The installation instructions for Debian are here. We used the following version:
Download the Docker image from here and load with:
docker load < klee38pspa.tar.gz
Run the docker image:
docker run --rm -it klee38pspa bash
This opens a shell into the docker image with all the sources and benchmarks used for our evaluation. Folders of interest:
/klee-dsa-benchmarks
- Contains the benchmarks used in the evaluation /data/dependencies/SVF-dynamic
- Source code for PSPA analysis/src/client-chopper
- Source code for Chopper on top of PSPA/src/client-resolution
- Source code for KLEE with PSPA based object resolution/src/client-wit
- Source code for KLEE performing WIT with PSPA/src/client-chopper-build/bin
- Binaries for Chopper on top of PSPA/src/client-resolution-build/bin
- Binaries for KLEE with PSPA based object resolution/src/client-wit-build/bin
- Binaries for KLEE performing WIT with PSPAWe provide a script for each of the experiments in our evaluation.
The results of the experiment are written to the /output
directory.
Note: The results in most of the experiments might be affected by the performance of the underlying host machine, so some deviations are expected. We ran the experiments on an Intel i7-6700 machine with 32GB RAM (Ubuntu 16.04).
To run the experiments for table 1 (estimated running time: 10 minutes):
cd /klee-dsa-benchmarks
./run_statistics_experiment.sh
To parse the results:
python3.5 parse_statistics.py /output/statistics/
The expected output:
/output/statistics/libosip-mod: 17.788527634865403
/output/statistics/libosip-mod: 2.94666381979289
/output/statistics/libtasn1-mod: 7.357920358822886
/output/statistics/libtasn1-mod: 1.5588283557597638
/output/statistics/libtiff-mod: 140.16058016058017
/output/statistics/libtiff-mod: 12.958559958559958
/output/statistics/libosip-ref: 32.98462950707683
/output/statistics/libosip-ref: 3.684694879788188
/output/statistics/libtasn1-ref: 8.245487364620939
/output/statistics/libtasn1-ref: 1.9454928344820042
/output/statistics/libtiff-ref: 126.63247863247864
/output/statistics/libtiff-ref: 17.44107744107744
To run the experiments for figure 4 (estimated running time: 10 hours):
cd /klee-dsa-benchmarks
./run_modref_experiment.sh
To parse the results:
python3.5 parse_recoveries.py /output/cse/recoveries/ /out.csv
The expected output:
benchmark,config,mode,paths,recoveries,snapshots,usage,overhead
libosip,skip_10,static,450747,1012604,631923,0,0.0
libosip,skip_10,symbolic,484435,831105,677439,677439,4.17
libosip,skip_5,static,123327,374093,12523,0,0.0
libosip,skip_5,symbolic,155969,324350,26004,26004,0.52
libosip,skip_2,static,756060,756009,2,0,0.0
libosip,skip_2,symbolic,734785,734734,2,2,0.93
libosip,skip_4,static,182035,574353,110743,0,0.0
libosip,skip_4,symbolic,277670,506769,169877,169877,1.41
libosip,skip_1,static,116742,396563,30988,0,0.0
libosip,skip_1,symbolic,117886,382565,34867,34867,0.49
libosip,skip_3,static,757665,757615,2,0,0.0
libosip,skip_3,symbolic,736741,736691,2,2,0.94
libosip,skip_7,static,747275,747224,2,0,0.0
libosip,skip_7,symbolic,731407,731356,2,2,0.93
libosip,skip_8,static,33697,483091,52531,0,0.0
libosip,skip_8,symbolic,43869,481941,65442,65442,0.65
libosip,skip_9,static,26473,245088,3253,0,0.0
libosip,skip_9,symbolic,24623,228282,3147,3147,0.21
libosip,skip_6,static,471682,1473900,716983,0,0.0
libosip,skip_6,symbolic,479723,1275087,751046,751046,5.4
libtasn1,skip_10,static,360,253932,5042,0,0.0
libtasn1,skip_10,symbolic,1350,255957,15930,14637,1.08
libtasn1,skip_5,static,605,183520,2686,0,0.0
libtasn1,skip_5,symbolic,4057,60724,24674,24674,10.73
libtasn1,skip_2,static,253,525676,3034,0,0.0
libtasn1,skip_2,symbolic,1285,537402,19692,19692,1.76
libtasn1,skip_4,static,261,372451,758,0,0.0
libtasn1,skip_4,symbolic,421,498311,1309,1309,1.08
libtasn1,skip_1,static,582,344230,523,0,0.0
libtasn1,skip_1,symbolic,6397,374246,14099,14099,1.31
libtasn1,skip_3,static,458,238892,589,0,0.0
libtasn1,skip_3,symbolic,2023,32188,2820,2820,0.38
libtasn1,skip_7,static,83,281003,106,0,0.0
libtasn1,skip_7,symbolic,1447,146778,3313,1929,0.59
libtasn1,skip_8,static,450,225546,531,0,0.0
libtasn1,skip_8,symbolic,4032,73837,5863,5863,1.58
libtasn1,skip_9,static,5581,380338,12163,0,0.0
libtasn1,skip_9,symbolic,15732,114801,35292,35292,1.86
libtasn1,skip_6,static,7564,162676,27152,0,0.0
libtasn1,skip_6,symbolic,19343,128583,70911,51631,2.2
libtiff,skip_10,static,252,384599,11,0,0.0
libtiff,skip_10,symbolic,719,1160,14,14,0.41
libtiff,skip_5,static,731,19810,690,0,0.0
libtiff,skip_5,symbolic,731,894,694,694,0.4
libtiff,skip_2,static,731,16101,4,0,0.0
libtiff,skip_2,symbolic,731,14817,4,4,0.14
libtiff,skip_4,static,731,38349,564,0,0.0
libtiff,skip_4,symbolic,1188,1029,934,934,0.32
libtiff,skip_1,static,618,443865,161,0,0.0
libtiff,skip_1,symbolic,636,13451,2506,1917,1.6
libtiff,skip_3,static,5118,293947,6117,0,0.0
libtiff,skip_3,symbolic,5660,23297,6954,6954,2.26
libtiff,skip_7,static,2878,37492,2891,0,0.0
libtiff,skip_7,symbolic,27112,14413,44332,44332,8.22
libtiff,skip_8,static,481,210033,15,0,0.0
libtiff,skip_8,symbolic,497,1265,17,17,0.29
libtiff,skip_9,static,10665,42413,16,0,0.0
libtiff,skip_9,symbolic,139657,20508,5137,5137,0.92
libtiff,skip_6,static,10665,43344,959,0,0.0
libtiff,skip_6,symbolic,130397,10,2245,2245,0.91
To run the experiments for tables 2 and 3 (estimated running time: 18 hours):
cd /klee-dsa-benchmarks
./run_coverage.sh
To parse the results for table 2, we use gcov and lcov to compute the line coverage. For each benchmark (libosip, libtasn1, libtiff), run:
cd /klee-dsa-benchmarks/<benchmark>
./compute_coverage.sh /output/cse/coverage/libosip/klee-out-<search>_<mode> && ./gen_report.sh /report
where search is one dfs
or random-path
, and mode is static
or pspa
.
For example, for libosip with dfs and _pspa_, the expected output is:
...
...
Overall coverage rate:
lines......: 10.1% (519 of 5144 lines)
...
...
To parse the results for table 3:
python3.5 parse_overhead.py /output/cse/coverage/
The expected output:
/output/cse/coverage/libosip / dfs
No R: 97.87, R: 6.41, N: 6680404
/output/cse/coverage/libosip / random-path
No R: 90.18, R: 2.66, N: 919326
/output/cse/coverage/libtasn1 / dfs
No R: 3.06, R: 2.31, N: 967
/output/cse/coverage/libtasn1 / random-path
No R: 9.8, R: 3.54, N: 6800
/output/cse/coverage/libtiff / dfs
No R: 3.24, R: 0.04, N: 401
/output/cse/coverage/libtiff / random-path
No R: 95.74, R: 0.52, N: 155688
To run the experiments for table 4 (estimated running time: 16 hours):
cd /klee-dsa-benchmarks/cve/libtasn1/
./run_cve_experiments.sh
To parse the results for table 4: S_PA without chopping-aware heuristic (column 1):
python3.5 parse_cve.py /output/cse/cve/static-without-cah/
PS_PA without chopping-aware heuristic (column 2):
python3.5 parse_cve.py /output/cse/cve/pspa-without-cah/
S_PA with chopping-aware heuristic (column 3):
python3.5 parse_cve.py /output/cse/cve/static/
PS_PA with chopping-aware heuristic (column 4):
python3.5 parse_cve.py /output/cse/cve/pspa/
To run the experiments for table 5 (estimated running time: 9 hours):
cd /klee-dsa-benchmarks
./run_all_path.sh
To parse the results for table 5:
python3.5 parse_termination.py /output/cse/termination/
The expected output:
libosip/vanilla: 00:33:30
libosip/static: 01:00:00
libosip/pspa: 00:04:16
libtasn1/vanilla: 00:41:29
libtasn1/static: 01:00:00
libtasn1/pspa: 00:02:12
libtiff/vanilla: 00:32:40
libtiff/static: 01:00:00
libtiff/pspa: 00:10:02
To run the experiments for table 6 (estimated running time: 6 hours):
cd /klee-dsa-benchmarks
./run_wit.sh
To parse the results for table 6:
python3.5 parse_wit.py /output/wit/
The expected output:
libosip/static: Colours: 70, Transitions: 108532593
libosip/pspa: Colours: 277, Transitions: 302069717
libtasn1/static: Colours: 157, Transitions: 8848420
libtasn1/pspa: Colours: 645, Transitions: 39456716
libtiff/static: Colours: 1047, Transitions: 1938
libtiff/pspa: Colours: 1101, Transitions: 1938
To run the experiments for table 7 (estimated running time: 9 hours):
cd /klee-dsa-benchmarks
./run_resolution_experiment.sh
To parse the results for table 7:
python3.5 parse_resolution.py /output/resolution/
The expected output:
m4/vanilla: Q: 1902, RT: 56%, ET: 00:49:13, SA: 0.0%
m4/static: Q: 1836, RT: 55%, ET: 00:47:08, SA: 0.15%
m4/pspa: Q: 960, RT: 38%, ET: 00:34:46, SA: 0.54%
make/vanilla: Q: 21832, RT: 56%, ET: 01:05:02, SA: 0.0%
make/static: Q: 18872, RT: 52%, ET: 00:59:48, SA: 0.16%
make/pspa: Q: 6222, RT: 30%, ET: 00:41:13, SA: 1.22%
sqlite/vanilla: Q: 7726, RT: 28%, ET: 00:43:19, SA: 0.0%
sqlite/static: Q: 7726, RT: 23%, ET: 00:50:47, SA: 14.23%
sqlite/pspa: Q: 1166, RT: 5%, ET: 00:33:23, SA: 0.51%
Most of the experiments have a long running time. The running time can be reduced, but take into account that the results will change accordingly.
Figure 4
Update the value of MAX_TIME in /klee-dsa-benchmarks/run_modref_experiment.sh
:
MAX_TIME=60
Tables 2 and 3
Update the value of MAX_TIME to:
MAX_TIME=120
in the following files:
/klee-dsa-benchmarks/libosip/run_coverage.sh
/klee-dsa-benchmarks/libtasn1/run_coverage.sh
/klee-dsa-benchmarks/libtiff/run_coverage.sh
Table 4
Update the following line in /klee-dsa-benchmarks/cve/libtasn1/run_cve_experiments.sh
:
FLAGS="-max-memory=8000 -max-time=120 -use-forked-solver=0 -use-recovery-cache=1"
Table 5
Update the value of MAX_TIME to:
MAX_TIME=120
in the following files:
/klee-dsa-benchmarks/libosip/run_term.sh
/klee-dsa-benchmarks/libtasn1/run_term.sh
/klee-dsa-benchmarks/libtiff/run_term.sh
Table 6
Update the following line in /klee-dsa-benchmarks/libosip/run_wit.sh
:
-max-instructions=105348978
Update the following line in /klee-dsa-benchmarks/libtasn1/run_wit.sh
:
-max-instructions=8318716
Update the following line in /klee-dsa-benchmarks/libtiff/run_wit.sh
:
-max-instructions=2235833
Table 7
Add the following line at the end of the file /klee-dsa-benchmarks/resolution/flags.sh
:
FLAGS+="-max-time=60 "
Note:
Before re-running the experiments, you should remove the generated klee-*
directories in the /output
directory. The most easy way to do that is loading the original image.
Compile the benchmark to LLVM 3.8 bitcode.
To see how we performed this step for our benchmarks,
you can look for example at the $(BC_TARGET)
target in the following file:
/klee-dsa-benchmarks/libosip/Makefile
The klee
executable for Chopper (with support for past-sensitive pointer analysis) is located at /src/client-chopper-build/bin/klee
.
The basic command is:
/src/client-chopper-build/bin/klee --use-modular-pta -use-pta-mode=${mode} -skip-functions=${skip_functions} <bitcode_file>
where $mode
is either static
(static pointer analysis) or symbolic
(past-sensitive pointer analysis),
and $skip_functions
are the functions you want to skip.
For more details, see for example /klee-dsa-benchmarks/libosip/run_coverage.sh
(which is used for the coverage experiments).
The klee
executable for the WIT client analysis is located at /src/client-wit-build/bin/klee
.
The basic command is:
/src/client-wit-build/bin/klee -use-pta-mode=${mode} --create-unique-as=1 -use-strong-updates=0 -pta-target=${target} <bitcode_file>
where $mode
is the pointer analysis mode (as before),
and ${target}
denotes the function from which you want to start the analysis.
For more details, see for example /klee-dsa-benchmarks/libosip/run_wit.sh
.
The klee
executable for the symbolic pointer resolution client analysis is located at /src/client-resolution-build/bin/klee
.
The basic command is:
/src/client-resolution-build/bin/klee -use-pta-mode=${mode} --use-sa-resolve=1 <bitcode_file>
For more details, see for example /klee-dsa-benchmarks/resolution/m4/run.sh
.
Our extension of SVF is located at /data/dependencies/SVF-dynamic
.
The interesting class to look at is AndersenDynamic
which is implemented in these files:
lib/WPA/AndersenDynamic.cpp
include/WPA/AndersenDynamic.h
This class enables running the pointer analysis algorithm locally on a given function from an arbitrary initial abstract state, as opposed to whole program analysis.
The base implementation of PSPA is located in /src/pspa-master
.
Some interesting files to look at:
Implements the abstraction function for a single symbolic memory location. In the context of static analysis, this function is often called the embedding function.
Implements the abstraction of a symbolic state, that is, builds the abstract points-to graph.
Executor::updatePointsToOnCallSymbolic
:
Executor::analyzeTargetFunction
:
The client analyses are in /src/client-{chopper,resolution,wit}
, which are derived from /src/pspa-master
with the respective client built on top.
If you make any changes, running make
in the corresponding ´-build´ directory (/src/client-{chopper,resolution,wit}-build
) will enable you to re-run our experiments.