KLEE-Array is made available both as source tarball and in a binary form, pre-installed in a virtual machine. For more details on KLEE-Array and the benchmarks, please see the ISSTA’17 paper.
The source code of KLEE-Array is available here.
To compile and run KLEE-Array, you need a Linux-based operating system. You also need:
To compile the code, follow the KLEE’s documentation. The only difference is that, instead of checking out KLEE, you need to use the source distribution available above.
All the benchmark applications and the necessary infrastructure have been pre-installed under the VM.
The VM was prepared under VirtualBox and is based on Ubuntu 14.04 LTS.
Click here to download the virtual machine (md5: 663dc4184378a1c76aadb2756a1c9470).
The source code as well as the compiled binaries of all the tools used can be found in the home directory of the VM user (/home/user
). In particular, klee-array
contains our modified version of KLEE.
This replication package is located in the home folder and is structured as follows:
RQ1_2
contains the replication infrastructure to answer research question RQ1 and RQ2case_studies
is the main folder where you can find the source code and binaries of the subject programscommon.mk
is the main configuration file for our experimental infrastructuresandbox.tgz
contains KLEE’s sandbox directorytesting-env.sh
is a script that creates an environmental file for KLEEbandicoot
, bc
, binutils
, bzip2
, and coreutils
contain the source code and binaries of the subject programs, as well as a Makefile script to replicate the experimentsRQ3
contains the replication infrastructure to answer research question RQ3bc-1.06-mod
is the folder where you can find the source code and binary for the program BC that has been modified so that some symbolic inputs are concretized during symbolic exeuction.bc-1.06-gcov
is the folder where you can find the source code and binary for the original version of BC. During the experiments this program will be compiled so that it collects code and branch coverage information.If you have any further questions about the VM or you spot an issue, please contact .
We implemented our transformations within the KLEE symbolic executor. KLEE is invoked through command-line:
# klee [klee-options] <program.bc> [program-options]
The general form of a KLEE command-line is first the arguments for KLEE itself ([klee-options]
), then the LLVM bitcode file to execute (program.bc
), and then any arguments to pass to the application ([program-options]
). In particular, the KLEE option -posix-runtime
enables the use of the symbolic environment options as part of the program’s options. To get a complete list of KLEE’s command-line options run: klee --help
. More information on how to use KLEE can be found at: http://klee.github.io/docs/options/.
Our transformations can be enabled and controlled through the following command-line options:
-optimize-array - Optimize accesses to arrays. (default=off)
=all - Combining constant, index, and value optimizations
=constant - Constant value optimization
=index - Index-based optimization
=read_value - Value-based at ReadExpr
=read_value_concrete - Value-based at ReadExpr (concrete only)
=read_value_concsymb - Value-based at ReadExpr (concrete/symbolic only)
=value - Value-based at branch
=value_concrete - Value-based at branch (concrete only)
=value_concsymb - Value-based at branch (concrete/symbolic only)
-optimize-array-rewrite - Expression rewriting strategy.
=or - k == 0 | k ==1 | .. | k == 5
=range - (0 <= k && k <= 5)
=hybrid - k!=8
-array-value-ratio=R - Maximum ratio of unique values to array size for which
the array optimizations are applied.
-array-value-symb-ratio=R - Maximum ratio of symbolic values to array size for which
the array optimizations are applied.
In particular, we use the following combinations in our experimental evaluation:
--optimize-array=constant
--optimize-array=index --optimize-array-rewrite=hybrid
--optimize-array=read_value --array-value-ratio=0.75 --array-value-symb-ratio=0.75
--optimize-array=all --optimize-array-rewrite=hybrid --array-value-ratio=0.75 --array-value-symb-ratio=0.75
Our transformations can be applied to all the programs supported by standard KLEE. The most difficult part in applying KLEE to a new case study is to compile the subject program to bitcode. We usually followed the procedure and the tools (i.e., whole-program-llvm
) suggested in KLEE’s documentation.
More examples can be found inside our replication package.
We provide a pre-configured virtual machine containing our set of experiments.
The artifact includes the tool together with the complete set of experiments and its replication infrastructure to allow other researchers to reproduce the experiments.
Our experiments are very sensitive to changes in both environment and memory since they directly impact the code exploration strategies of KLEE. The best way to replicate our results is to use a physical machine, although the configuration is non-trivial. We did our best to set up the VM to replicate as close as possible the environment on our physical servers. However, the VM environment might be over-restrictive for our experiments, due to memory, CPU, and disk limits.
The current potential known issues are:
Determinism: Running KLEE deterministically is essential for our experiments, and yet it is non-trivial. We ran our experiments on a physical machine with 16 GB RAM, and we optimistically reserve 8 GB of RAM to the VM. Inside the VM, we disable ASLR and rely on TCMalloc to deterministically allocate memory. Failures in deterministically allocating memory may result in divergences during symbolic execution. KLEE might fail to allocate memory if either the VM, the host system, or both do not have enough available memory. In case of failures, a VM reboot might be enough to fix the issue.
Reducing the allocated memory might result in KLEE hitting the memory limit, thus consequently discarding execution states and exposing behavior divergences unrelated to our transformations.
KLEE runs all programs in a sandbox directory located in /tmp/sandbox
, which is deleted and re-created at every run. Failures in setting up the sandbox will likely result in divergences.
Runtime fluctuations: The data reported in the paper is collected on a physical machine without a middle-layer hypervisor. Moreover, to avoid subtle and suboptimal processor allocation by the Linux scheduler, we pinned processes to CPUs, but unfortunately VMs like Virtualbox do not allow users to pin physical CPUs. As a result, the performance recorded with this VM will likely differ from the one in the paper. However, we expect you to observe similar performance benefits when our optimizations are applied. It is crucial to allocate at least two CPUs to the VM and to run all experiments sequentially.
Disk usage: The complete replication of our experiments requires almost a terabyte of disk space, and one week of (physical) CPU time. Recording execution traces also incurs 3x slowdown. If disk usage is an issue (either space, performance, or both), please follow the suggestion in the \textsf{Replicating the Experiments} section, although you will no longer be able to record execution traces, and thus to check results for RQ1.
From now on, the description—and in particular all the paths in the examples—will refer to our virtual machine setup.
This step-by-step tutorial focuses on the replication of all the experiments used to evaluate our technique.
Our experimental infrastructure is based on GNU make and is composed of two main parts: A configuration file which is common to all experiments (case_studies/common.mk
) and a Makefile
file inside the directory of each case study (e.g., case_studies/bc/Makefile
).
The common.mk
file contains the configuration of KLEE used in our experiments. The KLEE options used during the evaluation are the following:
--simplify-sym-indices \
--output-module \
--max-memory=4095 \
--allocate-determ=true \
--allocate-determ-size=4095 \
--allocate-determ-start-address=0x7ffef66f3000 \
--max-sym-array-size=4096 \
--disable-inlining \
--optimize \
--use-forked-solver \
--use-cex-cache \
--libc=uclibc \
--posix-runtime \
--allow-external-sym-calls \
--only-output-states-covering-new \
--watchdog \
--max-memory-inhibit=false \
--use-query-log=solver:kquery \
--max-static-fork-pct=1 \
--max-static-solve-pct=1 \
--max-static-cpfork-pct=1 \
--switch-type=internal \
--dump-states-on-halt=false \
--debug-print-instructions=$(KINSTRFORMAT) \
--debug-compress-instructions \
--compress-query-log \
--environ=/tmp/test.env \
--run-in=/tmp/sandbox \
--max-instruction-time=$(KINSTRTIMEOUT) \
--max-solver-time=$(KINSTRTIMEOUT) \
--max-time=$(KTIMEOUT) \
--search=$(KSEARCH) \
$(KOPT)
Where $(KINSTRFORMAT)
, $(KINSTRTIMEOUT)
, $(KTIMEOUT)
, $(KSEARCH)
, and $(KOPT)
have the following default values:
KINSTRFORMAT:=compact:file
KINSTRTIMEOUT:=200.
KTIMEOUT:=4000.
KSEARCH=dfs
KOPT=
NOTE: In case disk usage is an issue, remove the following lines:
--use-query-log=solver:kquery
--debug-print-instructions=$(KINSTRFORMAT)
--debug-compress-instructions
--compress-query-log
Please notice that by removing these flags:
We essentially have two sets of experiments: The first set is used to answer research questions RQ1 and RQ2, thus filling the data in Table 3. The second set is used to answer RQ3.
The experiment follows the protocol described in the paper in Section 5.2: First, we create a baseline, that is, we run KLEE for a given amount of time and record the number of executed instructions (Creating a baseline). Second, we re-run KLEE by limiting the exploration to the number of instructions previously recorded, with and without our transformations (Running KLEE with transformations).
The creation and configuration of the baseline in our scripts is non-trivial. Our VM is already configured with the instruction limits that we recorded and used for our experiments on a 30 minute run. Hence, you can skip this step, although running KLEE in the VM on our instruction limits will incur additional runtime (up to 4x slowdown).
We now describe the experiment referring to the coreutils
case study where not explicitly mentioned.
Go to the coreutils
directory of the replication package:
# cd ~/RQ1_2/case_studies/coreutils
Invoke the generation of the baseline. You can specify the
timeout (in seconds) through the KTIMEOUT
variable. For
example, to create a baseline for 10 seconds:
# make KTIMEOUT=10. create-baseline
..
Makefile:92: warning: ignoring old commands for target `baseline-klee-od'
Makefile:92: warning: ignoring old commands for target `baseline-klee-pathchk'
..
Creating KLEE baseline with dfs on program [
..
Warnings about overriding of commands in the Makefile are expected in the output.
Ge the instruction count from KLEE run statistics. The invocation will create a directory and a log file for each subject program in the case study:
# ls
klee-out-[ klee-out-base64 klee-out-basename klee-out-cat klee-out-cksum
baseline-klee-dfs-[.log baseline-klee-dfs-base64.log ...
You can get the executed instruction count with the following command:
# tail -n1 klee-out-[/run.stats | cut -d ',' -f1 | cut -d '(' -f2
1843496
Once you obtained the executed instructions for all the subject programs, you need to configure the Makefile to use such limits in the next runs. The configuration steps to follow depends on the case study. In general, you need to open the Makefile in the case study directory (e.g., bc/Makefile
) and modify the run-*
targets with the new instruction limit (e.g., 1234
). For example, for the BC case study:
# emacs bc/Makefile
[..]
# Run KLEE on BC given a instruction limit
run-bc: $(KLEE) bc.bc prepare
@echo "Running KLEE with $(KSEARCH) on program BC"
-taskset -c 0,1 $(KLEE) --stop-after-n-instructions=1234 \
--stats-write-interval=0 --istats-write-interval=0 \
--stats-write-after-instructions=1234 --istats-write-after-instructions=1234 \
$(KFLAGS) \
--output-dir=klee-out-bc \
$(BC_DIR)/bc/bc.bc \
A --sym-files 1 8 --sym-stdin 8 --sym-stdout \
> run-klee-$(KSEARCH)-bc.log 2>&1
[..]
In case of coreutils
and binutils
, the process is quite different. In fact, given the substantial number of the subject programs (63), we rely on the automatic generation of Makefile recipes through macros. This strategy allows us to decouple instruction limits and program invocation, thus simplifying most of the recipes. The instruction limits are thus stored in a separate file called targets.txt
inside the case studies’ own directory. The file format is CSV and is quite simple: app-name:instructions:statistics
. The instructions
column should contain the executed instructions recorded during the creation of the baseline, while the statistics
should contain at how many instructions the statistics should be dumped. Since dumping statistics might be expensive, we force KLEE to dump only at the beginning and end of the execution by using the same value of executed instructions. For example, the targets.txt
file in coreutils
contains:
# cat coreutils/targets.txt
[:1843496:1843496
base64:2948624182:2948624182
[..]
Unfortunately, as discussed on KLEE’s website some of the subject programs require special symbolic arguments to maximize code coverage. Such special Makefile recipes are stored into the Makefile.specials
file. Therefore, once you obtain the executed instructions for all the subject programs, you need to:
targets.txt
;run-*
targets inside Makefile.specials
since it contains a reference to instruction and statistic limits;run-*
targets inside Makefile.specials
as shown for the general case.After the baseline is created and the configuration updated, you need to move the results to another directory, since new runs will be placed directly in the same path. For example, you can create a directory baseline
:
# mkdir baseline
# mv klee-out-* baseline
# mv baseline-* baseline
You can now follow the instructions to run KLEE with and without transformations and compare their runtime.
As mentioned in the introduction, KLEE might fail to deterministically allocate memory. You can check if the runs completed successfully by inspecting the log files. The error message is the following: Could not allocate memory deterministically
. In case of failures, please reboot the VM and try to run the command again.
Go to the coreutils
directory of the replication package:
# cd ~/RQ1_2/case_studies/coreutils
Now you can run different types of experiments:
# make all-dfs
..
Running KLEE with dfs on program [
..
KOPT
variable. For example, to run KLEE with constant-folding optimizations:# make KOPT="--optimize-array=constant" all-dfs
..
Running KLEE with dfs on program [
..
bandicoot
, BC
, and bzip2
.)# make run-klee-[_1843496_1843496
..
Running KLEE with dfs on program [
..
# make KOPT="--optimize-array=index" run-bc
..
Running KLEE with dfs on program BC
..
make
in the terminal and press TAB
key twice:# make <TAB><TAB>
Display all 175 possibilities? (y or n) y
Makefile baseline-klee-pathchk run-klee-base64_2948624182_2948624182 ..
To completely replicate our experimental evaluation to answer RQ1 and RQ2 you have to run, for each case study:
# make <run-command>
# make KOPT="--optimize-array=constant" <run-command>
# make KOPT="--optimize-array=index --optimize-array-rewrite=hybrid" <run-command>
# make KOPT="--optimize-array=read value --array-value-ratio=0.75 --array-value-symb-ratio=0.75" <run-command>
# make KOPT="--optimize-array=all --optimize-array-rewrite=hybrid --array-value-ratio=0.75 --array-value-symb-ratio=0.75" <run-command>
Where <run-command>
may be either all-dfs
(for coreutils
and binutils
), or a run command specific for a case study (e.g. run-bc
for BC
).
NOTE: After each command invocation, you need to save the current logs and KLEE output directory:
# mkdir constant
# mv klee-out-* constant
# mv run-klee-* constant
Warnings about overriding of commands in the Makefile are expected in the output.
The runtime for each program should be very close to the one used to create the baseline. Minor fluctuations can be observed due to the heuristic nature of constraint solvers.
As mentioned in the introduction, KLEE might fail to deterministically allocate memory. You can check if the runs completed successfully by inspecting the log files. The error message is the following: Could not allocate memory deterministically
. In case of failures, please reboot the VM and try to run the command again.
The experiment follows the protocol described in the paper in Section 5.4:
bc-1.06-mod/bc/scan.c
at lines 1005 through 1067. These modifications insert calls to a special function, klee_assume
, which allows the user to place constraints on variables found in the program.Go to the RQ3
directory of the replication package:
# cd ~/RQ3
Now you can run the two experiments:
# make klee-bc-orig
..
Running KLEE without transformations on BC
..
# make klee-bc-opt
..
Running KLEE with transformations on BC
..
Finally, you can collect the code/branch coverage information:
# make gcov-bc-orig
..
Replaying inputs generated during run with no transformations
..
# make gcov-bc-opt
..
Replaying inputs generated during run with transformations
..
After running the commands to collect coverage information the data for each file in BC will be displayed on the screen. Additionally, the total coverage information will be at the very bottom of the output. Unfortunately, due to the slowdown that is a result of emulation we expect the code coverage achieved to be less than what is reported in the paper. However, the difference between the coverage achieved by the original and optimized runs should be similar.
As mentioned in the introduction, KLEE might fail to deterministically allocate memory. You can check if the runs completed successfully by inspecting the log files. The error message is the following: Could not allocate memory deterministically
. In case of failures, please reboot the VM and try to run the command again.
During the execution of the experiments, KLEE stores two types of data:
Such data is recorded on a per-run basis, although KLEE generates the same files for all runs. We will present how to obtain the results to answer the research question by referring to the BC case study:
# cd PATH_TO_CASE_STUDIES/bc
Each run of BC will generate a folder named klee-out-bc
with the following files inside:
# ls klee-out-bc
array-details.stats array.stats assembly.ll final.bc info instructions.txt.gz
messages.txt run.istats run.stats solver-queries.kquery.gz warnings.txt
Inside the KLEE-generated folder there should be a file named instructions.txt.gz
. This file contains all instructions interpreted by KLEE during symbolic execution.
The file size can quickly grow and, together with file solver-queries.kquery.gz
, is the main source of disk usage in our experiments (up to 10GB each). Since it is infeasible to compare with diff
the files, we rely on the CRC code to check for equality:
# gzip -lv original/klee-out-bc/instructions.txt.gz | tail -n1 | cut -d ' ' -f2
d22f4d6d
# gzip -lv index/klee-out-bc/instructions.txt.gz | tail -n1 | cut -d ' ' -f2
d22f4d6d
In practice, we found this method extremely accurate. The manual inspection of the two CRC codes for their equality allows the generation of the data to fill Table 3 column Eq
in the paper (RQ1).
KLEE stores statistics on the progress of the execution in a CSV file named run.stats
. The file is stored withing the KLEE-generated folder for the run. The CSV file contains several statistics, for example the number of executed instructions, the number of queries issued, and the time required to complete the execution. It is possible to easily inspect the content of the run.stats
file with the klee-stats
tool.
To fill Table 3 (RQ2) it is sufficient to:
# tail -n1 klee-out-bc/run.stats | cut -d ',' -f11
warnings.txt
# grep "OPT.*success" klee-out-bc/warnings.txt | wc -l