README.md
1Fuzzing SAPI for PHP
2--------------------
3
4The following `./configure` options can be used to enable the fuzzing SAPI, as well as all available fuzzers. If you don't build the exif/json/mbstring extensions, fuzzers for these extensions will not be built.
5
6```sh
7CC=clang CXX=clang++ \
8./configure \
9 --disable-all \
10 --enable-fuzzer \
11 --with-pic \
12 --enable-debug-assertions \
13 --enable-address-sanitizer \
14 --enable-exif \
15 --enable-mbstring
16```
17
18The `--with-pic` option is required to avoid a linking failure. The `--enable-debug-assertions` option can be used to enable debug assertions despite the use of a release build.
19
20You can combine fuzzing with `--enable-address-sanitizer`, `--enable-undefined-sanitizer` or `--enable-memory-sanitizer`. The first two options can also be used together.
21
22You will need a recent version of clang that supports the `-fsanitize=fuzzer-no-link` option.
23
24When running `make` it creates these binaries in `sapi/fuzzer/`:
25
26* `php-fuzz-parser`: Fuzzing language parser and compiler
27* `php-fuzz-unserialize`: Fuzzing unserialize() function
28* `php-fuzz-unserializehash`: Fuzzing unserialize() for HashContext objects
29* `php-fuzz-json`: Fuzzing JSON parser
30* `php-fuzz-exif`: Fuzzing `exif_read_data()` function (requires --enable-exif)
31* `php-fuzz-mbstring`: Fuzzing `mb_convert_encoding()` (requires `--enable-mbstring`)
32* `php-fuzz-mbregex`: Fuzzing `mb_ereg[i]()` (requires --enable-mbstring)
33* `php-fuzz-execute`: Fuzzing the executor
34* `php-fuzz-function-jit`: Fuzzing the function JIT (requires --enable-opcache)
35* `php-fuzz-tracing-jit`: Fuzzing the tracing JIT (requires --enable-opcache)
36
37Some fuzzers have a seed corpus in `sapi/fuzzer/corpus`. You can use it as follows:
38
39```sh
40cp -r sapi/fuzzer/corpus/exif ./my-exif-corpus
41sapi/fuzzer/php-fuzz-exif ./my-exif-corpus
42```
43
44For the unserialize fuzzer, a dictionary of internal classes should be generated first:
45
46```sh
47sapi/cli/php sapi/fuzzer/generate_unserialize_dict.php
48cp -r sapi/fuzzer/corpus/unserialize ./my-unserialize-corpus
49sapi/fuzzer/php-fuzz-unserialize -dict=$PWD/sapi/fuzzer/dict/unserialize ./my-unserialize-corpus
50```
51
52For the unserializehash fuzzer, generate a corpus of initial hash serializations:
53
54```sh
55sapi/cli/php sapi/fuzzer/generate_unserializehash_corpus.php
56cp -r sapi/fuzzer/corpus/unserializehash ./my-unserialize-corpus
57sapi/fuzzer/php-fuzz-unserializehash ./my-unserialize-corpus
58```
59
60For the parser fuzzer, a corpus may be generated from Zend test files:
61
62```sh
63sapi/cli/php sapi/fuzzer/generate_parser_corpus.php
64mkdir ./my-parser-corpus
65sapi/fuzzer/php-fuzz-parser -merge=1 ./my-parser-corpus sapi/fuzzer/corpus/parser
66sapi/fuzzer/php-fuzz-parser -only_ascii=1 ./my-parser-corpus
67```
68
69For the execute, function-jit and tracing-jit fuzzers, a corpus may be generated from any set of test files:
70
71```sh
72sapi/cli/php sapi/fuzzer/generate_execute_corpus.php ./execute-corpus Zend/tests ext/opcache/tests/jit
73sapi/fuzzer/php-fuzzer-function-jit ./execute-corpus
74```
75
76For the mbstring fuzzer, a dictionary of encodings should be generated first:
77
78```sh
79sapi/cli/php sapi/fuzzer/generate_mbstring_dict.php
80sapi/fuzzer/php-fuzz-mbstring -dict=$PWD/sapi/fuzzer/dict/mbstring ./my-mbstring-corpus
81```
82
83For the mbregex fuzzer, you may want to build the libonig dependency with instrumentation. At this time, libonig is not clean under ubsan, so only the fuzzer and address sanitizers may be used.
84
85```sh
86git clone https://github.com/kkos/oniguruma.git
87pushd oniguruma
88autoreconf -vfi
89./configure CC=clang CFLAGS="-fsanitize=fuzzer-no-link,address -O2 -g"
90make
91popd
92
93export ONIG_CFLAGS="-I$PWD/oniguruma/src"
94export ONIG_LIBS="-L$PWD/oniguruma/src/.libs -l:libonig.a"
95```
96
97This will link an instrumented libonig statically into the PHP binary.
98