summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorNobuyoshi Nakada <nobu@ruby-lang.org>2023-07-01 12:22:22 +0900
committerNobuyoshi Nakada <nobu@ruby-lang.org>2023-07-01 12:22:22 +0900
commit2952dbad5eedb3ed7d0bfdd8b6c7a546c775c76b (patch)
tree970b3b77a82123f89011b497483104a3f60b8dc9 /doc
parent1bc8838d60ef3fc6812d3b64ed87caaf0ae943d9 (diff)
[DOC] Move document for YARP
Diffstat (limited to 'doc')
-rw-r--r--doc/yarp/fuzzing.md93
1 files changed, 93 insertions, 0 deletions
diff --git a/doc/yarp/fuzzing.md b/doc/yarp/fuzzing.md
new file mode 100644
index 0000000000..c9571d4a2c
--- /dev/null
+++ b/doc/yarp/fuzzing.md
@@ -0,0 +1,93 @@
+# Fuzzing
+
+We use fuzzing to test the various entrypoints to the library. The fuzzer we use is [AFL++](https://aflplus.plus). All files related to fuzzing live within the `fuzz` directory, which has the following structure:
+
+```
+fuzz
+├── corpus
+│   ├── parse fuzzing corpus for parsing (a symlink to our fixtures)
+│   ├── regexp fuzzing corpus for regexp
+│   └── unescape fuzzing corpus for unescaping strings
+├── dict a AFL++ dictionary containing various tokens
+├── docker
+│   └── Dockerfile for building a container with the fuzzer toolchain
+├── fuzz.c generic entrypoint for fuzzing
+├── heisenbug.c entrypoint for reproducing a crash or hang
+├── parse.c fuzz handler for parsing
+├── parse.sh script to run parsing fuzzer
+├── regexp.c fuzz handler for regular expression parsing
+├── regexp.sh script to run regexp fuzzer
+├── tools
+│   ├── backtrace.sh generates backtrace files for a crash directory
+│   └── minimize.sh generates minimized crash or hang files
+├── unescape.c fuzz handler for unescape functionality
+└── unescape.sh script to run unescape fuzzer
+```
+
+## Usage
+
+There are currently three fuzzing targets
+
+- `yp_parse_serialize` (parse)
+- `yp_regexp_named_capture_group_names` (regexp)
+- `yp_unescape_manipulate_string` (unescape)
+
+Respectively, fuzzing can be performed with
+
+```
+make fuzz-run-parse
+make fuzz-run-regexp
+make fuzz-run-unescape
+```
+
+To end a fuzzing job, interrupt with CTRL+C. To enter a container with the fuzzing toolchain and debug utilities, run
+
+```
+make fuzz-debug
+```
+
+# Out-of-bounds reads
+
+Currently, encoding functionality implementing the `yp_encoding_t` interface can read outside of inputs. For the time being, ASAN instrumentation is disabled for functions from src/enc. See `fuzz/asan.ignore`.
+
+To disable ASAN read instrumentation globally, use the `FUZZ_FLAGS` environment variable e.g.
+
+```
+FUZZ_FLAGS="-mllvm -asan-instrument-reads=false" make fuzz-run-parse
+```
+
+Note, that this may make reproducing bugs difficult as they may depend on memory outside of the input buffer. In that case, try
+
+```
+make fuzz-debug # enter the docker container with build tools
+make build/fuzz.heisenbug.parse # or .unescape or .regexp
+./build/fuzz.heisenbug.parse path-to-problem-input
+```
+
+# Triaging Crashes and Hangs
+
+Triaging crashes and hangs is easier when the inputs are as short as possible. In the fuzz container, an entire crash or hang directory can be minimized using
+
+```
+./fuzz/tools/minimize.sh <directory>
+```
+
+e.g.
+```
+./fuzz/tools/minimize.sh fuzz/output/parse/default/crashes
+```
+
+This may take a long time. In the the crash/hang directory, for each input file there will appear a minimized version with the extension `.min` appended.
+
+Backtraces for crashes (not hangs) can be generated en masse with
+
+```
+./fuzz/tools/backtrace.sh <directory>
+```
+
+Files with basename equal to the input file name with extension `.bt` will be created e.g.
+
+```
+id:000000,sig:06,src:000006+000190,time:8480,execs:18929,op:splice,rep:4
+id:000000,sig:06,src:000006+000190,time:8480,execs:18929,op:splice,rep:4.bt
+```