9220a0fdd0f3dc2c8dd7cbeefac7d11106451b51 tests: Add one specialized ProcessMessage(...) fuzzing binary per message type for optimal results when using coverage-guided fuzzing (practicalswift)
fd1dae10b4a549ba9292d837235d59bd9eebbed3 tests: Add fuzzing harness for ProcessMessage(...) (practicalswift)
Pull request description:
Add fuzzing harness for `ProcessMessage(...)`. Enables high-level fuzzing of the P2P layer.
All code paths reachable from this fuzzer can be assumed to be reachable for an untrusted peer.
Seeded from thin air (an empty corpus) this fuzzer reaches roughly 20 000 lines of code.
To test this PR:
```
$ make distclean
$ ./autogen.sh
$ CC=clang CXX=clang++ ./configure --enable-fuzz \
--with-sanitizers=address,fuzzer,undefined
$ make
$ src/test/fuzz/process_message
…
```
Worth noting about this fuzzing harness:
* To achieve a reasonable number of executions per seconds the state of the fuzzer is unfortunately not entirely reset between `test_one_input` calls. The set-up (`FuzzingSetup` ctor) and tear-down (`~FuzzingSetup`) work is simply too costly to be run on every iteration. There is a trade-off to handle here between a.) achieving high executions/second and b.) giving the fuzzer a totally blank slate for each call. Please let me know if you have any suggestion on how to improve this situation while maintaining >1000 executions/second.
* To achieve optimal results when using coverage-guided fuzzing I've chosen to create one specialised fuzzing binary per message type (`process_message_addr`, `process_message_block`, `process_message_blocktxn `, etc.) and one general fuzzing binary (`process_message`) which handles all messages types. The latter general fuzzer can be seeded with inputs generated by the former specialised fuzzers.
Happy fuzzing friends!
ACKs for top commit:
MarcoFalke:
ACK 9220a0fdd0 🏊
Tree-SHA512: c314ef12b0db17b53cbf3abfb9ecc10ce420fb45b17c1db0b34cabe7c30e453947b3ae462020b0c9f30e2c67a7ef1df68826238687dc2479cd816f0addb530e5
6590395f6047cbfbe29f491d816c25c9a28d23a2 tests: Remove FUZZERS_MISSING_CORPORA (practicalswift)
815c7a679316e34b2072a45949ad4ecb1ae1c7fb tests: Add basic fuzzing harness for CNetAddr/CService/CSubNet related functions (netaddress.h) (practicalswift)
Pull request description:
Add basic fuzzing harness for `CNetAddr`/`CService`/`CSubNet` related functions (`netaddress.h`).
To test this PR:
```
$ make distclean
$ ./autogen.sh
$ CC=clang CXX=clang++ ./configure --enable-fuzz \
--with-sanitizers=address,fuzzer,undefined
$ make
$ src/test/fuzz/netaddress
…
```
Top commit has no ACKs.
Tree-SHA512: 69dc0e391d56d5e9cdb818ac0ac4b69445d0195f714442a06cf662998e38b6e0bbaa635dce78df37ba797feed633e94abba4764b946c1716d392756e7809112d
9ff41f64198e8ddb969544fc1a5328763f1fa183 tests: Add float to FUZZERS_MISSING_CORPORA (temporarily) (practicalswift)
8f6fb0a85ae6399c8fb4f205ad35c319c42294f1 tests: Add serialization/deserialization fuzzing for integral types (practicalswift)
3c82b92d2e01e409cc46261bffcf3643102f0b94 tests: Add fuzzing harness for functions taking floating-point types as input (practicalswift)
c2bd5888607d283a229c9361747a93c83dfea0de Add missing includes (practicalswift)
Pull request description:
Add simple fuzzing harness for functions with floating-point parameters (such as `ser_double_to_uint64(double)`, etc.).
Add serialization/deserialization fuzzing for integral types.
Add missing includes.
To test this PR:
```
$ make distclean
$ ./autogen.sh
$ CC=clang CXX=clang++ ./configure --enable-fuzz \
--with-sanitizers=address,fuzzer,undefined
$ make
$ src/test/fuzz/float
…
```
Top commit has no ACKs.
Tree-SHA512: 9b5a0c4838ad18d715c7398e557d2a6d0fcc03aa842f76d7a8ed716170a28f17f249eaede4256998aa3417afe2935e0ffdfaa883727d71ae2d2d18a41ced24b5
7e9c7113afbed96cef80c327cc93e82000d6bb69 compressor: Make the domain of CompressAmount(...) explicit (practicalswift)
4a7fd7a7124f84e010b01d0769ef0572bf031ee8 tests: Add amount compression/decompression fuzzing to existing fuzzing harness: test compression round-trip (practicalswift)
Pull request description:
Small fuzzing improvement:
Add amount compression/decompression fuzzing to existing fuzzing harness: test compression round-trip (`DecompressAmount(CompressAmount(…))`).
Make the domain of `CompressAmount(…)` explicit.
Amount compression primer:
```
Compact serialization for amounts
Special serializer/deserializer for amount values. It is optimized for
values which have few non-zero digits in decimal representation. Most
amounts currently in the txout set take only 1 or 2 bytes to
represent.
```
**How to test this PR**
```
$ make distclean
$ ./autogen.sh
$ CC=clang CXX=clang++ ./configure --enable-fuzz \
--with-sanitizers=address,fuzzer,undefined
$ make
$ src/test/fuzz/integer
…
```
Top commit has no ACKs.
Tree-SHA512: 0f7c05b97012ccd5cd05a96c209e6b4d7d2fa73138bac9615cf531baa3f614f9003e29a198015bcc083af9f5bdc752bb52615b82c5df3c519b1a064bd4fc6664
470e2ac602ed2d6e62e5c80f27cd0a60c7cf6bce tests: Avoid hitting some known minor tinyformat issues when fuzzing strprintf(...) (practicalswift)
Pull request description:
Avoid hitting some known minor tinyformat issues when fuzzing `strprintf(...)`. These can be removed when the issues have been resolved upstreams :)
Note to reviewers: The `%c` and `%*` issues are also present for `%<some junk>c` and `%<some junk>*`. That is why simply matching on `"%c"` or `"%*"` is not enough. Note that the intentionally trivial skipping logic overshoots somewhat (`c[…]%` is filtered in addition to `%[…]c`).
Top commit has no ACKs.
Tree-SHA512: 2b002981e8b3f2ee021c3013f1260654ac7e158699313849c9e9660462bb8cd521544935799bb8daa74925959dc04d63440e647495e0b008cfe1b8a8b2202d40
cc668d06fb71463fd406df761b0e89e25d4de968 tests: Add fuzzing harness for strprintf(...) (practicalswift)
ccc3c76e2b5d28a2372ae5752c08256396bf43e6 tests: Add fuzzer strprintf to FUZZERS_MISSING_CORPORA (temporarily) (practicalswift)
6ef04912af7f216f3112e0e9919f67e36415a792 tests: Update FuzzedDataProvider.h from upstream (LLVM) (practicalswift)
Pull request description:
Add fuzzing harness for `strprintf(…)`.
Update `FuzzedDataProvider.h`.
Avoid hitting some issues in tinyformat (reported upstreams in https://github.com/c42f/tinyformat/issues/70).
---
Found issues in tinyformat:
**Issue 1.** The following causes a signed integer overflow followed by an allocation of 9 GB of RAM (or an OOM in memory constrained environments):
```
strprintf("%.777777700000000$", 1.0);
```
**Issue 2.** The following causes a stack overflow:
```
strprintf("%987654321000000:", 1);
```
**Issue 3.** The following causes a stack overflow:
```
strprintf("%1$*1$*", -11111111);
```
**Issue 4.** The following causes a `NULL` pointer dereference:
```
strprintf("%.1s", (char *)nullptr);
```
**Issue 5.** The following causes a float cast overflow:
```
strprintf("%c", -1000.0);
```
**Issue 6.** The following causes a float cast overflow followed by an invalid integer negation:
```
strprintf("%*", std::numeric_limits<double>::lowest());
```
Top commit has no ACKs.
Tree-SHA512: 9b765559281470f4983eb5aeca94bab1b15ec9837c0ee01a20f4348e9335e4ee4e4fecbd7a1a5a8ac96aabe0f9eeb597b8fc9a2c8faf1bab386e8225d5cdbc18
4de934b9b5b4be1bac8fe205f4ee9a79e772dc34 Convert compression.h to new serialization framework (Pieter Wuille)
ca34c5cba5fbb9b046b074a234f06ecf6ed5d610 Add FORMATTER_METHODS, similar to SERIALIZE_METHODS, but for formatters (Pieter Wuille)
Pull request description:
This is the next piece of the puzzle from #10785. It includes:
* The `FORMATTER_METHODS` macro, similar to `SERIALIZE_METHODS`, for defining a formatter with a unified serialization/deserialization implementation.
* Updating `compression.h` to consist of 3 formatters, rather than old-style wrappers (`ScriptCompression`, `AmountCompression`, `TxOutCompression`).
ACKs for top commit:
laanwj:
code review ACK 4de934b9b5b4be1bac8fe205f4ee9a79e772dc34
ryanofsky:
Code review ACK 4de934b9b5b4be1bac8fe205f4ee9a79e772dc34. Only change since last review is removing REF usages
Tree-SHA512: d52ca21eb1ce87d9bc3c90d00c905bd4fada522759aaa144c02a58b4d738d5e8647c0558b8ce393c707f6e3c4d20bf93781a2dcc1e1dcbd276d9b5ffd0e02cd6