57d8b1f1b3375452213c48ce80e8207e2fe53ebc cmake: Avoid fuzzer "multiple definition of `main'" errors (Ryan Ofsky)
Pull request description:
This change builds libraries with `-fsanitize=fuzzer-no-link` instead of `-fsanitize=fuzzer` when the cmake `-DSANITIZERS=fuzzer` option is specified. This is necessary to make fuzzing and IPC cmake options compatible with each other and avoid CI failures in #30975 which enables IPC in the fuzzer CI build:
https://cirrus-ci.com/task/5366255504326656?logs=ci#L2817https://cirrus-ci.com/task/5233064575500288?logs=ci#L2384
The failures can also be reproduced by checking out #31741 and building with `cmake -B build -DBUILD_FOR_FUZZING=ON -DSANITIZERS=fuzzer -DENABLE_IPC=ON` with this fix reverted.
The fix updates the cmake build so when `-DSANITIZERS=fuzzer` is specified, the fuzz test binary is built with `-fsanitize=fuzzer` (so it can use libFuzzer's main function), and libraries are built with `-fsanitize=fuzzer-no-link` (so they can be linked into other executables with their own main functions).
Previously when `-DSANITIZERS=fuzzer` was specified, `-fsanitize=fuzzer` was applied to ALL libraries and executables. This was inappropriate because it made it impossible to build any executables other than the fuzz test executable without triggering link errors:
- `` multiple definition of `main' ``
- `` "undefined reference to `LLVMFuzzerTestOneInput' ``
if they depended on any libraries instrumented for fuzzing.
This was especially a problem when the `ENABLE_IPC` option was set because it made building the `mpgen` code generator impossible so nothing else that depended on generated sources, including the fuzz test binary, could be built either.
This commit was previously part of https://github.com/bitcoin/bitcoin/pull/31741 and had some discussion there starting in https://github.com/bitcoin/bitcoin/pull/31741#pullrequestreview-2619682385
---
This PR is part of the [process separation project](https://github.com/bitcoin/bitcoin/issues/28722).
ACKs for top commit:
hebasto:
ACK 57d8b1f1b3375452213c48ce80e8207e2fe53ebc, tested on Ubuntu 24.04.
Tree-SHA512: 4011adbc0b08742e83cf7c0560d3d5b5694a863358e6ac9a21239626b4a8fedceca66db34b5a46136a7b26849bb1d8710c894689322ae97e1c407687c3f57d50
This abstracts out the finding of the connected component that includes
a given element from FindConnectedComponent (which just finds any connected
component).
Use this in the txgraph fuzz test, which was effectively reimplementing this
logic. At the same time, improve its performance by replacing a vector with a
set.
b1de59e8965354fff5a149bc0fe61ed0704aea7a fuzz: extract unsequenced operations with side-effects (Lőrinc)
Pull request description:
https://github.com/bitcoin/bitcoin/pull/30746#discussion_r1817851827 introduced unsequenced operations with side-effects - which is undefined behavior, i.e. the right hand side can be evaluated before the left hand side, which happens to mutate it.
<details>
<summary>Tried to find other occurrences</summary>
```bash
clang++ --analyze -std=c++20 -I./src -I./src/test -I./src/test/fuzz src/test/fuzz/base_encode_decode.cpp src/psbt.cpp
```
but it didn't warn about UB.
Grepped for similar ones, but could find any other one in the codebase:
```bash
> grep -rnE --include='*.cpp' --include='*.h' '\b(\w+)\(([^)]*\b(\w+)\b[^)]*)\)\s*==\s*\3\.' .
./src/test/arith_uint256_tests.cpp:373: BOOST_CHECK(R1L.GetHex() == R1L.ToString());
./src/test/arith_uint256_tests.cpp:374: BOOST_CHECK(R2L.GetHex() == R2L.ToString());
./src/test/arith_uint256_tests.cpp:375: BOOST_CHECK(OneL.GetHex() == OneL.ToString());
./src/test/arith_uint256_tests.cpp:376: BOOST_CHECK(MaxL.GetHex() == MaxL.ToString());
./src/test/fuzz/cluster_linearize.cpp:565: assert(depgraph.FeeRate(best_anc.transactions) == best_anc.feerate);
./src/test/fuzz/cluster_linearize.cpp:646: assert(depgraph.FeeRate(found.transactions) == found.feerate);
./src/test/fuzz/cluster_linearize.cpp:765: assert(depgraph.FeeRate(chunk_info.transactions) == chunk_info.feerate);
./src/test/fuzz/base_encode_decode.cpp:95: assert(DecodeBase64PSBT(psbt, random_string, error) == error.empty());
./src/test/fuzz/key.cpp:102: assert(pubkey.data() == pubkey.begin());
./src/test/skiplist_tests.cpp:42: BOOST_CHECK(vIndex[from].GetAncestor(0) == vIndex.data());
./src/script/signingprovider.cpp:535: ComputeTapbranchHash(node.sub[1]->hash, node.sub[1]->hash) == node.hash) {
./src/pubkey.h:78: return vch.size() > 0 && GetLen(vch[0]) == vch.size();
./src/cluster_linearize.h:881: Assume(elem.inc.feerate.IsEmpty() == elem.pot_feerate.IsEmpty());
```
</details>
Hodlinator deduced the UB on Windows in https://github.com/bitcoin/bitcoin/issues/32135#issuecomment-2751723855Fixes#32135
ACKs for top commit:
maflcko:
lgtm ACK b1de59e8965354fff5a149bc0fe61ed0704aea7a
hodlinator:
ACK b1de59e8965354fff5a149bc0fe61ed0704aea7a
marcofleon:
Nice, ACK b1de59e8965354fff5a149bc0fe61ed0704aea7a
brunoerg:
code review ACK b1de59e8965354fff5a149bc0fe61ed0704aea7a
Tree-SHA512: d66524424c7f749eba870f5bd6038da79666ac638047b31dd8ff15a77d927facb54b4735e8afb7984648fdc9e2dd59ea213996c352301fa05978f041511361d4
b2ea3656481b4196acaf6a1b5f3949a9ba4cf48f txgraph: Add Get{Ancestors,Descendants}Union functions (feature) (Pieter Wuille)
54bceddd3ab39918834d72e9c77eb14e41996652 txgraph: Multiple inputs to Get{Ancestors,Descendant}Refs (preparation) (Pieter Wuille)
aded04701925781ffe194e11e4782261e4736339 txgraph: Add CountDistinctClusters function (feature) (Pieter Wuille)
b685d322c9739ca03b9d0bb9fa57aabea1890060 txgraph: Add DoWork function (feature) (Pieter Wuille)
295a1ca8bbbe5e61bd936158ca33cda5d5e58afd txgraph: Expose ability to compare transactions (feature) (Pieter Wuille)
22c68cd153b72f867dffcc7a62a3f65cef9038fb txgraph: Allow Refs to outlive the TxGraph (feature) (Pieter Wuille)
82fa3573e197f184054fc5096f13ea2520a8d219 txgraph: Destroying Ref means removing transaction (feature) (Pieter Wuille)
6b037ceddfd0160981bd401630c610ad2a3cf000 txgraph: Cache oversizedness of graphs (optimization) (Pieter Wuille)
8c70688965bc4038f28f41e4490180e40a88b5ee txgraph: Add staging support (feature) (Pieter Wuille)
c99c7300b4443f70e452cb97c42b9c2513b372d7 txgraph: Abstract out ClearLocator (refactor) (Pieter Wuille)
34aa3da5adea40615d80588bb0ff8b78d6d292a8 txgraph: Group per-graph data in ClusterSet (refactor) (Pieter Wuille)
36dd5edca5b00f4140f19f364ff93a5a7dd4bbe3 txgraph: Special-case removal of tail of cluster (Optimization) (Pieter Wuille)
5801e0fb2b99f44ac24531779acf0d44ec35b98c txgraph: Delay chunking while sub-acceptable (optimization) (Pieter Wuille)
57f5499882afe170612e0afd4ef6d91561738288 txgraph: Avoid looking up the same child cluster repeatedly (optimization) (Pieter Wuille)
1171953ac6091950f06646a8cc85ca10683023ce txgraph: Avoid representative lookup for each dependency (optimization) (Pieter Wuille)
64f69ec8c383436d1a657add1b8a7eee3e75f61f txgraph: Make max cluster count configurable and "oversize" state (feature) (Pieter Wuille)
1d27b74c8e3bf055fb8b0a5fc5d664bd5048bec6 txgraph: Add GetChunkFeerate function (feature) (Pieter Wuille)
c80aecc24ddd878c62be9753a2746e36860e3a97 txgraph: Avoid per-group vectors for clusters & dependencies (optimization) (Pieter Wuille)
ee57e93099f243cf9fbf9c10265057a53f06e062 txgraph: Add internal sanity check function (tests) (Pieter Wuille)
05abf336f997f477c6f48412809ab540fccf1cb0 txgraph: Add simulation fuzz test (tests) (Pieter Wuille)
8ad3ed26818a620cb973cd4e5eaa7b49313f562b txgraph: Add initial version (feature) (Pieter Wuille)
6eab3b2d7380b8ff818e3a1cefeb7731b7342e04 feefrac: Introduce tagged wrappers to distinguish vsize/WU rates (Pieter Wuille)
d4497738999873c8432d02fd71e14f1afc2065a8 scripted-diff: (refactor) ClusterIndex -> DepGraphIndex (Pieter Wuille)
bfeb69f6e00d94b94171cebf351fac69bec489cc clusterlin: Make IsAcyclic() a DepGraph member function (Pieter Wuille)
0aa874a357865dd4768091f26dff238e66fb8d83 clusterlin: Add FixLinearization function + fuzz test (Pieter Wuille)
Pull request description:
Part of cluster mempool: #30289.
### 1. Overview
This introduces the `TxGraph` class, which encapsulates knowledge about the (effective) fees, sizes, and dependencies between all mempool transactions, but nothing else. In particular, it lacks knowledge about `CTransaction`, inputs, outputs, txids, wtxids, prioritization, validatity, policy rules, and a lot more. Being restricted to just those aspects of the mempool makes the behavior very easy to fully specify (ignoring the actual linearizations produced), and write simulation-based tests for (which are included in this PR).
### 2. Interface
The interface can be largely categorized into:
* Mutation functions:
* `AddTransaction` (add a new transaction with specified feerate, and get a `Ref` object back to identify it).
* `RemoveTransaction` (given a `Ref` object, remove the transaction).
* `AddDependency` (given two `Ref` objects, add a dependency between them).
* `SetTransactionFee` (modify the fee associated with a Ref object).
* Inspector functions:
* `GetAncestors` (get the ancestor set in the form of `Ref*` pointers)
* `GetAncestorsUnion` (like above, but for the union of ancestors of multiple `Ref*` pointers)
* `GetDescendants` (get the descendant set in the form of `Ref*` pointers)
* `GetDescendantsUnion` (like above, but for the union of ancestors of multiple `Ref*` pointers)
* `GetCluster` (get the connected component set in the form of `Ref*` pointers, in the order they would be mined).
* `GetIndividualFeerate` (get the feerate of a transaction)
* `GetChunkFeerate` (get the mining score of a transaction)
* `CountDistinctClusters` (count the number of distinct clusters a list of `Ref`s belong to)
* Staging functions:
* `StartStaging` (make all future mutations operate on a proposed transaction graph)
* `CommitStaging` (apply all the changes that are staged)
* `AbortStaging` (discard all the changes that are staged)
* Miscellaneous functions:
* `DoWork` (do queued-up computations now, so that future operations are fast)
This `TxGraph::Ref` type used as a "handle" on transactions in the graph can be inherited from, and the idea is that in the full cluster mempool implementation (#28676, after it is rebased on this), `CTxMempoolEntry` will inherit from it, and all actually used Ref objects will be `CTxMempoolEntry`s. With that, the mempool code can just cast any `Ref*` returned by txgraph to `CTxMempoolEntry*`.
### 3. Implementation
Internally the graph data is kept in clustered form (partitioned into connected components), for which linearizations are maintained and updated as needed using the `cluster_linearize.h` algorithms under the hood, but this is hidden from the users of this class. Implementation-wise, mutations are generally applied lazily, appending to queues of to-be-removed transactions and to-be-added dependencies, so they can be batched for higher performance. Inspectors will generally only evaluate as much as is needed to answer queries, with roughly 5 levels of processing to go to fully instantiated and acceptable cluster linearizations, in order:
1. `ApplyRemovals` (take batches of to-be-removed transactions and translate them to "holes" in the corresponding Clusters/DepGraphs).
2. `SplitAll` (creating holes in Clusters may cause them to break apart into smaller connected components, so make turn them into separate Clusters/linearizations).
3. `GroupClusters` (figure out which Clusters will need to be combined in order to add requested to-be-added dependencies, as these may span clusters).
4. `ApplyDependencies` (actually merge Clusters as precomputed by `GroupClusters`, and add the dependencies between them).
5. `MakeAcceptable` (perform the LIMO linearization algorithm on Clusters to make sure their linearizations are acceptable).
### 4. Future work
This is only an initial version of TxGraph, and some functionality is missing before #28676 can be rebased on top of it:
* The ability to get comparative feerate diagrams before/after for the set of staged changes (to evaluate RBF incentive-compatibility).
* Mining interface (ability to iterate transactions quickly in mining score order) (see #31444).
* Eviction interface (reverse of mining order, plus memory usage accounting) (see #31444).
* Ability to fix oversizedness of clusters (before or after committing) - this is needed for reorgs where aborting/rejecting the change just is not an option (see #31553).
* Interface for controlling how much effort is spent on LIMO. In this PR it is hardcoded.
Then there are further improvements possible which would not block other work:
* Making Cluster a virtual class with different implementations based on transaction count (which could dramatically reduce memory usage, as most Clusters are just a single transaction, for which the current implementation is overkill).
* The ability to have background thread(s) for improving cluster linearizations.
ACKs for top commit:
instagibbs:
reACK b2ea3656481b4196acaf6a1b5f3949a9ba4cf48f
ajtowns:
reACK b2ea3656481b4196acaf6a1b5f3949a9ba4cf48f
ismaelsadeeq:
reACK b2ea3656481b4196acaf6a1b5f3949a9ba4cf48f 🚀
glozow:
ACK b2ea3656481
Tree-SHA512: 0f86f73d37651fe47d469db1384503bbd1237b4556e5d50b1d0a3dd27754792d6fc3481f77a201cf2ed36c6ca76e0e44c30e175d112aacb53dfdb9e11d8abc6b
https://github.com/bitcoin/bitcoin/pull/30746#discussion_r1817851827 introduced an unsequenced operations with side-effects - which is undefined behavior, i.e. the right hand side can be evaluated before the left hand side, which happens to mutate it.
Tried:
```
clang++ --analyze -std=c++20 -I./src -I./src/test -I./src/test/fuzz src/test/fuzz/base_encode_decode.cpp src/psbt.cpp
```
but it didn't warn about UB.
Grepped for similar ones, but could find any other one in the codebase:
> grep -rnE --include='*.cpp' --include='*.h' '\b(\w+)\(([^)]*\b(\w+)\b[^)]*)\)\s*==\s*\3\.' .
```
./src/test/arith_uint256_tests.cpp:373: BOOST_CHECK(R1L.GetHex() == R1L.ToString());
./src/test/arith_uint256_tests.cpp:374: BOOST_CHECK(R2L.GetHex() == R2L.ToString());
./src/test/arith_uint256_tests.cpp:375: BOOST_CHECK(OneL.GetHex() == OneL.ToString());
./src/test/arith_uint256_tests.cpp:376: BOOST_CHECK(MaxL.GetHex() == MaxL.ToString());
./src/test/fuzz/cluster_linearize.cpp:565: assert(depgraph.FeeRate(best_anc.transactions) == best_anc.feerate);
./src/test/fuzz/cluster_linearize.cpp:646: assert(depgraph.FeeRate(found.transactions) == found.feerate);
./src/test/fuzz/cluster_linearize.cpp:765: assert(depgraph.FeeRate(chunk_info.transactions) == chunk_info.feerate);
./src/test/fuzz/base_encode_decode.cpp:95: assert(DecodeBase64PSBT(psbt, random_string, error) == error.empty());
./src/test/fuzz/key.cpp:102: assert(pubkey.data() == pubkey.begin());
./src/test/skiplist_tests.cpp:42: BOOST_CHECK(vIndex[from].GetAncestor(0) == vIndex.data());
./src/script/signingprovider.cpp:535: ComputeTapbranchHash(node.sub[1]->hash, node.sub[1]->hash) == node.hash) {
./src/pubkey.h:78: return vch.size() > 0 && GetLen(vch[0]) == vch.size();
./src/cluster_linearize.h:881: Assume(elem.inc.feerate.IsEmpty() == elem.pot_feerate.IsEmpty());
```
Hodlinator deduced the UB on Windows in https://github.com/bitcoin/bitcoin/issues/32135#issuecomment-2751723855
Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
In order to make it possible for higher layers to compare transaction quality
(ordering within the implicit total ordering on the mempool), expose a comparison
function and test it.
Before this commit, if a TxGraph::Ref object is destroyed, it becomes impossible
to refer to, but the actual corresponding transaction node in the TxGraph remains,
and remains indefinitely as there is no way to remove it.
Fix this by making the destruction of TxGraph::Ref trigger immediate removal of
the corresponding transaction in TxGraph, both in main and staging if it exists.
In order to make it easy to evaluate proposed changes to a TxGraph, introduce a
"staging" mode, where mutators (AddTransaction, AddDependency, RemoveTransaction)
do not modify the actual graph, but just a staging version of it. That staging
graph can then be commited (replacing the main one with it), or aborted (discarding
the staging).
Instead of leaving the responsibility on higher layers to guarantee that
no connected component within TxGraph (a barely exposed concept, except through
GetCluster()) exceeds the cluster count limit, move this responsibility to
TxGraph itself:
* TxGraph retains a cluster count limit, but it becomes configurable at construction
time (this primarily helps with testing that it is properly enforced).
* It is always allowed to perform mutators on TxGraph, even if they would cause the
cluster count limit to be exceeded. Instead, TxGraph exposes an IsOversized()
function, which queries whether it is in a special "oversize" state.
* During oversize state, many inspectors are unavailable, but mutators remain valid,
so the higher layer can "fix" the oversize state before continuing.
To make testing more powerful, expose a function to perform an internal sanity
check on the state of a TxGraph. This is especially important as TxGraphImpl
contains many redundantly represented pieces of information:
* graph contains clusters, which refer to entries, but the entries refer back
* graph maintains pointers to Ref objects, which point back to the graph.
This lets us make sure they are always in sync.
This adds a simulation fuzz test for txgraph, by comparing with a naive
reimplementation that models the entire graph as a single DepGraph, and
clusters in TxGraph as connected components within that DepGraph.
Since cluster_linearize.h does not actually have a Cluster type anymore, it is more
appropriate to rename the index type to DepGraphIndex.
-BEGIN VERIFY SCRIPT-
sed -i 's/Data type to represent transaction indices in clusters./Data type to represent transaction indices in DepGraphs and the clusters they represent./' $(git grep -l 'using ClusterIndex')
sed -i 's|\<ClusterIndex\>|DepGraphIndex|g' $(git grep -l 'ClusterIndex')
-END VERIFY SCRIPT-
This function takes an existing ordering for transactions in a DepGraph, and
makes it a valid linearization for it (i.e., topological). Any topological
prefix of the input remains untouched.
63b534f97e591d4e107fd5148909852eb2965d27 fuzz: sanity check hardcoded snapshot in utxo_snapshot target (Antoine Poinsot)
3b85eba83abe561078c91f5a5c49cf26c682c19b test util: split up ConnectBlock from MineBlock (Antoine Poinsot)
d1527f6b88656ff4aab3c671c6d9780ea2ae986e qa: correct off-by-one in utxo snapshot fuzz target (Antoine Poinsot)
Pull request description:
The assumeutxo data for the fuzz target could change and invalidate the hash silently, preventing the fuzz target from reaching some code paths. Fix this by introducing a unit test which would break if the snapshot data the fuzz target relies on were to change.
In implementing this i noticed the height used for coins in the fuzz target is actually off-by-one (as if the first block in the created chain was the genesis but it's block `1`), so fix that too.
ACKs for top commit:
mzumsande:
Code Review ACK 63b534f97e591d4e107fd5148909852eb2965d27
fjahr:
tACK 63b534f97e591d4e107fd5148909852eb2965d27
Tree-SHA512: 2399b6e74db9b78aab8efba67c57a405d2d7d880ae3b7d8518a1c96cc6266f61f5e77722cd999adeac5d3e03e73d84cf9ae7bdbcc0afae198cc87049dde4012f
fa4fb6a8f15c295a2a3d3ffd737e115b8be46c1f fuzz: Use serial task runner to increase fuzz stability (MarcoFalke)
Pull request description:
Leaking a scheduler with a non-empty queue from the fuzz initialization phase into the fuzz target execution phase is problematic, because it messes with coverage data. This in turn is problematic, because it leads to:
* Decrease in fuzz target execution stability (non-determinism when running the fuzz target).
* Decrease in fuzz input merge stability (non-determinism when selecting a minimum set of fuzz input to reach maximum coverage), which leads to qa-assets bloat.
Fix one such issue. Tracking issue: https://github.com/bitcoin/bitcoin/issues/29018
Can be tested via: `RUST_BACKTRACE=1 cargo run --manifest-path ./contrib/devtools/deterministic-fuzz-coverage/Cargo.toml -- $PWD/bld-cmake $PWD/../b-c-qa-assets/fuzz_corpora/ partially_downloaded_block`.
The failure is non-deterministic (obviously) and will show coverage in validation signals such as `UpdatedBlockTip` before this change and will have this one fixed after this change.
ACKs for top commit:
marcofleon:
ACK fa4fb6a8f15c295a2a3d3ffd737e115b8be46c1f
dergoegge:
Code review ACK fa4fb6a8f15c295a2a3d3ffd737e115b8be46c1f
Tree-SHA512: fd1f66562c1d3c21553c7dd324399cdc16faa2fedfdb8e7544ea6a68b8b356e7c81d81815ecf70e0d334307dab6b275c1889b3b889b6f15eec514beee22c95f4
ffff4a293ad878494e12f8f00108cc99ee2b713e bench: Update span-serialize comment (MarcoFalke)
fa4d6ec97bcb1790a7cd4363a13fda7c80c3dd90 refactor: Avoid false-positive gcc warning (MarcoFalke)
fa942332b40c97375af0722f32f7575bca3af819 scripted-diff: Bump copyright headers after std::span changes (MarcoFalke)
fa0c6b7179c062b7ca92d120455ce02a9f4e9e19 refactor: Remove unused Span alias (MarcoFalke)
fade0b5e5e6e80e3da1ab6448b6212244bafa5d3 scripted-diff: Use std::span over Span (MarcoFalke)
fadccc26c03db00a2be3f703aa7e5eec4312bd2e refactor: Make Span an alias of std::span (MarcoFalke)
fa27e36717ec18d64b7ff7bba71b8f0c202ba31d test: Fix broken span_tests (MarcoFalke)
fadf02ef8bf96ad5b3b8e34fd425b31b555f4371 refactor: Return std::span from MakeUCharSpan (MarcoFalke)
fa720b94be17fa9e7c91188710e6a04939ceab11 refactor: Return std::span from MakeByteSpan (MarcoFalke)
Pull request description:
`Span` has some issues:
* It does not support fixed-size spans, which are available through `std::span`.
* It is confusing to have it available and in use at the same time with `std::span`.
* It does not obey the standard library iterator build hardening flags. See https://github.com/bitcoin/bitcoin/issues/31272 for a discussion. For example, this allows to catch issues like the one fixed in commit fabeca3458b38a3d8930cb0cbc866388c3f120f1.
Both types are type-safe and can even implicitly convert into each other in most contexts.
However, exclusively using `std::span` seems less confusing, so do it here with a scripted-diff.
ACKs for top commit:
l0rinc:
reACK ffff4a293ad878494e12f8f00108cc99ee2b713e
theuni:
ACK ffff4a293ad878494e12f8f00108cc99ee2b713e.
Tree-SHA512: 9cc2f1f43551e2c07cc09f38b1f27d11e57e9e9bc0c6138c8fddd0cef54b91acd8b14711205ff949be874294a121910d0aceffe0e8914c4cff07f1e0e87ad5b8
fac3d93c2ba84899c2c6516b5449f61ef653d9fa fuzz: Speed up *_package_eval fuzz targets a bit (MarcoFalke)
fa40fd043ab23eb8948c208ca82f75f3d40bb2e4 fuzz: [refactor] Avoid confusing c-style cast (MarcoFalke)
Pull request description:
Each target is at least 10% faster for me when running over the current set of qa-assets, which seems nice.
The changes `outpoints_value` from a map to an unordered map, which is safe, because the element order is not used in the fuzz test and the map is only used for lookup.
(`mempool_outpoints` can't be changed, because the order matters here. Using unordered_set here may result in a non-deterministic fuzz target, given the same fuzz input.)
ACKs for top commit:
l0rinc:
ACK fac3d93c2ba84899c2c6516b5449f61ef653d9fa
dergoegge:
Code review ACK fac3d93c2ba84899c2c6516b5449f61ef653d9fa
Tree-SHA512: 8ae5d4e281505aff76a4003d6e9ea388dbb73860e167385bd6a0a201b3acc939db29ee212594952a9e80e85b3cc4cd726ce6dd49551f74013cb4da8a15cbdfb3
21e9d39a3725cd6107b742f0cb97f65b3640201b docs: add release notes for 31603 (brunoerg)
a8b548d75d9a376c9bb66e06bb918c876416d615 test: `getdescriptorinfo`/`importdescriptors` with whitespace in pubkeys (brunoerg)
c7afca3d62cf5d3ea9b98d5a76e4e54cac07bc3c test: descriptor: check whitespace into keys (brunoerg)
cb722a3cea16a04844c83e56fd6deaa1f0dc0a7e descriptor: check whitespace in ParsePubkeyInner (brunoerg)
50856695ef6c02ecbaa0cf448567355b6b86b510 test: fix descriptors in `ismine_tests` (brunoerg)
Pull request description:
Currently, we successfully parse descriptors which contains spaces in the beginning or end of the public/private key within a fragment (e.g. `pk( KEY)`, `pk(KEY )` or `pk( KEY )`). I have noticed that one of the reasons is that the `DecodeBase58` function simply ignore these whitespaces.
This PR changes the `ParsePubkeyInner ` to reject pubkeys that contain a whitespace at the beginning and/or at the end. We will only check the whitespace in some RPCs (e.g. `importdescriptors`), but an already imported descriptor won't be affected by this check, especially because we store descriptors from `ToString`.
For context: https://github.com/brunoerg/bitcoinfuzz/issues/72
ACKs for top commit:
rkrux:
tACK 21e9d39a3725cd6107b742f0cb97f65b3640201b
darosior:
re-ACK 21e9d39a3725cd6107b742f0cb97f65b3640201b
sipa:
utACK 21e9d39a3725cd6107b742f0cb97f65b3640201b
Tree-SHA512: 54f48a89a235517e5cdc29a46dceeb7dabbee93c7616a166288ff3f90131808eb0ece43b0797a11fe827a5f7bd51d65e3e75c16789b0a42020934cabb684cc8f
54e6eacc1fccd602897d9e3025c62f83194ffd5b test: Enable ResetCoverageCounters beyond Linux (janb84)
Pull request description:
In PR [#31901](https://github.com/bitcoin/bitcoin/pull/31901), Coverage.cpp was introduced as a separate utility file, based on existing code. However, the macro defined in Coverage.cpp was limited to Clang and Linux, which caused issues for users on macOS when using the newly introduced deterministic test tooling.
This change adds fallback functions which are used when building without code coverage on non linux env.
This adds support for macOS to ResetCoverageCounters. ResetCoverageCounters is used by the unit tests in `g_rng_temp_path_init` to support the deterministic unit test tooling. It is also used in fuzz tests to completely suppress coverage from anything init-related.
See [Readme](https://github.com/bitcoin/bitcoin/blob/master/contrib/devtools/README.md) on how to test this for deterministic unit & fuzz test.
Suggestion for test files:
- for unit test: `util_string_tests`
- for fuzz test: `addition_overflow `
These files should give deterministic results
ACKs for top commit:
maflcko:
review-only ACK 54e6eacc1fccd602897d9e3025c62f83194ffd5b
hodlinator:
re-ACK 54e6eacc1fccd602897d9e3025c62f83194ffd5b
Tree-SHA512: dd71da6f76d4fc9e64bf521bbfe5e7483d77c2ca0380f9e692502e64b529068ea33f21b19399481feb7c6780a23d893d8b7f733cef641a2db18a13397c98deea
4cd95a2921805f447a8bcecc6b448a365171eb93 refactor: modernize remaining outdated trait patterns (Lőrinc)
ab2b67fce20fd7d8017f8a26425cab99e91f420d scripted-diff: modernize outdated trait patterns - values (Lőrinc)
8327889f358289f918d04ddb9469fb5562720bf4 scripted-diff: modernize outdated trait patterns - types (Lőrinc)
Pull request description:
The use of [`std::underlying_type_t<T>`](https://en.cppreference.com/w/cpp/types/underlying_type) or [`std::is_enum_v<T>`](https://en.cppreference.com/w/cpp/types/is_enum) (and similar ones, introduced in C++14) replace the `typename std::underlying_type<T>::type` and `std::is_enum<T>::value` constructs (available in C++11).
The `_t` and `_v` helper alias templates offer a more concise way to extract the type and value directly.
I've modified the instances I found in the codebase one-by-one (noticed them while investigating https://github.com/bitcoin/bitcoin/pull/31868), and afterwards extracted scripted diff commits to do the trivial ones automatically.
The last commit contains the values that were easier done manually.
I've excluded changes from `src/bench/nanobench.h`, `src/leveldb`, `src/minisketch`, `src/span.h` and `src/sync.h` - let me know if you think they should be included instead.
A few of the code changes can also be reproduced by clang-tidy (but not all of them):
```bash
cmake -B build -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DBUILD_BENCH=ON -DBUILD_FUZZ_BINARY=ON -DBUILD_FOR_FUZZING=ON && cmake --build build -j$(nproc)
run-clang-tidy -quiet -p build -j $(nproc) -checks='-*,modernize-type-traits' -fix $(git grep -lE '::(value|type)' ./src ':(exclude)src/bench/nanobench.h' ':(exclude)src/leveldb' ':(exclude)src/minisketch' ':(exclude)src/span.h' ':(exclude)src/sync.h')
```
ACKs for top commit:
laanwj:
Concept and code review ACK 4cd95a2921805f447a8bcecc6b448a365171eb93
Tree-SHA512: a4bcf0f267c0f4e02983b4d548ed6f58d464ec379ac5cd1f998b9ec0cf698b53a9f2557a05a342b661f1d94adefc9a0ce2dc8f764d49453aaea95451e2c4c581
Non-Linux linkers require a fallback implementation for when coverage is not enabled.
The fallbacks are marked weak to have lower precedence than built-in implementations when available, removing ambiguity from the linker.
d5537c18a9034647ba4c9ed4008abd7fee33989e fuzz: make sure DecodeBase58(Check) is called with valid values more often (Lőrinc)
bad1433ef2b5b02ac4b1c6c1d9482c513e5b2192 fuzz: Always restrict base conversion input lengths (Lőrinc)
Pull request description:
This is a follow-up to https://github.com/bitcoin/bitcoin/pull/30746, expanding coverage by:
* restricting every input for the base58 conversions, capping max sizes to `100` instead of `1000` or all available input (suggested by marcofleon in https://github.com/bitcoin/bitcoin/pull/30746#discussion_r1963718683) since most actual usage has lengths of e.g. `21`, `34`, `78`.
* providing more valid values to the decoder (suggested by maflcko in https://github.com/bitcoin/bitcoin/pull/30746#discussion_r1957847712) by randomly providing a random input or a valid encoded one; this also enables unifying the roundtrip tests to a single roundtrip per fuzz.
ACKs for top commit:
mzumsande:
Code Review / lightly tested ACK d5537c18a9034647ba4c9ed4008abd7fee33989e
maflcko:
review ACK d5537c18a9034647ba4c9ed4008abd7fee33989e 🚛
Tree-SHA512: 50365654cdac8a38708a7475eaa43396642b7337e2ee8999374c3faafff4f05457abc1a54c701211e0ed24d36c12af77bcad17b49695699be42664f2be660659
3c5d1a468199722da620f1f3d8ae3319980a46d5 Remove checkpoints (marcofleon)
632ae47372de90064f61e3e622d8da766d1d12de update comment on MinimumChainWork check (marcofleon)
Pull request description:
The headers presync logic (only downloading headers that lead to a chain with sufficient work, implemented in https://github.com/bitcoin/bitcoin/pull/25717) should be enough to prevent memory DoS using low-work headers. Therefore, we no longer have any use for checkpoints.
All checkpoints and checkpoint logic are removed in a single commit, to make it easy to revert if necessary.
Some previous discussion can be found in https://github.com/bitcoin/bitcoin/pull/25725. The conclusion at the time was that more testing of the presync logic was needed. Now that we have [unit](https://github.com/bitcoin/bitcoin/blob/master/src/test/headers_sync_chainwork_tests.cpp), [functional](https://github.com/bitcoin/bitcoin/blob/master/test/functional/p2p_headers_sync_with_minchainwork.py), and [fuzz](https://github.com/bitcoin/bitcoin/blob/master/src/test/fuzz/p2p_headers_presync.cpp) tests for this logic, it seems safe to move forward with checkpoint removal.
ACKs for top commit:
Sjors:
Code review ACK 3c5d1a468199722da620f1f3d8ae3319980a46d5
instagibbs:
reACK 3c5d1a468199722da620f1f3d8ae3319980a46d5
dergoegge:
ACK 3c5d1a468199722da620f1f3d8ae3319980a46d5
Tree-SHA512: 051a6f9b82cd0262f4d3be4403906812fc6d1be022731fac16bb1c02bca471f31dfc7fc4b834ab2469e8f087265a6d99e84a1d665823cda1b112363a8e8f337d
fa99c3b544b631cfe34d52fb5e71636aedb1b423 test: Exclude SeedStartup from coverage counts (MarcoFalke)
fa579d663d716c967ccd45d67b46e779e2fa0b48 contrib: Add deterministic-unittest-coverage (MarcoFalke)
fa3940b1cbc94c8ccfde36be1db1adca04fbcaa6 contrib: deterministic-fuzz-coverage fixups (MarcoFalke)
faf905b9b694313bed4531d1299568a101f33fb8 doc: Remove unused -fPIC (MarcoFalke)
fa1e0a72281fde13d704c7766d4d704e009274da gitignore: target/ (MarcoFalke)
Pull request description:
The `contrib/devtools/test_deterministic_coverage.sh` script is problematic:
* It is written in bash. This can lead to issues when running with the ancient bash version shipped by macOS by default, or can lead to other compatibility issues, such as https://github.com/bitcoin/bitcoin/pull/31588#discussion_r1946784827. Also, pipefail isn't set, so IO errors may be silently ignored.
* It is based on gcov. This can lead to issues, such as https://github.com/bitcoin/bitcoin/pull/31588#pullrequestreview-2602169248 (possibly due to prefix-map), or https://github.com/bitcoin/bitcoin/pull/31588#issuecomment-2646395385 (gcovr processing error), or https://github.com/bitcoin/bitcoin/pull/31588#pullrequestreview-2605954001 (gcovr assertion error).
* The script is severely outdated, with the last update to `NON_DETERMINISTIC_TESTS` being in the prior decade.
Instead of patching around all issues one-by-one, just provide a fresh rewrite, based on the recently added `deterministic-fuzz-coverage` tool based on clang, llvm-cov, and llvm-profdata. (Initial feedback indicates that this is a more promising attempt: https://github.com/bitcoin/bitcoin/pull/31588#issuecomment-2649356408 and https://github.com/bitcoin/bitcoin/pull/31588#issuecomment-2649354598).
The new tool also sets `RANDOM_CTX_SEED=21` as suggested by hodlinator in https://github.com/bitcoin/bitcoin/pull/31588#issuecomment-2650784726.
ACKs for top commit:
Prabhat1308:
Concept ACK [`fa99c3b`](fa99c3b544)
hodlinator:
re-ACK fa99c3b544b631cfe34d52fb5e71636aedb1b423
brunoerg:
light ACK fa99c3b544b631cfe34d52fb5e71636aedb1b423
dergoegge:
tACK fa99c3b544b631cfe34d52fb5e71636aedb1b423
janb84:
Concept ACK [fa99c3b](fa99c3b544)
Tree-SHA512: 491d5e6413d929395a5c7caea54817bdc1a0e00562c9728a374d4e92f2e2017dba4a770ecdb2e7317e049df9fdeb390d83c90dff9aa5709f97aa3f6a0e70cdb4
cadbd4137d84b71be26effd6a2ae177d5031345e miner: have waitNext return after 20 min on testnet (Sjors Provoost)
d4020f502a63cb4390ec241fc5f989e988afa022 Add waitNext() to BlockTemplate interface (Sjors Provoost)
Pull request description:
This PR introduces `waitNext()`. It waits for either the tip to update or for fees at the top of the mempool to rise sufficiently. It then returns a new template, with which the caller can rinse and repeat.
On testnet3 and testnet4 the difficulty drops after 20 minutes, so the second ensures that a new template is returned in that case.
Alternative approach to #31003, suggested in https://github.com/bitcoin/bitcoin/issues/31109#issuecomment-2451942362
ACKs for top commit:
ryanofsky:
Code review ACK cadbd4137d84b71be26effd6a2ae177d5031345e. Main change since last review is adding back a missing `m_interrupt` check in the waitNext loop. Also made various code cleanups in both commits.
ismaelsadeeq:
Code review ACK cadbd4137d84b71be26effd6a2ae177d5031345e
vasild:
ACK cadbd4137d84b71be26effd6a2ae177d5031345e
Tree-SHA512: c5a40053723c1c1674449ba1e4675718229a2022c8b0a4853b12a2c9180beb87536a1f99fde969a0ef099bca9ac69ca14ea4f399d277d2db7f556465ce47de95
Historically, the headers have been bumped some time after a file has
been touched. Do it now to avoid having to touch them again in the
future for that reason.
-BEGIN VERIFY SCRIPT-
sed -i --regexp-extended 's;( 20[0-2][0-9])(-20[0-2][0-9])? The Bitcoin Core developers;\1-present The Bitcoin Core developers;g' $( git show --pretty="" --name-only HEAD~1 )
-END VERIFY SCRIPT-
This uses a macro, which can be a bit more brittle than an alias
template. However, class template argument deduction for alias templates
is only implemented in clang-19.
* The comment is wrong claiming that void* was returned when void was
returned in reality.
* The namespace is missing a name, leading to compile errors that are
suppressed with non-standard pragmas, and leading to compile errors in
future commits. Instead of using more non-standard suppressions, just
add the missing name.
* The SpanableYes/No types are missing begin/end iterators, which will
be needed when using std::span.
In theory this commit should only touch the span.h header, because
std::span can implicilty convert into Span in most places, if needed.
However, at least when using the clang compiler, there are some
false-positive lifetimebound warnings and some implicit conversions can
not be resolved.
Thus, this refactoring commit also changed the affected places to
replace Span with std::span.
e637dc2c01c3b566e6c51c911c5881a8d206c924 refactor: Replace uint256 type with Wtxid in PackageMempoolAcceptResult struct (marcofleon)
a3baead7cb8376e3b09f1726b8c466648d187524 validation: use wtxid instead of txid in CheckEphemeralSpends (marcofleon)
Pull request description:
This PR addresses a small bug in [`AcceptMultipleTransactions`](45719390a1/src/validation.cpp (L1598)) where a txid was being inserted into a map that should only hold wtxids. `CheckEphemeralSpends` has an out parameter on failure that records that the child transaction did not spend the parent's dust. Instead of using the txid of this child, use its wtxid.
The second commit in this PR is a refactor of the `PackageMempoolAcceptResult` struct to use the `Wtxid` type instead of `uint256`. This helps to prevent errors like this in the future.
ACKs for top commit:
instagibbs:
ACK e637dc2c01
glozow:
ACK e637dc2c01c, hooray for type safety
dergoegge:
Code review ACK e637dc2c01c3b566e6c51c911c5881a8d206c924
Tree-SHA512: 17039efbb241b7741e2610be5a6d6f88f4c1cbe22d476931ec99e43f993d259a1a5e9334e1042651aff49edbdf7b9e1c1cd070a28dcba5724be6db842e4ad1e0
568fcdddaec2cc8decba5a098257f31729cc1caa scripted-diff: Adjust documentation per top-level target output location (Hennadii Stepanov)
026bb226e96919603af829d0b677779a234a0f6e cmake: Set top-level target output locations (Hennadii Stepanov)
Pull request description:
This PR sets the target output locations to the `bin` and `lib` subdirectories within the build tree, creating a directory structure that mirrors that of the installed targets.
This approach is widely adopted by the large projects, such as [LLVM](e146c1867e/lldb/cmake/modules/LLDBStandalone.cmake (L128-L130)):
```cmake
set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/bin)
set(CMAKE_LIBRARY_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/lib${LLVM_LIBDIR_SUFFIX})
set(CMAKE_ARCHIVE_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/lib${LLVM_LIBDIR_SUFFIX})
```
The `libsecp256k1` project has also recently [adopted](https://github.com/bitcoin-core/secp256k1/pull/1553) this approach.
With this PR, all binaries are conveniently located. For example, run:
```
$ ./build/bin/fuzz
```
instead of:
```
$ ./build/src/test/fuzz/fuzz
```
On Windows, all required DLLs are now located in the same directory as the executables, allowing to run `bitcoin-chainstate.exe` (which loads `bitcoinkernel.dll`) without the need to copy DLLs or modify the `PATH` variable.
The idea was briefly discussed among the build team during the recent CoreDev meeting.
---
**Warning**: This PR changes build locations of newly built executables like `bitcoind` and `test_bitcoin` from `src/` to `bin/` without deleting previously built executables. A clean build is recommended to avoid accidentally running old binaries.
ACKs for top commit:
theStack:
Light re-ACK 568fcdddaec2cc8decba5a098257f31729cc1caa
ryanofsky:
Code review ACK 568fcdddaec2cc8decba5a098257f31729cc1caa. Only change since last review was rebasing. I'm ok with this PR in its current form if other developers are happy with it. I just personally think it is inappropriate to \*silently\* break an everyday developer workflow like `git pull; make bitcoind`. I wouldn't have a problem with this PR if it triggered an explicit error, or if the problem was limited to less common workflows like changing cmake options in an existing build.
TheCharlatan:
Re-ACK 568fcdddaec2cc8decba5a098257f31729cc1caa
theuni:
ACK 568fcdddaec2cc8decba5a098257f31729cc1caa
Tree-SHA512: 1aa5ecd3cd49bd82f1dcc96c8e171d2d19c58aec8dade4bc329df89311f9e50cbf6cf021d004c58a0e1016c375b0fa348ccd52761bcdd179c2d1e61c105e3b9f
In Base58 fuzz the two roundtrips are merged now, the new `decode_input` switches between a completely random input and a valid encoded one, to make sure the decoding passes more often.
The `max_ret_len` can also exceed the original length now and is being validated more thoroughly.
Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>
Co-authored-by: marcofleon <marleo23@proton.me>
They seem to cause timeouts:
> Issue 397734700: bitcoin-core:base58check_encode_decode: Timeout in base58check_encode_decode
The `encoded_string.empty()` check was corrected here to `decoded.empty()` to make sure the `(0, decoded.size() - 1)` range is always valid.
Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>
Co-authored-by: marcofleon <marleo23@proton.me>
Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
This change builds libraries with -fsanitize=fuzzer-no-link instead of
-fsanitize=fuzzer when the cmake -DSANITIZERS=fuzzer option is specified. This
is necessary to make fuzzing and IPC cmake options compatible with each other
and avoid CI failures in #30975 which enables IPC in the fuzzer CI build:
https://cirrus-ci.com/task/5366255504326656?logs=ci#L2817https://cirrus-ci.com/task/5233064575500288?logs=ci#L2384
The failures can also be reproduced by checking out #31741 and building with
`cmake -B build -DBUILD_FOR_FUZZING=ON -DSANITIZERS=fuzzer -DENABLE_IPC=ON`
with this fix reverted.
The fix updates the cmake build so when -DSANITIZERS=fuzzer is specified, the
fuzz test binary is built with -fsanitize=fuzzer (so it can use libFuzzer's
main function), and libraries are built with -fsanitize=fuzzer-no-link (so they
can be linked into other executables with their own main functions).
Previously when -DSANITIZERS=fuzzer was specified, -fsanitize=fuzzer was
applied to ALL libraries and executables. This was inappropriate because it
made it impossible to build any executables other than the fuzz test executable
without triggering link errors:
- "multiple definition of `main'"
- "undefined reference to `LLVMFuzzerTestOneInput'"
if they depended on any libraries instrumented for fuzzing.
This was especially a problem when the ENABLE_IPC option was set because it
made building the mpgen code generator impossible so nothing else that depended
on generated sources, including the fuzz test binary, could be built either.
This commit was previously part of
https://github.com/bitcoin/bitcoin/pull/31741 and had some discussion there
starting in
https://github.com/bitcoin/bitcoin/pull/31741#pullrequestreview-2619682385