This reverts commit b2d536100282bd901d3e0be7f7f4a6966e0ef817.
Also, adjust the doc to reflect the new minimum version. Versions 17.6
or 17.11 (or anything in between) may still work on a best-effor basis,
but it is not checked by CI or by developers.
It is only used in test. There it is problematic, because it sometimes
relies on m_default_address_type. If the default were changed to
BECH32M, those tests would fail the assert(false).
So just use PKHash{} in all tests and remove GetDestinationForKey.
This interface lets one iterate efficiently over the chunks of the main
graph in a TxGraph, in the same order as CompareMainOrder. Each chunk
can be marked as "included" or "skipped" (and in the latter case,
dependent chunks will be skipped).
a58cb3b1c12c8cb75a87375c50f94c4605bb805d qa: sanity check mined block have their coinbase timelocked to height (Antoine Poinsot)
8f2078af6a55448c003b3f7f3021955fbb351caa miner: timelock coinbase transactions (Antoine Poinsot)
788aeebf343526760fa8f3ed969ac3713212a5b6 qa: use prev height as nLockTime for coinbase txs created in unit tests (Antoine Poinsot)
c76dbe9b8b6f03b761a0ef97e1b8cd133b934714 qa: timelock coinbase transactions created in fuzz targets (Antoine Poinsot)
9c94069d8b6cf67a24eb03c51230a4f2b2bf2d64 contrib: timelock coinbase transactions in signet miner (Antoine Poinsot)
a5f52cfcc400ad0adb41a78c65b8abb971e0d622 qa: timelock coinbase transactions created in functional tests (Antoine Poinsot)
Pull request description:
The Consensus Cleanup soft fork proposal includes enforcing that coinbase transactions set their
nLockTime field to the block height minus 1, as well as their nSequence such as to not disable the
timelock. If such a fork were to be activated by Bitcoin users, miners need to be ready to produce
compliant blocks at the risk of losing substantial amounts mining would-be invalid blocks. As miners
are unfamously slow to upgrade, it's good to make this change as early as possible.
Although Bitcoin Core's GBT implementation does not provide the `coinbasetxn` field, and mining
pool software crafts the coinbase on its own, updating the Bitcoin Core mining code is a first step
toward convincing pools to update their (often closed source) code. A possible followup is also to
introduce new fields to GBT. In addition, this first step also makes it possible to test future
Consensus Cleanup changes.
The commit making the change also updates a bunch of seemingly-unrelated tests. This is because those tests were asserting error messages based on the txid of transactions involved, and changing the coinbase transaction structure necessarily changes the txid of all tests' transactions.
ACKs for top commit:
Sjors:
Code review ACK a58cb3b1c12c8cb75a87375c50f94c4605bb805d
achow101:
ACK a58cb3b1c12c8cb75a87375c50f94c4605bb805d
TheCharlatan:
Re-ACK a58cb3b1c12c8cb75a87375c50f94c4605bb805d
Tree-SHA512: a2aae009a187eb760d34435f518a895ee76c6b02a667eb030ddf6bd584da6e8eae2737d974dbf81a928d60c07bcb4820f055adc067e18d8819640db0240bb513
f1b142856a4ecd0a0d90bc3d73ef5137997b14ff test: Same addr, diff port is already connected (David Gumberg)
94e85a82a753a0aa5ad688fc46330e83c9a697fe net: remove unnecessary check from AlreadyConnectedToAddress() (Vasil Dimov)
Pull request description:
`CConnman::AlreadyConnectedToAddress()` searches the existent nodes by address or by address-and-port:
```cpp
FindNode(static_cast<CNetAddr>(addr)) || FindNode(addr.ToStringAddrPort())
```
but:
* if there is a match by just the address, then the address-and-port search will not be evaluated and the whole condition will be `true`
* if the there is no node with the same address, then the second search by address-and-port will not find a match either.
The search by address-and-port is comparing against `CNode::m_addr_name` which could be a hostname, e.g. `"node.foobar.com:8333"`, but `addr.ToStringAddrPort()` is always going to be numeric.
---
In other words: let `A` be "CNetAddr equals" and `B` be "addr:port string matches", then:
* If `A` (is `true`), then `B` is irrelevant, so the condition `A || B` is equivalent to `A` is `true`.
* Observation in this PR: if `!A` (`A` is `false`), then `!B` for sure, thus the condition `A || B` is equivalent to `A` is `false`.
So, simplify `A || B` to `A`.
https://en.wikipedia.org/wiki/Modus_tollens `!A => !B` is equivalent to `B => A`. So the added fuzz test asserts that if `B` is `true`, then `A` is `true`.
ACKs for top commit:
davidgumberg:
crACK f1b142856a4ecd0a0d90bc3d
achow101:
ACK f1b142856a4ecd0a0d90bc3d73ef5137997b14ff
theuni:
utACK f1b142856a4ecd0a0d90bc3d73ef5137997b14ff
mzumsande:
Code Review ACK f1b142856a4ecd0a0d90bc3d73ef5137997b14ff
Tree-SHA512: d744b60e9bace121faa3a746463f6b6e0e6ef08eac0e7879326cbd5f4721e47e6e10f6203dfd3870a2057c4ddd1860692c070ef048a76d773b84e6c2f840cc86
e3014017bacff42d8d69f3061ce1ee621aaa450a test: add IsActiveAfter tests for versionbits (Anthony Towns)
60950f77c35e54e2884cfc14ab67623f3e325099 versionbits: docstrings for BIP9Info (Anthony Towns)
7565563bc7a5bb98ebf03a7d6881912a74d3f302 tests: refactor versionbits fuzz test (Anthony Towns)
2e4e9b9608c722aaf767638e9dba498d8dc3e772 tests: refactor versionbits unit test (Anthony Towns)
525c00f91bb27d0f2a1b2e5532aebec7fac97d3a versionbits: Expose VersionBitsConditionChecker via impl header (Anthony Towns)
e74a7049b477d1853191ded75fdf25024a6e233f versionbits: Expose StateName function (Anthony Towns)
d00d1ed52c8ee95eeed665d68d6715a694bd4c1f versionbits: Split out internal details into impl header (Anthony Towns)
37b9b67a39554465104c9cf1a74690f40019dbad versionbits: Simplify VersionBitsCache API (Anthony Towns)
1198e7d2fd665bf2bc49fd26773d4fd5fbc2b716 versionbits: Move BIP9 status logic for getblocktemplate to versionbits (Anthony Towns)
b1e967c3ec92738affb22d3b58483ebcdd8dfea2 versionbits: Move getdeploymentinfo logic to versionbits (Anthony Towns)
3bd32c20550e69688a4ff02409fb34b9a637b9c4 versionbits: Move WarningBits logic from validation to versionbits (Anthony Towns)
5da119e5d0e61f0b583f0fe21b9a00ee815a3e46 versionbits: Change BIP9Stats to uint32_t types (Anthony Towns)
a679040ec19ef17f3f03988a52207f1c03af701e consensus/params: Move version bits period/threshold to bip9 param (Anthony Towns)
e9d617095d4ce9525a4337d33624cac9d6b4abe6 versionbits: Remove params from AbstractThresholdConditionChecker (Anthony Towns)
9bc41f1b48b2e0cc6abf9714e860a29989d7809c versionbits: Use std::array instead of C-style arrays (Anthony Towns)
Pull request description:
Increases the encapsulation/modularity of the versionbits code, moving more of the logic into the versionbits module rather than having it scattered across validation and rpc code. Updates unit/fuzz tests to test the actual code used rather than just a close approximation of it.
ACKs for top commit:
achow101:
ACK e3014017bacff42d8d69f3061ce1ee621aaa450a
TheCharlatan:
Re-ACK e3014017bacff42d8d69f3061ce1ee621aaa450a
darosior:
ACK e3014017bacff42d8d69f3061ce1ee621aaa450a
Tree-SHA512: 2978db5038354b56fa1dd6aafd511099e9c16504d6a88daeac2ff2702c87bcf3e55a32e2f0a7697e3de76963b68b9d5ede7976ee007e45862fa306911194496d
`CConnman::AlreadyConnectedToAddress()` searches the existent nodes by
address or by address-and-port:
```cpp
FindNode(static_cast<CNetAddr>(addr)) || FindNode(addr.ToStringAddrPort())
```
but:
* if there is a match by just the address, then the address-and-port
search will not be evaluated and the whole condition will be `true`
* if the there is no node with the same address, then the second search
by address-and-port will not find a match either.
The search by address-and-port is comparing against `CNode::m_addr_name`
which could be a hostname, e.g. `"node.foobar.com:8333"`, but
`addr.ToStringAddrPort()` is always going to be numeric.
55b931934a34bab11446e8eed7bdaef92bb056de removed duplicate calling of GetDescriptorScriptPubKeyMan (Saikiran)
Pull request description:
Removed duplicate call to GetDescriptorScriptPubKeyMan and
Instead of checking linearly I have used find method so time complexity reduced significantly for GetDescriptorScriptPubKeyMan
after this fix improved performance of importdescriptor part refs https://github.com/bitcoin/bitcoin/issues/32013.
**Steps to reproduce in testnet environment**
**Input size:** 2 million address in the wallet
**Step1:** call importaddresdescriptor rpc method
observe the time it has taken.
**With the provided fix:**
Do the same steps again
observe the time it has taken.
There is a huge improvement in the performance. (previously it may take 5 to 6 seconds now it will take 1 seconds or less)
main changes i've made during this pr:
1. remove duplicate call to GetDescriptorScriptPubKeyMan method
2. And inside GetDescriptorScriptPubKeyMan method previously we checking **each address linearly** so each time it is calling HasWallet method which has aquired lock.
3. Now i've modified this logic call **find method on the map (O(logn)**) time it is taking, so only once we calling HasWallet method.
**Note:** Smaller inputs in the wallet you may not see the issue but huge wallet size it will definitely impact the performance.
ACKs for top commit:
achow101:
ACK 55b931934a34bab11446e8eed7bdaef92bb056de
w0xlt:
ACK 55b931934a
Tree-SHA512: 4a7fdbcbb4e55bd034e9cf28ab4e7ee3fb1745fc8847adb388c98a19c952a1fb66d7b54f0f28b4c2a75a42473923742b4a99fb26771577183a98e0bcbf87a8ca
3669ecd4ccd8e7a1e2b1a9dcbe708c51c78e4d6c doc: Document fuzz build options (Anthony Towns)
c1d01f59acc2067ecbf8a8b42ba0d8e596694439 fuzz: enable running fuzz test cases in Debug mode (Anthony Towns)
Pull request description:
When building with
BUILD_FOR_FUZZING=OFF
BUILD_FUZZ_BINARY=ON
CMAKE_BUILD_TYPE=Debug
allow the fuzz binary to execute given test cases (without actual fuzzing) to make it easier to reproduce fuzz test failures in a more normal debug build.
In Debug builds, deterministic fuzz behaviour is controlled via a runtime variable, which is normally false, but set to true automatically in the fuzz binary, unless the FUZZ_NONDETERMINISM environment variable is set.
ACKs for top commit:
maflcko:
re-ACK 3669ecd4ccd8e7a1e2b1a9dcbe708c51c78e4d6c 🏉
marcofleon:
re ACK 3669ecd4ccd8e7a1e2b1a9dcbe708c51c78e4d6c
ryanofsky:
Code review ACK 3669ecd4ccd8e7a1e2b1a9dcbe708c51c78e4d6c with just variable renamed and documentation added since last review
Tree-SHA512: 5da5736462f98437d0aa1bd01aeacb9d46a9cc446a748080291067f7a27854c89f560f3a6481b760b9a0ea15a8d3ad90cd329ee2a008e5e347a101ed2516449e
When building with
BUILD_FOR_FUZZING=OFF
BUILD_FUZZ_BINARY=ON
CMAKE_BUILD_TYPE=Debug
allow the fuzz binary to execute given test cases (without actual
fuzzing) to make it easier to reproduce fuzz test failures in a more
normal debug build.
In Debug builds, deterministic fuzz behaviour is controlled via a runtime
variable, which is normally false, but set to true automatically in the
fuzz binary, unless the FUZZ_NONDETERMINISM environment variable is set.
2835216ec09cc2d86b091824376b15601e7c7b8a txgraph: make GroupClusters use partition numbers directly (optimization) (Pieter Wuille)
c72c8d5d45d9a87e4cf78e66f9737b9e6f371d2d txgraph: compare sequence numbers instead of Cluster* (bugfix) (Pieter Wuille)
Pull request description:
Part of cluster mempool: #30289
The implicit transaction ordering for transactions in a TxGraphImpl is defined by:
1. higher chunk feerate first
2. lower Cluster* object pointer first
3. lower position within cluster linearization first.
Number (2) is not deterministic, as it intricately depends on the heap allocation algorithm. Fix this by giving each Cluster a unique `uint64_t m_sequence` value, and sorting by those instead.
The second commit then uses this new approach to optimize GroupClusters a bit more, avoiding some repeated checks and dereferences, by making a local copy of the involved sequence numbers.
Thanks to @dergoegge for pointing this out.
ACKs for top commit:
instagibbs:
reACK 2835216ec09cc2d86b091824376b15601e7c7b8a
marcofleon:
ACK 2835216ec09cc2d86b091824376b15601e7c7b8a
glozow:
utACK 2835216ec09cc2d86b091824376b15601e7c7b8a
Tree-SHA512: d772a55b9ed620159b934a42a39fca7f900d4aa89c099a280a0c61ea0bd7c4fc39b388281ffc775064ea77b0b17263871b4c9763aa71c710a79287d5eb2cd4b4
fa6a007b8e7b68d559b30c04dd8d76c877bef133 fuzz: Avoid integer sanitizer warnings in policy_estimator target (MarcoFalke)
Pull request description:
It seems odd to write a fuzz target to trigger integer sanitizer warnings in `CBlockPolicyEstimator::processBlockTx` and then suppress them. If the scenario can happen in reality, the code should be properly fixed to handle the cases. If not, it seems better to fix the fuzz target to not trigger meaningless traces.
Do that here by keeping track of the current height and limiting mempool entries to at most this entry height.
ACKs for top commit:
brunoerg:
ACK fa6a007b8e7b68d559b30c04dd8d76c877bef133
dergoegge:
utACK fa6a007b8e7b68d559b30c04dd8d76c877bef133
Tree-SHA512: 2092017dc309fb095fe5d43cfb76efb691795f303d567ee919be2b5cac19a944293636229903dc4d1e8b9fe5daf9dc3058544321eff1735f91f804c3baa36cd0
ec81a72b369ab9efe23681ebb6e8fab34ce2e0f2 net: Add randomized prefix to Tor stream isolation credentials (laanwj)
c47f81e8ac110b1d7f78bce0232e8015366d13e7 net: Rename `_randomize_credentials` Proxy parameter to `tor_stream_isolation` (laanwj)
Pull request description:
Add a class TorsStreamIsolationCredentialsGenerator that generates unique credentials based on a randomly generated session prefix and an atomic counter. Use this in `ConnectThroughProxy` instead of a simple atomic int counter.
This makes sure that different launches of the application won't share the same credentials, and thus circuits, even in edge cases.
Example with `-debug=proxy`:
```
2025-03-31T16:30:27Z [proxy] SOCKS5 sending proxy authentication 0afb2da441f5c105-0:0afb2da441f5c105-0
2025-03-31T16:30:31Z [proxy] SOCKS5 sending proxy authentication 0afb2da441f5c105-1:0afb2da441f5c105-1
```
Thanks to hodlinator in https://github.com/bitcoin/bitcoin/pull/32166#discussion_r2020973352 for the idea.
ACKs for top commit:
hodlinator:
re-ACK ec81a72b369ab9efe23681ebb6e8fab34ce2e0f2
jonatack:
ACK ec81a72b369ab9efe23681ebb6e8fab34ce2e0f2
danielabrozzoni:
tACK ec81a72b369ab9efe23681ebb6e8fab34ce2e0f2
Tree-SHA512: 195f5885fade77545977b91bdc41394234ae575679cb61631341df443fd8482cd74650104e323c7dbfff7826b10ad61692cca1284d6810f84500a3488f46597a
faa3ce31998cdbdcb888132b0d38ee7a1e743682 fuzz: Avoid influence on the global RNG from peerman m_rng (MarcoFalke)
faf4c1b6fc330885ddf84b643838d1e301aaeab2 fuzz: Disable unused validation interface and scheduler in p2p_headers_presync (MarcoFalke)
fafaca6cbc2fdabf95c286c9d8346e331a2de216 fuzz: Avoid setting the mock-time twice (MarcoFalke)
fad22149f4677a650939be5aa0865f5f6e709aec refactor: Use MockableSteadyClock in ReportHeadersPresync (MarcoFalke)
fa9c38794efb476d2c6e9dcbc9b137b8a1bac64a test: Introduce MockableSteadyClock::mock_time_point and ElapseSteady helper (MarcoFalke)
faf2d512c529db9ec7edd14bb2cd3e75f037489c fuzz: Move global node id counter along with other global state (MarcoFalke)
fa98455e4b162876d6a13057a0d6c03efae32505 fuzz: Set ignore_incoming_txs in p2p_headers_presync (MarcoFalke)
faf2e238fbbb23fd0e20970dd95dcf01846df9aa fuzz: Shuffle files before testing them (MarcoFalke)
Pull request description:
This should make the `p2p_headers_presync` fuzz target more deterministic.
Tracking issue: https://github.com/bitcoin/bitcoin/issues/29018.
The first commits adds an `ElapseSteady` helper and type aliases. The second commit uses those helpers in `ReportHeadersPresync` and in the fuzz target to increase determinism.
### Testing
It can be tested via (setting 32 parallel threads):
```
cargo run --manifest-path ./contrib/devtools/deterministic-fuzz-coverage/Cargo.toml -- $PWD/bld-cmake/ $PWD/../b-c-qa-assets/fuzz_corpora/ p2p_headers_presync 32
```
The failing diff is contained in the commit messages, if applicable.
ACKs for top commit:
Crypt-iQ:
tACK faa3ce31998cdbdcb888132b0d38ee7a1e743682
janb84:
Re-ACK [faa3ce3](faa3ce3199)
marcofleon:
ACK faa3ce31998cdbdcb888132b0d38ee7a1e743682
Tree-SHA512: 7e2e0ddf3b4e818300373d6906384df57a87f1eeb507fa43de1ba88cf03c8e6752a26b6e91bfb3ee26a21efcaf1d0d9eaf70d311d1637b671965ef4cb96e6b59
This should avoid the remaining non-determistic code coverage paths.
Without this patch, the tool would report a diff (only when running
without libFuzzer):
cargo run --manifest-path ./contrib/devtools/deterministic-fuzz-coverage/Cargo.toml -- $PWD/bld-cmake/ $PWD/../qa-assets/fuzz_corpora/ p2p_headers_presync 32
It should be sufficient to set it once. Especially, if the dynamic value
is only used by ResetAndInitialize.
This also avoids non-determistic code paths, when ResetAndInitialize may
re-initialize m_next_inv_to_inbounds.
Without this patch, the tool would report a diff:
cargo run --manifest-path ./contrib/devtools/deterministic-fuzz-coverage/Cargo.toml -- $PWD/bld-cmake/ $PWD/../qa-assets/fuzz_corpora/ p2p_headers_presync 32
...
- 1126| 3| m_next_inv_to_inbounds = now + m_rng.rand_exp_duration(average_interval);
- 1127| 3| }
+ 1126| 10| m_next_inv_to_inbounds = now + m_rng.rand_exp_duration(average_interval);
+ 1127| 10| }
1128| 491| return m_next_inv_to_inbounds;
...
This allows the clock to be mockable in tests. Also, replace cs_main
with GetMutex() while touching this function.
Also, use the ElapseSteady test helper in the p2p_headers_presync fuzz
target to make it more deterministic.
The m_last_presync_update variable is a global that is not reset in
ResetAndInitialize. However, it is only used for logging, so completely
disable it for now.
Without this patch, the tool would report a diff:
cargo run --manifest-path ./contrib/devtools/deterministic-fuzz-coverage/Cargo.toml -- $PWD/bld-cmake/ $PWD/../qa-assets/fuzz_corpora/ p2p_headers_presync 32
...
4468| 81| auto now = std::chrono::steady_clock::now();
4469| 81| if (now < m_last_presync_update + std::chrono::milliseconds{250}) return;
- ^80
+ ^79
...
The global m_headers_presync_stats is not reset in ResetAndInitialize.
This may lead to non-determinism.
Fix it by incrementing the global node id counter instead.
Without this patch, the tool would report a diff:
cargo run --manifest-path ./contrib/devtools/deterministic-fuzz-coverage/Cargo.toml -- $PWD/bld-cmake/ $PWD/../qa-assets/fuzz_corpora/ p2p_headers_presync 32
...
2587| 3.73k| if (best_it == m_headers_presync_stats.end()) {
------------------
- | Branch (2587:17): [True: 80, False: 3.65k]
+ | Branch (2587:17): [True: 73, False: 3.66k]
------------------
...
When iterating over all fuzz input files in a folder, the order should
not matter.
However, shuffling may be useful to detect non-determinism.
Thus, shuffle in fuzz.cpp, when using neither libFuzzer, nor AFL.
Also, shuffle in the deterministic-fuzz-coverage tool, when using
libFuzzer.
Rather than use an ad-hoc reimplementation of wide multiplication inside the
fuzz test, reuse arith_uint256, which already has this. It's larger than what we
need here, but performance isn't a concern in this test, and it does what we need.
Add a class TorsStreamIsolationCredentialsGenerator that generates
unique credentials based on a randomly generated session prefix
and an atomic counter.
This makes sure that different launches of the application won't share
the same credentials, and thus circuits, even in edge cases.
Example with `-debug=proxy`:
```
2025-03-31T16:30:27Z [proxy] SOCKS5 sending proxy authentication 0afb2da441f5c105-0:0afb2da441f5c105-0
2025-03-31T16:30:31Z [proxy] SOCKS5 sending proxy authentication 0afb2da441f5c105-1:0afb2da441f5c105-1
```
Thanks to hodlinator for the idea.
57d8b1f1b3375452213c48ce80e8207e2fe53ebc cmake: Avoid fuzzer "multiple definition of `main'" errors (Ryan Ofsky)
Pull request description:
This change builds libraries with `-fsanitize=fuzzer-no-link` instead of `-fsanitize=fuzzer` when the cmake `-DSANITIZERS=fuzzer` option is specified. This is necessary to make fuzzing and IPC cmake options compatible with each other and avoid CI failures in #30975 which enables IPC in the fuzzer CI build:
https://cirrus-ci.com/task/5366255504326656?logs=ci#L2817https://cirrus-ci.com/task/5233064575500288?logs=ci#L2384
The failures can also be reproduced by checking out #31741 and building with `cmake -B build -DBUILD_FOR_FUZZING=ON -DSANITIZERS=fuzzer -DENABLE_IPC=ON` with this fix reverted.
The fix updates the cmake build so when `-DSANITIZERS=fuzzer` is specified, the fuzz test binary is built with `-fsanitize=fuzzer` (so it can use libFuzzer's main function), and libraries are built with `-fsanitize=fuzzer-no-link` (so they can be linked into other executables with their own main functions).
Previously when `-DSANITIZERS=fuzzer` was specified, `-fsanitize=fuzzer` was applied to ALL libraries and executables. This was inappropriate because it made it impossible to build any executables other than the fuzz test executable without triggering link errors:
- `` multiple definition of `main' ``
- `` "undefined reference to `LLVMFuzzerTestOneInput' ``
if they depended on any libraries instrumented for fuzzing.
This was especially a problem when the `ENABLE_IPC` option was set because it made building the `mpgen` code generator impossible so nothing else that depended on generated sources, including the fuzz test binary, could be built either.
This commit was previously part of https://github.com/bitcoin/bitcoin/pull/31741 and had some discussion there starting in https://github.com/bitcoin/bitcoin/pull/31741#pullrequestreview-2619682385
---
This PR is part of the [process separation project](https://github.com/bitcoin/bitcoin/issues/28722).
ACKs for top commit:
hebasto:
ACK 57d8b1f1b3375452213c48ce80e8207e2fe53ebc, tested on Ubuntu 24.04.
Tree-SHA512: 4011adbc0b08742e83cf7c0560d3d5b5694a863358e6ac9a21239626b4a8fedceca66db34b5a46136a7b26849bb1d8710c894689322ae97e1c407687c3f57d50
This abstracts out the finding of the connected component that includes
a given element from FindConnectedComponent (which just finds any connected
component).
Use this in the txgraph fuzz test, which was effectively reimplementing this
logic. At the same time, improve its performance by replacing a vector with a
set.
b1de59e8965354fff5a149bc0fe61ed0704aea7a fuzz: extract unsequenced operations with side-effects (Lőrinc)
Pull request description:
https://github.com/bitcoin/bitcoin/pull/30746#discussion_r1817851827 introduced unsequenced operations with side-effects - which is undefined behavior, i.e. the right hand side can be evaluated before the left hand side, which happens to mutate it.
<details>
<summary>Tried to find other occurrences</summary>
```bash
clang++ --analyze -std=c++20 -I./src -I./src/test -I./src/test/fuzz src/test/fuzz/base_encode_decode.cpp src/psbt.cpp
```
but it didn't warn about UB.
Grepped for similar ones, but could find any other one in the codebase:
```bash
> grep -rnE --include='*.cpp' --include='*.h' '\b(\w+)\(([^)]*\b(\w+)\b[^)]*)\)\s*==\s*\3\.' .
./src/test/arith_uint256_tests.cpp:373: BOOST_CHECK(R1L.GetHex() == R1L.ToString());
./src/test/arith_uint256_tests.cpp:374: BOOST_CHECK(R2L.GetHex() == R2L.ToString());
./src/test/arith_uint256_tests.cpp:375: BOOST_CHECK(OneL.GetHex() == OneL.ToString());
./src/test/arith_uint256_tests.cpp:376: BOOST_CHECK(MaxL.GetHex() == MaxL.ToString());
./src/test/fuzz/cluster_linearize.cpp:565: assert(depgraph.FeeRate(best_anc.transactions) == best_anc.feerate);
./src/test/fuzz/cluster_linearize.cpp:646: assert(depgraph.FeeRate(found.transactions) == found.feerate);
./src/test/fuzz/cluster_linearize.cpp:765: assert(depgraph.FeeRate(chunk_info.transactions) == chunk_info.feerate);
./src/test/fuzz/base_encode_decode.cpp:95: assert(DecodeBase64PSBT(psbt, random_string, error) == error.empty());
./src/test/fuzz/key.cpp:102: assert(pubkey.data() == pubkey.begin());
./src/test/skiplist_tests.cpp:42: BOOST_CHECK(vIndex[from].GetAncestor(0) == vIndex.data());
./src/script/signingprovider.cpp:535: ComputeTapbranchHash(node.sub[1]->hash, node.sub[1]->hash) == node.hash) {
./src/pubkey.h:78: return vch.size() > 0 && GetLen(vch[0]) == vch.size();
./src/cluster_linearize.h:881: Assume(elem.inc.feerate.IsEmpty() == elem.pot_feerate.IsEmpty());
```
</details>
Hodlinator deduced the UB on Windows in https://github.com/bitcoin/bitcoin/issues/32135#issuecomment-2751723855Fixes#32135
ACKs for top commit:
maflcko:
lgtm ACK b1de59e8965354fff5a149bc0fe61ed0704aea7a
hodlinator:
ACK b1de59e8965354fff5a149bc0fe61ed0704aea7a
marcofleon:
Nice, ACK b1de59e8965354fff5a149bc0fe61ed0704aea7a
brunoerg:
code review ACK b1de59e8965354fff5a149bc0fe61ed0704aea7a
Tree-SHA512: d66524424c7f749eba870f5bd6038da79666ac638047b31dd8ff15a77d927facb54b4735e8afb7984648fdc9e2dd59ea213996c352301fa05978f041511361d4
b2ea3656481b4196acaf6a1b5f3949a9ba4cf48f txgraph: Add Get{Ancestors,Descendants}Union functions (feature) (Pieter Wuille)
54bceddd3ab39918834d72e9c77eb14e41996652 txgraph: Multiple inputs to Get{Ancestors,Descendant}Refs (preparation) (Pieter Wuille)
aded04701925781ffe194e11e4782261e4736339 txgraph: Add CountDistinctClusters function (feature) (Pieter Wuille)
b685d322c9739ca03b9d0bb9fa57aabea1890060 txgraph: Add DoWork function (feature) (Pieter Wuille)
295a1ca8bbbe5e61bd936158ca33cda5d5e58afd txgraph: Expose ability to compare transactions (feature) (Pieter Wuille)
22c68cd153b72f867dffcc7a62a3f65cef9038fb txgraph: Allow Refs to outlive the TxGraph (feature) (Pieter Wuille)
82fa3573e197f184054fc5096f13ea2520a8d219 txgraph: Destroying Ref means removing transaction (feature) (Pieter Wuille)
6b037ceddfd0160981bd401630c610ad2a3cf000 txgraph: Cache oversizedness of graphs (optimization) (Pieter Wuille)
8c70688965bc4038f28f41e4490180e40a88b5ee txgraph: Add staging support (feature) (Pieter Wuille)
c99c7300b4443f70e452cb97c42b9c2513b372d7 txgraph: Abstract out ClearLocator (refactor) (Pieter Wuille)
34aa3da5adea40615d80588bb0ff8b78d6d292a8 txgraph: Group per-graph data in ClusterSet (refactor) (Pieter Wuille)
36dd5edca5b00f4140f19f364ff93a5a7dd4bbe3 txgraph: Special-case removal of tail of cluster (Optimization) (Pieter Wuille)
5801e0fb2b99f44ac24531779acf0d44ec35b98c txgraph: Delay chunking while sub-acceptable (optimization) (Pieter Wuille)
57f5499882afe170612e0afd4ef6d91561738288 txgraph: Avoid looking up the same child cluster repeatedly (optimization) (Pieter Wuille)
1171953ac6091950f06646a8cc85ca10683023ce txgraph: Avoid representative lookup for each dependency (optimization) (Pieter Wuille)
64f69ec8c383436d1a657add1b8a7eee3e75f61f txgraph: Make max cluster count configurable and "oversize" state (feature) (Pieter Wuille)
1d27b74c8e3bf055fb8b0a5fc5d664bd5048bec6 txgraph: Add GetChunkFeerate function (feature) (Pieter Wuille)
c80aecc24ddd878c62be9753a2746e36860e3a97 txgraph: Avoid per-group vectors for clusters & dependencies (optimization) (Pieter Wuille)
ee57e93099f243cf9fbf9c10265057a53f06e062 txgraph: Add internal sanity check function (tests) (Pieter Wuille)
05abf336f997f477c6f48412809ab540fccf1cb0 txgraph: Add simulation fuzz test (tests) (Pieter Wuille)
8ad3ed26818a620cb973cd4e5eaa7b49313f562b txgraph: Add initial version (feature) (Pieter Wuille)
6eab3b2d7380b8ff818e3a1cefeb7731b7342e04 feefrac: Introduce tagged wrappers to distinguish vsize/WU rates (Pieter Wuille)
d4497738999873c8432d02fd71e14f1afc2065a8 scripted-diff: (refactor) ClusterIndex -> DepGraphIndex (Pieter Wuille)
bfeb69f6e00d94b94171cebf351fac69bec489cc clusterlin: Make IsAcyclic() a DepGraph member function (Pieter Wuille)
0aa874a357865dd4768091f26dff238e66fb8d83 clusterlin: Add FixLinearization function + fuzz test (Pieter Wuille)
Pull request description:
Part of cluster mempool: #30289.
### 1. Overview
This introduces the `TxGraph` class, which encapsulates knowledge about the (effective) fees, sizes, and dependencies between all mempool transactions, but nothing else. In particular, it lacks knowledge about `CTransaction`, inputs, outputs, txids, wtxids, prioritization, validatity, policy rules, and a lot more. Being restricted to just those aspects of the mempool makes the behavior very easy to fully specify (ignoring the actual linearizations produced), and write simulation-based tests for (which are included in this PR).
### 2. Interface
The interface can be largely categorized into:
* Mutation functions:
* `AddTransaction` (add a new transaction with specified feerate, and get a `Ref` object back to identify it).
* `RemoveTransaction` (given a `Ref` object, remove the transaction).
* `AddDependency` (given two `Ref` objects, add a dependency between them).
* `SetTransactionFee` (modify the fee associated with a Ref object).
* Inspector functions:
* `GetAncestors` (get the ancestor set in the form of `Ref*` pointers)
* `GetAncestorsUnion` (like above, but for the union of ancestors of multiple `Ref*` pointers)
* `GetDescendants` (get the descendant set in the form of `Ref*` pointers)
* `GetDescendantsUnion` (like above, but for the union of ancestors of multiple `Ref*` pointers)
* `GetCluster` (get the connected component set in the form of `Ref*` pointers, in the order they would be mined).
* `GetIndividualFeerate` (get the feerate of a transaction)
* `GetChunkFeerate` (get the mining score of a transaction)
* `CountDistinctClusters` (count the number of distinct clusters a list of `Ref`s belong to)
* Staging functions:
* `StartStaging` (make all future mutations operate on a proposed transaction graph)
* `CommitStaging` (apply all the changes that are staged)
* `AbortStaging` (discard all the changes that are staged)
* Miscellaneous functions:
* `DoWork` (do queued-up computations now, so that future operations are fast)
This `TxGraph::Ref` type used as a "handle" on transactions in the graph can be inherited from, and the idea is that in the full cluster mempool implementation (#28676, after it is rebased on this), `CTxMempoolEntry` will inherit from it, and all actually used Ref objects will be `CTxMempoolEntry`s. With that, the mempool code can just cast any `Ref*` returned by txgraph to `CTxMempoolEntry*`.
### 3. Implementation
Internally the graph data is kept in clustered form (partitioned into connected components), for which linearizations are maintained and updated as needed using the `cluster_linearize.h` algorithms under the hood, but this is hidden from the users of this class. Implementation-wise, mutations are generally applied lazily, appending to queues of to-be-removed transactions and to-be-added dependencies, so they can be batched for higher performance. Inspectors will generally only evaluate as much as is needed to answer queries, with roughly 5 levels of processing to go to fully instantiated and acceptable cluster linearizations, in order:
1. `ApplyRemovals` (take batches of to-be-removed transactions and translate them to "holes" in the corresponding Clusters/DepGraphs).
2. `SplitAll` (creating holes in Clusters may cause them to break apart into smaller connected components, so make turn them into separate Clusters/linearizations).
3. `GroupClusters` (figure out which Clusters will need to be combined in order to add requested to-be-added dependencies, as these may span clusters).
4. `ApplyDependencies` (actually merge Clusters as precomputed by `GroupClusters`, and add the dependencies between them).
5. `MakeAcceptable` (perform the LIMO linearization algorithm on Clusters to make sure their linearizations are acceptable).
### 4. Future work
This is only an initial version of TxGraph, and some functionality is missing before #28676 can be rebased on top of it:
* The ability to get comparative feerate diagrams before/after for the set of staged changes (to evaluate RBF incentive-compatibility).
* Mining interface (ability to iterate transactions quickly in mining score order) (see #31444).
* Eviction interface (reverse of mining order, plus memory usage accounting) (see #31444).
* Ability to fix oversizedness of clusters (before or after committing) - this is needed for reorgs where aborting/rejecting the change just is not an option (see #31553).
* Interface for controlling how much effort is spent on LIMO. In this PR it is hardcoded.
Then there are further improvements possible which would not block other work:
* Making Cluster a virtual class with different implementations based on transaction count (which could dramatically reduce memory usage, as most Clusters are just a single transaction, for which the current implementation is overkill).
* The ability to have background thread(s) for improving cluster linearizations.
ACKs for top commit:
instagibbs:
reACK b2ea3656481b4196acaf6a1b5f3949a9ba4cf48f
ajtowns:
reACK b2ea3656481b4196acaf6a1b5f3949a9ba4cf48f
ismaelsadeeq:
reACK b2ea3656481b4196acaf6a1b5f3949a9ba4cf48f 🚀
glozow:
ACK b2ea3656481
Tree-SHA512: 0f86f73d37651fe47d469db1384503bbd1237b4556e5d50b1d0a3dd27754792d6fc3481f77a201cf2ed36c6ca76e0e44c30e175d112aacb53dfdb9e11d8abc6b
https://github.com/bitcoin/bitcoin/pull/30746#discussion_r1817851827 introduced an unsequenced operations with side-effects - which is undefined behavior, i.e. the right hand side can be evaluated before the left hand side, which happens to mutate it.
Tried:
```
clang++ --analyze -std=c++20 -I./src -I./src/test -I./src/test/fuzz src/test/fuzz/base_encode_decode.cpp src/psbt.cpp
```
but it didn't warn about UB.
Grepped for similar ones, but could find any other one in the codebase:
> grep -rnE --include='*.cpp' --include='*.h' '\b(\w+)\(([^)]*\b(\w+)\b[^)]*)\)\s*==\s*\3\.' .
```
./src/test/arith_uint256_tests.cpp:373: BOOST_CHECK(R1L.GetHex() == R1L.ToString());
./src/test/arith_uint256_tests.cpp:374: BOOST_CHECK(R2L.GetHex() == R2L.ToString());
./src/test/arith_uint256_tests.cpp:375: BOOST_CHECK(OneL.GetHex() == OneL.ToString());
./src/test/arith_uint256_tests.cpp:376: BOOST_CHECK(MaxL.GetHex() == MaxL.ToString());
./src/test/fuzz/cluster_linearize.cpp:565: assert(depgraph.FeeRate(best_anc.transactions) == best_anc.feerate);
./src/test/fuzz/cluster_linearize.cpp:646: assert(depgraph.FeeRate(found.transactions) == found.feerate);
./src/test/fuzz/cluster_linearize.cpp:765: assert(depgraph.FeeRate(chunk_info.transactions) == chunk_info.feerate);
./src/test/fuzz/base_encode_decode.cpp:95: assert(DecodeBase64PSBT(psbt, random_string, error) == error.empty());
./src/test/fuzz/key.cpp:102: assert(pubkey.data() == pubkey.begin());
./src/test/skiplist_tests.cpp:42: BOOST_CHECK(vIndex[from].GetAncestor(0) == vIndex.data());
./src/script/signingprovider.cpp:535: ComputeTapbranchHash(node.sub[1]->hash, node.sub[1]->hash) == node.hash) {
./src/pubkey.h:78: return vch.size() > 0 && GetLen(vch[0]) == vch.size();
./src/cluster_linearize.h:881: Assume(elem.inc.feerate.IsEmpty() == elem.pot_feerate.IsEmpty());
```
Hodlinator deduced the UB on Windows in https://github.com/bitcoin/bitcoin/issues/32135#issuecomment-2751723855
Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
In order to make it possible for higher layers to compare transaction quality
(ordering within the implicit total ordering on the mempool), expose a comparison
function and test it.
Before this commit, if a TxGraph::Ref object is destroyed, it becomes impossible
to refer to, but the actual corresponding transaction node in the TxGraph remains,
and remains indefinitely as there is no way to remove it.
Fix this by making the destruction of TxGraph::Ref trigger immediate removal of
the corresponding transaction in TxGraph, both in main and staging if it exists.
In order to make it easy to evaluate proposed changes to a TxGraph, introduce a
"staging" mode, where mutators (AddTransaction, AddDependency, RemoveTransaction)
do not modify the actual graph, but just a staging version of it. That staging
graph can then be commited (replacing the main one with it), or aborted (discarding
the staging).