fa48f8c8655d93e78b32b560a870577909b666d3 test: Add missing timeout_factor to zmq socket (MarcoFalke)
Pull request description:
Fixes https://github.com/bitcoin/bitcoin/issues/34189
Otherwise, the test may intermittently fail on slow CI systems that have `--timeout-factor=` properly set.
It can be tested by running `./bld-cmake/test/functional/interface_zmq.py --timeout-factor=10` with this diff:
```diff
diff --git a/src/validationinterface.cpp b/src/validationinterface.cpp
index c7be6abc3a..b14cf2aee6 100644
--- a/src/validationinterface.cpp
+++ b/src/validationinterface.cpp
@@ -166,2 +166,3 @@ void ValidationSignals::SyncWithValidationInterfaceQueue()
LOG_EVENT(fmt, local_name, __VA_ARGS__); \
+ UninterruptibleSleep(45ms); \
event(); \
diff --git a/test/functional/interface_zmq.py b/test/functional/interface_zmq.py
index 6717007626..eee377daea 100755
--- a/test/functional/interface_zmq.py
+++ b/test/functional/interface_zmq.py
@@ -176,3 +176,3 @@ class ZMQTest (BitcoinTestFramework):
for sub in subscribers:
- sub.socket.set(zmq.RCVTIMEO, recv_timeout*1000)
+ sub.socket.set(zmq.RCVTIMEO, int(recv_timeout * 1000))
@@ -271,3 +271,3 @@ class ZMQTest (BitcoinTestFramework):
[(topic, address) for topic in ["hashblock", "hashtx"]],
- recv_timeout=2) # 2 second timeout to check end of notifications
+ recv_timeout=0.2) # 2 second timeout to check end of notifications
self.disconnect_nodes(0, 1)
```
Before this pull: Test fails with `zmq.error.Again: Resource temporarily unavailable`
After this pull: Test passes
ACKs for top commit:
l0rinc:
code review ACK fa48f8c8655d93e78b32b560a870577909b666d3
achow101:
ACK fa48f8c8655d93e78b32b560a870577909b666d3
Tree-SHA512: 5ff8bdd807ff4b644daa634bb7b469da3da3915f58afba63a90e662df99cbebc86636e34e2b1b313c8629773aef2a239fb3025226a84d2ec22f6ecd4cea666c4
bbc8f1e0a7e5739f15b2e646a4ace338083309a3 ipc mining: Prevent ``Assertion `m_node.chainman' failed`` errors on early startup (Ryan Ofsky)
a7cabf92e4de83c87f6b9274ddd2fb70712d29f8 init refactor: Only initialize node.notifications one time (Ryan Ofsky)
c8e332cb33594cc307cdf89b6036a0e95c238cd8 init refactor: Remove node.init accesss in AppInitInterfaces (Ryan Ofsky)
Pull request description:
This fixes ``Assertion `m_node.chainman' failed`` errors first reported https://github.com/bitcoin/bitcoin/issues/33994#issuecomment-3602551596 when IPC mining methods are called before ChainstateManager is loaded.
The fix works by making the `Init.makeMining` method wait until chainstate data is loaded. It's probably the simplest possible fix but other alternatives like moving the wait to `Mining.createNewBlock` were discussed in the thread https://github.com/bitcoin/bitcoin/pull/34661#discussion_r2848176298 and could be implemented later without changes to clients.
ACKs for top commit:
Sjors:
utACK bbc8f1e0a7e5739f15b2e646a4ace338083309a3
ismaelsadeeq:
ACK bbc8f1e0a7e5739f15b2e646a4ace338083309a3
achow101:
ACK bbc8f1e0a7e5739f15b2e646a4ace338083309a3
Tree-SHA512: 3e2e4e28ccff364b2303efd06ce337a229c28609076638500acb29559f716a15ad99409c6970ce9ad91776d53e3f9d959f1bbbd144ea9a4a2fb578ddbf2da267
fa36adeb719b5d2a6ce8e1b69d3bbd8e2bd70349 ci: [refactor] Drop last use of pwsh (MarcoFalke)
fae31b1e2f1ed35b70a0bb855f37279e448c4ec2 ci: [refactor] Move github_import_vs_env to python script (MarcoFalke)
Pull request description:
The use of pwsh was a bit confusing and inconsistent around the exit code. See also https://github.com/bitcoin/bitcoin/pull/32672
I think it is fine to drop it and purely use Bash/Python.
This also moves a bit of code from the github yaml to the python script.
ACKs for top commit:
m3dwards:
Looks good! re-ACK fa36adeb71
hodlinator:
re-ACK fa36adeb719b5d2a6ce8e1b69d3bbd8e2bd70349
Tree-SHA512: 78edffc60c58c476b0acca5224150169d154b0b818114844a04295af9ba19b7cdf1fb2afb738f6cafd6172f0f477d546018ebf95061eb5bd8bbb35e065a129d4
fab51e470e738944dfc832926b05019e8b70f8c6 test: Move valgrind.supp to the other sanitizer_suppressions files (MarcoFalke)
fa9cf81d39e1354e4239933a053f6c91a6727fec test: Add missing resolve() to valgrind.supp file (MarcoFalke)
Pull request description:
(see commit message for details)
Can be tested via:
```
./bld-cmake/test/functional/feature_framework_testshell.py --valgrind
```
Before:
```
bitcoind exited with status 1 during initialization. ==1735813== FATAL: can't open suppressions file "/b-c/b-c-2/bld-cmake/contrib/valgrind.supp"
```
After: (passes)
Also, move the suppression file to all the other suppression files.
ACKs for top commit:
frankomosh:
Re-ACK fab51e470e738944dfc832926b05019e8b70f8c6
Tree-SHA512: d3e3e130c0e2292ca3ab9e80d2ebec6b4edc7914280ed90fb4af8a65cd1c9edd19064d86375a6787b864925fe0e47bab2321f89b9be8d1613a0aebf4ec2443fe
fa18be2f2ba19d5d35cb8a04fd4e1a7c4fc441ce test: Fix typo (MarcoFalke)
fac932698f6539e9ac4a13df51194bcb60d5d933 ci: Set TEST_RUNNER_PORT_MIN in test-each after cirrus runner switch (MarcoFalke)
Pull request description:
This is needed after the recent switch to cirrus runners in the task in commit c8c9c1e61759f689615a304254fed33cda7f895e.
Otherwise, the CI will fail:
```
node1 stderr Error: Unable to bind to 127.0.0.1:12321 on this computer. Bitcoin Core is probably already running.
```
(https://github.com/bitcoin/bitcoin/actions/runs/22398358349/job/64837827234?pr=31723#step:9:2605)
Also, include a random second commit, so that the CI task is run in this pull.
ACKs for top commit:
l0rinc:
ACK fa18be2f2ba19d5d35cb8a04fd4e1a7c4fc441ce
willcl-ark:
ACK fa18be2f2ba19d5d35cb8a04fd4e1a7c4fc441ce
Tree-SHA512: 6b63f645bf62d3e951ca155cddf3dc562b7ce675ccae4f9179e2202679685b5c147844eb350bd219b173fe2bb976376d0caa073d3e827a48c13aa015f4745b2c
Normally, when a symlinked test script is executed directly (e.g.,
`./bld-cmake/test/functional/wallet_disable.py`), Python's default
behavior is to resolve the symlink of the script itself, setting
`sys.path[0]` to the directory containing the physical source file.
Consequently, the test framework util.py is imported from the source
tree, and `Path(__file__).parents[3]` correctly resolves to the source
root.
However, `feature_framework_testshell.py` is unique because it manually
inserts `Path(__file__).parent` into `sys.path`. That refers to the
build tree and when importing the test framework util.py, the
`Path(__file__).parents[3]` will incorrectly point to the build
directory instead of the source root.
Use `.resolve()` to ensure the Valgrind suppressions file path is always
calculated relative to the physical source file, regardless of how the
framework was imported.
fa6af856341384e4a84c5674e66fe7c1f13dd73c refactor: Use static_cast<decltype(...)> to suppress integer sanitizer warning (MarcoFalke)
fa692974ac2d69e01091f03625cd8a227e310065 util: Fix UB in SetStdinEcho when ENOTTY (MarcoFalke)
Pull request description:
The call to `tcgetattr` may fail with `ENOTTY`, leaving the struct possibly uninitialized (UB).
Fix this UB by returning early when `isatty` fails, or when `tcgetattr` fails. (Same for Windows)
This can be tested by a command that fails valgrind before the change and passes after:
```
echo 'pipe' | valgrind --quiet ./bld-cmake/bin/bitcoin-cli -stdinrpcpass uptime
ACKs for top commit:
achow101:
ACK fa6af856341384e4a84c5674e66fe7c1f13dd73c
l0rinc:
lightly tested code review ACK fa6af856341384e4a84c5674e66fe7c1f13dd73c
sedited:
ACK fa6af856341384e4a84c5674e66fe7c1f13dd73c
Tree-SHA512: 76e2fbcb6c323b17736ee057dbd5e932b2e8cbb7d9fe4488c1dc7ab6ea928a3cde7e72ca0a63f8c8c78871ccb8b669263b712c0e1b530d88f2d45ea41f071201
45133c589a970390bc10e4db4f7d9921dbaa0832 doc: clarify `git range-diff` add/delete output (Lőrinc)
Pull request description:
### Problem
Range diffs in git are useful after PR rebases, but it has an easy-to-misread failure mode: if it cannot match a commit between the old and new ranges, it will show the old commit as removed (<) and the new commit as added (>), without showing any patch contents for that commit.
It can look like there were no code changes when in reality the commit was just treated as unrelated and needs full re-review.
### Example
```bash
git fetch upstream ff338fdb53a66ab40a36e1277e7371941fc89840 dd76338a57b9b1169ac27f7b783d6d0d4c6e38ab
git range-diff ff338fdb53a6...dd76338a57b9
```
This produced output like:
```patch
1: 0ca4295f2e = 93: 139aa4b27e bench: add on-disk `HaveInputs` benchmark
2: 4b32181dbb < -: ---------- test: add `HaveInputs` call-path unit tests
-: ---------- > 94: 277c57f0c5 test: add `HaveInputs` call-path unit tests
3: 8c57687f86 ! 95: c0c94ec986 dbwrapper: have `Read` and `Exists` reuse `ReadRaw`
@@ Metadata
## Commit message ##
dbwrapper: have `Read` and `Exists` reuse `ReadRaw`
- `ExistsImpl` was removed since it duplicates `CDBWrapper::ReadImpl` (except that it copies the resulting string on success, but that will be needed for caching anyway).
+ `ExistsImpl` was removed since it duplicates `CDBWrapper::ReadImpl`.
```
Even though the subject matches, there is no diff shown because the commits did not match - the reviewer could think that only the commit message was changed.
This should be treated as **unmatched** rather than **unchanged**.
If you expected a match, you can try increasing the search effort:
```bash
git range-diff --creation-factor=95 ff338fdb53a6...dd76338a57b9
```
which would show for example:
```patch
1: 0ca4295f2e = 93: 139aa4b27e bench: add on-disk `HaveInputs` benchmark
2: 4b32181dbb ! 94: 277c57f0c5 test: add `HaveInputs` call-path unit tests
@@ Commit message
The tests document that `HaveInputs()` consults the cache first and that a cache miss pulls from the backing view via `GetCoin()`.
+ Co-authored-by: Novo <eunovo9@gmail.com>
+
## src/test/coins_tests.cpp ##
@@ src/test/coins_tests.cpp: BOOST_FIXTURE_TEST_CASE(ccoins_flush_behavior, FlushTest)
}
}
-+BOOST_AUTO_TEST_CASE(ccoins_haveinputs_cache_miss_uses_base_getcoin)
++BOOST_AUTO_TEST_CASE(ccoins_cache_behavior)
```
### Fix
This PR updates `doc/productivity.md` to raise awareness and document this pitfall and mentions `--creation-factor` as a knob to try when the output seems unexpectedly empty.
ACKs for top commit:
maflcko:
review ACK 45133c589a970390bc10e4db4f7d9921dbaa0832 🏦
Sjors:
ACK 45133c589a970390bc10e4db4f7d9921dbaa0832
rkrux:
crACK 45133c5
sedited:
ACK 45133c589a970390bc10e4db4f7d9921dbaa0832
Tree-SHA512: 52dcf6db51425a3ac9789627f80233fb1e3437f7a351acf4a761504d9917837aa1ff8c964605a842ee099fae9842946784f7603f9bffa7051429b2f04b7900be
8834e4e86c851c4eb01a7e978cdbde5b3cbd24cc test: remove appveyor reference in comment (Max Edwards)
Pull request description:
Appveyor is not longer used however the test still requires to check for permissions including 666 as otherwise the tests fail on Windows
Fixes: #32576
ACKs for top commit:
maflcko:
lgtm ACK 8834e4e86c851c4eb01a7e978cdbde5b3cbd24cc
hebasto:
ACK 8834e4e86c851c4eb01a7e978cdbde5b3cbd24cc.
Tree-SHA512: 655b44e52da5e0c6c11c79bb4f92c701c6e0e66dce8d7791ccf1d64e4561fe4d1d5f37c1317bead89c88e4d7208a278925168b419482a6be17abf93d0ebc5dfa
408d5b12e80151feded31c2a5509e2dc5f15edf3 test: include response body in non-JSON HTTP error msg (Matthew Zipkin)
9dc653b3b4f3049b0e742499b762f7c13bb006cc test: threadpool, add coverage for all Submit() errors (furszy)
ce2a984ee324d37ba1dd7c2c4e27e40e0508bedc test: cleanup, use HasReason in threadpool_tests.cpp (l0rinc)
d9c6769d0324b65121935b7c8a285c6421fe74a6 test: refactor, decouple HasReason from test framework machinery (furszy)
dbbb780af02d850a1f9257f18610cfb9de9cb828 test: move and simplify BOOST_CHECK ostream helpers (Hodlinator)
3b7cbcafcb9b318bf1fa00a3499f514c5ebe9bb6 test: ensure Stop() thread helps drain the queue (seduless)
ca101a2315774f0ed65da633ba99899fd0dad740 test: coverage for queued tasks completion after interrupt (furszy)
bf2c607aaa22d253b9367c11b0a198bd4244ad2f threadpool: active-wait during shutdown (furszy)
e88d2744301a434064714f0a21e1395d41ac3984 test: add threadpool Start-Stop race coverage (furszy)
8cd4a4363fb85f5487a19ace82aa0d12d5fab450 threadpool: guard against Start-Stop race (furszy)
9ff1e82e7dbdf31ddf1c534853da4581a1f41bd5 test: cleanup, block threads via semaphore instead of shared_future (l0rinc)
Pull request description:
A few follow-ups to #33689, includes:
1) `ThreadPool` active-wait during shutdown:
Instead of just waiting for workers to finish processing tasks, `Stop()` now helps them actively.
This speeds up the JSON-RPC and REST server shutdown, resulting in a faster node shutdown when many requests remain unhandled. This wasn't included in the original PR due to the behavior change this introduces.
2) Decouple `HasReason` from the unit test framework machinery
This avoids providing the entire unit test framework dependency to low-level tests that only require access to the `HasReason` utility class. Examples are: `reverselock_tests.cpp`, `sync_tests.cpp`, `util_check_tests.cpp`, `util_string_tests.cpp`, `script_parse_tests.cpp` and `threadpool_tests.cpp`. These tests no longer gain access to unnecessary components like the chainstate, node context, caches, etc. It includes l0rinc's `threadpool_tests.cpp` `HasReason` changes.
3) Include response body in non-JSON HTTP error messages
Straight from pinheadmz [comment](https://github.com/bitcoin/bitcoin/pull/33689#discussion_r2783817192), it makes debugging CI issues easier.
ACKs for top commit:
maflcko:
review ACK 408d5b12e80151feded31c2a5509e2dc5f15edf3 🕗
achow101:
ACK 408d5b12e80151feded31c2a5509e2dc5f15edf3
hodlinator:
re-ACK 408d5b12e80151feded31c2a5509e2dc5f15edf3
Tree-SHA512: 57aa0ef96886f32bf95a0bd7f87c878d31c9df9e34cb96de615eee703ce0824b5cfdf8f5c9cd19a3594559994295b5810c38c94f5efd6291cbbd83a95473357a
0a8d303d667cc10a68fd15d799039b9ae26c3315 test: fix test_limit_enforcement_package (Greg Sanders)
Pull request description:
The current test has a couple issues:
1) the parent_tx_good is regenerating the exact same transaction that is already in the cluster, so it's resulting in no replacements on submission
2) once fixed, the additional fee needs to be allocated to the parent transaction in the package, not the child. If the RBF fees are allocated to the child, this triggers the package RBF logic, which requires no in-mempool ancestors to be present.
Fix the bug and add a few assertions to protect against regressions.
ACKs for top commit:
bensig:
ACK 0a8d303d667cc10a68fd15d799039b9ae26c3315
achow101:
ACK 0a8d303d667cc10a68fd15d799039b9ae26c3315
sipa:
ACK 0a8d303d667cc10a68fd15d799039b9ae26c3315
Tree-SHA512: 0ba184d82edc5a502e9119a6876e80c4564c876fa51ee39293d47bd30c18bf3ded50fbd2f6f2a3394784fad05d8f6370a90682068b30358b077280abd2477252
17a079c2fb96c44f25d6392389db22a7f424f8f1 ci: fix vcpkg tools cache key collision between windows matrix jobs (will)
Pull request description:
The standard and fuzz matrix jobs share the same github.job value
(windows-native-dll), so both try to save the vcpkg tools cache with the
same key.
Since the tools are identical across build types, let them share a
single cache entry by restricting the save to the standard job only.
ACKs for top commit:
maflcko:
lgtm ACK 17a079c2fb96c44f25d6392389db22a7f424f8f1
hebasto:
ACK 17a079c2fb96c44f25d6392389db22a7f424f8f1, this should fix issues like https://github.com/bitcoin/bitcoin/pull/34559#issuecomment-3897330307.
Tree-SHA512: 2e83846bfc88947b4bc321baa23563e0c093cd4f55f11f8105c2ecf867b78028aa71aa4780f928103b7a9f1f2e3cf72dbb072f05e7925bc1d00069234acf23c9
The standard and fuzz matrix jobs share the same github.job value
(windows-native-dll), so both try to save the vcpkg tools cache with the
same key.
Since the tools are identical across build types, let them share a
single cache entry by restricting the save to the standard job only.
Avoid providing the entire unit test framework dependency to tests that only
require access to the HasReason utility class.
E.g. reverselock_tests.cpp, sync_tests.cpp, util_check_tests.cpp, util_string_tests.cpp,
and script_parse_tests.cpp only require access to HasReason and nothing else.
Move the operator<< overloads used by BOOST_CHECK_* out of the
unit test machinery test/setup_common, into test/util/common.h.
And replace the individual per-type ToString() overloads with
a single concept-constrained template that covers any type
exposing a ToString() method. This is important to not add
uint256.h and transaction_identifier.h dependencies to the
shared test/util/common.h file.
Co-authored-by: furszy <matiasfurszyfer@protonmail.com>
Instead of waiting for the workers to finish processing tasks, help
them actively inside Stop().
This speeds up the JSON-RPC and REST server shutdown procedure,
and results in a faster node shutdown when many requests remain unhandled
e5f0613503b6973dbc886eba8e999f208d84853b net processing: Check if we are in ibd before processing block for txdownloadman (sedited)
ce8b692897f6aacbe936fe2220e85f23cd83cbf2 Add functional test exercising tx downloadman recently confirmed filter (Lőrinc)
Pull request description:
Calculating the rolling bloom filters for the txorphanage takes some CPU time from the scheduler thread. This can be observed for example in [this flamegraph](https://bitcoin-dev-tools.github.io/benchcoin/results/pr-172/20066462508/mainnet-default-instrumented-base-flamegraph.svg?x=920203898521&y=780), where handling the filter takes about 2.6% of total time (and most of the scheduler thread's time).
During ibd the entries in the tx download bloom filter are just continuously rolled over and aren't consumed, since no mempool entries are created by incoming transactions from peers during ibd. The mempool does accept transactions via RPC, or the wallet at the time, however these don't interact with the orphanage and the txdownloadman, because adding anything to those is guarded by IsInitialBlockDownload() checks as well.
We're usually latching ibd to false a few blocks before catching up to the tip, so this should also not significantly degrade the performance of the filter once fully caught up.
ACKs for top commit:
l0rinc:
ACK e5f0613503b6973dbc886eba8e999f208d84853b
instagibbs:
ACK e5f0613503b6973dbc886eba8e999f208d84853b
fjahr:
Code review ACK e5f0613503b6973dbc886eba8e999f208d84853b
Tree-SHA512: d667e677f5723c438cdf5b34f0f9c1ade7cc1b2e98530c23f14384514daa38217c4e7c3b756194b6831b590a487449c4514b52bf0fb461ae8083061722824270
33fbaed310a6a37d41d26af8fb34308d088d72c8 policy: don't CheckEphemeralSpends on reorg (Greg Sanders)
Pull request description:
Similar reasoning to https://github.com/bitcoin/bitcoin/pull/33504
During a deeper reorg it's possible that a long sequence of dust-having transactions that are connected in a linear fashion. On reorg, this could cause each subsequent "generation" to be rejected. These rejected transactions may contain a large amount of competitive fees via normal means.
PreCheck based `PreCheckEphemeralSpends` is left in place because we wouldn't have relayed them prior to the reorg.
ACKs for top commit:
darosior:
re-ACK 33fbaed310a
ismaelsadeeq:
reACK 33fbaed310a6a37d41d26af8fb34308d088d72c8
sedited:
ACK 33fbaed310a6a37d41d26af8fb34308d088d72c8
Tree-SHA512: cf0a6945066e9f5f8f9a847394c2c1225facf475a8aa4bc811b436513eff79c0a720d4ad21ba6b0f1cc4dfdd61cf46acb148333ac592b2ee252953732326ad1d
744d47fcee0d32a71154292699bfdecf954a6065 clusterlin: adopt trained cost model (feature) (Pieter Wuille)
4eefdfc5b7d0b86a523683de2a90da910b77a106 clusterlin: rescale costs (preparation) (Pieter Wuille)
ecc9a84f854e5b77dfc8876cf7c9b8d0f3de89d0 clusterlin: use 'cost' terminology instead of 'iters' (refactor) (Pieter Wuille)
9e7129df2962f7c52d07c14a56398bb285cac084 clusterlin: introduce CostModel class (preparation) (Pieter Wuille)
Pull request description:
Part of #30289, replaces earlier #34138.
This introduces a more accurate cost model for SFL, to control how much CPU time is spent inside the algorithm for clusters that cannot be linearized perfectly within a reasonable amount of time.
The goal is having a metric for the amount of work performed, so that txmempool can impose limits on that work: a lower bound that is always performed (unless optimality is reached before that point, of course), and an upper bound to limit the latency and total CPU time spent on this. There are conflicting design goals here:
* On the one hand, it seems ideal if this metric is closely correlated to actual CPU time, because otherwise the limits become inaccurate.
* On the other hand, it seems a nightmare to have the metric be platform/system dependent, as it makes network-wide reasoning nearly impossible. It's expected that slower systems take longer to do the same thing; this holds for everything, and we don't need to compensate for this.
There are multiple solutions to this:
* One extreme is just measuring the time. This is very accurate, but extremely platform dependent, and also non-deterministic due to random scheduling/cache effects.
* The other extreme is using a very abstract metric like counting how many times certain loops/function inside the algorithm run. That is what is implemented in master right now, just counting the sum of the numbers of transactions updated across all `UpdateChunks()` calls. It however necessarily fails to account for significant portions of runtime spent elsewhere, resulting in a rather wide range of "ns per cost" values.
* This PR takes a middle ground, counting many function calls / branches / loops, with weights that were determined through benchmarking on an average on a number of systems.
Specifically, the cost model was obtained by:
* For a variety of machines:
* Running a fixed collection of ~385000 clusters found through random generation and fuzzing, optimizing for difficulty of linearization.
* Linearize each 1000-5000 times, with different random seeds. Sometimes without input linearization, sometimes with a bad one.
* Gather cycle counts for each of the operations included in this cost model, broken down by their parameters.
* Correct the data by subtracting the runtime of obtaining the cycle count.
* Drop the 5% top and bottom samples from each cycle count dataset, and compute the average of the remaining samples.
* For each operation, fit a least-squares linear function approximation through the samples.
* Rescale all machine expressions to make their total time match, as we only care about relative cost of each operation.
* Take the per-operation average of operation expressions across all machines, to construct expressions for an average machine.
* Approximate the result with integer coefficients.
The benchmarks were performed by `l0rinc <pap.lorinc@gmail.com>` and myself, on AMD Ryzen 5950X, AMD Ryzen 7995WX, AMD Ryzen 9980X, Apple M4 Max, Intel Core i5-12500H, Intel Core Ultra 7 155H, Intel N150 (Umbrel), Intel Core i7-7700, Intel Core i9-9900K, Intel Haswell (VPS, virtualized), Intel Xeon E5-2637, ARM Cortex-A76 (Raspberry Pi 5), ARM Cortex-A72 (Raspberry Pi 4).
Based on final benchmarking, the "acceptable" iteration count (which is the minimum spent on every cluster) is to 75000 units, which corresponds to roughly 50 μs on Ryzen 5950X and similar modern desktop hardware.
ACKs for top commit:
instagibbs:
ACK 744d47fcee0d32a71154292699bfdecf954a6065
murchandamus:
reACK 744d47fcee0d32a71154292699bfdecf954a6065
Tree-SHA512: 5cb37a6bdd930389937c435f910410c3581e53ce609b9b594a8dc89601e6fca6e6e26216e961acfe9540581f889c14bf289b6a08438a2d7adafd696fc81ff517
3281824ecfa72c4f69ab69c94003b7f5a82c7265 fuzz: prevent invalid `FRESH` entries and surface `BatchWrite` errors (Lőrinc)
780f460635af86b91c4215e761b6895be762ed3e fuzz: avoid invalid `AddCoin` overwrites (Lőrinc)
d7e0d510f2bf2981e92e3b323aeba1c845377950 fuzz: make `AddCoins` query view for overwrites (Lőrinc)
b8fa6f0f701f04cffca6a085337b508381016649 util: introduce `TrySub` to prevent unsigned underflow (Lőrinc)
Pull request description:
### Problem
This is an alternative approach to #34647, fixes#34645.
### Fix
First, add `CheckedSub` and use it for decrements of `m_dirty_count` and `cachedCoinsUsage`, so unsigned underflows turn into immediate failures instead of silently wrapping and only failing later.
<details><summary>Assertion `j <= i' failed.</summary>
```bash
util/overflow.h:44 T CheckedSub(const T, const U) [T = unsigned long, U = bool]: Assertion `j <= i' failed.
==72817== ERROR: libFuzzer: deadly signal
#0 0x556e9225eab5 in __sanitizer_print_stack_trace (/mnt/my_storage/bitcoin/build_fuzz/bin/fuzz+0x191dab5) (BuildId: d77c4d5f9dfd38ea06fab463f49341735205e109)
#1 0x556e921acafc in fuzzer::PrintStackTrace() (/mnt/my_storage/bitcoin/build_fuzz/bin/fuzz+0x186bafc) (BuildId: d77c4d5f9dfd38ea06fab463f49341735205e109)
#2 0x556e92191bb7 in fuzzer::Fuzzer::CrashCallback() (/mnt/my_storage/bitcoin/build_fuzz/bin/fuzz+0x1850bb7) (BuildId: d77c4d5f9dfd38ea06fab463f49341735205e109)
#3 0x7164cfc458cf (/lib/x86_64-linux-gnu/libc.so.6+0x458cf) (BuildId: ae7440bbdce614e0e79280c3b2e45b1df44e639c)
#4 0x7164cfca49bb in __pthread_kill_implementation nptl/pthread_kill.c:43:17
#5 0x7164cfca49bb in __pthread_kill_internal nptl/pthread_kill.c:89:10
#6 0x7164cfca49bb in pthread_kill nptl/pthread_kill.c💯10
#7 0x7164cfc4579d in raise signal/../sysdeps/posix/raise.c:26:13
#8 0x7164cfc288cc in abort stdlib/abort.c:73:3
#9 0x556e92f9d591 in assertion_fail(std::source_location const&, std::basic_string_view<char, std::char_traits<char>>) /mnt/my_storage/bitcoin/src/util/check.cpp:41:5
#10 0x556e9250daf0 in bool&& inline_assertion_check<false, bool>(bool&&, std::source_location const&, std::basic_string_view<char, std::char_traits<char>>) /mnt/my_storage/bitcoin/src/util/check.h:90:13
#11 0x556e9250daf0 in unsigned long CheckedSub<unsigned long, bool>(unsigned long, bool) /mnt/my_storage/bitcoin/src/util/overflow.h:44:5
#12 0x556e9250daf0 in CoinsViewCacheCursor::NextAndMaybeErase(std::pair<COutPoint const, CCoinsCacheEntry>&) /mnt/my_storage/bitcoin/src/coins.h:282:25
#13 0x556e92507eb2 in (anonymous namespace)::MutationGuardCoinsViewCache::BatchWrite(CoinsViewCacheCursor&, uint256 const&) /mnt/my_storage/bitcoin/src/test/fuzz/coins_view.cpp:90:75
#14 0x556e92c17a2b in CCoinsViewCache::Flush(bool) /mnt/my_storage/bitcoin/src/coins.cpp:282:11
#15 0x556e924fb732 in TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_1::operator()() const /mnt/my_storage/bitcoin/src/test/fuzz/coins_view.cpp:135:34
#16 0x556e924fb732 in unsigned long CallOneOf<TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_0, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_1, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_2, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_3, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_4, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_5, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_6, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_7, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_8, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_9, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_10, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_11>(FuzzedDataProvider&, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_0, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_1, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_2, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_3, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_4, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_5, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_6, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_7, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_8, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_9, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_10, TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool)::$_11) /mnt/my_storage/bitcoin/src/test/fuzz/util.h:42:27
#17 0x556e924fb732 in TestCoinsView(FuzzedDataProvider&, CCoinsViewCache&, CCoinsView&, bool) /mnt/my_storage/bitcoin/src/test/fuzz/coins_view.cpp:114:9
#18 0x556e92503b0c in coins_view_overlay_fuzz_target(std::span<unsigned char const, 18446744073709551615ul>) /mnt/my_storage/bitcoin/src/test/fuzz/coins_view.cpp:404:5
#19 0x556e92bcb7a5 in std::function<void (std::span<unsigned char const, 18446744073709551615ul>)>::operator()(std::span<unsigned char const, 18446744073709551615ul>) const /usr/lib/gcc/x86_64-linux-gnu/15/../../../../include/c++/15/bits/std_function.h:593:9
#20 0x556e92bcb7a5 in test_one_input(std::span<unsigned char const, 18446744073709551615ul>) /mnt/my_storage/bitcoin/src/test/fuzz/fuzz.cpp:88:5
#21 0x556e92bcb7a5 in LLVMFuzzerTestOneInput /mnt/my_storage/bitcoin/src/test/fuzz/fuzz.cpp:216:5
#22 0x556e9219318f in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/mnt/my_storage/bitcoin/build_fuzz/bin/fuzz+0x185218f) (BuildId: d77c4d5f9dfd38ea06fab463f49341735205e109)
#23 0x556e92192799 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*) (/mnt/my_storage/bitcoin/build_fuzz/bin/fuzz+0x1851799) (BuildId: d77c4d5f9dfd38ea06fab463f49341735205e109)
#24 0x556e92194139 in fuzzer::Fuzzer::MutateAndTestOne() (/mnt/my_storage/bitcoin/build_fuzz/bin/fuzz+0x1853139) (BuildId: d77c4d5f9dfd38ea06fab463f49341735205e109)
#25 0x556e92194c95 in fuzzer::Fuzzer::Loop(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) (/mnt/my_storage/bitcoin/build_fuzz/bin/fuzz+0x1853c95) (BuildId: d77c4d5f9dfd38ea06fab463f49341735205e109)
#26 0x556e92181255 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/mnt/my_storage/bitcoin/build_fuzz/bin/fuzz+0x1840255) (BuildId: d77c4d5f9dfd38ea06fab463f49341735205e109)
#27 0x556e921ad696 in main (/mnt/my_storage/bitcoin/build_fuzz/bin/fuzz+0x186c696) (BuildId: d77c4d5f9dfd38ea06fab463f49341735205e109)
#28 0x7164cfc2a577 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#29 0x7164cfc2a63a in __libc_start_main csu/../csu/libc-start.c:360:3
#30 0x556e921757e4 in _start (/mnt/my_storage/bitcoin/build_fuzz/bin/fuzz+0x18347e4) (BuildId: d77c4d5f9dfd38ea06fab463f49341735205e109)
NOTE: libFuzzer has rudimentary signal handlers.
Combine libFuzzer with AddressSanitizer or similar for better crash reports.
SUMMARY: libFuzzer: deadly signal
MS: 2 PersAutoDict-CopyPart- DE: "\005\000"-; base unit: ecb626aff8724f0fdde38a0a6965718f2096d474
artifact_prefix='/tmp/fuzz_artifacts/'; Test unit written to /tmp/fuzz_artifacts/crash-1d19026c1a23f08bfe693fd684a56ce51187c6e5
./build_fuzz/bin/fuzz /tmp/fuzz_corpus/coins_view_overlay -max_total_time=3600 -rss_limit_mb=2560 -artifact_prefix=/tmp/fuzz_artifacts/ >fuzz-16.log 2>&1
```
</details>
The coins view fuzz targets can call `AddCoin`/`AddCoins` and construct `BatchWrite` cursors in ways that violate `CCoinsViewCache` caller contracts. These invalid states can trigger `BatchWrite` `std::logic_error` and can desync dirty-entry accounting (caught by `Assume(m_dirty_count == 0)` currently).
Make the fuzzer avoid generating invalid states instead of catching and resetting:
* Derive `AddCoin`’s `possible_overwrite` from `PeekCoin`, so `possible_overwrite=false` is only used when the outpoint is absent - similarly to 67c0d1798e/src/test/fuzz/coinscache_sim.cpp (L312-L317)
- Only use `AddCoins(check=false)` when we have confirmed the txid has no unspent outputs; otherwise fall back to `check=true` so `AddCoins` determines overwrites via the view.
- When constructing a `CoinsViewCacheCursor`, avoid setting `FRESH` when the parent already has an unspent coin, and ensure `FRESH` implies `DIRTY`.
### Fuzzing
The original error could be reproduced in ~10 minutes using `coins_view_overlay`. I ran the `coins_view`, `coins_view_db`, `coins_view_overlay`, and `coinscache_sim` fuzzers for this PR overnight and they didn't fail anymore.
ACKs for top commit:
achow101:
ACK 3281824ecfa72c4f69ab69c94003b7f5a82c7265
sipa:
ACK 3281824ecfa72c4f69ab69c94003b7f5a82c7265. Ran the 4 relevant fuzz tests for ~1 CPU day each. Will run more overnight.
andrewtoth:
ACK 3281824ecfa72c4f69ab69c94003b7f5a82c7265
Tree-SHA512: b8155e8d21740eb7800e373c27a8a1457eb84468c24af879bac5a1ed251ade2aec99c34a350a31f2ebb74e41bb7380bf20214d38d14fe23310a43282d2434fb7
5cd57943b8adc76ed0b8a75a83f27bc0f971cbef test: verify node state after restart in assumeutxo (Yash Bhutwala)
Pull request description:
## Summary
This PR replaces the TODO comment in `wallet_assumeutxo.py` (line 242) with actual test assertions that verify node and wallet behavior after a restart during assumeutxo background sync.
## Changes
The new tests verify:
- Two chainstates exist (background validation not complete)
- Background chainstate is still at `START_HEIGHT`
- Snapshot chainstate has synced to at least `PAUSE_HEIGHT`
- Wallets cannot be loaded after restart (expected behavior)
- Wallet backup from before snapshot height cannot be restored
## Motivation
During implementation, I discovered that **wallets cannot be loaded after a node restart during assumeutxo background sync**. This is expected behavior because:
- The wallet loading code checks if required blocks are available for rescanning
- During assumeutxo background sync, blocks before the snapshot are not available
- This applies to all wallets, including watch-only wallets created at the snapshot height
This is a valuable test addition because it documents this expected behavior and ensures it doesn't regress. Users should be aware that if they restart their node during assumeutxo background sync, they won't be able to load their wallets until the background sync completes.
## Related
refs #28648
Addresses the TODO comment that was originally added as part of the assumeutxo wallet test implementation.
ACKs for top commit:
Bicaru20:
Code review ACK 5cd57943b8adc76ed0b8a75a83f27bc0f971cbef
achow101:
ACK 5cd57943b8adc76ed0b8a75a83f27bc0f971cbef
fjahr:
ACK 5cd57943b8adc76ed0b8a75a83f27bc0f971cbef
sedited:
ACK 5cd57943b8adc76ed0b8a75a83f27bc0f971cbef
polespinasa:
code lgtm ACK 5cd57943b8adc76ed0b8a75a83f27bc0f971cbef
Tree-SHA512: 4a125c5247168da2bbf4d855b4150ca453bb5e4cce1a62e633ce5e43acdc2c58883a6a94dcc46b38f8b4c44206fe42cec4db151a76aded53d8ea433ea5eb2562
50cf6838e6aa51e0d712cbc1e13d686253bc8fe0 wallet: rpc: manpage: fix example missing `fee_rate` argument (SomberNight)
Pull request description:
The function signature for the `send` RPC is:
```
send [{"address":amount,...},{"data":"hex"},...] ( conf_target "estimate_mode" fee_rate options version )
```
The last example in the manpage is missing the `fee_rate` arg, but is trying to specify the `options` arg, by index. The parser confuses the intended `options` arg as the missing `fee_rate` arg.
See:
```
$ bitcoin-cli -rpcuser=doggman -rpcpassword=donkey -rpcport=18554 -regtest send '{"bcrt1qusm48zmlzwr32csxdw4ar7atw260h22c9ten9l": 0.1}' 1 economical '{"add_to_wallet": false, "inputs": [{"txid":"0b7e1a471dc948b7a6187936b16e6d7d9833629b2f9dd8a392eb89928f63aaad", "vout":0}]}'
error code: -8
error message:
Cannot specify both conf_target and fee_rate. Please provide either a confirmation target in blocks for automatic fee estimation, or an explicit fee rate.
```
vs
```
$ bitcoin-cli -rpcuser=doggman -rpcpassword=donkey -rpcport=18554 -regtest send '{"bcrt1qusm48zmlzwr32csxdw4ar7atw260h22c9ten9l": 0.1}' 1 economical null '{"add_to_wallet": false, "inputs": [{"txid":"0b7e1a471dc948b7a6187936b16e6d7d9833629b2f9dd8a392eb89928f63aaad", "vout":0}]}'
{
"psbt": "cHNidP8BAHECAAAAAa2qY4+SieuSo9idL5tiM5h9bW6xNnkYprdIyR1HGn4LAAAAAAD9////AkR2DwQAAAAAFgAUpLDwJu+wFRHLQAgKAb0psk7UVd2AlpgAAAAAABYAFOQ3U4t/E4cVYgZrq9H7q3K0+6lYAAAAAAABAIUCAAAAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wQC4wMA/////wLIF6gEAAAAABYAFLMY1zihXrefAA0DA5nld4MCPjkrAAAAAAAAAAAmaiSqIant4vYcP3HR3v0/qZnfo2lTdVxpBol5mWK0i+vYNpdOjPkAAAAAAQEfyBeoBAAAAAAWABSzGNc4oV63nwANAwOZ5XeDAj45KwEIawJHMEQCIElTV4pbUrsPR9qHWcioowVv3QVWHizxwevfD0u/I8YyAiBCY3OzF81PSLM00h4ueQkehYuxDFZu7Jk51iejphKnnwEhA0VKdYVSyBpWoxBwTDOupB58Fi3mEBs+u+OOqEYVd2sZACICA98YLWyH7dBCfXVxe7woiLSTgV1mJN8Zc8KgZ77pVSg+GNBMeT5UAACAAQAAgAAAAIABAAAAbAAAAAAA",
"txid": "625b71b314a6ac4f738634e29dc007cd5edc0427c1ae96ab706d06a62910cea2",
"hex": "02000000000101adaa638f9289eb92a3d89d2f9b6233987d6d6eb1367918a6b748c91d471a7e0b0000000000fdffffff0244760f0400000000160014a4b0f026efb01511cb40080a01bd29b24ed455dd8096980000000000160014e437538b7f13871562066babd1fbab72b4fba9580247304402204953578a5b52bb0f47da8759c8a8a3056fdd05561e2cf1c1ebdf0f4bbf23c6320220426373b317cd4f48b334d21e2e79091e858bb10c566eec9939d627a3a612a79f012103454a758552c81a56a310704c33aea41e7c162de6101b3ebbe38ea84615776b1900000000",
"complete": true
}
```
ACKs for top commit:
svanstaa:
tACK 50cf6838e6
kannapoix:
Tested ACK 50cf6838e6aa51e0d712cbc1e13d686253bc8fe0
achow101:
ACK 50cf6838e6aa51e0d712cbc1e13d686253bc8fe0
rkrux:
tACK 50cf6838e6aa51e0d712cbc1e13d686253bc8fe0
theStack:
Tested ACK 50cf6838e6aa51e0d712cbc1e13d686253bc8fe0
Tree-SHA512: 499701729038cd863b612698098a73ce995589fc5ab08a2962f8edf1ff3cb3de6f7090e04722ca13ba7707a566fa3750ae549b6ad55750a3d01127eb6b94a79f
Stop() has two windows where Start() could cause troubles:
1) m_workers is temporarily empty while workers are being joined,
this creates a window where Start() could slip through and reset
m_interrupt to false, preventing the old workers from exiting and
causing a deadlock.
2) Start() could be called after workers are joined but before the
empty() sanity check on m_work_queue, causing a crash.
Fix both races by keeping m_interrupt set for the entire duration
of Stop(), so any concurrent Start() call is rejected until all
workers have exited.
Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
e8f8b74a46aa075bf6c74c104fd572cc89d3b53b test: index, improve txospenderindex_initial_sync() test code (furszy)
ac3bea07cdceac9e316448a9a5f190848156efd5 test: improve rpc_gettxspendingprevout.py code (furszy)
Pull request description:
Fixes#34637.
Was reviewing #34637 and, while reading the new txospender index
test code for the first time, found it could use some cleanups. Finding
stuff in there is harder than it should be due to the amount of dup code.
The first commit cleans up `rpc_gettxspendingprevout.py` by introducing
helper functions to avoid repeating the same dicts everywhere, using
for-loops instead of duplicating the same checks for each node, and
renaming variables to better reflect what they actually represent.
The second commit reorganizes `txospenderindex_initial_sync()`
moving index initialization after the test setup phase, since the index
doesn't participate in it anyway. It adds a post-sync check to catch
cases where `Sync()` aborted prematurely.
Note:
This is just a pre-work for deeper index changes I'm cooking.
ACKs for top commit:
achow101:
ACK e8f8b74a46aa075bf6c74c104fd572cc89d3b53b
sedited:
Re-ACK e8f8b74a46aa075bf6c74c104fd572cc89d3b53b
w0xlt:
reACK e8f8b74a46aa075bf6c74c104fd572cc89d3b53b
Tree-SHA512: 3f7026712ab20a43f376afa28c683dcd5daec8ed1bbf1c36d7ec6bbf231f468d4de74efae4aa8295ff3afb83986286ccaf31c03b34e45fc9971652f064791ed0
This fixes ``Assertion `m_node.chainman' failed`` errors first reported
https://github.com/bitcoin/bitcoin/issues/33994#issuecomment-3602551596 when
IPC mining methods are called before ChainstateManager is loaded.
The fix works by making the `Init.makeMining` method block until chainstate
data is loaded.
Instead of having the InitAndLoadChainstate function delete and create the
KernelNotifications object each time it is called (it can be called twice when
reindexing) to clear cached state, create it just one time and add a
setChainstateLoaded() method to manage state as it is loaded and unloaded.
This refactoring should make sense by itself to be more explicit about how
KernelNotifications state is cleared, but it's also needed to make outside code
accessing KernelNotifications state (currently just mining code) safe during
node startup and shutdown so the KernelNofications mutex can be used for
synchronization and does not get recreated itself.
This parametrizes the cost model for the SFL algorithm with another
class. Right now, the behavior of that class matches the naive cost
model so far, but it will be replaced with a more advanced on in a
future commit.
The reason for abstracting this out is that it makes benchmarking for
creating such cost models easy, by instantiating the cost model class
with one that tracks time.
The index is now initialized after the setup phase (chain generation
and txs creation), since it doesn't participate on it at all.
This improves readability and splits setup from what we actually
want to check.
This also adds a check after Sync() to verify the index best block hash
matches the tip, so we know it fully synced before checking the
processed data. This will help catching errors as Sync() could have
aborted prematurely.
As a happy side effect, the SyncWithValidationInterfaceQueue() call at
the end of the test is no longer needed and has been removed.