2024 Commits

Author SHA1 Message Date
Fabian Jahr
fa41fc6a1a
refactor: Operate on bytes instead of bits in Asmap code
Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
2026-01-20 23:45:28 +01:00
Ava Chow
f7e88e298a
Merge bitcoin/bitcoin#32471: wallet/rpc: fix listdescriptors RPC fails to return descriptors with private key information when wallet contains descriptors missing any key
9c7e4771b13d4729fd20ea08b7e2e3209b134fff test: Test listdescs with priv works even with missing priv keys (Novo)
ed945a685473712c1a822379effa42fd49223515 walletrpc: reject listdes with priv key on w-only wallets (Novo)
9e5e9824f11b1b0f9e2a4e28124edbb1616af519 descriptor: ToPrivateString() pass if  at least 1 priv key exists (Novo)
5c4db25b61d417a567f152169f4ab21a491afb95 descriptor: refactor ToPrivateString for providers (Novo)
2dc74e3f4e5e6f01c8810359b91041bc6865f1c7 wallet/migration: use HavePrivateKeys in place of ToPrivateString (Novo)
e842eb90bb6db39076a43b010c0c7898d50b8d92 descriptors: add HavePrivateKeys() (Novo)

Pull request description:

  _TLDR:
  Currently, `listdescriptors [private=true]` will fail for a non-watch-only wallet if any descriptor has a missing private key(e.g `tr()`, `multi()`, etc.). This PR changes that while making sure `listdescriptors [private=true]` still fails if there no private keys. Closes #32078_

  In non-watch-only wallets, it's possible to import descriptors as long as at least one private key is included. It's important that users can still view these descriptors when they need to create a backup—even if some private keys are missing ([#32078 (comment)](https://github.com/bitcoin/bitcoin/issues/32078#issuecomment-2781428475)). This change makes it possible to do so.

  This change also helps prevent `listdescriptors true` from failing completely, because one descriptor is missing some private keys.

  ### Notes
  - The new behaviour is applied to all descriptors including miniscript descriptors
  - `listdescriptors true` still fails for watch-only wallets to preserve existing behaviour https://github.com/bitcoin/bitcoin/pull/24361#discussion_r920801352
  - Wallet migration logic previously used `Descriptor::ToPrivateString()` to determine which descriptor was watchonly. This means that modifying the `ToPrivateString()` behaviour caused descriptors that were previously recognized as "watchonly" to be "non-watchonly". **In order to keep the scope of this PR limited to the RPC behaviour, this PR uses a different method to determine `watchonly` descriptors for the purpose of wallet migration.** A follow-up PR can be opened to update migration logic to exclude descriptors with some private keys from the `watchonly` migration wallet.

  ### Relevant PRs
  https://github.com/bitcoin/bitcoin/pull/24361
  https://github.com/bitcoin/bitcoin/pull/32186

  ### Testing
  Functional tests were added to test the new behaviour

  EDIT
  **`listdescriptors [private=true]` will still fail when there are no private keys because non-watchonly wallets must have private keys and calling `listdescriptors [private=true]` for watchonly wallet returns an error**

ACKs for top commit:
  Sjors:
    ACK 9c7e4771b13d4729fd20ea08b7e2e3209b134fff
  achow101:
    ACK 9c7e4771b13d4729fd20ea08b7e2e3209b134fff
  w0xlt:
    reACK 9c7e4771b1 with minor nits
  rkrux:
    re-ACK 9c7e4771b13d4729fd20ea08b7e2e3209b134fff

Tree-SHA512: f9b3b2c3e5425a26e158882e39e82e15b7cb13ffbfb6a5fa2868c79526e9b178fcc3cd88d3e2e286f64819d041f687353780bbcf5a355c63a136fb8179698b60
2026-01-20 12:17:19 -08:00
merge-script
7f5ebef56a
Merge bitcoin/bitcoin#34302: fuzz: Restore SendMessages coverage in process_message(s) fuzz targets
fabf8d1c5bdb6a944762792cdc762caa6c1a760b fuzz: Restore SendMessages coverage in process_message(s) fuzz targets (MarcoFalke)
fac7fed397f0fa3924600c1806a46d235a62ee5f refactor: Use std::reference_wrapper<AddrMan> in Connman (MarcoFalke)

Pull request description:

  *Found and reported by Crypt-iQ (thanks!)*

  Currently the process_message(s) fuzz targets do not have any meaningful `SendMessages` code coverage. This is not ideal.

  Fix the problem by adding back the coverage, and by hardening the code here, so that the problem hopefully does not happen again in the future.

  ### Historic context for this regression

  The regression was introduced in commit fa11eea4059a608f591db4469c07a341fd33a031, which built a new deterministic peerman object. However, the patch was incomplete, because it was missing one hunk to replace `g_setup->m_node.peerman->SendMessages(&p2p_node);` with `peerman->SendMessages(&p2p_node);`.

  This means the stale and empty peerman from the node context and not the freshly created and deterministic peerman was used.

  A simple fix would be to just submit the missing patch hunk. However, this still leaves the risk that the issue is re-introduced at any time in the future. So instead, I think the stale and empty peerman should be de-constructed, so that any call to it will lead to a hard sanitizer error and fuzz failure.

  Doing that also uncovered another issue: The connman was holding on to a reference to a stale and empty addrman.

  So fix all issues by:

  * Allowing the addrman reference in connman to be re-seatable
  * Clearing all stale objects, before creating new objects, and then using references to the new objects in all code

ACKs for top commit:
  Crypt-iQ:
    crACK fabf8d1c5bdb6a944762792cdc762caa6c1a760b
  frankomosh:
    ACK fabf8d1c5bdb6a944762792cdc762caa6c1a760b
  marcofleon:
    code review ACK fabf8d1c5bdb6a944762792cdc762caa6c1a760b
  sedited:
    ACK fabf8d1c5bdb6a944762792cdc762caa6c1a760b

Tree-SHA512: 2e478102b3e928dc7505f00c08d4b9e4f8368407b100bc88f3eb3b82aa6fea5a45bae736c211f5af1551ca0de1a5ffd4a5d196d9473d4c3b87cfed57c9a0b69d
2026-01-20 16:45:18 +01:00
merge-script
c57fbbe99d
Merge bitcoin/bitcoin#31650: refactor: Avoid copies by using const references or by move-construction
fa64d8424b8de49e219bffb842a33d484fb03212 refactor: Enforce readability-avoid-const-params-in-decls (MarcoFalke)
faf0c2d942c8de7868a3fd3afc7fc9ea700c91d4 refactor: Avoid copies by using const references or by move-construction (MarcoFalke)

Pull request description:

  Top level `const` in declarations is problematic for many reasons:

  * It is often a typo, where one wanted to denote a const reference. For example `bool PSBTInputSignedAndVerified(const PartiallySignedTransaction psbt, ...` is missing the `&`. This will create a redundant copy of the value.
  * In constructors it prevents move construction.
  * It can incorrectly imply some data is const, like in an imaginary example `std::span<int> Shuffle(const std::span<int>);`, where the `int`s are *not* const.
  * The compiler ignores the `const` from the declaration in the implementation.
  * It isn't used consistently anyway, not even on the same line.

  Fix some issues by:

  * Using a const reference to avoid a copy, where read-only of the value is intended. This is only done for values that may be expensive to copy.
  * Using move-construction to avoid a copy
  * Applying `readability-avoid-const-params-in-decls` via clang-tidy

ACKs for top commit:
  l0rinc:
    diff reACK fa64d8424b8de49e219bffb842a33d484fb03212
  hebasto:
    ACK fa64d8424b8de49e219bffb842a33d484fb03212, I have reviewed the code and it looks OK.
  sedited:
    ACK fa64d8424b8de49e219bffb842a33d484fb03212

Tree-SHA512: 293c000b4ebf8fdcc75259eb0283a2e4e7892c73facfb5c3182464d6cb6a868b7f4a6682d664426bf2edecd665cf839d790bef0bae43a8c3bf1ddfdd3d068d38
2026-01-19 11:44:04 +01:00
MarcoFalke
fabf8d1c5b
fuzz: Restore SendMessages coverage in process_message(s) fuzz targets 2026-01-15 15:17:12 +01:00
merge-script
d08c1b3ed9
Merge bitcoin/bitcoin#34288: fuzz: Exclude too expensive inputs in miniscript_string target
fac70ea8b5bb33d05a47c36f2c5f1d79f119315c fuzz: Exclude too expensive inputs in miniscript_string target (MarcoFalke)
fa907864786056258302a611bf4df0319138a71b iwyu: Fix includes for test/fuzz/util/descriptor module (MarcoFalke)

Pull request description:

  Fixes https://github.com/bitcoin/bitcoin/issues/30498

  Accepting "expensive" fuzz inputs which have no real use-case is problematic, because it prevents the fuzz engine from spending time on the next useful fuzz input.

  For example this one will take several seconds (the flamegraph shows the time is spent in minscipt `NoDupCheck`):

  ```
  curl -fLO '41bae50cff'
  FUZZ=miniscript_string /usr/bin/time   ./bld-cmake/bin/fuzz  ./41bae50cffd1741150a1b330d02ab09f46ff8cd1
  ```

  Inspecting the inputs shows that it has many sub frags, so rejecting based on `HasTooManySubFrag` should be sufficient.

ACKs for top commit:
  darosior:
    ACK fac70ea8b5bb33d05a47c36f2c5f1d79f119315c
  brunoerg:
    code review ACK fac70ea8b5bb33d05a47c36f2c5f1d79f119315c
  dergoegge:
    utACK fac70ea8b5bb33d05a47c36f2c5f1d79f119315c

Tree-SHA512: 7f1e0d9ce24d67ec63e5b7c2dd194efa51f38beb013564690afe0f920e5ff1980c85ce344828c0dc3f34b6851db7fe72a76b1a775c6d51c94fb91431834f453b
2026-01-15 13:55:27 +00:00
merge-script
baa554f708
Merge bitcoin/bitcoin#34259: Find minimal chunks in SFL
da56ef239b12786e3a177afda14352dda4a70bc6 clusterlin: minimize chunks (feature) (Pieter Wuille)

Pull request description:

  Part of #30289.

  This was split off from #34023, because it's not really an optimization but a feature. The feature existed pre-SFL, so this brings SFL to parity in terms of functionality with the old code.

  The idea is that while optimality - as achieved by SFL before this PR - guarantees a linearization whose feerate diagram is optimal, it may be possible to split chunks into smaller equal-feerate parts. This is desirable because even though it doesn't change the diagram, it provides more flexibility for optimization (binpacking is easier when the pieces are smaller).

  Thus, this PR introduces the stronger notion of "minimality": optimal chunks, which are also split into their smallest possible pieces. To accomplish that, an additional step in the SFL algorithm is added which aims to split chunks into minimal equal-feerate parts where possible, without introducing circular dependencies between them. It works based on the observation that if an (already otherwise optimal) chunk has a way of being split into two equal-feerate parts, and T is a given transaction in the chunk, then we can find the split in two steps:
  * One time, pretend T has $\epsilon$ higher feerate than it really has. If a split exists with T in the top part, this will find it.
  * The other time, pretend T has $\epsilon$ lower feerate than it really has. If a split exists with T in the bottom part, this will find it.

  So we try both on each found optimal chunk. If neither works, the chunk is minimal. If one works, recurse into the split chunks to split them further.

ACKs for top commit:
  instagibbs:
    reACK da56ef239b
  marcofleon:
    crACK da56ef239b12786e3a177afda14352dda4a70bc6

Tree-SHA512: 2e94d6b78725f5f9470a939dedef46450b85c4e5e6f30cba0b038622ec2b417380747e8df923d1f303706602ab6d834350716df9678de144f857e3a8d163f6c2
2026-01-15 10:07:21 +00:00
MarcoFalke
fa64d8424b
refactor: Enforce readability-avoid-const-params-in-decls 2026-01-14 23:04:12 +01:00
Ava Chow
b0b65336e7
Merge bitcoin/bitcoin#32740: refactor: Header sync optimisations & simplifications
de4242f47476769d0a7f3e79e8297ed2dd60d9a4 refactor: Use reference for chain_start in HeadersSyncState (Daniela Brozzoni)
e37555e5401f9fca39ada0bd153e46b2c7ebd095 refactor: Use initializer list in CompressedHeader (Daniela Brozzoni)
0488bdfefe92b2c9a924be9244c91fe472462aab refactor: Remove unused parameter in ReportHeadersPresync (Daniela Brozzoni)
256246a9fa5b05141c93aeeb359394b9c7a80e49 refactor: Remove redundant parameter from CheckHeadersPoW (Daniela Brozzoni)
ca0243e3a6d77d2b218749f1ba113b81444e3f4a refactor: Remove useless CBlock::GetBlockHeader (Pieter Wuille)
45686522224598bed9923e60daad109094d7bc29 refactor: Use std::span in HasValidProofOfWork (Daniela Brozzoni)
4066bfe561a45f61a3c9bf24bec7f600ddcc7467 refactor: Compute work from headers without CBlockIndex (Daniela Brozzoni)
0bf6139e194f355d121bb2aea74715d1c4099598 p2p: Avoid an IsAncestorOfBestHeaderOrTip call (Pieter Wuille)

Pull request description:

  This is a partial* revival of #25968

  It contains a list of most-unrelated simplifications and optimizations to the code merged in #25717:

  - Avoid an IsAncestorOfBestHeaderOrTip call: Just don't call this function when it won't have any effect.
  - Compute work from headers without CBlockIndex: Avoid the need to construct a CBlockIndex object just to compute work for a header, when its nBits value suffices for that. Also use some Spans where possible.
  - Remove useless CBlock::GetBlockHeader: There is no need for a function to convert a CBlock to a CBlockHeader, as it's a child class of it.

  It also contains the following code cleanups, which were suggested by reviewers in #25968:
  - Remove redundant parameter from CheckHeadersPoW: No need to pass consensusParams, as CheckHeadersPow already has access to m_chainparams.GetConsensus()
  - Remove unused parameter in ReportHeadersPresync
  - Use initializer list in CompressedHeader, also make GetFullHeader const
  - Use reference for chain_start in HeadersSyncState: chain_start can never be null, so it's better to pass it as a reference rather than a raw pointer

  *I decided to leave out three commits that were in #25968 (4e7ac7b94d04e056e9994ed1c8273c52b7b23931, ab52fb4e95aa2732d1a1391331ea01362e035984, 7f1cf440ca1a9c86085716745ca64d3ac26957c0), since they're a bit more involved, and I'm a new contributor. If this PR gets merged, I'll comment under #25968 to note that these three commits are still up for grabs :)

ACKs for top commit:
  l0rinc:
    ACK de4242f47476769d0a7f3e79e8297ed2dd60d9a4
  polespinasa:
    re-ACK de4242f47476769d0a7f3e79e8297ed2dd60d9a4
  sipa:
    ACK de4242f47476769d0a7f3e79e8297ed2dd60d9a4
  achow101:
    ACK de4242f47476769d0a7f3e79e8297ed2dd60d9a4
  hodlinator:
    re-ACK de4242f47476769d0a7f3e79e8297ed2dd60d9a4

Tree-SHA512: 1de4f3ce0854a196712505f2b52ccb985856f5133769552bf37375225ea8664a3a7a6a9578c4fd461e935cd94a7cbbb08f15751a1da7651f8962c866146d9d4b
2026-01-14 11:38:07 -08:00
MarcoFalke
fac70ea8b5
fuzz: Exclude too expensive inputs in miniscript_string target 2026-01-14 20:02:38 +01:00
MarcoFalke
fa90786478
iwyu: Fix includes for test/fuzz/util/descriptor module
Also, fix a typo.
2026-01-14 19:19:18 +01:00
Pieter Wuille
da56ef239b clusterlin: minimize chunks (feature)
After the normal optimization process finishes, and finds an optimal
spanning forest, run a second process (while computation budget remains)
to split chunks into minimal equal-feerate chunks.
2026-01-12 17:38:30 -05:00
MarcoFalke
fa8d56f9f0
fuzz: Reject too large descriptor leaf sizes in scriptpubkeyman target 2026-01-08 14:26:29 +01:00
MarcoFalke
333333356f
fuzz: [refactor] Use std::span over FuzzBufferType in descriptor utils
They are exactly the same, but the descriptor utils should not prescribe
to use the FuzzBufferType. Using a dedicated type for them clarifies
that the utils are not tied to FuzzBufferType.

Also, while touching the lines, use `const` only where it is meaningful.
2026-01-08 12:18:01 +01:00
Novo
9e5e9824f1 descriptor: ToPrivateString() pass if at least 1 priv key exists
- Refactor Descriptor::ToPrivateString() to allow descriptors with
  missing private keys to be printed. Useful in descriptors with
  multiple keys e.g tr() etc.
- The existing behaviour of listdescriptors is preserved as much as
  possible, if no private keys are availablle ToPrivateString will
  return false
2026-01-07 10:44:38 +01:00
Pieter Wuille
1808b5aaf7 clusterlin: remove unused FixLinearization (cleanup) 2026-01-05 11:48:34 -05:00
Pieter Wuille
01ffcf464a clusterlin: support fixing linearizations (feature)
This also updates FixLinearization to just be a thin wrapper around Linearize.
In a future commit, FixLinearization will be removed entirely.
2026-01-05 11:48:16 -05:00
Ava Chow
ab233255d4
Merge bitcoin/bitcoin#33866: refactor: Let CCoinsViewCache::BatchWrite return void
6da6f503a6dd8de85780ca402f5202095aa8b6ea refactor: Let CCoinsViewCache::BatchWrite return void (TheCharlatan)

Pull request description:

  CCoinsViewCache::BatchWrite always returns true if called from a backed cache, so just return void instead. Also return void from ::Sync and ::Flush.

  This allows for dropping a FatalError condition and simplifying some dead error handling code a bit.

  Since we now no longer exercise the "error path" when returning from `CCoinsView::BatchWrite`, make the method clear the cache instead. This should only be exercised by tests and not change production behaviour. This might slightly improve the coins_view fuzz test's ability to generate better coverage.

ACKs for top commit:
  l0rinc:
    ACK 6da6f503a6dd8de85780ca402f5202095aa8b6ea
  andrewtoth:
    re-ACK 6da6f503a6dd8de85780ca402f5202095aa8b6ea
  achow101:
    ACK 6da6f503a6dd8de85780ca402f5202095aa8b6ea
  w0xlt:
    ACK 6da6f503a6

Tree-SHA512: dfaa325b0cf8108910aebf1b27434aaddb639d10d860e96797c77ea42eca9035a54a7dc1d6a5d4eae2b75fcc9356206d3d5672243d2c906e80d19024c8b95408
2026-01-02 16:49:23 -08:00
bensig
08ed802bab doc: fix double-word typos in comments 2025-12-30 12:12:26 -08:00
fanquake
3e4765ee10
scripted-diff: [doc] Unify stale copyright headers
-BEGIN VERIFY SCRIPT-

sed --in-place --regexp-extended \
       's;( 20[0-2][0-9])(-20[0-2][0-9])? The Bitcoin Core developers;\1-present The Bitcoin Core developers;g' \
       $( git grep -l 'The Bitcoin Core developers' -- ':(exclude)COPYING' ':(exclude)src/ipc/libmultiprocess' ':(exclude)src/minisketch' )

-END VERIFY SCRIPT-
2025-12-19 16:58:36 +00:00
merge-script
7f295e1d9b
Merge bitcoin/bitcoin#34084: scripted-diff: [doc] Unify stale copyright headers
fa4cb13b52030c2e55c6bea170649ab69d75f758 test: [doc] Manually unify stale headers (MarcoFalke)
fa5f29774872d18febc0df38831a6e45f3de69cc scripted-diff: [doc] Unify stale copyright headers (MarcoFalke)

Pull request description:

  Historically, the upper year range in file headers was bumped manually
  or with a script.

  This has many issues:

  * The script is causing churn. See for example commit 306ccd4, or
    drive-by first-time contributions bumping them one-by-one. (A few from
    this year: https://github.com/bitcoin/bitcoin/pull/32008,
    https://github.com/bitcoin/bitcoin/pull/31642,
    https://github.com/bitcoin/bitcoin/pull/32963, ...)
  * Some, or likely most, upper year values were wrong. Reasons for
    incorrect dates could be code moves, cherry-picks, or simply bugs in
    the script.
  * The upper range is not needed for anything.
  * Anyone who wants to find the initial file creation date, or file
    history, can use `git log` or `git blame` to get more accurate
    results.
  * Many places are already using the `-present` suffix, with the meaning
    that the upper range is omitted.

  To fix all issues, this bumps the upper range of the copyright headers
  to `-present`.

  Further notes:

  * Obviously, the yearly 4-line bump commit for the build system (c.f.
    b537a2c02a9921235d1ecf8c3c7dc1836ec68131) is fine and will remain.
  * For new code, the date range can be fully omitted, as it is done
    already by some developers. Obviously, developers are free to pick
    whatever style they want. One can list the commits for each style.
  * For example, to list all commits that use `-present`:
    `git log --format='%an (%ae) [%h: %s]' -S 'present The Bitcoin'`.
  * Alternatively, to list all commits that use no range at all:
    `git log --format='%an (%ae) [%h: %s]' -S '(c) The Bitcoin'`.

  <!--
  * The lower range can be wrong as well, so it could be omitted as well,
    but this is left for a follow-up. A previous attempt was in
    https://github.com/bitcoin/bitcoin/pull/26817.

ACKs for top commit:
  l0rinc:
    ACK fa4cb13b52030c2e55c6bea170649ab69d75f758
  rkrux:
    re-ACK fa4cb13b52030c2e55c6bea170649ab69d75f758
  janb84:
    ACK fa4cb13b52030c2e55c6bea170649ab69d75f758

Tree-SHA512: e5132781bdc4417d1e2922809b27ef4cf0abb37ffb68c65aab8a5391d3c917b61a18928ec2ec2c75ef5184cb79a5b8c8290d63e949220dbeab3bd2c0dfbdc4c5
2025-12-19 16:56:02 +00:00
Pieter Wuille
75bdb925f4 clusterlin: drop support for improvable chunking (simplification)
With MergeLinearizations() gone and the LIMO-based Linearize() replaced by SFL, we do not
need a class (LinearizationChunking) that can maintain an incrementally-improving chunk
set anymore.

Replace it with a function (ChunkLinearizationInfo) that just computes the chunks as
SetInfos once, and returns them as a vector. This simplifies several call sites too.
2025-12-18 16:01:31 -05:00
Pieter Wuille
91399a7912 clusterlin: remove unused MergeLinearizations (cleanup)
This ended up never being used in txgraph.
2025-12-18 16:01:31 -05:00
Pieter Wuille
13aad26b78 clusterlin: randomize various decisions in SFL (feature)
This introduces a local RNG inside the SFL state, which is used to randomize
various decisions inside the algorithm, in order to make it hard to create
pathological clusters which predictably have bad performance.

The decisions being randomized are:
* When deciding what chunk to attempt to split, the queue order is
  randomized.
* When deciding which dependency to split on, a uniformly random one is
  chosen among those with higher top feerate than bottom feerate within
  the chosen chunk.
* When deciding which chunks to merge, a uniformly random one among those
  with the higher feerate difference is picked.
* When merging two chunks, a uniformly random dependency between them is
  now activated.
* When making the state topological, the queue of chunks to process is
  randomized.
2025-12-18 16:01:31 -05:00
Pieter Wuille
ddbfa4dfac clusterlin: keep FIFO queue of improvable chunks (preparation)
This introduces a queue of chunks that still need processing, in both
MakeTopological() and OptimizationStep(). This is simultaneously:
* A preparation for introducing randomization, by allowing permuting the
  queue.
* An improvement to the fairness of suboptimal solutions, by distributing
  the work more fairly over chunks.
* An optimization, by avoiding retrying chunks over and over again which
  are already known to be optimal.
2025-12-18 16:01:31 -05:00
Pieter Wuille
3efc94d656 clusterlin: replace cluster linearization with SFL (feature)
This replaces the existing LIMO linearization algorithm (which internally uses
ancestor set finding and candidate set finding) with the much more performant
spanning-forest linearization algorithm.

This removes the old candidate-set search algorithm, and several of its tests,
benchmarks, and needed utility code.

The worst case time per cost is similar to the previous algorithm, so
ACCEPTABLE_ITERS is unchanged.
2025-12-18 16:01:31 -05:00
Pieter Wuille
6a8fa821b8 clusterlin: add support for loading existing linearization (feature) 2025-12-18 16:01:22 -05:00
Pieter Wuille
da48ed9f34 clusterlin: ReadLinearization for non-topological (tests)
Rather than using an ad-hoc no-dependency copy of the graph when a potentially
non-topological linearization is needed in the clusterlin fuzz test, add this
directly as a feature in ReadLinearization().

This is preparation for a later commit where another use for such a function
is added.
2025-12-18 15:49:07 -05:00
Pieter Wuille
c461259fb6 clusterlin: add class implementing SFL state (preparation)
This adds a data structure representing the optimization state for the spanning-forest
linearization algorithm (SFL), plus a fuzz test for its correctness.

This is preparation for switching over Linearize() to use this algorithm.

See https://delvingbitcoin.org/t/spanning-forest-cluster-linearization/1419 for
a description of the algorithm.
2025-12-18 15:49:01 -05:00
merge-script
516ae5ede4
Merge bitcoin/bitcoin#31533: fuzz: Add fuzz target for block index tree and related validation events
db2d39f642979f929261e5f1cd67f0c2f2ca045f fuzz: add subtest for re-downloading a previously pruned block (Eugene Siegel)
45f5b2dac330906368352a1c585183f0d75d779d fuzz: Add fuzzer for block index (Martin Zumsande)
c011e3aa542631a8857039df796ebf13a653e8a6 test: Wrap validation functions with TestChainstateManager (Martin Zumsande)

Pull request description:

  This adds a fuzz target for the block index and various events in validation that interact with it.

  It can create arbitrary tree-like structure of block indexes, simulating (so far) the following events:
  - Adding a header
  - Receiving the full block (may be valid or not)
  - `ActivateBestChain()` - Reorging the chain to a new chain tip (possibly encountering invalid blocks on the way)
  - Pruning a block in the best chain
  - Receiving a previously pruned block again (`getblockfrompeer`)

  It might be interesting / possible to extend this to more events, such as dealing with more than one chainstate (assumeutxo).

  The test skips all actual validation of header/ block / transaction data by just simulating the outcome, and also doesn't interact with the data directory.
  The main goal is to ensure the integrity of the block index tree in all fuzzed constellations, by calling `CheckBlockIndex()` at the end of each iteration.

  Compared to #29158 this approach has a more limited scope (by skipping all actual validation), but it is fast - it doesn't do a full init sequence on each iteration, but "cleans up" after itself by resetting the global validation state after each iteration.

ACKs for top commit:
  Crypt-iQ:
    reACK db2d39f642
  maflcko:
    review ACK db2d39f642979f929261e5f1cd67f0c2f2ca045f 🍶
  sedited:
    Re-ACK db2d39f642979f929261e5f1cd67f0c2f2ca045f

Tree-SHA512: 76cd5f8f4d7d7258620b46d7438bad4508c3bdc98825b48b60f694b5a9838e2b2cf4967c0ead181f86f66f4939ddfe552471851b9d18f84f584c03dd7e09fc43
2025-12-18 15:26:42 +00:00
merge-script
8d38b6f5f1
Merge bitcoin/bitcoin#34091: fuzz: doc: remove any mention to address_deserialize_v2
caf4843a59a9d2512d69f8fd88a9672112bd80ac fuzz: doc: remove any mention to address_deserialize_v2 (brunoerg)

Pull request description:

  We don't have `address_deserialize_v2` target anymore since fac81affb527132945773a5315bd27fec61ec52f (we used to have `address_deserialize_v1_notime`, `address_deserialize_v1_withtime` and `address_deserialize_v2` but now we only have a single `address_deserialize` target) so it removes any mention to it.

ACKs for top commit:
  maflcko:
    review ACK caf4843a59a9d2512d69f8fd88a9672112bd80ac 🎾
  marcofleon:
    ACK caf4843a59a9d2512d69f8fd88a9672112bd80ac

Tree-SHA512: 539d69edbfe4ca11eb0701ed5c789ad81976e3e85e8a229e39e9dc1b1c72264f01d10a1c16d0a3bb4a354794412dc8b625298f4f72430905a00b65faeaa37d6b
2025-12-18 11:35:41 +00:00
Ryan Ofsky
ab513103df
Merge bitcoin/bitcoin#33192: refactor: unify container presence checks
d9319b06cf82664d55f255387a348135fd7f91c7 refactor: unify container presence checks - non-trivial counts (Lőrinc)
039307554eb311ce41648d1f9a12b543f480f871 refactor: unify container presence checks - trivial counts (Lőrinc)
8bb9219b6301215f53e43967d17445aaf1b81090 refactor: unify container presence checks - find (Lőrinc)

Pull request description:

  ### Summary
  Instead of counting occurrences in sets and maps, the C++20 `::contains` method expresses the intent unambiguously and can return early on first encounter.

  ### Context
  Applied clang‑tidy's [readability‑container‑contains](https://clang.llvm.org/extra/clang-tidy/checks/readability/container-contains.html) check, though many cases required manual changes since tidy couldn't fix them automatically.

  ### Changes
  The changes made here were:

  | From                   | To               |
  |------------------------|------------------|
  | `m.find(k) == m.end()` | `!m.contains(k)` |
  | `m.find(k) != m.end()` | `m.contains(k)`  |
  | `m.count(k)`           | `m.contains(k)`  |
  | `!m.count(k)`          | `!m.contains(k)` |
  | `m.count(k) == 0`      | `!m.contains(k)` |
  | `m.count(k) != 1`      | `!m.contains(k)` |
  | `m.count(k) == 1`      | `m.contains(k)`  |
  | `m.count(k) < 1`       | `!m.contains(k)`  |
  | `m.count(k) > 0`       | `m.contains(k)`  |
  | `m.count(k) != 0`      | `m.contains(k)`  |

  > Note that `== 1`/`!= 1`/`< 1` only apply to simple [maps](https://en.cppreference.com/w/cpp/container/map/contains)/[sets](https://en.cppreference.com/w/cpp/container/set/contains) and had to be changed manually.

  There are many other cases that could have been changed, but we've reverted most of those to reduce conflict with other open PRs.

  -----

  <details>
  <summary>clang-tidy command on Mac</summary>

  ```bash
  rm -rfd build && \
  cmake -B build \
    -DCMAKE_C_COMPILER="$(brew --prefix llvm)/bin/clang" \
    -DCMAKE_CXX_COMPILER="$(brew --prefix llvm)/bin/clang++" \
    -DCMAKE_OSX_SYSROOT="$(xcrun --show-sdk-path)" \
    -DCMAKE_C_FLAGS="-target arm64-apple-macos11" \
    -DCMAKE_CXX_FLAGS="-target arm64-apple-macos11" \
    -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DBUILD_BENCH=ON -DBUILD_FUZZ_BINARY=ON -DBUILD_FOR_FUZZING=ON

   "$(brew --prefix llvm)/bin/run-clang-tidy" -quiet -p build -j$(nproc) -checks='-*,readability-container-contains' | grep -v 'clang-tidy'
  ```

  </details>

  Note: this is a take 2 of https://github.com/bitcoin/bitcoin/pull/33094 with fewer contentious changes.

ACKs for top commit:
  optout21:
    reACK d9319b06cf82664d55f255387a348135fd7f91c7
  sedited:
    ACK d9319b06cf82664d55f255387a348135fd7f91c7
  janb84:
    re ACK d9319b06cf82664d55f255387a348135fd7f91c7
  pablomartin4btc:
    re-ACK d9319b06cf82664d55f255387a348135fd7f91c7
  ryanofsky:
    Code review ACK d9319b06cf82664d55f255387a348135fd7f91c7. I manually reviewed the full change, and it seems there are a lot of positive comments about this and no more very significant conflicts, so I will merge it shortly.

Tree-SHA512: e4415221676cfb88413ccc446e5f4369df7a55b6642347277667b973f515c3c8ee5bfa9ee0022479c8de945c89fbc9ff61bd8ba086e70f30298cbc1762610fe1
2025-12-17 16:17:29 -05:00
brunoerg
caf4843a59 fuzz: doc: remove any mention to address_deserialize_v2 2025-12-17 11:57:11 -03:00
MarcoFalke
fa5f297748
scripted-diff: [doc] Unify stale copyright headers
-BEGIN VERIFY SCRIPT-

 sed --in-place --regexp-extended \
   's;( 20[0-2][0-9])(-20[0-2][0-9])? The Bitcoin Core developers;\1-present The Bitcoin Core developers;g' \
   $( git grep -l 'The Bitcoin Core developers' -- ':(exclude)COPYING' ':(exclude)src/ipc/libmultiprocess' ':(exclude)src/minisketch' )

-END VERIFY SCRIPT-
2025-12-16 22:21:15 +01:00
Eugene Siegel
db2d39f642 fuzz: add subtest for re-downloading a previously pruned block
This imitates the use of the getblockfrompeer rpc.
Note that currently pruning is limited to blocks in the active chain.

Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
2025-12-16 11:25:46 -05:00
Martin Zumsande
45f5b2dac3 fuzz: Add fuzzer for block index
This fuzz target creates arbitrary tree-like structure of indices,
simulating the following events:
- Adding a header to the block tree db
- Receiving the full block (may be valid or not)
- Reorging to a new chain tip (possibly encountering invalid blocks on
  the way)
- pruning
The test skips all actual validation of header/ block / transaction data
by just simulating the outcome, and also doesn't interact with the data directory.

The main goal is to test the integrity of the block index tree in
all fuzzed constellations, by calling CheckBlockIndex()
at the end of each iteration.
2025-12-16 11:25:46 -05:00
merge-script
13891a8a68
Merge bitcoin/bitcoin#34050: fuzz: exercise ComputeMerkleRoot without mutated parameter
7e9de20c0c144f5ccea5efe6d90601dd72bc7461 fuzz: exercise `ComputeMerkleRoot` without mutated parameter (Lőrinc)

Pull request description:

  The `mutated` parameter in `ComputeMerkleRoot` unlocks a different path that was always exercised in the fuzz test.
  Adjusted to be fuzzer to pass `nullptr` as well to make sure that path is also tested: 24ed820d4f/src/consensus/merkle.cpp (L49-L53)

  Follow-up to https://github.com/bitcoin/bitcoin/pull/33805#discussion_r2589073735

ACKs for top commit:
  frankomosh:
    ACK [7e9de20](7e9de20c0c)
  hodlinator:
    ACK 7e9de20c0c144f5ccea5efe6d90601dd72bc7461
  sedited:
    ACK 7e9de20c0c144f5ccea5efe6d90601dd72bc7461

Tree-SHA512: bf27029ac04003447b24a95544ec863f9ceca6c28d51ea811dd6ca2b412a2a780bb9fdbcdc82719f39dd710a746eb2446263e8377d67a8be52a1694571d03498
2025-12-16 14:25:55 +00:00
merge-script
4f11ef058b
Merge bitcoin/bitcoin#30214: refactor: Improve assumeutxo state representation
82be652e40ec7e1bea4b260ee804a92a3e05f496 doc: Improve ChainstateManager documentation, use consistent terms (Ryan Ofsky)
af455dcb39dbd53700105e29c87de5db65ecf43c refactor: Simplify pruning functions (TheCharlatan)
ae85c495f1b507ca5871ea98f5d884fccb15adba refactor: Delete ChainstateManager::GetAll() method (Ryan Ofsky)
6a572dbda92ceb8c5af379f51cf6f9b93fb5e486 refactor: Add ChainstateManager::ActivateBestChains() method (Ryan Ofsky)
491d827d5284ed984ee2b11daaee50321217eac5 refactor: Add ChainstateManager::m_chainstates member (Ryan Ofsky)
e514fe61168109bd467d7cb2ac7561442b17b5f6 refactor: Delete ChainstateManager::SnapshotBlockhash() method (Ryan Ofsky)
ee35250683ab9a395b70a0e90ebc68b1858387c7 refactor: Delete ChainstateManager::IsSnapshotValidated() method (Ryan Ofsky)
d9e82299fc4e45fbc0f5a34dcbb1d51397d0bd35 refactor: Delete ChainstateManager::IsSnapshotActive() method (Ryan Ofsky)
4dfe383912761669a968f8535ed43437da160ec8 refactor: Convert ChainstateRole enum to struct (Ryan Ofsky)
352ad27fc1b1b350c8dbeb26a9813b01025cad31 refactor: Add ChainstateManager::ValidatedChainstate() method (Ryan Ofsky)
a229cb9477e6622087241be7a105551d1329503b refactor: Add ChainstateManager::CurrentChainstate() method (Ryan Ofsky)
a9b7f5614c24fe6f386448604c325ec4fa6c98a5 refactor: Add Chainstate::StoragePath() method (Ryan Ofsky)
840bd2ef230ed0582fe33a90ec2636bfefa21709 refactor: Pass chainstate parameters to MaybeCompleteSnapshotValidation (Ryan Ofsky)
1598a15aedb9fd9c4e4a671785ebebf56fc1e072 refactor: Deduplicate Chainstate activation code (Ryan Ofsky)
9fe927b6d654e752dac82156e209e45d31b75779 refactor: Add Chainstate m_assumeutxo and m_target_utxohash members (Ryan Ofsky)
6082c84713f42f5fa66f9a76baef17e8ed231633 refactor: Add Chainstate::m_target_blockhash member (Ryan Ofsky)
de00e87548f7ddd623355b7094924b0387a36280 test: Fix broken chainstatemanager_snapshot_init check (Ryan Ofsky)

Pull request description:

  This PR contains the first part of #28608, which tries to make assumeutxo code more maintainable, and improve it by not locking `cs_main` for a long time when the snapshot block is connected, and by deleting the snapshot validation chainstate when it is no longer used, instead of waiting until the next restart.

  The changes in this PR are just refactoring. They make `Chainstate` objects self-contained, so for example, it is possible to determine what blocks to connect to a chainstate without querying `ChainstateManager`, and to determine whether a Chainstate is validated without basing it on inferences like `&cs != &ActiveChainstate()` or `GetAll().size() == 1`.

  The PR also tries to make assumeutxo terminology less confusing, using "current chainstate" to refer to the chainstate targeting the current network tip, and "historical chainstate" to refer to the chainstate downloading old blocks and validating the assumeutxo snapshot. It removes uses of the terms "active chainstate," "usable chainstate," "disabled chainstate," "ibd chainstate," and "snapshot chainstate" which are confusing for various reasons.

ACKs for top commit:
  maflcko:
    re-review ACK 82be652e40ec7e1bea4b260ee804a92a3e05f496 🕍
  fjahr:
    re-ACK 82be652e40ec7e1bea4b260ee804a92a3e05f496
  sedited:
    Re-ACK 82be652e40ec7e1bea4b260ee804a92a3e05f496

Tree-SHA512: 81c67abba9fc5bb170e32b7bf8a1e4f7b5592315b4ef720be916d5f1f5a7088c0c59cfb697744dd385552f58aa31ee36176bae6a6e465723e65861089a1252e5
2025-12-16 14:03:34 +00:00
TheCharlatan
6da6f503a6
refactor: Let CCoinsViewCache::BatchWrite return void
CCoinsViewCache::BatchWrite always returns true if called from a backed
cache, so just return void instead. Also return void from ::Sync and
::Flush.

This allows for dropping a FatalError condition and simplifying some
dead error handling code a bit.

Since we now no longer exercise the "error path" when returning from
`CCoinsView::BatchWrite`, make the method clear the cache instead. This
should only be exercised by tests and not change production behaviour.
This might slightly improve the coins_view fuzz test's ability to
generate better coverage.

Co-authored-by: l0rinc <pap.lorinc@gmail.com>
2025-12-14 22:25:31 +01:00
marcofleon
a70a14a3f4 refactor: Separate out logic for building a tree-shaped dependency graph 2025-12-12 16:09:53 +01:00
marcofleon
ce29d7d626 fuzz: Fix variable in clusterlin_postlinearize_tree check
The test intends to verify that running `PostLinearize` a
second time on a tree-structured graph doesn't change the
result. But `PostLinearize` was being called on the original
variable, not the copy. So the check was comparing the
unmodified copy against itself, which is useless.

Fix by post-linearizing the correct variable.
2025-12-12 15:04:10 +00:00
marcofleon
876e2849b4 fuzz: Fix incorrect loop bounds in clusterlin_postlinearize_tree
The dependency graphs generated by this test can have holes
(unused indices) in them. This means some of the transactions
were skipped when using `depgraph_gen.TxCount()` as the upper
bound of the loop. Switch to using `depgraph.Positions()` to
correctly handle sparse graphs.
2025-12-12 15:02:26 +00:00
Ryan Ofsky
e514fe6116 refactor: Delete ChainstateManager::SnapshotBlockhash() method
SnapshotBlockhash() is only called two places outside of tests, and is used
redundantly in some tests, checking the same field as other checks. Simplify by
dropping the method and using the m_from_snapshot_blockhash field directly.
2025-12-12 06:49:59 -04:00
Ryan Ofsky
840bd2ef23 refactor: Pass chainstate parameters to MaybeCompleteSnapshotValidation
Remove hardcoded references to m_ibd_chainstate and m_snapshot_chainstate so
MaybeCompleteSnapshotValidation function can be simpler and focus on validating
the snapshot without dealing with internal ChainstateManager states.

This is a step towards being able to validate the snapshot outside of
ActivateBestChain loop so cs_main is not locked for minutes when the snapshot
block is connected.
2025-12-12 06:49:59 -04:00
Lőrinc
7e9de20c0c
fuzz: exercise ComputeMerkleRoot without mutated parameter
Co-authored-by: sedited <seb.kung@gmail.com>
2025-12-11 12:47:18 +01:00
Ava Chow
b26762bdcb
Merge bitcoin/bitcoin#33805: merkle: migrate path arg to reference and drop unused args
24ed820d4f0d8f7fa2f69e1909c2d98f809d2f94 merkle: remove unused `mutated` arg from `BlockWitnessMerkleRoot` (Lőrinc)
63d640fa6a7090b4e615153b42ebf2e48d909db0 merkle: remove unused `proot` and `pmutated` args from `MerkleComputation` (Lőrinc)
be270551df30dd42b4f1b664234d0c22c09be625 merkle: migrate `path` arg of `MerkleComputation` to a reference (Lőrinc)

Pull request description:

  ### Summary
  Simplifies merkle tree computation by removing dead code found through coverage analysis (following up on #33768 and #33786).

  ### History

  #### BlockWitnessMerkleRoot
  Original `MerkleComputation` was added in ee60e5625b (diff-706988c23877f8a557484053887f932b2cafb3b5998b50497ce7ff8118ac85a3R131) where it was called for either `&hash, mutated` or `position, &ret` args.
  In 1f0e7ca09c (diff-706988c23877f8a557484053887f932b2cafb3b5998b50497ce7ff8118ac85a3L135-L165) the first usage was inlined in `ComputeMerkleRoot`, leaving the `proot` and , `pmutated` values unused in `MerkleComputation`.
  Later in 4defdfab94 the method was moved to test and in 63d6ad7c89 (diff-706988c23877f8a557484053887f932b2cafb3b5998b50497ce7ff8118ac85a3R87-R95) was restored to the code, though with unused parameters again.

  #### BlockWitnessMerkleRoot
  `BlockWitnessMerkleRoot` was introduced in 8b49040854 where it was already called with `NULL` 8b49040854 (diff-34d21af3c614ea3cee120df276c9c4ae95053830d7f1d3deaf009a4625409ad2R3509) or an unused dummy 8b49040854 (diff-34d21af3c614ea3cee120df276c9c4ae95053830d7f1d3deaf009a4625409ad2R3598-R3599) for the `mutated` parameter.

  ### Fixes

  #### BlockWitnessMerkleRoot
  - Converts `path` parameter from pointer to reference (always non-null at call site)
  - Removes `proot` and `pmutated` parameters (always `nullptr` at call site)

  #### BlockWitnessMerkleRoot
  - Removes unused `mutated` output parameter (always passed as `nullptr`)

  The change is a refactor that shouldn't introduce *any* behavioral change, only remove dead code, leftovers from previous refactors.

  ### Coverage proof
  https://maflcko.github.io/b-c-cov/total.coverage/src/consensus/merkle.cpp.gcov.html

ACKs for top commit:
  optout21:
    utACK 24ed820d4f0d8f7fa2f69e1909c2d98f809d2f94
  Sjors:
    utACK 24ed820d4f0d8f7fa2f69e1909c2d98f809d2f94
  achow101:
    ACK 24ed820d4f0d8f7fa2f69e1909c2d98f809d2f94
  sedited:
    ACK 24ed820d4f0d8f7fa2f69e1909c2d98f809d2f94
  hodlinator:
    ACK 24ed820d4f0d8f7fa2f69e1909c2d98f809d2f94

Tree-SHA512: 6960411304631bc381a3db7a682f6b6ba51bd58936ca85aa237c69a9109265b736b22ec4d891875bddfcbe8517bd3f014c44a4b387942eee4b01029c91ec93e1
2025-12-10 15:28:50 -08:00
Ava Chow
0f6d8a347a
Merge bitcoin/bitcoin#30442: precalculate SipHash constant salt XORs
6eb5ba569141b722a5e27ef36e04994886769c44 refactor: extract shared `SipHash` state into `SipHashState` (Lőrinc)
118d22ddb4ba6af6cd54204dda579c2ff9a70c12 optimization: cache `PresaltedSipHasher` in `CBlockHeaderAndShortTxIDs` (Lőrinc)
9ca52a4cbece2963363d201ef2b15d58d96ea221 optimization: migrate `SipHashUint256` to `PresaltedSipHasher` (Lőrinc)
ec11b9fede2af826abb144a443115fb586c49597 optimization: introduce `PresaltedSipHasher` for repeated hashing (Lőrinc)
20330548cf5f44cec057c0ed099b64c81afb124d refactor: extract `SipHash` C0-C3 constants to class scope (Lőrinc)
9f9eb7fbc053b38daeb1f4ca4e284d51e07fe50c test: rename k1/k2 to k0/k1 in `SipHash` consistency tests (Lőrinc)

Pull request description:

  This change is part of [[IBD] - Tracking PR for speeding up Initial Block Download](https://github.com/bitcoin/bitcoin/pull/32043)

  ### Summary

  The in-memory representation of the UTXO set uses (salted) [SipHash](https://github.com/bitcoin/bitcoin/blob/master/src/coins.h#L226) to avoid key collision attacks.

  Hashing `uint256` keys is performed frequently throughout the codebase. Previously, specialized optimizations existed as standalone functions (`SipHashUint256` and `SipHashUint256Extra`), but the constant salting operations (C0-C3 XOR with keys) were recomputed on every call.

  This PR introduces `PresaltedSipHasher`, a class that caches the initial SipHash state (v0-v3 after XORing constants with keys), eliminating redundant constant computations when hashing multiple values with the same keys. The optimization is applied uniformly across:
  - All `Salted*Hasher` classes (`SaltedUint256Hasher`, `SaltedTxidHasher`, `SaltedWtxidHasher`, `SaltedOutpointHasher`)
  - `CBlockHeaderAndShortTxIDs` for compact block short ID computation

  ### Details

  The change replaces the standalone `SipHashUint256` and `SipHashUint256Extra` functions with `PresaltedSipHasher` class methods that cache the constant-salted state. This is particularly beneficial for hash map operations where the same salt is used repeatedly (as suggested by Sipa in https://github.com/bitcoin/bitcoin/pull/30442#issuecomment-2628994530).

  `CSipHasher` behavior remains unchanged; only the specialized `uint256` paths and callers now reuse the cached state instead of recomputing it.

  ### Measurements

  Benchmarks were run using local `SaltedOutpointHasherBench_*` microbenchmarks (not included in this PR) that exercise `SaltedOutpointHasher` in realistic `std::unordered_set` scenarios.

  <details>
  <summary>Benchmarks</summary>

  ```C++
  diff --git a/src/bench/crypto_hash.cpp b/src/bench/crypto_hash.cpp
  --- a/src/bench/crypto_hash.cpp(revision 9b1a7c3e8dd78d97fbf47c2d056d043b05969176)
  +++ b/src/bench/crypto_hash.cpp(revision e1b4f056b3097e7e34b0eda31f57826d81c9d810)
  @@ -2,7 +2,6 @@
   // Distributed under the MIT software license, see the accompanying
   // file COPYING or http://www.opensource.org/licenses/mit-license.php.

  -
   #include <bench/bench.h>
   #include <crypto/muhash.h>
   #include <crypto/ripemd160.h>
  @@ -12,9 +11,11 @@
   #include <crypto/sha512.h>
   #include <crypto/siphash.h>
   #include <random.h>
  -#include <span.h>
   #include <tinyformat.h>
   #include <uint256.h>
  +#include <primitives/transaction.h>
  +#include <util/hasher.h>
  +#include <unordered_set>

   #include <cstdint>
   #include <vector>
  @@ -205,6 +206,98 @@
       });
   }

  +static void SaltedOutpointHasherBench_hash(benchmark::Bench& bench)
  +{
  +    FastRandomContext rng{/*fDeterministic=*/true};
  +    constexpr size_t size{1000};
  +
  +    std::vector<COutPoint> outpoints(size);
  +    for (auto& outpoint : outpoints) {
  +        outpoint = {Txid::FromUint256(rng.rand256()), rng.rand32()};
  +    }
  +
  +    const SaltedOutpointHasher hasher;
  +    bench.batch(size).run([&] {
  +        size_t result{0};
  +        for (const auto& outpoint : outpoints) {
  +            result ^= hasher(outpoint);
  +        }
  +        ankerl::nanobench::doNotOptimizeAway(result);
  +    });
  +}
  +
  +static void SaltedOutpointHasherBench_match(benchmark::Bench& bench)
  +{
  +    FastRandomContext rng{/*fDeterministic=*/true};
  +    constexpr size_t size{1000};
  +
  +    std::unordered_set<COutPoint, SaltedOutpointHasher> values;
  +    std::vector<COutPoint> value_vector;
  +    values.reserve(size);
  +    value_vector.reserve(size);
  +
  +    for (size_t i{0}; i < size; ++i) {
  +        COutPoint outpoint{Txid::FromUint256(rng.rand256()), rng.rand32()};
  +        values.emplace(outpoint);
  +        value_vector.push_back(outpoint);
  +        assert(values.contains(outpoint));
  +    }
  +
  +    bench.batch(size).run([&] {
  +        bool result{true};
  +        for (const auto& outpoint : value_vector) {
  +            result ^= values.contains(outpoint);
  +        }
  +        ankerl::nanobench::doNotOptimizeAway(result);
  +    });
  +}
  +
  +static void SaltedOutpointHasherBench_mismatch(benchmark::Bench& bench)
  +{
  +    FastRandomContext rng{/*fDeterministic=*/true};
  +    constexpr size_t size{1000};
  +
  +    std::unordered_set<COutPoint, SaltedOutpointHasher> values;
  +    std::vector<COutPoint> missing_value_vector;
  +    values.reserve(size);
  +    missing_value_vector.reserve(size);
  +
  +    for (size_t i{0}; i < size; ++i) {
  +        values.emplace(Txid::FromUint256(rng.rand256()), rng.rand32());
  +        COutPoint missing_outpoint{Txid::FromUint256(rng.rand256()), rng.rand32()};
  +        missing_value_vector.push_back(missing_outpoint);
  +        assert(!values.contains(missing_outpoint));
  +    }
  +
  +    bench.batch(size).run([&] {
  +        bool result{false};
  +        for (const auto& outpoint : missing_value_vector) {
  +            result ^= values.contains(outpoint);
  +        }
  +        ankerl::nanobench::doNotOptimizeAway(result);
  +    });
  +}
  +
  +static void SaltedOutpointHasherBench_create_set(benchmark::Bench& bench)
  +{
  +    FastRandomContext rng{/*fDeterministic=*/true};
  +    constexpr size_t size{1000};
  +
  +    std::vector<COutPoint> outpoints(size);
  +    for (auto& outpoint : outpoints) {
  +        outpoint = {Txid::FromUint256(rng.rand256()), rng.rand32()};
  +    }
  +
  +    bench.batch(size).run([&] {
  +        std::unordered_set<COutPoint, SaltedOutpointHasher> set;
  +        set.reserve(size);
  +        for (const auto& outpoint : outpoints) {
  +            set.emplace(outpoint);
  +        }
  +        ankerl::nanobench::doNotOptimizeAway(set.size());
  +    });
  +}
  +
   static void MuHash(benchmark::Bench& bench)
   {
       MuHash3072 acc;
  @@ -276,6 +369,10 @@
   BENCHMARK(SHA256_32b_AVX2, benchmark::PriorityLevel::HIGH);
   BENCHMARK(SHA256_32b_SHANI, benchmark::PriorityLevel::HIGH);
   BENCHMARK(SipHash_32b, benchmark::PriorityLevel::HIGH);
  +BENCHMARK(SaltedOutpointHasherBench_hash, benchmark::PriorityLevel::HIGH);
  +BENCHMARK(SaltedOutpointHasherBench_match, benchmark::PriorityLevel::HIGH);
  +BENCHMARK(SaltedOutpointHasherBench_mismatch, benchmark::PriorityLevel::HIGH);
  +BENCHMARK(SaltedOutpointHasherBench_create_set, benchmark::PriorityLevel::HIGH);
   BENCHMARK(SHA256D64_1024_STANDARD, benchmark::PriorityLevel::HIGH);
   BENCHMARK(SHA256D64_1024_SSE4, benchmark::PriorityLevel::HIGH);
   BENCHMARK(SHA256D64_1024_AVX2, benchmark::PriorityLevel::HIGH);

  ```

  </details>

  > cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='SaltedOutpointHasherBench' -min-time=10000

  > Before:

  |               ns/op |                op/s |    err% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------:|:----------
  |               58.60 |       17,065,922.04 |    0.3% |     11.02 | `SaltedOutpointHasherBench_create_set`
  |               11.97 |       83,576,684.83 |    0.1% |     11.01 | `SaltedOutpointHasherBench_hash`
  |               14.50 |       68,985,850.12 |    0.3% |     10.96 | `SaltedOutpointHasherBench_match`
  |               13.90 |       71,942,033.47 |    0.4% |     11.03 | `SaltedOutpointHasherBench_mismatch`

  > After:

  |               ns/op |                op/s |    err% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------:|:----------
  |               57.27 |       17,462,299.19 |    0.1% |     11.02 | `SaltedOutpointHasherBench_create_set`
  |               11.24 |       88,997,888.48 |    0.3% |     11.04 | `SaltedOutpointHasherBench_hash`
  |               13.91 |       71,902,014.20 |    0.2% |     11.01 | `SaltedOutpointHasherBench_match`
  |               13.29 |       75,230,390.31 |    0.1% |     11.00 | `SaltedOutpointHasherBench_mismatch`

  compared to master:
  ```python
  create_set - 17,462,299.19 / 17,065,922.04 - 2.3% faster
  hash       - 88,997,888.48 / 83,576,684.83 - 6.4% faster
  match      - 71,902,014.20 / 68,985,850.12 - 4.2% faster
  mismatch   - 75,230,390.31 / 71,942,033.47 - 4.5% faster
  ```

  > C++ compiler .......................... GNU 13.3.0

  > Before:

  |               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
  |              136.76 |        7,312,133.16 |    0.0% |        1,086.67 |          491.12 |  2.213 |         119.54 |    1.1% |     11.01 | `SaltedOutpointHasherBench_create_set`
  |               23.82 |       41,978,882.62 |    0.0% |          252.01 |           85.57 |  2.945 |           4.00 |    0.0% |     11.00 | `SaltedOutpointHasherBench_hash`
  |               60.42 |       16,549,695.42 |    0.1% |          460.51 |          217.04 |  2.122 |          21.00 |    1.4% |     10.99 | `SaltedOutpointHasherBench_match`
  |               78.66 |       12,713,595.35 |    0.1% |          555.59 |          282.52 |  1.967 |          20.19 |    2.2% |     10.74 | `SaltedOutpointHasherBench_mismatch`

  > After:

  |               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
  |              135.38 |        7,386,349.49 |    0.0% |        1,078.19 |          486.16 |  2.218 |         119.56 |    1.1% |     11.00 | `SaltedOutpointHasherBench_create_set`
  |               23.67 |       42,254,558.08 |    0.0% |          247.01 |           85.01 |  2.906 |           4.00 |    0.0% |     11.00 | `SaltedOutpointHasherBench_hash`
  |               58.95 |       16,962,220.14 |    0.1% |          446.55 |          211.74 |  2.109 |          20.86 |    1.4% |     11.01 | `SaltedOutpointHasherBench_match`
  |               76.98 |       12,991,047.69 |    0.1% |          548.93 |          276.50 |  1.985 |          20.25 |    2.3% |     10.72 | `SaltedOutpointHasherBench_mismatch`

  ```python
  compared to master:
  create_set -  7,386,349.49 / 7,312,133.16  - 1.0% faster
  hash       - 42,254,558.08 / 41,978,882.62 - 0.6% faster
  match      - 16,962,220.14 / 16,549,695.42 - 2.4% faster
  mismatch   - 12,991,047.69 / 12,713,595.35 - 2.1% faster
  ```

ACKs for top commit:
  achow101:
    ACK 6eb5ba569141b722a5e27ef36e04994886769c44
  vasild:
    ACK 6eb5ba569141b722a5e27ef36e04994886769c44
  sipa:
    ACK 6eb5ba569141b722a5e27ef36e04994886769c44

Tree-SHA512: 9688b87e1d79f8af9efc18a8487922c5f1735487a9c5b78029dd46abc1d94f05d499cd1036bd615849aa7d6b17d11653c968086050dd7d04300403ebd0e81210
2025-12-10 15:22:34 -08:00
Ava Chow
c2975f26d6
Merge bitcoin/bitcoin#33602: [IBD] coins: reduce lookups in dbcache layer propagation
0ac969cddfdba52f7947e9b140ef36e2b19c2c41 validation: don't reallocate cache for short-lived CCoinsViewCache (Lőrinc)
c8f5e446dc95712a63e4dd88786e2f7cb697b986 coins: reduce lookups in dbcache layer propagation (Lőrinc)

Pull request description:

  This change is part of [[IBD] - Tracking PR for speeding up Initial Block Download](https://github.com/bitcoin/bitcoin/pull/32043)

  ### Summary

  Previously, when the parent coins cache had no entry and the child did, `BatchWrite` performed a find followed by `try_emplace`, which resulted in multiple `SipHash` computations and bucket traversals on the common insert path.
  On a different path, these caches were recreated needlessly for every block connection.

  ### Fix for double fetch

  This change uses a single leading `try_emplace` and branches on the returned `inserted` flag. In the `FRESH && SPENT` case (not used in production, only exercised by tests), we erase the just-inserted placeholder (which is constant time with no rehash anyway). Semantics are unchanged for all valid parent/child state combinations.

  This change is a minimal version of [bitcoin/bitcoin@`723c49b` (#32128)](723c49b63b) and draws simplification ideas [bitcoin/bitcoin@`ae76ec7` (#30673)](ae76ec7bcf) and https://github.com/bitcoin/bitcoin/pull/30326.

  ### Fix for temporary cache recreation

  Related to parent cache propagation, the second commit makes it possible to avoid destructuring-recreating-destructuring of these short-live parent caches created for each new block.
  A few temporary `CCoinsViewCache`'s are destructed right after the `Flush()`, therefore it is not necessary to call `ReallocateCache` to recreate them right before they're killed anyway.

  This change was based on a subset of https://github.com/bitcoin/bitcoin/pull/28945, the original authors and relevant commenters were added as coauthors to this version.

  -----

  Reindex-chainstate indicates ~1% speedup.
  <details>
  <summary>Details</summary>

  ```python
  COMMITS="647cdb4f7e8041affed887e2325ee03a91078bb1 0b0c3293ffd75afb27dadc0b28426b40132a8c6b"; \
  STOP=909090; DBCACHE=4500; \
  CC=gcc; CXX=g++; \
  BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
  (echo ""; for c in $COMMITS; do git fetch -q origin $c && git log -1 --pretty='%h %s' $c || exit 1; done; echo "") && \
  hyperfine \
    --sort command \
    --runs 2 \
    --export-json "$BASE_DIR/rdx-$(sed -E 's/(\w{8})\w+ ?/\1-/g;s/-$//'<<<"$COMMITS")-$STOP-$DBCACHE-$CC.json" \
    --parameter-list COMMIT ${COMMITS// /,} \
    --prepare "killall bitcoind 2>/dev/null; rm -f $DATA_DIR/debug.log; git checkout {COMMIT}; git clean -fxd; git reset --hard && \
      cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release -DENABLE_IPC=OFF && ninja -C build bitcoind && \
      ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=1000 -printtoconsole=0; sleep 20" \
    --cleanup "cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log" \
    "COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=$DBCACHE -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0"

  647cdb4f7e Merge bitcoin/bitcoin#33311: net: Quiet down logging when router doesn't support natpmp/pcp
  0b0c3293ff validation: don't reallocate cache for short-lived CCoinsViewCache

  Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=909090 -dbcache=4500 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 647cdb4f7e8041affed887e2325ee03a91078bb1)
    Time (mean ± σ):     16233.508 s ±  9.501 s    [User: 19064.578 s, System: 951.672 s]
    Range (min … max):   16226.790 s … 16240.226 s    2 runs

  Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=909090 -dbcache=4500 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 0b0c3293ffd75afb27dadc0b28426b40132a8c6b)
    Time (mean ± σ):     16039.626 s ± 17.284 s    [User: 18870.130 s, System: 950.722 s]
    Range (min … max):   16027.405 s … 16051.848 s    2 runs

  Relative speed comparison
          1.01 ±  0.00  COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=909090 -dbcache=4500 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 647cdb4f7e8041affed887e2325ee03a91078bb1)
          1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=909090 -dbcache=4500 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 0b0c3293ffd75afb27dadc0b28426b40132a8c6b)
  ```

  </details>

ACKs for top commit:
  optout21:
    utACK 0ac969cddfdba52f7947e9b140ef36e2b19c2c41
  achow101:
    ACK 0ac969cddfdba52f7947e9b140ef36e2b19c2c41
  andrewtoth:
    utACK 0ac969cddfdba52f7947e9b140ef36e2b19c2c41
  sedited:
    ACK 0ac969cddfdba52f7947e9b140ef36e2b19c2c41

Tree-SHA512: 9fcc3f1a8314368576a4fba96ca72665527eaa3a97964ab5b39491757f3527147d134f79a5c3456f76c1330c7ef862989d23f764236c5e2563be89a81c1cee47
2025-12-10 15:02:25 -08:00
Lőrinc
9ca52a4cbe
optimization: migrate SipHashUint256 to PresaltedSipHasher
Replaces standalone `SipHashUint256` with an `operator()` overload in `PresaltedSipHasher`.
Updates all hasher classes (`SaltedUint256Hasher`, `SaltedTxidHasher`, `SaltedWtxidHasher`) to use `PresaltedSipHasher` internally, enabling the same constant-state caching optimization while keeping behavior unchanged.

Benchmark was also adjusted to cache the salting part.
2025-12-09 17:16:15 +01:00
Lőrinc
ec11b9fede
optimization: introduce PresaltedSipHasher for repeated hashing
Replaces the `SipHashUint256Extra` function with the `PresaltedSipHasher` class that caches the constant-salted state (v[0-3] after XORing with keys).
This avoids redundant XOR operations when hashing multiple values with the same keys, benefiting use cases like `SaltedOutpointHasher`.

This essentially brings the precalculations in the `CSipHasher` constructor to the `uint256`-specialized SipHash implementation.

> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release && cmake --build build -j$(nproc) && build/src/bench/bench_bitcoin -filter='SaltedOutpointHasherBench.*' -min-time=10000

> C++ compiler .......................... AppleClang 16.0.0.16000026

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|               57.27 |       17,462,299.19 |    0.1% |     11.02 | `SaltedOutpointHasherBench_create_set`
|               11.24 |       88,997,888.48 |    0.3% |     11.04 | `SaltedOutpointHasherBench_hash`
|               13.91 |       71,902,014.20 |    0.2% |     11.01 | `SaltedOutpointHasherBench_match`
|               13.29 |       75,230,390.31 |    0.1% |     11.00 | `SaltedOutpointHasherBench_mismatch`

compared to master:
create_set - 17,462,299.19/17,065,922.04 - 2.3% faster
hash       - 88,997,888.48/83,576,684.83 - 6.4% faster
match      - 71,902,014.20/68,985,850.12 - 4.2% faster
mismatch   - 75,230,390.31/71,942,033.47 - 4.5% faster

> C++ compiler .......................... GNU 13.3.0

|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|              135.38 |        7,386,349.49 |    0.0% |        1,078.19 |          486.16 |  2.218 |         119.56 |    1.1% |     11.00 | `SaltedOutpointHasherBench_create_set`
|               23.67 |       42,254,558.08 |    0.0% |          247.01 |           85.01 |  2.906 |           4.00 |    0.0% |     11.00 | `SaltedOutpointHasherBench_hash`
|               58.95 |       16,962,220.14 |    0.1% |          446.55 |          211.74 |  2.109 |          20.86 |    1.4% |     11.01 | `SaltedOutpointHasherBench_match`
|               76.98 |       12,991,047.69 |    0.1% |          548.93 |          276.50 |  1.985 |          20.25 |    2.3% |     10.72 | `SaltedOutpointHasherBench_mismatch`

compared to master:
create_set -  7,386,349.49/7,312,133.16  - 1% faster
hash       - 42,254,558.08/41,978,882.62 - 0.6% faster
match      - 16,962,220.14/16,549,695.42 - 2.4% faster
mismatch   - 12,991,047.69/12,713,595.35 - 2% faster

Co-authored-by: sipa <pieter@wuille.net>
2025-12-09 17:13:44 +01:00