233 Commits

Author SHA1 Message Date
TheCharlatan
d69a582e72
kernel: Remove some unnecessary non-kernel includes
Specifically gets rid of batchpriority, chainparams, script/sign.h and
system includes.

Also take the opportunity of cleaning up the headers for the effected
files and adding them to the iwyu-enforced set.
2025-12-21 10:24:09 +01:00
merge-script
7f295e1d9b
Merge bitcoin/bitcoin#34084: scripted-diff: [doc] Unify stale copyright headers
fa4cb13b52030c2e55c6bea170649ab69d75f758 test: [doc] Manually unify stale headers (MarcoFalke)
fa5f29774872d18febc0df38831a6e45f3de69cc scripted-diff: [doc] Unify stale copyright headers (MarcoFalke)

Pull request description:

  Historically, the upper year range in file headers was bumped manually
  or with a script.

  This has many issues:

  * The script is causing churn. See for example commit 306ccd4, or
    drive-by first-time contributions bumping them one-by-one. (A few from
    this year: https://github.com/bitcoin/bitcoin/pull/32008,
    https://github.com/bitcoin/bitcoin/pull/31642,
    https://github.com/bitcoin/bitcoin/pull/32963, ...)
  * Some, or likely most, upper year values were wrong. Reasons for
    incorrect dates could be code moves, cherry-picks, or simply bugs in
    the script.
  * The upper range is not needed for anything.
  * Anyone who wants to find the initial file creation date, or file
    history, can use `git log` or `git blame` to get more accurate
    results.
  * Many places are already using the `-present` suffix, with the meaning
    that the upper range is omitted.

  To fix all issues, this bumps the upper range of the copyright headers
  to `-present`.

  Further notes:

  * Obviously, the yearly 4-line bump commit for the build system (c.f.
    b537a2c02a9921235d1ecf8c3c7dc1836ec68131) is fine and will remain.
  * For new code, the date range can be fully omitted, as it is done
    already by some developers. Obviously, developers are free to pick
    whatever style they want. One can list the commits for each style.
  * For example, to list all commits that use `-present`:
    `git log --format='%an (%ae) [%h: %s]' -S 'present The Bitcoin'`.
  * Alternatively, to list all commits that use no range at all:
    `git log --format='%an (%ae) [%h: %s]' -S '(c) The Bitcoin'`.

  <!--
  * The lower range can be wrong as well, so it could be omitted as well,
    but this is left for a follow-up. A previous attempt was in
    https://github.com/bitcoin/bitcoin/pull/26817.

ACKs for top commit:
  l0rinc:
    ACK fa4cb13b52030c2e55c6bea170649ab69d75f758
  rkrux:
    re-ACK fa4cb13b52030c2e55c6bea170649ab69d75f758
  janb84:
    ACK fa4cb13b52030c2e55c6bea170649ab69d75f758

Tree-SHA512: e5132781bdc4417d1e2922809b27ef4cf0abb37ffb68c65aab8a5391d3c917b61a18928ec2ec2c75ef5184cb79a5b8c8290d63e949220dbeab3bd2c0dfbdc4c5
2025-12-19 16:56:02 +00:00
merge-script
a005fdff6c
Merge bitcoin/bitcoin#34074: A few followups after introducing /rest/blockpart/ endpoint
59b93f11e8600d5224359f4d05619c0f56aef1e6 rest: print also HTTP response reason in case of an error (Roman Zeyde)
7fe94a04934a89b63f1248cb46d59f0ab45439b5 rest: add a test for unsuported `/blockpart/` request type (Roman Zeyde)
55d0d19b5c02a65d8dfafd99f352769224ab51a4 rest: deduplicate `interface_rest.py` negative tests (Roman Zeyde)
89eb531024d9921f5c825d390b90c0a7bd3756cc rest: update release notes for `/blockpart/` endpoint (Roman Zeyde)
41118e17f87561afc8cbe1f3dd528624f06906a7 blockstorage: simplify partial block read validation (Roman Zeyde)
599effdeab4d6687da783de04f8edf1d88959169 rest: reformat `uri_prefixes` initializer list (Roman Zeyde)

Pull request description:

  The commits below should resolve a few leftovers from #33657.

ACKs for top commit:
  l0rinc:
    ACK 59b93f11e8600d5224359f4d05619c0f56aef1e6
  hodlinator:
    re-ACK 59b93f11e8600d5224359f4d05619c0f56aef1e6

Tree-SHA512: ae45e08edd315018e11283b354fb32f9658f5829c956554dc662a81c2e16397def7c3700e6354e0a91ff03c850def35638a69ec2668b7c015d25d6fed42b92bb
2025-12-17 15:09:15 +00:00
MarcoFalke
fa5f297748
scripted-diff: [doc] Unify stale copyright headers
-BEGIN VERIFY SCRIPT-

 sed --in-place --regexp-extended \
   's;( 20[0-2][0-9])(-20[0-2][0-9])? The Bitcoin Core developers;\1-present The Bitcoin Core developers;g' \
   $( git grep -l 'The Bitcoin Core developers' -- ':(exclude)COPYING' ':(exclude)src/ipc/libmultiprocess' ':(exclude)src/minisketch' )

-END VERIFY SCRIPT-
2025-12-16 22:21:15 +01:00
merge-script
4f11ef058b
Merge bitcoin/bitcoin#30214: refactor: Improve assumeutxo state representation
82be652e40ec7e1bea4b260ee804a92a3e05f496 doc: Improve ChainstateManager documentation, use consistent terms (Ryan Ofsky)
af455dcb39dbd53700105e29c87de5db65ecf43c refactor: Simplify pruning functions (TheCharlatan)
ae85c495f1b507ca5871ea98f5d884fccb15adba refactor: Delete ChainstateManager::GetAll() method (Ryan Ofsky)
6a572dbda92ceb8c5af379f51cf6f9b93fb5e486 refactor: Add ChainstateManager::ActivateBestChains() method (Ryan Ofsky)
491d827d5284ed984ee2b11daaee50321217eac5 refactor: Add ChainstateManager::m_chainstates member (Ryan Ofsky)
e514fe61168109bd467d7cb2ac7561442b17b5f6 refactor: Delete ChainstateManager::SnapshotBlockhash() method (Ryan Ofsky)
ee35250683ab9a395b70a0e90ebc68b1858387c7 refactor: Delete ChainstateManager::IsSnapshotValidated() method (Ryan Ofsky)
d9e82299fc4e45fbc0f5a34dcbb1d51397d0bd35 refactor: Delete ChainstateManager::IsSnapshotActive() method (Ryan Ofsky)
4dfe383912761669a968f8535ed43437da160ec8 refactor: Convert ChainstateRole enum to struct (Ryan Ofsky)
352ad27fc1b1b350c8dbeb26a9813b01025cad31 refactor: Add ChainstateManager::ValidatedChainstate() method (Ryan Ofsky)
a229cb9477e6622087241be7a105551d1329503b refactor: Add ChainstateManager::CurrentChainstate() method (Ryan Ofsky)
a9b7f5614c24fe6f386448604c325ec4fa6c98a5 refactor: Add Chainstate::StoragePath() method (Ryan Ofsky)
840bd2ef230ed0582fe33a90ec2636bfefa21709 refactor: Pass chainstate parameters to MaybeCompleteSnapshotValidation (Ryan Ofsky)
1598a15aedb9fd9c4e4a671785ebebf56fc1e072 refactor: Deduplicate Chainstate activation code (Ryan Ofsky)
9fe927b6d654e752dac82156e209e45d31b75779 refactor: Add Chainstate m_assumeutxo and m_target_utxohash members (Ryan Ofsky)
6082c84713f42f5fa66f9a76baef17e8ed231633 refactor: Add Chainstate::m_target_blockhash member (Ryan Ofsky)
de00e87548f7ddd623355b7094924b0387a36280 test: Fix broken chainstatemanager_snapshot_init check (Ryan Ofsky)

Pull request description:

  This PR contains the first part of #28608, which tries to make assumeutxo code more maintainable, and improve it by not locking `cs_main` for a long time when the snapshot block is connected, and by deleting the snapshot validation chainstate when it is no longer used, instead of waiting until the next restart.

  The changes in this PR are just refactoring. They make `Chainstate` objects self-contained, so for example, it is possible to determine what blocks to connect to a chainstate without querying `ChainstateManager`, and to determine whether a Chainstate is validated without basing it on inferences like `&cs != &ActiveChainstate()` or `GetAll().size() == 1`.

  The PR also tries to make assumeutxo terminology less confusing, using "current chainstate" to refer to the chainstate targeting the current network tip, and "historical chainstate" to refer to the chainstate downloading old blocks and validating the assumeutxo snapshot. It removes uses of the terms "active chainstate," "usable chainstate," "disabled chainstate," "ibd chainstate," and "snapshot chainstate" which are confusing for various reasons.

ACKs for top commit:
  maflcko:
    re-review ACK 82be652e40ec7e1bea4b260ee804a92a3e05f496 🕍
  fjahr:
    re-ACK 82be652e40ec7e1bea4b260ee804a92a3e05f496
  sedited:
    Re-ACK 82be652e40ec7e1bea4b260ee804a92a3e05f496

Tree-SHA512: 81c67abba9fc5bb170e32b7bf8a1e4f7b5592315b4ef720be916d5f1f5a7088c0c59cfb697744dd385552f58aa31ee36176bae6a6e465723e65861089a1252e5
2025-12-16 14:03:34 +00:00
Roman Zeyde
41118e17f8 blockstorage: simplify partial block read validation
Use `SaturatingAdd` following https://github.com/bitcoin/bitcoin/pull/33657#discussion_r2610832092.
2025-12-14 10:44:12 +01:00
merge-script
938d7aacab
Merge bitcoin/bitcoin#33657: rest: allow reading partial block data from storage
07135290c1720a14c9d2f18a5700bb6565ae7a10 rest: allow reading partial block data from storage (Roman Zeyde)
4e2af1c06547230b9245d94e7bcb1129f2c49714 blockstorage: allow reading partial block data from storage (Roman Zeyde)
f2fd1aa21c7694cef393b4a13e472ae9d3fc54fc blockstorage: return an error code from `ReadRawBlock()` (Roman Zeyde)

Pull request description:

  It allows fetching specific transactions using an external index, following https://github.com/bitcoin/bitcoin/pull/32541#issuecomment-3267485313.

  Currently, electrs and other indexers map between an address/scripthash to the list of the relevant transactions.

  However, in order to fetch those transactions from bitcoind, electrs relies on reading the whole block and post-filtering for a specific transaction[^1]. Other indexers use a `txindex` to fetch a transaction using its txid [^2][^3][^4].

  The above approach has significant storage and CPU overhead, since the `txid` is a pseudo-random 32-byte value. Also, mainnet `txindex` takes ~60GB today.

  This PR is adding support for using the transaction's position within its block to be able to fetch it directly using [REST API](https://github.com/bitcoin/bitcoin/blob/master/doc/REST-interface.md), using the following HTTP request:

  ```
  GET /rest/blockpart/BLOCKHASH.bin?offset=OFFSET&size=SIZE
  ```

  - The offsets' index can be encoded much more efficiently ([~1.3GB today](https://github.com/romanz/bindex-rs/pull/66#issuecomment-3508476436)).

  - Address history query performance can be tested on mainnet using [1BitcoinEaterAddressDontSendf59kuE](https://mempool.space/address/1BitcoinEaterAddressDontSendf59kuE) - assuming warm OS block cache, [it takes <1s to fetch 5200 txs, i.e. <0.2ms per tx](https://github.com/romanz/bindex-rs/pull/66#issuecomment-3508476436) with [bindex](https://github.com/romanz/bindex-rs).

  - Only binary and hex response formats are supported.

  [^1]: https://github.com/romanz/electrs/blob/master/doc/schema.md
  [^2]: https://github.com/Blockstream/electrs/blob/new-index/doc/schema.md#txstore
  [^3]: https://github.com/spesmilo/electrumx/blob/master/docs/HOWTO.rst#prerequisites
  [^4]: https://github.com/cculianu/Fulcrum/blob/master/README.md#requirements

ACKs for top commit:
  maflcko:
    review ACK 07135290c1720a14c9d2f18a5700bb6565ae7a10 🏪
  l0rinc:
    ACK 07135290c1720a14c9d2f18a5700bb6565ae7a10
  hodlinator:
    re-ACK 07135290c1720a14c9d2f18a5700bb6565ae7a10

Tree-SHA512: bcce7bf4b9a3e5e920ab5a83e656f50d5d7840cdde6b7147d329cf578f8a2db555fc1aa5334e8ee64d5630d25839ece77a2cf421c6c3ac1fa379bb453163bd4f
2025-12-12 13:22:00 +00:00
TheCharlatan
af455dcb39 refactor: Simplify pruning functions
Move GetPruneRange from ChainstateManager to Chainstate.
2025-12-12 11:49:59 +01:00
Ryan Ofsky
ae85c495f1 refactor: Delete ChainstateManager::GetAll() method
Just use m_chainstates array instead.
2025-12-12 06:49:59 -04:00
Ryan Ofsky
6a572dbda9 refactor: Add ChainstateManager::ActivateBestChains() method
Deduplicate code looping over chainstate objects and calling
ActivateBestChain() and avoid need for code outside ChainstateManager to use
the GetAll() method.
2025-12-12 06:49:59 -04:00
Ryan Ofsky
4dfe383912 refactor: Convert ChainstateRole enum to struct
Change ChainstateRole parameter passed to wallets and indexes. Wallets and
indexes need to know whether chainstate is historical and whether it is fully
validated. They should not be aware of the assumeutxo snapshot validation
process.
2025-12-12 06:49:59 -04:00
Roman Zeyde
4e2af1c065 blockstorage: allow reading partial block data from storage
It will allow fetching specific transactions using an external index,
following https://github.com/bitcoin/bitcoin/pull/32541#issuecomment-3267485313.

No logging takes place in case of an invalid offset/size (to avoid spamming the log),
by using a new `ReadRawError::BadPartRange` error variant.

Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
Co-authored-by: Lőrinc <pap.lorinc@gmail.com>
2025-12-11 18:54:55 +01:00
Roman Zeyde
f2fd1aa21c blockstorage: return an error code from ReadRawBlock()
It will enable different error handling flows for different error types.

Also, `ReadRawBlockBench` performance has decreased due to no longer reusing a vector
with an unchanging capacity - mirroring our production code behavior.

Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
Co-authored-by: Lőrinc <pap.lorinc@gmail.com>
2025-12-11 18:54:55 +01:00
MarcoFalke
fa89f60e31
scripted-diff: LogPrintLevel(*,BCLog::Level::*,*) -> LogError()/LogWarning()
This is a minimal behavior change and changes log output from:

  [net:error] Something bad happened
  [net:warning] Something problematic happened

to either

  [error] Something bad happened
  [warning] Something problematic happened

or, when -loglevelalways=1 is enabled:

  [all:error] Something bad happened
  [all:warning] Something problematic happened

Such a behavior change is desired, because all warning and error logs
are written in the same style in the source code and they are logged in
the same format for log consumers.

-BEGIN VERIFY SCRIPT-

 sed --regexp-extended --in-place \
   's/LogPrintLevel\((BCLog::[^,]*), BCLog::Level::(Error|Warning), */Log\2(/g' \
   $( git grep -l LogPrintLevel ':(exclude)src/test/logging_tests.cpp' )

-END VERIFY SCRIPT-
2025-12-09 10:44:33 +01:00
MarcoFalke
fa45a1503e
log: Use LogWarning for non-critical logs
As per doc/developer-notes#logging, LogWarning should be used for severe
problems that do not warrant shutting down the node
2025-11-27 14:33:59 +01:00
Andrew Toth
99d012ec80
refactor: return reference instead of pointer
The return value of BlockManager::GetFirstBlock must always be non-null. This
can be inferred by the implementation, which has an assertion that the return
value is not null. A raw pointer should only be returned if the result may be
null. In this case a reference is more appropriate.
2025-11-13 09:57:42 -05:00
merge-script
3789215f73
Merge bitcoin/bitcoin#33724: refactor: Return uint64_t from GetSerializeSize
fa6c0bedd33ac7ad27454adaf9522fd27bef6ea3 refactor: Return uint64_t from GetSerializeSize (MarcoFalke)
fad0c8680ea7ef433c2d6e7c0d5799f81fd861b9 refactor: Use uint64_t over size_t for serialized-size values (MarcoFalke)
fa4f388fc99c9ec7c3cf2bac3863c7b3004bb2ae refactor: Use fixed size ints over (un)signed ints for serialized values (MarcoFalke)
fa01f38e53cfda4155d0ea09ca8b1291b7001fe8 move-only: Move CBlockFileInfo to kernel namespace (MarcoFalke)
fa2bbc9e4cfe017436a5167ab5c443f4412efa3c refactor: [rpc] Remove cast when reporting serialized size (MarcoFalke)
fa364af89bd914ea7cd0d4a5470e0a502e0a2075 test: Remove outdated comment (MarcoFalke)

Pull request description:

  Consensus code should arrive at the same conclusion, regardless of the architecture it runs on. Using architecture-specific types such as `size_t` can lead to issues, such as the low-severity [CVE-2025-46597](https://bitcoincore.org/en/2025/10/24/disclose-cve-2025-46597/).

  The CVE was already worked around, but it may be good to still fix the underlying issue.

  Fixes https://github.com/bitcoin/bitcoin/issues/33709 with a few refactors to use explicit fixed-sized integer types in serialization-size related code and concluding with a refactor to return `uint64_t` from `GetSerializeSize`. The refactors should not change any behavior, because the CVE was already worked around.

ACKs for top commit:
  Crypt-iQ:
    crACK fa6c0bedd33ac7ad27454adaf9522fd27bef6ea3
  l0rinc:
    ACK fa6c0bedd33ac7ad27454adaf9522fd27bef6ea3
  laanwj:
    Code review ACK fa6c0bedd33ac7ad27454adaf9522fd27bef6ea3

Tree-SHA512: f45057bd86fb46011e4cb3edf0dc607057d72ed869fd6ad636562111ae80fea233b2fc45c34b02256331028359a9c3f4fa73e9b882b225bdc089d00becd0195e
2025-11-12 09:48:10 -05:00
Ava Chow
a4e96cae7d
Merge bitcoin/bitcoin#33042: refactor: inline constant return values from dbwrapper write methods
743abbcbde9e5a2db489bca461c98df461eff7d0 refactor: inline constant return value of `BlockTreeDB::WriteBatchSync` and `BlockManager::WriteBlockIndexDB` and `BlockTreeDB::WriteFlag` (Lőrinc)
e030240e909493549e24aa8bcd5b382cab6e2c79 refactor: inline constant return value of `CDBWrapper::Erase` and `BlockTreeDB::WriteReindexing` (Lőrinc)
cdab9480e9e35656f490878f92dab5427b36f21d refactor: inline constant return value of `CDBWrapper::Write` (Lőrinc)
d1847cf5b5af232ad180f5d302361b72334952b2 refactor: inline constant return value of `TxIndex::DB::WriteTxs` (Lőrinc)
50b63a5698e533376ef7a20bc0c440d3d6bf7a9f refactor: inline constant return value of `CDBWrapper::WriteBatch` (Lőrinc)

Pull request description:

  Related to https://github.com/bitcoin/bitcoin/pull/31144#discussion_r2223587480

  ### Summary
  `WriteBatch` always returns `true` - the errors are handled by throwing `dbwrapper_error` instead.

  ### Context
  This boolean return value of the `Write` methods is confusing because it's inconsistent with `CDBWrapper::Read`, which catches exceptions and returns a boolean to indicate success/failure. It's bad that `Read` returns and `Write` throws - but it's a lot worse that `Write` advertises a return value when it actually communicates errors through exceptions.

  ### Solution
  This PR removes the constant return values from write methods and inlines `true` at their call sites. Many upstream methods had boolean return values only because they were propagating these constants - those have been cleaned up as well.

  Methods that returned a constant `true` value that now return `void`:
  - `CDBWrapper::WriteBatch`, `CDBWrapper::Write`, `CDBWrapper::Erase`
  - `TxIndex::DB::WriteTxs`
  - `BlockTreeDB::WriteReindexing`, `BlockTreeDB::WriteBatchSync`, `BlockTreeDB::WriteFlag`
  - `BlockManager::WriteBlockIndexDB`

  ### Note
  `CCoinsView::BatchWrite` (and transitively `CCoinsViewCache::Flush` & `CCoinsViewCache::Sync`) were intentionally not changed here. While all implementations return `true`, the base `CCoinsView::BatchWrite` returns `false`. Changing this would cause `coins_view` tests to fail with:
  > terminating due to uncaught exception of type std::logic_error: Not all unspent flagged entries were cleared

  We can fix that in a follow-up PR.

ACKs for top commit:
  achow101:
    ACK 743abbcbde9e5a2db489bca461c98df461eff7d0
  janb84:
    ACK 743abbcbde9e5a2db489bca461c98df461eff7d0
  TheCharlatan:
    ACK 743abbcbde9e5a2db489bca461c98df461eff7d0
  sipa:
    ACK 743abbcbde9e5a2db489bca461c98df461eff7d0

Tree-SHA512: b2a550bff066216f1958d2dd9a7ef6a9949de518cc636f8ab9c670e0b7a330c1eb8c838e458a8629acb8ac980cea6616955cd84436a7b8ab9096f6d648073b1e
2025-11-10 09:15:24 -08:00
MarcoFalke
fa01f38e53
move-only: Move CBlockFileInfo to kernel namespace
Also, move it to the blockstorage module, because it is only used inside
that module.

Can be reviewed with the git option --color-moved=dimmed-zebra
2025-10-28 16:08:44 +01:00
Lőrinc
743abbcbde refactor: inline constant return value of BlockTreeDB::WriteBatchSync and BlockManager::WriteBlockIndexDB and BlockTreeDB::WriteFlag 2025-08-13 15:47:48 -07:00
Lőrinc
e030240e90 refactor: inline constant return value of CDBWrapper::Erase and BlockTreeDB::WriteReindexing
Did both in this commit, since the return value of `WriteReindexing` was ignored anyway - which existed only because of the constant `Erase` being called
2025-08-13 15:47:48 -07:00
Lőrinc
cdab9480e9 refactor: inline constant return value of CDBWrapper::Write 2025-08-13 15:47:48 -07:00
Lőrinc
50b63a5698 refactor: inline constant return value of CDBWrapper::WriteBatch
`WriteBatch` can only ever return `true` - its errors are handled by throwing a `throw dbwrapper_error` instead.
The boolean return value is quite confusing, especially since it's symmetric with `CDBWrapper::Read`, which catches the exceptions and returns a boolean instead.
We're removing the constant return value and inlining `true` for its usages.
2025-08-13 15:47:39 -07:00
Sergi Delgado Segura
18524b072e Make nSequenceId init value constants
Make it easier to follow what the values come without having to go
over the comments, plus easier to maintain
2025-07-28 10:15:17 -04:00
Sergi Delgado Segura
8b91883a23 Set the same best tip on restart if two candidates have the same work
Before this, if we had two (or more) same work tip candidates and restarted our node,
it could be the case that the block set as tip after bootstrap didn't match the one
before stopping. That's because the work and `nSequenceId` of both block will be the same
(the latter is only kept in memory), so the active chain after restart would have depended
on what tip candidate was loaded first.

This makes sure that we are consistent over reboots.
2025-07-28 10:15:14 -04:00
Sergi Delgado Segura
ab145cb3b4 Updates CBlockIndexWorkComparator outdated comment 2025-07-28 10:11:34 -04:00
MarcoFalke
face8123fd
log: [refactor] Use info level for init logs
This refactor does not change behavior.
2025-07-25 09:50:50 +02:00
MarcoFalke
fa183761cb
log: Remove function name from init logs
It is redundant with -logsourcelocations and the log messages are
clearer without it.

Also, remove a double-space.

Also, add braces around `if` touched in the next commit.

This tiny behavior change requires a test fixup.
2025-07-25 09:50:24 +02:00
Lőrinc
478d40afc6 refactor: encapsulate vector/array keys into Obfuscation 2025-07-16 14:33:07 -07:00
Lőrinc
0b8bec8aa6 scripted-diff: unify xor-vs-obfuscation nomenclature
Mechanical refactor of the low-level "xor" wording to signal the intent instead of the implementation used.
The renames are ordered by heaviest-hitting substitutions first, and were constructed such that after each replacement the code is still compilable.

-BEGIN VERIFY SCRIPT-
sed -i \
  -e 's/\bGetObfuscateKey\b/GetObfuscation/g' \
  -e 's/\bxor_key\b/obfuscation/g' \
  -e 's/\bxor_pat\b/obfuscation/g' \
  -e 's/\bm_xor_key\b/m_obfuscation/g' \
  -e 's/\bm_xor\b/m_obfuscation/g' \
  -e 's/\bobfuscate_key\b/m_obfuscation/g' \
  -e 's/\bOBFUSCATE_KEY_KEY\b/OBFUSCATION_KEY_KEY/g' \
  -e 's/\bSetXor(/SetObfuscation(/g' \
  -e 's/\bdata_xor\b/obfuscation/g' \
  -e 's/\bCreateObfuscateKey\b/CreateObfuscation/g' \
  -e 's/\bobfuscate key\b/obfuscation key/g' \
  $(git ls-files '*.cpp' '*.h')
-END VERIFY SCRIPT-
2025-07-16 14:32:01 -07:00
Lőrinc
54ab0bd64c refactor: commit to 8 byte obfuscation keys
Since 31 byte xor-keys are not used in the codebase, using the common size (8 bytes) makes the benchmarks more realistic.

Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>
2025-07-16 13:19:18 -07:00
Ava Chow
ea4285775e
Merge bitcoin/bitcoin#29307: util: explicitly close all AutoFiles that have been written
c10e382d2a3b76b70ebb8a4eb5cd99fc9f14d702 flatfile: check whether the file has been closed successfully (Vasil Dimov)
4bb5dd78ea4b578922a3316b37b486f96cb0beec util: check that a file has been closed before ~AutoFile() is called (Vasil Dimov)
8bb34f07df9ad45faf25c32c99a4dd70759b25be Explicitly close all AutoFiles that have been written (Vasil Dimov)
a69c4098b273b6db5d2212ba91cfc713c1634c5d rpc: take ownership of the file by WriteUTXOSnapshot() (Hodlinator)

Pull request description:

  `fclose(3)` may fail to flush the previously written data to disk, thus a failing `fclose(3)` is as serious as a failing `fwrite(3)`.

  Previously the code ignored `fclose(3)` failures. This PR improves that by changing all users of `AutoFile` that use it to write data to explicitly close the file and handle a possible error.

  ---

  Other alternatives are:

  1. `fflush(3)` after each write to the file (and throw if it fails from the `AutoFile::write()` method) and hope that `fclose(3)` will then always succeed. Assert that it succeeds from the destructor 🙄. Will hurt performance.
  2. Throw nevertheless from the destructor. Exception within the exception in C++ I think results in terminating the program without a useful message.
  3. (this is implemented in the latest incarnation of this PR) Redesign `AutoFile` so that its destructor cannot fail. Adjust _all_ its users 😭. For example, if the file has been written to, then require the callers to explicitly call the `AutoFile::fclose()` method before the object goes out of scope. In the destructor, as a sanity check, assume/assert that this is indeed the case. Defeats the purpose of a RAII wrapper for `FILE*` which automatically closes the file when it goes out of scope and there are a lot of users of `AutoFile`.
  4. Pass a new callback function to the `AutoFile` constructor which will be called from the destructor to handle `fclose()` errors, as described in https://github.com/bitcoin/bitcoin/pull/29307#issuecomment-2243842400. My thinking is that if that callback is going to only log a message, then we can log the message directly from the destructor without needing a callback. If the callback is going to do more complicated error handling then it is easier to do that at the call site by directly calling `AutoFile::fclose()` instead of getting the `AutoFile` object out of scope (so that its destructor is called) and inspecting for side effects done by the callback (e.g. set a variable to indicate a failed `fclose()`).

ACKs for top commit:
  l0rinc:
    ACK c10e382d2a3b76b70ebb8a4eb5cd99fc9f14d702
  achow101:
    ACK c10e382d2a3b76b70ebb8a4eb5cd99fc9f14d702
  hodlinator:
    re-ACK c10e382d2a3b76b70ebb8a4eb5cd99fc9f14d702

Tree-SHA512: 3994ca57e5b2b649fc84f24dad144173b7500fc0e914e06291d5c32fbbf8d2b1f8eae0040abd7a5f16095ddf4e11fe1636c6092f49058cda34f3eb2ee536d7ba
2025-07-03 15:37:44 -07:00
Vasil Dimov
8bb34f07df
Explicitly close all AutoFiles that have been written
There is no way to report a close error from `AutoFile` destructor.
Such an error could be serious if the file has been written to because
it may mean the file is now corrupted (same as if write fails).

So, change all users of `AutoFile` that use it to write data to
explicitly close the file and handle a possible error.
2025-06-16 15:33:15 +02:00
Roman Zeyde
6ecb9fc65f
chore: use std::vector<std::byte> for BlockManager::ReadRawBlock() 2025-06-13 19:19:44 +03:00
Lőrinc
09ee8b7f27 node: avoid recomputing block hash in ReadBlock
Eliminate one SHA‑256 double‑hash computation of the header per block read by reusing the hash for:
* proof‑of‑work verification;
* (optional) integrity check against the supplied hash.
2025-05-26 23:23:44 +02:00
fanquake
2b85d31bcc
refactor: starts/ends_with changes for clang-tidy 20 2025-04-22 13:16:54 +01:00
Lőrinc
8d801e3efb optimization: bulk serialization writes in WriteBlockUndo and WriteBlock
Similarly to the serialization reads optimization, buffered writes will enable batched XOR calculations.
This is especially beneficial since the current implementation requires copying the write input's `std::span` to perform obfuscation.
Batching allows us to apply XOR operations on the internal buffer instead, reducing unnecessary data copying and improving performance.

------

> macOS Sequoia 15.3.1
> C++ compiler .......................... Clang 19.1.7
> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='WriteBlockBench' -min-time=10000

Before:

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|        5,149,564.31 |              194.19 |    0.8% |     10.95 | `WriteBlockBench`

After:

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|        2,990,564.63 |              334.39 |    1.5% |     11.27 | `WriteBlockBench`

------

> Ubuntu 24.04.2 LTS
> C++ compiler .......................... GNU 13.3.0
> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='WriteBlockBench' -min-time=20000

Before:

|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|        5,152,973.58 |              194.06 |    2.2% |   19,350,886.41 |    8,784,539.75 |  2.203 |   3,079,335.21 |    0.4% |     23.18 | `WriteBlockBench`

After:

|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|        4,145,681.13 |              241.21 |    4.0% |   15,337,596.85 |    5,732,186.47 |  2.676 |   2,239,662.64 |    0.1% |     23.94 | `WriteBlockBench`

Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
Co-authored-by: Cory Fields <cory-nospam-@coryfields.com>
2025-04-14 12:04:06 +02:00
Lőrinc
520965e293 optimization: bulk serialization reads in UndoRead, ReadBlock
The obfuscation (XOR) operations are currently done byte-by-byte during serialization. Buffering the reads will enable batching the obfuscation operations later.

Different operating systems handle file caching differently, so reading larger batches (and processing them from memory) is measurably faster, likely because of fewer native fread calls and reduced lock contention.

Note that `ReadRawBlock` doesn't need buffering since it already reads the whole block directly.
Unlike `ReadBlockUndo`, the new `ReadBlock` implementation delegates to `ReadRawBlock`, which uses more memory than a buffered alternative but results in slightly simpler code and a small performance increase (~0.4%). This approach also clearly documents that `ReadRawBlock` is a logical subset of `ReadBlock` functionality.

The current implementation, which iterates over a fixed-size buffer, provides a more general alternative to Cory Fields' solution of reading the entire block size in advance.

Buffer sizes were selected based on benchmarking to ensure the buffered reader produces performance similar to reading the whole block into memory. Smaller buffers were slower, while larger ones showed diminishing returns.

------

> macOS Sequoia 15.3.1
> C++ compiler .......................... Clang 19.1.7
> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='ReadBlockBench' -min-time=10000

Before:

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|        2,271,441.67 |              440.25 |    0.1% |     11.00 | `ReadBlockBench`

After:

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|        1,738,971.29 |              575.05 |    0.2% |     10.97 | `ReadBlockBench`

------

> Ubuntu 24.04.2 LTS
> C++ compiler .......................... GNU 13.3.0
> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='ReadBlockBench' -min-time=20000

Before:

|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|        6,895,987.11 |              145.01 |    0.0% |   71,055,269.86 |   23,977,374.37 |  2.963 |   5,074,828.78 |    0.4% |     22.00 | `ReadBlockBench`

After:

|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|        5,771,882.71 |              173.25 |    0.0% |   65,741,889.82 |   20,453,232.33 |  3.214 |   3,971,321.75 |    0.3% |     22.01 | `ReadBlockBench`

Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>
Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
Co-authored-by: Martin Leitner-Ankerl <martin.ankerl@gmail.com>
Co-authored-by: Cory Fields <cory-nospam-@coryfields.com>
2025-04-14 12:04:06 +02:00
Lőrinc
056cb3c0d2 refactor: clear up blockstorage/streams in preparation for optimization
Made every OpenBlockFile#fReadOnly value explicit.

Replaced hard-coded values in ReadRawBlock with STORAGE_HEADER_BYTES.
Changed `STORAGE_HEADER_BYTES` and `UNDO_DATA_DISK_OVERHEAD` to `uint32_t` to avoid casts.

Also added `LIFETIMEBOUND` to the `AutoFile` parameter of `BufferedFile`, which stores a reference to the underlying `AutoFile`, allowing Clang to emit warnings if the referenced `AutoFile` might be destroyed while `BufferedFile` still exists.
Without this attribute, code with lifetime violations wouldn't trigger compiler warnings.

Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>
2025-04-14 11:57:14 +02:00
Lőrinc
67fcc64802 log: unify error messages for (read/write)[undo]block
Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>
2025-04-13 23:44:46 +02:00
Lőrinc
a4de160492 scripted-diff: shorten BLOCK_SERIALIZATION_HEADER_SIZE constant
Renames the constant to be less verbose and better reflect its purpose:
it represents the size of the storage header that precedes serialized block data on disk,
not to be confused with a block's own header.

-BEGIN VERIFY SCRIPT-
git grep -q "STORAGE_HEADER_BYTES" $(git ls-files) && echo "Error: Target name STORAGE_HEADER_BYTES already exists in the codebase" && exit 1
sed -i 's/BLOCK_SERIALIZATION_HEADER_SIZE/STORAGE_HEADER_BYTES/g' $(git grep -l 'BLOCK_SERIALIZATION_HEADER_SIZE')
-END VERIFY SCRIPT-
2025-04-13 23:44:46 +02:00
Lőrinc
6640dd52c9 Narrow scope of undofile write to avoid possible resource management issue
`AutoFile{OpenUndoFile(pos)}` was still in scope when `FlushUndoFile(pos.nFile)` was called, which could lead to file handle conflicts or other unexpected behavior.

Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>
2025-04-13 23:44:46 +02:00
Lőrinc
3197155f91 refactor: collect block read operations into try block
Reorganized error handling in block-related operations by grouping related operations together within the same scope.

In `ReadBlockUndo()` and `ReadBlock()`, moved all deserialization operations, comments and checksum verification inside a single try/catch block for cleaner error handling.
In `WriteBlockUndo()`, consolidated hash calculation and data writing operations within a common block to better express their logical relationship.
2025-04-13 23:44:44 +02:00
marcofleon
3c5d1a4681 Remove checkpoints
The headers presync logic should be enough to prevent memory DoS using
low-work headers. Therefore, we no longer have any use for checkpoints.
2025-03-13 11:13:13 +00:00
Ava Chow
601a6a6917
Merge bitcoin/bitcoin#30965: kernel: Move block tree db open to block manager
0cdddeb2240d1f33c8b2dd28bb0c9d84d9420e3d kernel: Move block tree db open to BlockManager constructor (TheCharlatan)
7fbb1bc44b1461f008284533f1667677e729f0c0 kernel: Move block tree db open to block manager (TheCharlatan)
57ba59c0cdf20de322afabe4a132ad17e483ce77 refactor: Remove redundant reindex check (TheCharlatan)

Pull request description:

  Before this change the block tree db was needlessly re-opened during startup when loading a completed snapshot. Improve this by letting the block manager open it on construction. This also simplifies the test code a bit.

  The change was initially motivated to make it easier for users of the kernel library to instantiate a BlockManager that may be used to read data from disk without loading the block index into a cache.

ACKs for top commit:
  maflcko:
    re-ACK 0cdddeb2240d1f33c8b2dd28bb0c9d84d9420e3d 🏪
  achow101:
    ACK 0cdddeb2240d1f33c8b2dd28bb0c9d84d9420e3d
  mzumsande:
    re-ACK 0cdddeb2240d1f33c8b2dd28bb0c9d84d9420e3d

Tree-SHA512: fe3d557a725367e549e6a0659f64259cfef6aaa565ec867d9a177be0143ff18a2c4a20dd57e35e15f97cf870df476d88c05b03b6a7d9e8d51c568d9eda8947ef
2025-01-31 15:28:06 -05:00
Ava Chow
9ecc7af41f
Merge bitcoin/bitcoin#31674: init: Lock blocksdir in addition to datadir
2656a5658c14b43c32959db7235e9db55a17d4c8 tests: add a test for the new blocksdir lock (Cory Fields)
bdc0a68e676f237bcbb5195a60bb08bb34123e71 init: lock blocksdir in addition to datadir (Cory Fields)
cabb2e5c24282c88ccc7fcaede854eaa8d7ff162 refactor: introduce a more general LockDirectories for init (Cory Fields)
1db331ba764d27537d82abd91dde50fc9d7e0ff4 init: allow a new xor key to be written if the blocksdir is newly created (Cory Fields)

Pull request description:

  This probably should've been included in #12653 when `-blocksdir` was introduced. Credit TheCharlatan for noticing that it's missing.

  This guards against 2 processes running with separate datadirs but the same blocksdir. I didn't add `walletdir` as I assume sqlite has us covered there.

  It's not likely to happen currently, but may be more relevant in the future with applications using the kernel. Note that the kernel does not currently do any dir locking, but it should.

ACKs for top commit:
  maflcko:
    review ACK 2656a5658c14b43c32959db7235e9db55a17d4c8 🏼
  kevkevinpal:
    ACK [2656a56](2656a5658c)
  achow101:
    ACK 2656a5658c14b43c32959db7235e9db55a17d4c8
  tdb3:
    Code review and light test ACK 2656a5658c14b43c32959db7235e9db55a17d4c8

Tree-SHA512: 3ba17dc670126adda104148e14d1322ea4f67d671c84aaa9c08c760ef778ca1936832c0dc843cd6367e09939f64c6f0a682b0fa23a5967e821b899dff1fff961
2025-01-24 18:15:00 -05:00
TheCharlatan
0cdddeb224
kernel: Move block tree db open to BlockManager constructor
Make the block db open RAII style by calling it in the BlockManager
constructor.

Before this change the block tree db was needlessly re-opened during
startup when loading a completed snapshot. Improve this by letting the
block manager open it on construction. This also simplifies the test
code a bit.

The change was initially motivated to make it easier for users of the
kernel library to instantiate a BlockManager that may be used to read
data from disk without loading the block index into a cache.
2025-01-20 21:27:50 +01:00
Cory Fields
1db331ba76 init: allow a new xor key to be written if the blocksdir is newly created
A subsequent commit will add a .lock file to this dir at startup, meaning that
the blocksdir is never empty by the time the xor key is being read/written.

Ignore all hidden files when determining if this is the first run.
2025-01-16 21:06:21 +00:00
Lőrinc
223081ece6 scripted-diff: rename block and undo functions for consistency
Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>

-BEGIN VERIFY SCRIPT-
grep -r -wE 'WriteBlock|ReadRawBlock|ReadBlock|WriteBlockUndo|ReadBlockUndo' $(git ls-files src/ ':!src/leveldb') && \
    echo "Error: One or more target names already exist!" && exit 1
sed -i \
    -e 's/\bSaveBlockToDisk/WriteBlock/g' \
    -e 's/\bReadRawBlockFromDisk/ReadRawBlock/g' \
    -e 's/\bReadBlockFromDisk/ReadBlock/g' \
    -e 's/\bWriteUndoDataForBlock/WriteBlockUndo/g' \
    -e 's/\bUndoReadFromDisk/ReadBlockUndo/g' \
    $(git ls-files src/ ':!src/leveldb')
-END VERIFY SCRIPT-
2025-01-09 15:17:02 +01:00
Lőrinc
baaa3b2846 refactor,blocks: remove costly asserts and modernize affected logs
When the behavior was changes in a previous commit (caching `GetSerializeSize` and avoiding `AutoFile.tell`), (static)asserts were added to make sure the behavior was kept - to make sure reviewers and CI validates it.
We can safely remove them now.

Logs were also slightly modernized since they were trivial to do.

Co-authored-by: Anthony Towns <aj@erisian.com.au>
Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
2025-01-09 15:16:49 +01:00