bitcoin

mirror of https://github.com/bitcoin/bitcoin.git synced 2026-02-01 03:01:05 +00:00

Author	SHA1	Message	Date
Lőrinc	478d40afc6	refactor: encapsulate `vector`/`array` keys into `Obfuscation`	2025-07-16 14:33:07 -07:00
Lőrinc	0b8bec8aa6	scripted-diff: unify xor-vs-obfuscation nomenclature Mechanical refactor of the low-level "xor" wording to signal the intent instead of the implementation used. The renames are ordered by heaviest-hitting substitutions first, and were constructed such that after each replacement the code is still compilable. -BEGIN VERIFY SCRIPT- sed -i \ -e 's/\bGetObfuscateKey\b/GetObfuscation/g' \ -e 's/\bxor_key\b/obfuscation/g' \ -e 's/\bxor_pat\b/obfuscation/g' \ -e 's/\bm_xor_key\b/m_obfuscation/g' \ -e 's/\bm_xor\b/m_obfuscation/g' \ -e 's/\bobfuscate_key\b/m_obfuscation/g' \ -e 's/\bOBFUSCATE_KEY_KEY\b/OBFUSCATION_KEY_KEY/g' \ -e 's/\bSetXor(/SetObfuscation(/g' \ -e 's/\bdata_xor\b/obfuscation/g' \ -e 's/\bCreateObfuscateKey\b/CreateObfuscation/g' \ -e 's/\bobfuscate key\b/obfuscation key/g' \ $(git ls-files '.cpp' '.h') -END VERIFY SCRIPT-	2025-07-16 14:32:01 -07:00
Lőrinc	54ab0bd64c	refactor: commit to 8 byte obfuscation keys Since 31 byte xor-keys are not used in the codebase, using the common size (8 bytes) makes the benchmarks more realistic. Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>	2025-07-16 13:19:18 -07:00
Ava Chow	ea4285775e	Merge bitcoin/bitcoin#29307 : util: explicitly close all AutoFiles that have been written c10e382d2a3b76b70ebb8a4eb5cd99fc9f14d702 flatfile: check whether the file has been closed successfully (Vasil Dimov) 4bb5dd78ea4b578922a3316b37b486f96cb0beec util: check that a file has been closed before ~AutoFile() is called (Vasil Dimov) 8bb34f07df9ad45faf25c32c99a4dd70759b25be Explicitly close all AutoFiles that have been written (Vasil Dimov) a69c4098b273b6db5d2212ba91cfc713c1634c5d rpc: take ownership of the file by WriteUTXOSnapshot() (Hodlinator) Pull request description: `fclose(3)` may fail to flush the previously written data to disk, thus a failing `fclose(3)` is as serious as a failing `fwrite(3)`. Previously the code ignored `fclose(3)` failures. This PR improves that by changing all users of `AutoFile` that use it to write data to explicitly close the file and handle a possible error. --- Other alternatives are: 1. `fflush(3)` after each write to the file (and throw if it fails from the `AutoFile::write()` method) and hope that `fclose(3)` will then always succeed. Assert that it succeeds from the destructor 🙄. Will hurt performance. 2. Throw nevertheless from the destructor. Exception within the exception in C++ I think results in terminating the program without a useful message. 3. (this is implemented in the latest incarnation of this PR) Redesign `AutoFile` so that its destructor cannot fail. Adjust _all_ its users 😭. For example, if the file has been written to, then require the callers to explicitly call the `AutoFile::fclose()` method before the object goes out of scope. In the destructor, as a sanity check, assume/assert that this is indeed the case. Defeats the purpose of a RAII wrapper for `FILE*` which automatically closes the file when it goes out of scope and there are a lot of users of `AutoFile`. 4. Pass a new callback function to the `AutoFile` constructor which will be called from the destructor to handle `fclose()` errors, as described in https://github.com/bitcoin/bitcoin/pull/29307#issuecomment-2243842400. My thinking is that if that callback is going to only log a message, then we can log the message directly from the destructor without needing a callback. If the callback is going to do more complicated error handling then it is easier to do that at the call site by directly calling `AutoFile::fclose()` instead of getting the `AutoFile` object out of scope (so that its destructor is called) and inspecting for side effects done by the callback (e.g. set a variable to indicate a failed `fclose()`). ACKs for top commit: l0rinc: ACK c10e382d2a3b76b70ebb8a4eb5cd99fc9f14d702 achow101: ACK c10e382d2a3b76b70ebb8a4eb5cd99fc9f14d702 hodlinator: re-ACK c10e382d2a3b76b70ebb8a4eb5cd99fc9f14d702 Tree-SHA512: 3994ca57e5b2b649fc84f24dad144173b7500fc0e914e06291d5c32fbbf8d2b1f8eae0040abd7a5f16095ddf4e11fe1636c6092f49058cda34f3eb2ee536d7ba	2025-07-03 15:37:44 -07:00
Vasil Dimov	8bb34f07df	Explicitly close all AutoFiles that have been written There is no way to report a close error from `AutoFile` destructor. Such an error could be serious if the file has been written to because it may mean the file is now corrupted (same as if write fails). So, change all users of `AutoFile` that use it to write data to explicitly close the file and handle a possible error.	2025-06-16 15:33:15 +02:00
Roman Zeyde	6ecb9fc65f	chore: use `std::vector<std::byte>` for `BlockManager::ReadRawBlock()`	2025-06-13 19:19:44 +03:00
Lőrinc	09ee8b7f27	node: avoid recomputing block hash in `ReadBlock` Eliminate one SHA‑256 double‑hash computation of the header per block read by reusing the hash for: * proof‑of‑work verification; * (optional) integrity check against the supplied hash.	2025-05-26 23:23:44 +02:00
fanquake	2b85d31bcc	refactor: starts/ends_with changes for clang-tidy 20	2025-04-22 13:16:54 +01:00
Lőrinc	8d801e3efb	optimization: bulk serialization writes in `WriteBlockUndo` and `WriteBlock` Similarly to the serialization reads optimization, buffered writes will enable batched XOR calculations. This is especially beneficial since the current implementation requires copying the write input's `std::span` to perform obfuscation. Batching allows us to apply XOR operations on the internal buffer instead, reducing unnecessary data copying and improving performance. ------ > macOS Sequoia 15.3.1 > C++ compiler .......................... Clang 19.1.7 > cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='WriteBlockBench' -min-time=10000 Before: \| ns/op \| op/s \| err% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------:\|:---------- \| 5,149,564.31 \| 194.19 \| 0.8% \| 10.95 \| `WriteBlockBench` After: \| ns/op \| op/s \| err% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------:\|:---------- \| 2,990,564.63 \| 334.39 \| 1.5% \| 11.27 \| `WriteBlockBench` ------ > Ubuntu 24.04.2 LTS > C++ compiler .......................... GNU 13.3.0 > cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='WriteBlockBench' -min-time=20000 Before: \| ns/op \| op/s \| err% \| ins/op \| cyc/op \| IPC \| bra/op \| miss% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------------:\|----------------:\|-------:\|---------------:\|--------:\|----------:\|:---------- \| 5,152,973.58 \| 194.06 \| 2.2% \| 19,350,886.41 \| 8,784,539.75 \| 2.203 \| 3,079,335.21 \| 0.4% \| 23.18 \| `WriteBlockBench` After: \| ns/op \| op/s \| err% \| ins/op \| cyc/op \| IPC \| bra/op \| miss% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------------:\|----------------:\|-------:\|---------------:\|--------:\|----------:\|:---------- \| 4,145,681.13 \| 241.21 \| 4.0% \| 15,337,596.85 \| 5,732,186.47 \| 2.676 \| 2,239,662.64 \| 0.1% \| 23.94 \| `WriteBlockBench` Co-authored-by: Ryan Ofsky <ryan@ofsky.org> Co-authored-by: Cory Fields <cory-nospam-@coryfields.com>	2025-04-14 12:04:06 +02:00
Lőrinc	520965e293	optimization: bulk serialization reads in `UndoRead`, `ReadBlock` The obfuscation (XOR) operations are currently done byte-by-byte during serialization. Buffering the reads will enable batching the obfuscation operations later. Different operating systems handle file caching differently, so reading larger batches (and processing them from memory) is measurably faster, likely because of fewer native fread calls and reduced lock contention. Note that `ReadRawBlock` doesn't need buffering since it already reads the whole block directly. Unlike `ReadBlockUndo`, the new `ReadBlock` implementation delegates to `ReadRawBlock`, which uses more memory than a buffered alternative but results in slightly simpler code and a small performance increase (~0.4%). This approach also clearly documents that `ReadRawBlock` is a logical subset of `ReadBlock` functionality. The current implementation, which iterates over a fixed-size buffer, provides a more general alternative to Cory Fields' solution of reading the entire block size in advance. Buffer sizes were selected based on benchmarking to ensure the buffered reader produces performance similar to reading the whole block into memory. Smaller buffers were slower, while larger ones showed diminishing returns. ------ > macOS Sequoia 15.3.1 > C++ compiler .......................... Clang 19.1.7 > cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='ReadBlockBench' -min-time=10000 Before: \| ns/op \| op/s \| err% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------:\|:---------- \| 2,271,441.67 \| 440.25 \| 0.1% \| 11.00 \| `ReadBlockBench` After: \| ns/op \| op/s \| err% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------:\|:---------- \| 1,738,971.29 \| 575.05 \| 0.2% \| 10.97 \| `ReadBlockBench` ------ > Ubuntu 24.04.2 LTS > C++ compiler .......................... GNU 13.3.0 > cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='ReadBlockBench' -min-time=20000 Before: \| ns/op \| op/s \| err% \| ins/op \| cyc/op \| IPC \| bra/op \| miss% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------------:\|----------------:\|-------:\|---------------:\|--------:\|----------:\|:---------- \| 6,895,987.11 \| 145.01 \| 0.0% \| 71,055,269.86 \| 23,977,374.37 \| 2.963 \| 5,074,828.78 \| 0.4% \| 22.00 \| `ReadBlockBench` After: \| ns/op \| op/s \| err% \| ins/op \| cyc/op \| IPC \| bra/op \| miss% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------------:\|----------------:\|-------:\|---------------:\|--------:\|----------:\|:---------- \| 5,771,882.71 \| 173.25 \| 0.0% \| 65,741,889.82 \| 20,453,232.33 \| 3.214 \| 3,971,321.75 \| 0.3% \| 22.01 \| `ReadBlockBench` Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com> Co-authored-by: Ryan Ofsky <ryan@ofsky.org> Co-authored-by: Martin Leitner-Ankerl <martin.ankerl@gmail.com> Co-authored-by: Cory Fields <cory-nospam-@coryfields.com>	2025-04-14 12:04:06 +02:00
Lőrinc	056cb3c0d2	refactor: clear up blockstorage/streams in preparation for optimization Made every OpenBlockFile#fReadOnly value explicit. Replaced hard-coded values in ReadRawBlock with STORAGE_HEADER_BYTES. Changed `STORAGE_HEADER_BYTES` and `UNDO_DATA_DISK_OVERHEAD` to `uint32_t` to avoid casts. Also added `LIFETIMEBOUND` to the `AutoFile` parameter of `BufferedFile`, which stores a reference to the underlying `AutoFile`, allowing Clang to emit warnings if the referenced `AutoFile` might be destroyed while `BufferedFile` still exists. Without this attribute, code with lifetime violations wouldn't trigger compiler warnings. Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>	2025-04-14 11:57:14 +02:00
Lőrinc	67fcc64802	log: unify error messages for (read/write)[undo]block Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>	2025-04-13 23:44:46 +02:00
Lőrinc	a4de160492	scripted-diff: shorten BLOCK_SERIALIZATION_HEADER_SIZE constant Renames the constant to be less verbose and better reflect its purpose: it represents the size of the storage header that precedes serialized block data on disk, not to be confused with a block's own header. -BEGIN VERIFY SCRIPT- git grep -q "STORAGE_HEADER_BYTES" $(git ls-files) && echo "Error: Target name STORAGE_HEADER_BYTES already exists in the codebase" && exit 1 sed -i 's/BLOCK_SERIALIZATION_HEADER_SIZE/STORAGE_HEADER_BYTES/g' $(git grep -l 'BLOCK_SERIALIZATION_HEADER_SIZE') -END VERIFY SCRIPT-	2025-04-13 23:44:46 +02:00
Lőrinc	6640dd52c9	Narrow scope of undofile write to avoid possible resource management issue `AutoFile{OpenUndoFile(pos)}` was still in scope when `FlushUndoFile(pos.nFile)` was called, which could lead to file handle conflicts or other unexpected behavior. Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com> Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>	2025-04-13 23:44:46 +02:00
Lőrinc	3197155f91	refactor: collect block read operations into try block Reorganized error handling in block-related operations by grouping related operations together within the same scope. In `ReadBlockUndo()` and `ReadBlock()`, moved all deserialization operations, comments and checksum verification inside a single try/catch block for cleaner error handling. In `WriteBlockUndo()`, consolidated hash calculation and data writing operations within a common block to better express their logical relationship.	2025-04-13 23:44:44 +02:00
marcofleon	3c5d1a4681	Remove checkpoints The headers presync logic should be enough to prevent memory DoS using low-work headers. Therefore, we no longer have any use for checkpoints.	2025-03-13 11:13:13 +00:00
Ava Chow	601a6a6917	Merge bitcoin/bitcoin#30965 : kernel: Move block tree db open to block manager 0cdddeb2240d1f33c8b2dd28bb0c9d84d9420e3d kernel: Move block tree db open to BlockManager constructor (TheCharlatan) 7fbb1bc44b1461f008284533f1667677e729f0c0 kernel: Move block tree db open to block manager (TheCharlatan) 57ba59c0cdf20de322afabe4a132ad17e483ce77 refactor: Remove redundant reindex check (TheCharlatan) Pull request description: Before this change the block tree db was needlessly re-opened during startup when loading a completed snapshot. Improve this by letting the block manager open it on construction. This also simplifies the test code a bit. The change was initially motivated to make it easier for users of the kernel library to instantiate a BlockManager that may be used to read data from disk without loading the block index into a cache. ACKs for top commit: maflcko: re-ACK 0cdddeb2240d1f33c8b2dd28bb0c9d84d9420e3d 🏪 achow101: ACK 0cdddeb2240d1f33c8b2dd28bb0c9d84d9420e3d mzumsande: re-ACK 0cdddeb2240d1f33c8b2dd28bb0c9d84d9420e3d Tree-SHA512: fe3d557a725367e549e6a0659f64259cfef6aaa565ec867d9a177be0143ff18a2c4a20dd57e35e15f97cf870df476d88c05b03b6a7d9e8d51c568d9eda8947ef	2025-01-31 15:28:06 -05:00
Ava Chow	9ecc7af41f	Merge bitcoin/bitcoin#31674 : init: Lock blocksdir in addition to datadir 2656a5658c14b43c32959db7235e9db55a17d4c8 tests: add a test for the new blocksdir lock (Cory Fields) bdc0a68e676f237bcbb5195a60bb08bb34123e71 init: lock blocksdir in addition to datadir (Cory Fields) cabb2e5c24282c88ccc7fcaede854eaa8d7ff162 refactor: introduce a more general LockDirectories for init (Cory Fields) 1db331ba764d27537d82abd91dde50fc9d7e0ff4 init: allow a new xor key to be written if the blocksdir is newly created (Cory Fields) Pull request description: This probably should've been included in #12653 when `-blocksdir` was introduced. Credit TheCharlatan for noticing that it's missing. This guards against 2 processes running with separate datadirs but the same blocksdir. I didn't add `walletdir` as I assume sqlite has us covered there. It's not likely to happen currently, but may be more relevant in the future with applications using the kernel. Note that the kernel does not currently do any dir locking, but it should. ACKs for top commit: maflcko: review ACK 2656a5658c14b43c32959db7235e9db55a17d4c8 🏼 kevkevinpal: ACK [2656a56](`2656a5658c`) achow101: ACK 2656a5658c14b43c32959db7235e9db55a17d4c8 tdb3: Code review and light test ACK 2656a5658c14b43c32959db7235e9db55a17d4c8 Tree-SHA512: 3ba17dc670126adda104148e14d1322ea4f67d671c84aaa9c08c760ef778ca1936832c0dc843cd6367e09939f64c6f0a682b0fa23a5967e821b899dff1fff961	2025-01-24 18:15:00 -05:00
TheCharlatan	0cdddeb224	kernel: Move block tree db open to BlockManager constructor Make the block db open RAII style by calling it in the BlockManager constructor. Before this change the block tree db was needlessly re-opened during startup when loading a completed snapshot. Improve this by letting the block manager open it on construction. This also simplifies the test code a bit. The change was initially motivated to make it easier for users of the kernel library to instantiate a BlockManager that may be used to read data from disk without loading the block index into a cache.	2025-01-20 21:27:50 +01:00
Cory Fields	1db331ba76	init: allow a new xor key to be written if the blocksdir is newly created A subsequent commit will add a .lock file to this dir at startup, meaning that the blocksdir is never empty by the time the xor key is being read/written. Ignore all hidden files when determining if this is the first run.	2025-01-16 21:06:21 +00:00
Lőrinc	223081ece6	scripted-diff: rename block and undo functions for consistency Co-authored-by: Ryan Ofsky <ryan@ofsky.org> Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com> -BEGIN VERIFY SCRIPT- grep -r -wE 'WriteBlock\|ReadRawBlock\|ReadBlock\|WriteBlockUndo\|ReadBlockUndo' $(git ls-files src/ ':!src/leveldb') && \ echo "Error: One or more target names already exist!" && exit 1 sed -i \ -e 's/\bSaveBlockToDisk/WriteBlock/g' \ -e 's/\bReadRawBlockFromDisk/ReadRawBlock/g' \ -e 's/\bReadBlockFromDisk/ReadBlock/g' \ -e 's/\bWriteUndoDataForBlock/WriteBlockUndo/g' \ -e 's/\bUndoReadFromDisk/ReadBlockUndo/g' \ $(git ls-files src/ ':!src/leveldb') -END VERIFY SCRIPT-	2025-01-09 15:17:02 +01:00
Lőrinc	baaa3b2846	refactor,blocks: remove costly asserts and modernize affected logs When the behavior was changes in a previous commit (caching `GetSerializeSize` and avoiding `AutoFile.tell`), (static)asserts were added to make sure the behavior was kept - to make sure reviewers and CI validates it. We can safely remove them now. Logs were also slightly modernized since they were trivial to do. Co-authored-by: Anthony Towns <aj@erisian.com.au> Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>	2025-01-09 15:16:49 +01:00
Lőrinc	fa39f27a0f	refactor,blocks: deduplicate block's serialized size calculations For consistency `UNDO_DATA_DISK_OVERHEAD` was also extracted to avoid the constant's ambiguity. Asserts were added to help with the review - they are removed in the next commit. Co-authored-by: Ryan Ofsky <ryan@ofsky.org>	2025-01-09 15:16:28 +01:00
Lőrinc	dfb2f9d004	refactor,blocks: inline `WriteBlockToDisk` Similarly, `WriteBlockToDisk` wasn't really extracting a meaningful subset of the `SaveBlockToDisk` functionality, it's tied closely to the only caller (needs the header size twice, recalculated block serializes size, returns multiple branches, mutates parameter). The inlined code should only differ in these parts (modernization will be done in other commits): * renamed `blockPos` to `pos` in `SaveBlockToDisk` to match the parameter name; * changed `return false` to `return FlatFilePos()`. Also removed remaining references to `SaveBlockToDisk`. Co-authored-by: Ryan Ofsky <ryan@ofsky.org>	2025-01-09 13:24:53 +01:00
Lőrinc	42bc491465	refactor,blocks: inline `UndoWriteToDisk` `UndoWriteToDisk` wasn't really extracting a meaningful subset of the `WriteUndoDataForBlock` functionality, it's tied closely to the only caller (needs the header size twice, recalculated undo serializes size, returns multiple branches, modifies parameter, needs documentation). The inlined code should only differ in these parts (modernization will be done in other commits): * renamed `_pos` to `pos` in `WriteUndoDataForBlock` to match the parameter name; * inlined `hashBlock` parameter usage into `hasher << block.pprev->GetBlockHash()`; * changed `return false` to `return FatalError`; * capitalize comment. Co-authored-by: Ryan Ofsky <ryan@ofsky.org>	2025-01-09 13:18:22 +01:00
Pieter Wuille	67a3d59076	streams: remove unused code	2024-09-19 07:33:02 -04:00
Pieter Wuille	e624a9bef1	streams: cache file position within AutoFile	2024-09-13 07:35:41 -04:00
Ava Chow	d4b5553849	Merge bitcoin/bitcoin#30742 : kernel: Use spans instead of vectors for passing block headers to validation functions a2955f09792b6232f3a45aa44a498b466279a8b7 validation: Use span for ImportBlocks paths (TheCharlatan) 20515ea3f5bd426f6e3746cf5cddd2324dacae31 validation: Use span for CalculateClaimedHeadersWork (TheCharlatan) 52575e96e72a0402c448f86728b2e84836b1e817 validation: Use span for ProcessNewBlockHeaders (TheCharlatan) Pull request description: Makes it friendlier for potential future users of the kernel library if they do not store the headers in a std::vector, but can guarantee contiguous memory. Take this opportunity to also change the argument of ImportBlocks previously taking a `std::vector` to a `std::span`. ACKs for top commit: stickies-v: re-ACK a2955f09792b6232f3a45aa44a498b466279a8b7 - no changes except further walking the ~file~ path of modernizing variable names. maflcko: ACK a2955f09792b6232f3a45aa44a498b466279a8b7 🕑 achow101: ACK a2955f09792b6232f3a45aa44a498b466279a8b7 danielabrozzoni: ACK a2955f09792b6232f3a45aa44a498b466279a8b7 Tree-SHA512: 8b07f4ad26e270b65600d1968cd78847b85caca5bfbb83fd9860389f26656b1d9a40b85e0990339f50403d18cedcd2456990054f3b8b0bedce943e50222d2709	2024-09-03 15:40:40 -04:00
TheCharlatan	a2955f0979	validation: Use span for ImportBlocks paths Makes it friendlier for potential future users of the kernel library if they do not store the headers in a std::vector, but can guarantee contiguous memory.	2024-08-30 12:39:46 +02:00
MarcoFalke	3333415890	scripted-diff: LogPrint -> LogDebug -BEGIN VERIFY SCRIPT- sed -i 's/\<LogPrint\>/LogDebug/g' $( git grep -l '\<LogPrint\>' -- ./contrib/ ./src/ ./test/ ':(exclude)src/logging.h' ) -END VERIFY SCRIPT-	2024-08-29 13:49:57 +02:00
stickies-v	2925bd537c	refactor: use c++20 std::views::reverse instead of reverse_iterator.h Use std::ranges::views::reverse instead of the implementation in reverse_iterator.h, and remove it as it is no longer used.	2024-08-06 00:23:38 +01:00
Ava Chow	949b673472	Merge bitcoin/bitcoin#28052 : blockstorage: XOR blocksdir .dat files fa895c72832f9555b52d5bb1dba1093f73de3136 mingw: Document mode wbx workaround (MarcoFalke) fa359255fe6b4de5f26784bfc147dbfb58bef116 Add -blocksxor boolean option (MarcoFalke) fa7f7ac040a9467c307b20e77dc47c87d7377ded Return XOR AutoFile from BlockManager::OpenFile() (MarcoFalke) Pull request description: Currently the *.dat files in the blocksdir store the data received from remote peers as-is. This may be problematic when a program other than Bitcoin Core tries to interpret them by accident. For example, an anti-virus program or other program may scan them and move them into quarantine, or delete them, or corrupt them. This may cause Bitcoin Core to fail a reorg, or fail to reply to block requests (via P2P, RPC, REST, ...). Fix this, similar to https://github.com/bitcoin/bitcoin/pull/6650, by rolling a random XOR pattern over the dat files when writing or reading them. Obviously this can only protect against programs that accidentally and unintentionally are trying to mess with the dat files. Any program that intentionally wants to mess with the dat files can still trivially do so. The XOR pattern is only applied when the blocksdir is freshly created, and there is an option to disable it (on creation), so that people can disable it, if needed. ACKs for top commit: achow101: ACK fa895c72832f9555b52d5bb1dba1093f73de3136 TheCharlatan: Re-ACK fa895c72832f9555b52d5bb1dba1093f73de3136 hodlinator: ACK fa895c72832f9555b52d5bb1dba1093f73de3136 Tree-SHA512: c92a6a717da83bc33a9b8671a779eeefde2c63b192362ba1d71e6535ee31d08e2802b74acc908345197de9daac6930e4771595ee25b09acd5a67f7ea34854720	2024-08-05 17:52:42 -04:00
Fabian Jahr	bf0efb4fc7	scripted-diff: Modernize naming of nChainTx and nTxCount -BEGIN VERIFY SCRIPT- sed -i 's/nChainTx/m_chain_tx_count/g' $(git grep -l 'nChainTx' ./src) sed -i 's/nTxCount/tx_count/g' $(git grep -l 'nTxCount' ./src) -END VERIFY SCRIPT-	2024-08-04 14:24:43 +02:00
MarcoFalke	fa895c7283	mingw: Document mode wbx workaround	2024-07-26 17:31:15 +02:00
MarcoFalke	fa359255fe	Add -blocksxor boolean option	2024-07-26 17:30:53 +02:00
MarcoFalke	fa7f7ac040	Return XOR AutoFile from BlockManager::Open*File() This is a refactor, because the XOR key is empty.	2024-07-26 12:28:59 +02:00
TheCharlatan	7aa8994c6f	refactor: Add FlatFileSeq member variables in BlockManager Instead of constructing a new class every time a file operation is done, construct them once for each of the undo and block file when a new BlockManager is created. In future, this might make it easier to introduce an abstract block store.	2024-07-24 09:39:35 +02:00
Ryan Ofsky	8426e018bf	Merge bitcoin/bitcoin#30428 : log: LogError with FlatFilePos in UndoReadFromDisk fa14e1d9d5c5dc44396a01583ae94480b7bc29ee log: Fix __func__ in LogError in blockstorage module (MarcoFalke) fad59a2f0f37f5b7f6076fd91be43448e35f4b7e log: LogError with FlatFilePos in UndoReadFromDisk (MarcoFalke) aaaa3323f37526862ebf2a2a4bf522c661e6976e refactor: Mark IsBlockPruned const (MarcoFalke) Pull request description: These errors should never happen in normal operation. If they do, knowing the `FlatFilePos` may be useful to determine if data corruption happened. Also, handle the error `pos.IsNull()` as part of `OpenUndoFile`, because it may as well have happened due to data corruption. This mirrors the `LogError` behavior from `ReadBlockFromDisk`. Also, two other fixup commits in this module. ACKs for top commit: kevkevinpal: ACK [fa14e1d](`fa14e1d9d5`) tdb3: cr and light test ACK fa14e1d9d5c5dc44396a01583ae94480b7bc29ee ryanofsky: Code review ACK fa14e1d9d5c5dc44396a01583ae94480b7bc29ee. This should make logging clearer and more consistent Tree-SHA512: abb492a919b4796698d1de0a7874c8eae355422b992aa80dcd6b59c2de1ee0d2949f62b3cf649cd62892976fee640358f7522867ed9d48a595d6f8f4e619df50	2024-07-15 13:42:53 -04:00
MarcoFalke	fa14e1d9d5	log: Fix __func__ in LogError in blockstorage module These errors should never happen. However, when they do happen, it is useful to log the correct error location (function name). For example, this fixes an incorrect "ConnectBlock()" in "WriteUndoDataForBlock".	2024-07-11 16:34:43 +02:00
MarcoFalke	fad59a2f0f	log: LogError with FlatFilePos in UndoReadFromDisk These errors should never happen in normal operation. If they do, knowing the FlatFilePos may be useful to determine if data corruption happened. Also, handle the error pos.IsNull() as part of OpenUndoFile, because it may as well have happened due to data corruption. This mirrors the LogError behavior from ReadBlockFromDisk.	2024-07-11 16:22:31 +02:00
MarcoFalke	aaaa3323f3	refactor: Mark IsBlockPruned const Member fields are used read-only in this method.	2024-07-11 15:39:19 +02:00
Ava Chow	f4849f6922	Merge bitcoin/bitcoin#29668 : prune, rpc: Check undo data when finding pruneheight 8789dc8f315a9d9ad7142d831bc9412f780248e7 doc: Add note to getblockfrompeer on missing undo data (Fabian Jahr) 4a1975008b602aeacdad9a74d1837a7455148074 rpc: Make pruneheight also reflect undo data presence (Fabian Jahr) 96b4facc912927305b06a233cb8b36e7e5964c08 refactor, blockstorage: Generalize GetFirstStoredBlock (Fabian Jahr) Pull request description: The function `GetFirstStoredBlock()` helps us find the first block for which we have data. So far this function only looked for a block with `BLOCK_HAVE_DATA`. However, this doesn't mean that we also have the undo data of that block, and undo data might be required for what a user would like to do with those blocks. One example of how this might happen is if some blocks were fetched using the `getblockfrompeer` RPC. Blocks fetched from a peer will have data but no undo data. The first commit here allows `GetFirstStoredBlock()` to check for undo data as well by passing a parameter. This alone is useful for #29553 and I would use it there. In the second commit I am applying the undo check to the RPCs that report `pruneheight` to the user. I find this much more intuitive because I think the user expects to be able to do all operations on blocks up until the `pruneheight` but that is not the case if undo data is missing. I personally ran into this once before and now again when testing for assumeutxo when I had used `getblockfrompeer`. The following commit adds test coverage for this change of behavior. The last commit adds a note in the docs of `getblockfrompeer` that undo data will not be available. ACKs for top commit: achow101: ACK 8789dc8f315a9d9ad7142d831bc9412f780248e7 furszy: Code review ACK 8789dc8f315a9d9ad7142d831bc9412f780248e7. stickies-v: ACK 8789dc8f315a9d9ad7142d831bc9412f780248e7 Tree-SHA512: 90ae8bdd07a496ade579aa25240609c61c9ed173ad38d30533f6c631fe674e5a41727478ade69ca4b71a571ad94c9da4b33ebba6b5d8821109313c2de3bdfb3d	2024-07-10 15:27:05 -04:00
Fabian Jahr	96b4facc91	refactor, blockstorage: Generalize GetFirstStoredBlock GetFirstStoredBlock is generalized to check for any data status with a status mask that needs to be passed as a parameter. To reflect this the function is also renamed to GetFirstBlock. Co-authored-by: stickies-v <stickies-v@protonmail.com>	2024-06-21 15:00:16 +02:00
Ryan Ofsky	f68cba29b3	blockman: Replace m_reindexing with m_blockfiles_indexed This is a just a mechanical change, renaming and inverting the meaning of the indexing variable. "m_blockfiles_indexed" is a more straightforward name for this variable because this variable just indicates whether or not <datadir>/blocks/blk?????.dat files have been indexed in the <datadir>/blocks/index LevelDB database. The name "m_reindexing" was more confusing, it could be true even if -reindex was not specified, and false when it was specified. Also, the previous name unnecessarily required thinking about the whole reindexing process just to understand simple checks in validation code about whether blocks were indexed. The motivation for this change is to follow up on previous commits, moving away from having multiple variables called "reindex" internally, and instead naming variables individually after what they do and represent.	2024-06-07 19:18:46 +02:00
Ava Chow	058af75874	Merge bitcoin/bitcoin#29817 : kernel: De-globalize fReindex b47bd959207e82555f07e028cc2246943d32d4c3 kernel: De-globalize fReindex (TheCharlatan) Pull request description: fReindex is one of the last remaining globals exposed by the kernel library, so move it into the blockstorage class to reduce the amount of global mutable state and make the kernel library a bit less awkward to use. --- This pull request is part of the [libbitcoinkernel project](https://github.com/bitcoin/bitcoin/issues/27587). ACKs for top commit: achow101: ACK b47bd959207e82555f07e028cc2246943d32d4c3 ryanofsky: Code review ACK b47bd959207e82555f07e028cc2246943d32d4c3. I rereviewed the whole PR, but the only change since last review was reverting the bugfix https://github.com/bitcoin/bitcoin/pull/29817#discussion_r1578327024 and make the change a pure refactoring. mzumsande: Code Review ACK b47bd959207e82555f07e028cc2246943d32d4c3 stickies-v: ACK b47bd959207e82555f07e028cc2246943d32d4c3 Tree-SHA512: f7399d01f93bc0c0c7428fe95d19b9d29b4ed00a4f1deabca78fb0c4fecb434ec971e890feecb105938b5247c926850b1b7b4a4a9caa333a061e40777d0c8463	2024-05-17 15:50:56 -04:00
Ryan Ofsky	2f53f2273d	Merge bitcoin/bitcoin#29975 : blockstorage: Separate reindexing from saving new blocks e41667b720372dae8438ea86e9819027e62b54e0 blockstorage: Don't move cursor backwards in UpdateBlockInfo (Ryan Ofsky) 17103637c6fa2dfcf5374ebb0cd715e540dd4ce1 blockstorage: Rename FindBlockPos and have it return a FlatFilePos (Martin Zumsande) d9e477c4dc39d9623ed66c35c06e28f94ae62ad5 validation, blockstorage: Separate code paths for reindex and saving new blocks (Martin Zumsande) 064859bbad6984a6ec85c744064abdf757807c58 blockstorage: split up FindBlockPos function (Martin Zumsande) fdae638e83522c28a1222e65c43d1cbca3e34cba doc: Improve doc for functions involved in saving blocks to disk (Martin Zumsande) 0d114e3cb20cb9e03fc9ba8daf3d03436b491742 blockstorage: Add Assume for fKnown / snapshot chainstate (Martin Zumsande) Pull request description: `SaveBlockToDisk` / `FindBlockPos` are used for two purposes, depending on whether they are called during reindexing (`dbp` set, `fKnown = true`) or in the "normal" case when adding new blocks (`dbp == nullptr`, `fKnown = false`). The actual tasks are quite different - In normal mode, preparations for saving a new block are made, which is then saved: find the correct position on disk (maybe skipping to a new blk file), check for available disk space, update the blockfile info db, save the block. - during reindex, most of this is not necessary (the block is already on disk after all), only the blockfile info needs to rebuilt because reindex wiped the leveldb it's saved in. Using one function with many conditional statements for this leads to code that is hard to read / understand and bug-prone: - many code paths in `FindBlockPos` are conditional on `fKnown` or `!fKnown` - It's not really clear what actually needs to be done during reindex (we don't need to "save a block to disk" or "find a block pos" as the function names suggest) - logic that should be applied to only one of the two modes is sometimes applied to both (see first commit, or #27039) #24858 and #27039 were recent bugs directly related to the differences between reindexing and normal mode, and in both cases the simple fix took a long time to be reviewed and merged. This PR proposes to clean this code up by splitting out the reindex logic into a separate function (`UpdateBlockInfo`) which will be called directly from validation. As a result, `SaveBlockToDisk` and `FindBlockPos` only need to cover the non-reindex logic. ACKs for top commit: paplorinc: ACK e41667b720372dae8438ea86e9819027e62b54e0 TheCharlatan: Re-ACK e41667b720372dae8438ea86e9819027e62b54e0 ryanofsky: Code review ACK e41667b720372dae8438ea86e9819027e62b54e0. Just improvements to comments since last review. Tree-SHA512: a14ff9a0facf6b1e3c1cd724a2d19a79a25d4b48de64398fdd172671532a472bc10a20cbb64ac3a3e55814dcc877d0597a3e1699cabc4f9d9a86b439b6eaba20	2024-05-16 11:16:08 -04:00
TheCharlatan	b47bd95920	kernel: De-globalize fReindex fReindex is one of the last remaining globals exposed by the kernel library, so move it into the blockstorage class to reduce the amount of global mutable state and make the kernel library a bit less awkward to use.	2024-05-16 11:28:46 +02:00
Ryan Ofsky	e41667b720	blockstorage: Don't move cursor backwards in UpdateBlockInfo Previously, it was possible to move the cursor back to an older file during reindex if blocks are enocuntered out of order during reindex. This would mean that MaxBlockfileNum() would be incorrect, and a wrong DB_LAST_BLOCK could be written to disk. This improves the logic by only ever moving the cursor forward (if possible) but not backwards. Co-authored-by: Martin Zumsande <mzumsande@gmail.com>	2024-05-14 14:54:27 -04:00
Martin Zumsande	17103637c6	blockstorage: Rename FindBlockPos and have it return a FlatFilePos The new name reflects that it is no longer called with existing blocks for which the position is already known. Returning a FlatFilePos directly simplifies the interface.	2024-05-14 14:54:27 -04:00
Martin Zumsande	d9e477c4dc	validation, blockstorage: Separate code paths for reindex and saving new blocks By calling SaveBlockToDisk only when we actually want to save a new block to disk. In the reindex case, we now call UpdateBlockInfo directly from validation. This commit doesn't change behavior.	2024-05-14 14:54:27 -04:00

1 2 3 4 5

205 Commits