1632fc104be8f171f59a023800b2f9f20d0a3cff txgraph: Track multiple potential would-be clusters in Trim (improvement) (Pieter Wuille)
4608df37e02abde09da5d6668dcd4dc5a6e7c569 txgraph: add Trim benchmark (benchmark) (Pieter Wuille)
9c436ff01cffe51e34d0f29c445db11bb807c6c3 txgraph: add fuzz test scenario that avoids cycles inside Trim() (tests) (Pieter Wuille)
938e86f8fecd65ca90b97e6cf896f8c59fb590ba txgraph: add unit test for TxGraph::Trim (tests) (glozow)
a04e205ab03efa105f7886449c2c9316f67a85c1 txgraph: Add ability to trim oversized clusters (feature) (Pieter Wuille)
eabcd0eb6fca5790ea19e7c1613ae06df8d21918 txgraph: remove unnecessary m_group_oversized (simplification) (Greg Sanders)
19b14e61eae7f0f0bfc8d17b06cc0e3ebd1f994d txgraph: Permit transactions that exceed cluster size limit (feature) (Pieter Wuille)
c4287b9b71c6d5222bcd0d2af3185de93ce76078 txgraph: Add ability to configure maximum cluster size/weight (feature) (Pieter Wuille)
Pull request description:
Part of cluster mempool (#30289).
During reorganisations, it is possible that dependencies get added which would result in clusters that violate policy limits (cluster count, cluster weight), when linking the new from-block transactions to the old from-mempool transactions. Unlike RBF scenarios, we cannot simply reject the changes when they are due to received blocks. To accommodate this, add a `TxGraph::Trim()`, which removes some subset of transactions (including descendants) in order to make all resulting clusters satisfy the limits.
Conceptually, the way this is done is by defining a rudimentary linearization for the entire would-be too-large cluster, iterating it from beginning to end, and reasoning about the counts and weights of the clusters that would be reached using transactions up to that point. If a transaction is encountered whose addition would violate the limit, it is removed, together with all its descendants.
This rudimentary linearization is like a merge sort of the chunks of the clusters being combined, but respecting topology. More specifically, it is continuously picking the highest-chunk-feerate remaining transaction among those which have no unmet dependencies left. For efficiency, this rudimentary linearization is computed lazily, by putting all viable transactions in a heap, sorted by chunk feerate, and adding new transactions to it as they become viable.
The `Trim()` function is rather unusual compared to the `TxGraph` functionality added in previous PRs, in that `Trim()` makes it own decisions about what the resulting graph contents will be, without good specification of how it makes that decision - it is just a best-effort attempt (which is improved in the last commit). All other `TxGraph` mutators are simply to inform the graph about changes the calling mempool code decided on; this one lets the decision be made by txgraph.
As part of this, the "oversized" property is expanded to also encompass a configurable cluster weight limit (in addition to cluster count limit).
ACKs for top commit:
instagibbs:
reACK 1632fc104be8f171f59a023800b2f9f20d0a3cff
glozow:
reACK 1632fc104be via range-diff
ismaelsadeeq:
reACK 1632fc104be8f171f59a023800b2f9f20d0a3cff 🛰️
Tree-SHA512: ccacb54be8ad622bd2717905fc9b7e42aea4b07f824de1924da9237027a97a9a2f1b862bc6a791cbd2e1a01897ad2c7c73c398a2d5ccbce90bfbeac0bcebc9ce
c10e382d2a3b76b70ebb8a4eb5cd99fc9f14d702 flatfile: check whether the file has been closed successfully (Vasil Dimov)
4bb5dd78ea4b578922a3316b37b486f96cb0beec util: check that a file has been closed before ~AutoFile() is called (Vasil Dimov)
8bb34f07df9ad45faf25c32c99a4dd70759b25be Explicitly close all AutoFiles that have been written (Vasil Dimov)
a69c4098b273b6db5d2212ba91cfc713c1634c5d rpc: take ownership of the file by WriteUTXOSnapshot() (Hodlinator)
Pull request description:
`fclose(3)` may fail to flush the previously written data to disk, thus a failing `fclose(3)` is as serious as a failing `fwrite(3)`.
Previously the code ignored `fclose(3)` failures. This PR improves that by changing all users of `AutoFile` that use it to write data to explicitly close the file and handle a possible error.
---
Other alternatives are:
1. `fflush(3)` after each write to the file (and throw if it fails from the `AutoFile::write()` method) and hope that `fclose(3)` will then always succeed. Assert that it succeeds from the destructor 🙄. Will hurt performance.
2. Throw nevertheless from the destructor. Exception within the exception in C++ I think results in terminating the program without a useful message.
3. (this is implemented in the latest incarnation of this PR) Redesign `AutoFile` so that its destructor cannot fail. Adjust _all_ its users 😭. For example, if the file has been written to, then require the callers to explicitly call the `AutoFile::fclose()` method before the object goes out of scope. In the destructor, as a sanity check, assume/assert that this is indeed the case. Defeats the purpose of a RAII wrapper for `FILE*` which automatically closes the file when it goes out of scope and there are a lot of users of `AutoFile`.
4. Pass a new callback function to the `AutoFile` constructor which will be called from the destructor to handle `fclose()` errors, as described in https://github.com/bitcoin/bitcoin/pull/29307#issuecomment-2243842400. My thinking is that if that callback is going to only log a message, then we can log the message directly from the destructor without needing a callback. If the callback is going to do more complicated error handling then it is easier to do that at the call site by directly calling `AutoFile::fclose()` instead of getting the `AutoFile` object out of scope (so that its destructor is called) and inspecting for side effects done by the callback (e.g. set a variable to indicate a failed `fclose()`).
ACKs for top commit:
l0rinc:
ACK c10e382d2a3b76b70ebb8a4eb5cd99fc9f14d702
achow101:
ACK c10e382d2a3b76b70ebb8a4eb5cd99fc9f14d702
hodlinator:
re-ACK c10e382d2a3b76b70ebb8a4eb5cd99fc9f14d702
Tree-SHA512: 3994ca57e5b2b649fc84f24dad144173b7500fc0e914e06291d5c32fbbf8d2b1f8eae0040abd7a5f16095ddf4e11fe1636c6092f49058cda34f3eb2ee536d7ba
Trim internally builds an approximate dependency graph of the merged cluster,
replacing all existing dependencies within existing clusters with a simple
linear chain of dependencies. This helps keep the complexity of the merging
operation down, but may result in cycles to appear in the general case, even
though in real scenarios (where Trim is called for stitching re-added mempool
transactions after a reorg back to the existing mempool transactions) such
cycles are not possible.
Add a test that specifically targets Trim() but in scenarios where it is
guaranteed not to have any cycles. It is a special case, is much more a
whitebox test than a blackbox test, and relies on randomness rather than
fuzz input. The upside is that somewhat stronger properties can be tested.
Co-authored-by: Greg Sanders <gsanders87@gmail.com>
During reorganisations, it is possible that dependencies get add which
result in clusters that violate limits (count, size), when linking the
new from-block transactions to the old from-mempool transactions.
Unlike RBF scenarios, we cannot simply reject these policy violations
when they are due to received blocks. To accomodate this, add a Trim()
function to TxGraph, which removes transactions (including descendants)
in order to make all resulting clusters satisfy the limits.
In the initial version of the function added here, the following approach
is used:
- Lazily compute a naive linearization for the to-be-merged cluster (using
an O(n log n) algorithm, optimized for far larger groups of transactions
than the normal linearization code).
- Initialize a set of accepted transactions to {}
- Iterate over the transactions in this cluster one by one:
- If adding the transaction to the set makes it exceed the max cluster size
or count limit, stop.
- Add the transaction to the set.
- Remove all transactions from the cluster that were not included in the set
(note that this necessarily includes all descendants too, because they
appear later in the naive linearization).
Co-authored-by: Greg Sanders <gsanders87@gmail.com>
This removes the restriction added in the previous commit that individual
transactions do not exceed the max cluster size limit.
With this change, the responsibility for enforcing cluster size limits can
be localized purely in TxGraph, without callers (and in particular, tests)
needing to duplicate the enforcement for individual transactions.
This is integrated with the oversized property: the graph is oversized when
any connected component within it contains more than the cluster count limit
many transactions, or when their combined size/weight exceeds the cluster size
limit.
It becomes disallowed to call AddTransaction with a size larger than this limit,
though this limit will be lifted in the next commit.
In addition, SetTransactionFeeRate becomes SetTransactionFee, so that we do not
need to deal with the case that a call to this function might affect the
oversizedness.
a34fb9ad6c6cb4ffafdcefefa1ab957a430b69cf miniscript: Make `operator""_mst` `consteval` (Pieter Wuille)
14052162b19ac22f465f7db7880a6ab5d588a98c Revert "miniscript: make operator_mst consteval" (Hennadii Stepanov)
Pull request description:
Same as https://github.com/bitcoin/bitcoin/pull/28657, but without the refactoring required to work around [fixed](https://github.com/bitcoin/bitcoin/pull/28657#discussion_r2095743353) MSVC bugs.
The second commit has been taken from https://github.com/bitcoin/bitcoin/pull/29167.
ACKs for top commit:
sipa:
ACK a34fb9ad6c6cb4ffafdcefefa1ab957a430b69cf
hodlinator:
re-ACK a34fb9ad6c6cb4ffafdcefefa1ab957a430b69cf
Tree-SHA512: 8b531f9d6c450a8a5218865da05ffb5093d09ce2c0bee9874c0160795c4b1713928730d894ea3cd0b12b133346971ae3a00ed2fe8d9fd8a50b67a74ef81fde98
28299ce77636d7563ec545d043cf1b61bd2f01c1 p2p: remove vestigial READ_STATUS_CHECKBLOCK_FAILED (Greg Sanders)
bac9ee4830664c86c1cb3d38a5b19c722aae2f54 p2p: Add witness mutation check inside FillBlock (Greg Sanders)
Pull request description:
Since #29412, we have not allowed mutated blocks to continue being processed immediately the block is received, but this is only done for the legacy BLOCK message.
Extend these checks as belt-and-suspenders to not allow similar mutation strategies to affect relay by honest peers by applying the check inside `PartiallyDownloadedBlock::FillBlock`, immediately before returning `READ_STATUS_OK`.
ACKs for top commit:
Crypt-iQ:
ACK 28299ce77636d7563ec545d043cf1b61bd2f01c1
achow101:
ACK 28299ce77636d7563ec545d043cf1b61bd2f01c1
stratospher:
ACK 28299ce7.
dergoegge:
Code review ACK 28299ce77636d7563ec545d043cf1b61bd2f01c1
Tree-SHA512: 883d7c12ca096234b425e6fe12e46b0611607600916e6ac8d1c8112224aa76924b7b074754910163ac2ec15379075d618a9ece3642649ac7629cddb0d4e432ea
95969bc58ae0cd928e536d7cb8541de93e8c7205 test: added fuzz coverage to consensus/merkle.cpp (kevkevinpal)
Pull request description:
### Summary
This adds a new fuzz target "merkle" which adds fuzz coverage to `consensus/merkle.cpp`
I can also add this to an existing fuzz target if that is preferable
Before:

After:

ACKs for top commit:
marcofleon:
ReACK 95969bc58ae0cd928e536d7cb8541de93e8c7205
Prabhat1308:
ACK [`95969bc`](95969bc58a)
maflcko:
lgtm ACK 95969bc58ae0cd928e536d7cb8541de93e8c7205
achow101:
ACK 95969bc58ae0cd928e536d7cb8541de93e8c7205
Tree-SHA512: e1fe8b69444733516bfa6cf2adaa199fde4c7c5582b7b908408f9313ed0f2e8cb803d27d707a1716d49606d5eaef8c1e722990bbc3cffc30fa91fe73d2233e9d
This reverts commit 63317103c9f2b0635558da814567bb79c17ae851.
operator""_mst has been manually adjusted according to commit
faf21625652fd0d4bbf9b86fd9ebedb5857505ea
There is no way to report a close error from `AutoFile` destructor.
Such an error could be serious if the file has been written to because
it may mean the file is now corrupted (same as if write fails).
So, change all users of `AutoFile` that use it to write data to
explicitly close the file and handle a possible error.
Rather than this exhaustive linearization check happening inline inside
clusterlin_simple_linearize, abstract it out into a Linearize()-like
function for clarity.
Note that this isn't exactly a refactor, because the old code would compare the
found linearization against all (valid) permutations, while the new code instead
first computes the best linearization from all valid permutations, and then
compares it with the found one.
In several call sites for ReadTopologicalSubset, a non-empty result is
expected, necessitating a special case at the call site for empty results.
Fix this by adding a bool non_empty argument, which does this special
casing (more efficiently) inside ReadTopologicalSubset itself.
Whenever a non-topological permutation is encountered, fast forward to the
last permutation with the same non-topological prefix, skipping over
potentially many permutations that are non-topological for the same reason.
With that, increase the checking of all permutations to clusters of size 8
instead of 7.
The separates the existing fuzz test into:
* clusterlin_linearize: establishes the correctness of Linearize() using the
simpler SimpleLinearize() function.
* clusterlin_simple_linearize: establishes the correctness of SimpleLinearize() by
comparing with all valid linearizations computed by
std::next_permutation.
rather than combining the first two into a single fuzz test.
This separates the existing fuzz test into:
* clusterlin_search_finder: establishes SearchCandidateFinder's correctness using the
simpler SimpleCandidateFinder.
* clusterlin_simple_finder: establishes SimpleCandidateFinder's correctness using the
(even) simpler ExhaustiveCandidateFinder.
rather than trying to do both at once.
Only count the number of actual new subsets added. If the queue contains
a work item that completely covers a component, no transaction can be added
to it without creating a disconnected component. In this case, also don't
count it as an iteration.
With this, the number of iterations performed by SimpleCandidateFinder is
bounded by the number of distinct connected topologically-valid subsets of
the cluster.
IsValid() also returns false for blocks that have not been
validated yet up to the default validity level of BLOCK_VALID_TRANSACTIONS but
are not marked as invalid - e.g. if we only know the header.
Here, we specifically want to filter for invalid blocks.
Also removes the default arg from IsValid() which is now unused outside
of tests, to prevent this kind of misuse for the future.
Co-authored-by: TheCharlatan <seb.kung@gmail.com>
fa9ca13f35be0a023aeed78775ad66f95717b28b refactor: Sort includes of touched source files (MarcoFalke)
facb152697b8d7b75a9e6108f8896f774b06b35f scripted-diff: Bump copyright headers after include changes (MarcoFalke)
fae71d30f7227594e2f59499cf7f7f9420284e04 clang-tidy: Apply modernize-deprecated-headers (MarcoFalke)
Pull request description:
Bitcoin Core is written in C++, so it is confusing to sometimes use the deprecated C headers (with the `.h` extension). For example, it is less clear whether `string.h` refers to the file in this repo or the cstring stdlib header (https://github.com/bitcoin/bitcoin/pull/31308#discussion_r2121492797).
The check is currently disabled for headers, to exclude subtree headers.
ACKs for top commit:
l0rinc:
ACK fa9ca13f35be0a023aeed78775ad66f95717b28b
achow101:
ACK fa9ca13f35be0a023aeed78775ad66f95717b28b
janb84:
ACK fa9ca13f35be0a023aeed78775ad66f95717b28b
stickies-v:
ACK fa9ca13f35be0a023aeed78775ad66f95717b28b
Tree-SHA512: 6639608308c598d612e24435aa519afe92d71b955874b87e527245291fb874b67f3ab95d3a0a5125c6adce5eb41c0d62f6ca488fbbfd60a94f2063d734173f4d
Since #29412, we have not allowed mutated blocks to continue
being processed immediately the block is received, but this
is only done for the legacy BLOCK message.
Extend these checks as belt-and-suspenders to not allow
similar mutation strategies to affect relay by honest peers
by applying the check inside
PartiallyDownloadedBlock::FillBlock, immediately before
returning READ_STATUS_OK.
This also removes the extraneous CheckBlock call.
a189d636184b1c28fa4a325b56c1fab8f44527b1 add release note for datacarriersize default change (Greg Sanders)
a141e1bf501bb2660f3a62083a65678250085e56 Add more OP_RETURN mempool acceptance functional tests (Peter Todd)
0b4048c73385166144d0b3e76beb9a2ac4cc1eca datacarrier: deprecate startup arguments for future removal (Greg Sanders)
63091b79e70b8e230a122fa6fb3dac91c80638e7 test: remove unnecessary -datacarriersize args from tests (Greg Sanders)
9f36962b07eff2369577a17c8adeaa0433697e1c policy: uncap datacarrier by default (Greg Sanders)
Pull request description:
Retains the `-datacarrier*` args, marks them as deprecated, and does not require another startup argument for multiple OP_RETURN outputs.
If a user has set `-datacarriersize` the value is "budgeted" across all seen OP_RETURN output scriptPubKeys. In other words the total script bytes stays the same, but can be spread across any number of outputs. This is done to not introduce an additional argument to support multiple outputs.
I do not advise people use the option with custom arguments and it is marked as deprecated to not mislead as a promise to offer it forever. The argument itself can be removed in some future release to clean up the code and minimize footguns for users.
ACKs for top commit:
stickies-v:
re-ACK a189d636184b1c28fa4a325b56c1fab8f44527b1
Sjors:
re-ACK a189d636184b1c28fa4a325b56c1fab8f44527b1
polespinasa:
re-ACK a189d636184b1c28fa4a325b56c1fab8f44527b1
hodlinator:
re-ACK a189d636184b1c28fa4a325b56c1fab8f44527b1
ajtowns:
reACK a189d636184b1c28fa4a325b56c1fab8f44527b1
mzumsande:
re-ACK a189d636184b1c28fa4a325b56c1fab8f44527b1
petertodd:
ACK a189d636184b1c28fa4a325b56c1fab8f44527b1
theStack:
re-ACK a189d636184b1c28fa4a325b56c1fab8f44527b1
1440000bytes:
re-ACK a189d636184b1c28fa4a325b56c1fab8f44527b1
willcl-ark:
ACK a189d636184b1c28fa4a325b56c1fab8f44527b1
dergoegge:
ACK a189d636184b1c28fa4a325b56c1fab8f44527b1
fanquake:
ACK a189d636184b1c28fa4a325b56c1fab8f44527b1
murchandamus:
ACK a189d636184b1c28fa4a325b56c1fab8f44527b1
darosior:
Concept ACK a189d636184b1c28fa4a325b56c1fab8f44527b1.
Tree-SHA512: 3da2f1ef2f50884d4da7e50df2121bf175cb826edaa14ba7c3068a6d5b2a70beb426edc55d50338ee1d9686b9f74fdf9e10d30fb26a023a718dd82fa1e77b038
* Make the methods of `CThreadInterrupt` virtual and store a pointer to
it in `CConnman`, thus making it possible to override with a mocked
instance.
* Initialize `CConnman::m_interrupt_net` from the constructor, making it
possible for callers to supply mocked version.
* Introduce `FuzzedThreadInterrupt` and `ConsumeThreadInterrupt()` and
use them in `src/test/fuzz/connman.cpp` and `src/test/fuzz/i2p.cpp`.
This improves the CPU utilization of the `connman` fuzz test.
As a nice side effect, the `std::shared_ptr` used for
`CConnman::m_interrupt_net` resolves the possible lifetime issues with
it (see the removed comment for that variable).
Now that all network calls done by `CConnman::OpenNetworkConnection()`
are done via `Sock` they can be redirected (mocked) to `FuzzedSocket`
for testing.
`FuzzedSock::Accept()` properly returns a new socket, but it forgot to
set the output argument `addr`, like `accept(2)` is expected to.
This could lead to reading uninitialized data during testing when we
read it, e.g. from `CService::SetSockAddr()` which reads the `sa_family`
member.
Set `addr` to a fuzzed IPv4 or IPv6 address.
cfc42ae5b7ef6d3892907cfe36dc3ae923495d37 fuzz: add a target for the coins database (Antoine Poinsot)
46e14630f7fe8e2d51adc18a4f50345eb26970c7 fuzz: move the coins_view target's body into a standalone function (Antoine Poinsot)
56d878c4650cc46ce54d1d79db054ac44b486605 fuzz: avoid underflow in coins_view target (Antoine Poinsot)
Pull request description:
This reopens https://github.com/bitcoin/bitcoin/pull/28216.
The current `coins_view` target only tests `CCoinsViewCache` using a basic `CCoinsView` instance. The addition of the `coins_view_db` target enables testing with an actual `CCoinsViewDB` as the backend.
ACKs for top commit:
maflcko:
lgtm ACK cfc42ae5b7ef6d3892907cfe36dc3ae923495d37
l0rinc:
code review ACK cfc42ae5b7ef6d3892907cfe36dc3ae923495d37
TheCharlatan:
ACK cfc42ae5b7ef6d3892907cfe36dc3ae923495d37
Tree-SHA512: d3a92f122629f075767453a1abd9819a1c9716db53b997418993fef62d27683324740d0a8f84df76d8a7a45e508ccadeb69553b6f69e29a1238cd7c0be5276ca
Historically, the headers have been bumped some time after a file has
been touched. Do it now to avoid having to touch them again in the
future for that reason.
-BEGIN VERIFY SCRIPT-
sed -i --regexp-extended 's;( 20[0-2][0-9])(-20[0-2][0-9])? The Bitcoin Core developers;\1-present The Bitcoin Core developers;g' $( git show --pretty="" --name-only HEAD~0 )
-END VERIFY SCRIPT-
This can be reproduced according to the developer notes with something
like
( cd ./src/ && ../contrib/devtools/run-clang-tidy.py -p ../bld-cmake -fix -j $(nproc) )
Also, the header related changes were done manually.
Datacarrier output script sizes and output counts are now
uncapped by default.
To avoid introducing another startup argument, we modify the
OP_RETURN accounting to "budget" the spk sizes.
If a user has set a custom default, this results in that
budget being spent over the sum of all OP_RETURN outputs'
scripts in the transaction, no longer capping the number
of OP_RETURN outputs themselves. This should allow a
superset of current behavior while respecting the passed
argument in terms of total arbitrary data storage.
Co-authored-by: Anthony Towns <aj@erisian.com.au>
It reuses the logic from the `coins_view` target, except it uses an
in-memory CCoinsViewDB as the backend.
Note `CCoinsViewDB` will assert the best block hash is never null, so we
slightly modify the coins_view fuzz logic to take care of this.