c5b36b1c1b11f04e5da7fb44183f61d09a14e40d Mempool Update Cut-Through Optimization (Jeremy Rubin)
c49daf9885e86ba08acdc8332d2a34bc5951a487 [TESTS] Increase limitancestorcount in tournament RPC test to showcase improved algorithm (Jeremy Rubin)
Pull request description:
Often when we're updating mempool entries we update entries that we ultimately end up removing the updated entries shortly thereafter. This patch makes it so that we filter for such entries a bit earlier in processing, which yields a mild improvement for these cases, and is negligible overhead otherwise.
There's potential for a better -- but more sophisticated -- algorithm that can be used taking advantage of epochs, but I figured it is better to do something that is simple and works first and upgrade it later as the other epoch mempool work proceeds as it makes the patches for the epoch algorithm simpler to understand, so you can consider this as preparatory work. It could either go in now if it is not controversial, or we could wait until the other patch is ready to go.
ACKs for top commit:
instagibbs:
reACK c5b36b1
sipa:
utACK c5b36b1c1b11f04e5da7fb44183f61d09a14e40d
mzumsande:
Code Review ACK c5b36b1c1b11f04e5da7fb44183f61d09a14e40d
Tree-SHA512: 78b16864f77a637d8a68a65e23c019a9757d8b2243486728ef601d212ae482f6084cf8e69d810958c356f1803178046e4697207ba40d6d10529ca57de647fae6
b4adc5ad6769e4a5a6179dfff271cd4c9dc47a5b [bugfix] update lockpoints correctly during reorg (glozow)
b6002b07a36f0d58dc6becd04bfcf78599056b7c MOVEONLY: update_lock_points to txmempool.h (glozow)
Pull request description:
I introduced a bug in #22677 (sorry! 😅)
Mempool entries cache `LockPoints`, containing the first height/blockhash/`CBlockIndex*` at which the transaction becomes valid. During a reorg, we re-check timelocks on all mempool entries using `CheckSequenceLocks(useExistingLockPoints=false)` and remove any now-invalid entries. `CheckSequenceLocks()` also mutates the `LockPoints` passed in, and we update valid entries' `LockPoints` using `update_lock_points`. Thus, `update_lock_points(lp)` needs to be called right after `CheckSequenceLocks(lp)`, otherwise we lose the data in `lp`. I incorrectly assumed they could be called in separate loops.
The incorrect behavior introduced is: if we have a reorg in which a timelocked mempool transaction is still valid but becomes valid at a different block, the cached `LockPoints` will be incorrect.
This PR fixes the bug, adds a test, and adds an assertion at the end of `removeForReorg()` to check that all mempool entries' lockpoints are valid. You can reproduce the bug by running the test added in the [test] commit on the code before the [bugfix] commit.
ACKs for top commit:
jnewbery:
ACK b4adc5ad6769e4a5a6179dfff271cd4c9dc47a5b
vasild:
ACK b4adc5ad6769e4a5a6179dfff271cd4c9dc47a5b
mzumsande:
Code Review ACK b4adc5ad6769e4a5a6179dfff271cd4c9dc47a5b
hebasto:
ACK b4adc5ad6769e4a5a6179dfff271cd4c9dc47a5b
MarcoFalke:
re-ACK b4adc5ad6769e4a5a6179dfff271cd4c9dc47a5b 🏁
Tree-SHA512: 16b59f6ff8140d0229079ca1c6b04f2f4a00a2e49931275150e4f3fe5ac4ec109698b083fa6b223ba9511f328271cc1ab081263669d5da020af7fee83c13e401
Often when we're updating mempool entries we update entries that we
ultimately end up removing the updated entries shortly thereafter. This
patch makes it so that we filter for such entries a bit earlier in
processing, which yields a mild improvement for these cases, and is
negligible overhead otherwise.
During a reorg, we re-check timelocks on all mempool entries using
CheckSequenceLocks(useExistingLockPoints=false) and remove any
now-invalid entries. CheckSequenceLocks() also mutates the LockPoints
passed in, and we update valid entries' LockPoints using
update_lock_points. Thus, update_lock_points(lp) needs to be called
right after CheckSequenceLocks(lp), otherwise we lose the data in lp.
commit bedf246 introduced a bug by separating those two loops.
No behavior change.
Parameterize removeForReorg using a CChain and callable that
encapsulates validation logic. The mempool shouldn't need to know a
bunch of details about coinbase maturity and lock finality. Instead,
just pass in a callable function that says true/false. Breaks circular
dependency by removing txmempool's dependency on validation.
Each entry's lockpoints are independent of one another, so there isn't
any reason to update lockpoints for entries that will be removed.
Separating the loops also allows us to move validation logic out and
leave lockpoints in txmempool.
65aaf9495d19ea3fb875228a7e14aab6c1f2986d refactor: move `update_*` structs from txmempool.h to .cpp file (Sebastian Falbesoner)
9947ce62626c05bd186ae8a4864aa382f673ec1a refactor: use const reference for parents in `CTxMemPool::UpdateAncestorsOf` (Sebastian Falbesoner)
Pull request description:
These helpers are exclusively used in txmempool.cpp, hence they should also be moved there. The PR also contains a commit which fixes const-correctness for parents in `CTxMemPool::UpdateAncestorsOf` and declares them as reference to avoid a copy.
ACKs for top commit:
promag:
Code review ACK 65aaf9495d19ea3fb875228a7e14aab6c1f2986d. Verified move-only commit locally.
Tree-SHA512: 7ce29f3ba0e68b5355001f27725b00f6d54cc993015356eb40b61b8cdd17db49b980f4c3d798c8e0c940d245dc3a72c474bb9ff3c0ee971ead450786076812c2
082c5bf099c64e3d27abe9b68a71ce500b693e7e [refactor] pass coinsview and height to check() (glozow)
ed6115f1eae0eb4669601106a9aaff078a2f3a74 [mempool] simplify some check() logic (glozow)
9e8d7ad5d9cc4b013826daead9cee09aad539401 [validation/mempool] use Spend/AddCoin instead of UpdateCoins (glozow)
09d18916afb0ecae90700d4befd9d5dc52767970 MOVEONLY: remove single-use helper func CheckInputsAndUpdateCoins (glozow)
e8639ec26aaf4de3fae280963434bf1cf2017b6f [mempool] remove now-unnecessary code (glozow)
54c6f3c1da01090aee9691a2c2bee0984a054ce8 [mempool] speed up check() by using coins cache and iterating in topo order (glozow)
30e240f65e69c6dffcd033afc63895345bd51f53 [bench] Benchmark CTxMemPool::check() (glozow)
cb1407196fba648aa75504e3ab3d46aa0181563a [refactor/bench] make mempool_stress bench reusable and parameterizable (glozow)
Pull request description:
Remove the txmempool <-> validation circular dependency by removing txmempool's dependency on validation. There are two functions in txmempool that need validation right now: `check()` and `removeForReorg()`. This PR removes the dependencies in `check()`.
This PR also improves the performance of `CTxMemPool::check()` by walking through the entries exactly once, in ascending ancestorcount order, which guarantees that we see parents before children.
ACKs for top commit:
jnewbery:
reACK 082c5bf099c64e3d27abe9b68a71ce500b693e7e
GeneFerneau:
tACK [082c5bf](082c5bf099)
rajarshimaitra:
tACK 082c5bf099
theStack:
Code-review ACK 082c5bf099c64e3d27abe9b68a71ce500b693e7e
Tree-SHA512: 40ac622af1627b5c3e6abb4f0f035d833265a8c5e8dc88faf5354875dfb5137f137825e54bbd2a2668ed37b145c5d02285f776402629f58596e51853a9a79d29
No transaction in the mempool should ever be a coinbase.
Since mempoolDuplicate's backend is the chainstate coins view, it should
always contain the coins available.
UpdateCoins is an unnecessary dependency on validation. All we need to
do is add and remove coins to check inputs. We don't need the extra
logic for checking coinbases and handling TxUndos.
Also remove the wrapper function in validation.h which constructs a
throwaway TxUndo object before calling UpdateCoins because it is now
unused.
Remove variables used for keeping track of mempool transactions for
which we haven't processed the parents yet. Since we're iterating in
topological order now, they're always unused.
No behavior changes.
Before, we're always adding transactions to the "check later" queue if
they have any parents in the mempool. But there's no reason to do this
if all of its inputs are already available from mempoolDuplicate.
Instead, check for inputs, and only mark fDependsWait=true if the
parents haven't been processed yet.
Reduce the amount of "check later" transactions by looking at
ancestors before descendants. Do this by iterating through them in
ascending order by ancestor count. This works because a child will
always have more in-mempool ancestors than its parent.
We should never have any entries in the "check later" queue
after this commit.
This removes a bunch of boilerplate, makes the code easier to read.
Also, C++11 member initialization avoids accidental uninitialized
members.
Can be reviewed with the git option "--word-diff-regex=." or with "git
difftool --tool=meld".
When calculating ancestor/descendant counts for transactions in the
package, as a heuristic, count every transaction in the package as an
ancestor and descendant of every other transaction in the package.
This may overestimate, but will not underestimate, the
ancestor/descendant counts. This shortcut still produces an accurate
count for packages of 1 parent + 1 child.
This does not change existing behavior.
The ancestor/descendant limits are inclusive of the entries themselves,
but CalculateAncestorsAndCheckLimits() does not need access to them.
Allow CheckSequenceLocks to use heights and coins from any CoinsView and
CBlockIndex provided. This means that CheckSequenceLocks() doesn't need
to hold the mempool lock or cs_main. The caller is responsible for
ensuring the CoinsView and CBlockIndex are consistent before passing
them in. The typical usage is still to create a CCoinsViewMemPool from
the mempool and grab the CBlockIndex from the chainstate tip.
In #21415 we decided to return `std::optional` rather than `{}` for
uninitialized values. This PR repalces the two remaining usages of `{}`
with `std::nullopt`.
As a side-effect, this also quells the spurious GCC 10.2.x warning that
we've had reported quite a few times. i.e #21318, #21248, #20797.
```bash
txmempool.cpp: In member function ‘CTxMemPool::setEntries CTxMemPool::GetIterSet(const std::set<uint256>&) const’:
txmempool.cpp:898:13: warning: ‘<anonymous>’ may be used uninitialized in this function [-Wmaybe-uninitialized]
898 | return {};
| ^
```
e11b6496506246882df450586acf735dabedf731 validation: CVerifyDB::VerifyDB: Use locking annotation (Carl Dong)
03f75c42e12a272057adccb6f0077e71f971eeef validation: Use existing chain member in CChainState::LoadGenesisBlock (Carl Dong)
5e4af773809415b71a10e3120cc44854d61c4c19 validation: Use existing chain member in CChainState::AcceptBlock (Carl Dong)
fee73347c0015e4c24595c9708831d76cd6eec8c validation: Pass in chain to FindBlockPos+SaveBlockToDisk (Carl Dong)
a9d28bcd8d8f71d089322b1d631390352e31ee2b validation: Use *this in CChainState::ActivateBestChainStep (Carl Dong)
4744efc9bae8b22efb76152a3c045d054c880399 validation: Pass in chainstate to CTxMemPool::check (Carl Dong)
1fb7b2c59505a6b9768789f6caad215a0a22ef16 validation: Use *this in CChainState::InvalidateBlock (Carl Dong)
8cdb2f7e58dfd9a631a8cbb8f0ee7e8c0c304eb4 validation: Move LoadBlockIndexDB to CChainState (Carl Dong)
8b99efbcc08ab41caf657c6d730a27e6a91bc356 validation: Move invalid block handling to CChainState (Carl Dong)
2bdf37fe187ba6f090a0f5299b74d5d82cde4697 validation: Pass in chainstate to CVerifyDB::VerifyDB (Carl Dong)
31eac50c721dd3b0bd2347e76196bf16913e9be9 validation: Remove global ::VersionBitsTip{State,SinceHeight,Statistics} (Carl Dong)
63e4c7316a537900f525e221d8042587b443cc3d validation: Pass in chainstate to ::PruneBlockFilesManual (Carl Dong)
4bada76237d734c1de38d3bd58689caeefd5e8cb validation: Pass in chainstate to UpdateTip (Carl Dong)
a3ba08ba7dec0b016e42233cd4a061ba1a0e86c1 validation: Remove global ::{{Precious,Invalidate}Block,ResetBlockFailureFlags} (Carl Dong)
4927c9e6991b09a36a41aab93a0e05332d899611 validation: Remove global ::LoadGenesisBlock (Carl Dong)
9da106be4db692fa5db7b4de79f9cf7bfef37075 validation: Check chain tip is non-null in CheckFinalTx (Carl Dong)
Pull request description:
Overall PR: #20158 (tree-wide: De-globalize ChainstateManager)
Based on:
- [x] #20750 | [Bundle 2/n] Prune g_chainman usage in mempool-related validation functions
Note to reviewers:
1. This bundle may _apparently_ introduce usage of `g_chainman` or `::Chain(state|)Active()` globals, but these are resolved later on in the overall PR. [Commits of overall PR](https://github.com/bitcoin/bitcoin/pull/20158/commits)
2. There may be seemingly obvious local references to `ChainstateManager` or other validation objects which are not being used in callers of the current function in question, this is done intentionally to **_keep each commit centered around one function/method_** to ease review and to make the overall change systematic. We don't assume anything about our callers. Rest assured that once we are considering that particular caller in later commits, we will use the obvious local references. [Commits of overall PR](https://github.com/bitcoin/bitcoin/pull/20158/commits)
3. When changing a function/method that has many callers (e.g. `LookupBlockIndex` with 55 callers), it is sometimes easier (and less error-prone) to use a scripted-diff. When doing so, there will be 3 commits in sequence so that every commit compiles like so:
1. Add `new_function`, make `old_function` a wrapper of `new_function`, divert all calls to `old_function` to `new_function` **in the local module only**
2. Scripted-diff to divert all calls to `old_function` to `new_function` **in the rest of the codebase**
3. Remove `old_function`
Note to self:
- [x] Address: https://github.com/bitcoin/bitcoin/pull/20750#discussion_r579400663
ACKs for top commit:
laanwj:
Code review ACK e11b6496506246882df450586acf735dabedf731
Tree-SHA512: 205a451a741e32f17d5966de289f2f5a3f0817738c0087b70ff4755ddd217b53d01050ed396669bda2b1d216a88d927b9778777f9ff95ab1fe20e59c5f341776
fd6580e405699ccb051fd2a34525e48d3253673d [refactor] txmempool: split epoch logic into class (Anthony Towns)
Pull request description:
Splits the epoch logic introduced in #17925 into a separate class.
Uses clang's thread safety annotations and encapsulates the data more strongly to reduce chances of bugs from API misuse.
ACKs for top commit:
jonatack:
ACK fd6580e405699ccb051fd2a34525e48d3253673d using clang thread safety annotations looks like a very good idea, and the encapsulation this change adds should improve robustness (and possible unit test-ability) of the code. Verified that changing some of the locking duly provoked build-time warnings with Clang 9 on Debian and that small changes in the new `Epoch` class were covered by failing functional test assertions in `mempool_updatefromblock.py`, `mempool_resurrect.py`, and `mempool_reorg.py`
hebasto:
re-ACK fd6580e405699ccb051fd2a34525e48d3253673d, since my [previous](https://github.com/bitcoin/bitcoin/pull/18017#pullrequestreview-569619362) review:
Tree-SHA512: 7004623faa02b56639aa05ab7a078320a6d8d54ec62d8022876221e33f350f47df51ddff056c0de5be798f8eb39b5c03c2d3f035698555d70abc218e950f2f8c
Several other parameters are now redundant since they can be safely
obtained from the chainstate given that ::cs_main is locked. These are
now removed.