82 Commits

Author SHA1 Message Date
Pieter Wuille
6f113cb184 txgraph: use fallback order to sort chunks (feature)
This makes TxGraph also use the fallback order to decide the order of
chunks from distinct clusters.

The order of chunks across clusters becomes:
1. Feerate (high to low)
2. Equal-feerate-chunk-prefix (small to large)
3. Max-txid (chunk with lowest maximum-txid first)

This makes the full TxGraph ordering fully deterministic as long as all
clusters in it are optimally linearized.
2026-02-09 15:55:58 -05:00
Pieter Wuille
0a3351947e txgraph: use fallback order when linearizing (feature)
Add glue to make TxGraph use the fallback order provided to it, in the
fallback comparator it provides to the cluster linearization code.

The order of chunks within a cluster becomes:
1. Topology (chunks after their dependencies)
2. Feerate (high to low)
3. Weight (small to large)
4. Max-txid (chunk with lowest maximum-txid first)

The order of transactions within a chunk becomes:
1. Topology (parents before children)
2. Individual transaction feerate (high to low)
3. Weight (small to large)
4. Txid (low to high txid)

This makes optimal cluster linearization, both the order of chunks
within a chunk, and the order of transactions within those chunks,
completely deterministic.
2026-02-09 15:55:58 -05:00
Pieter Wuille
fba004a3df txgraph: pass fallback_order to TxGraph (preparation)
This adds an std::function<strong_ordering(Ref&,Ref&)> argument to the
MakeTxGraph function, which can be used by the caller (e.g., mempool
code) to provide a fallback order to TxGraph.

This is just preparation; TxGraph does not yet use this fallback order
for anything.
2026-02-09 15:55:58 -05:00
Pieter Wuille
39d0052cbf clusterlin: make optimal linearizations deterministic (feature)
This allows passing in a fallback order comparator to Linearize(), which
is used as final tiebreak when deciding the order of chunks and
transactions within a chunk, rather than a random tiebreak.

The order of transactions within a chunk becomes:
1. Topology (parents before children)
2. Individual transaction feerate (high to low)
3. Weight (small to large)
4. Fallback (low to high fallback order)

The order of chunks within a cluster becomes:
1. Topology (chunks after their dependencies)
2. Feerate (high to low)
3. Weight (small to large)
4. Max-fallback (chunk with lowest maximum-fallback-tx first)

For now, txgraph passes a naive comparator to Linearize(), which makes
the cluster order deterministic when treating the input transactions as
identified by the DepGraphIndex. However, since DepGraphIndexes are the
result of possibly-randomized operations inside txgraph, this doesn't
actually make txgraph's per-cluster ordering deterministic. That will be
changed in a later commit, by using a txid-based fallback instead.
2026-02-09 15:55:58 -05:00
Pieter Wuille
8bfbba3207 txgraph: sort distinct-cluster chunks by equal-feerate-prefix size (feature)
This makes TxGraph track the equal-feerate-prefix size of all chunks in
all clusters in the main graph, and uses it to sort chunks coming from
distinct clusters.

The order of chunks across clusters becomes:
1. Feerate (high to low)
2. Equal-feerate-prefix (small to large)
3. Cluster sequence number (old to new); this will be changed later.

The equal-feerate-prefix size of a chunk C is defined as the sum
of the weights of all chunks in the same cluster as C, with the same
feerate as C, up to and including C itself, in linearization order (but
excluding such chunks that appear after C).

This is an approximation of sorting chunks from small to large across
clusters, while remaining consistent with intra-cluster linearization
order.
2026-02-09 15:55:58 -05:00
Pieter Wuille
6c1bcb2c7c txgraph: clear cluster's chunk index in ~Ref (preparation)
Whenever a TxGraph::Ref is destroyed, if it by then still appears inside
main-level clusters, wipe the chunk index entries for those clusters, to
prevent having lingering indexes for transactions without Ref.

This is preparation for enabling a callback being passed to MakeTxGraph
to define a fallback order on objects. Once the Ref for a transaction is
gone, it is not possible to invoke the callback anymore. To prevent the
index becoming inconsistent, we need to immediately get rid of the index
entries when the Ref disappears.

This is not a problem, because such destructions necessarily will
trigger a relinearization of the cluster (assuming there are
transactions in it left) before becoming acceptable again, and the chunk
ordering is not observable (through CompareMainOrder, or through the
BlockBuilder interface) until that point. However, the index itself
needs to remain consistent in the mean time, even if not meaningful.
2026-02-09 15:55:58 -05:00
Pieter Wuille
7427c7d098 txgraph: update chunk index on Compact (preparation)
This makes TxGraphImpl::Compact() invoke Cluster::Updated() on all
affected clusters, in case they have internal GraphIndex values stored
that may have become outdated with the renumbering of GraphIndex values
that Compact() caused.

No such GraphIndex values are currently stored, but this will change in
a future commit.
2026-02-09 15:55:58 -05:00
Pieter Wuille
3ddafceb9a txgraph: initialize Ref in AddTransaction (preparation)
Instead of returning a TxGraph::Ref from TxGraph::AddTransaction(),
pass in a TxGraph::Ref& which is updated to refer to the new transaction
in that graph.

This cleans up the usage somewhat, avoiding the need for dummy Refs in
CTxMemPoolEntry constructor calls, but the motivation is that a future
commit will allow a callback to passed to MakeTxGraph to define a
fallback order on the transaction objects. This does not work when a
Ref is created separately from the CTxMemPoolEntry it ends up living in,
as passing the newly-created Ref to the callback would be UB before it's
emplaced in its final CTxMemPoolEntry.
2026-02-09 15:55:55 -05:00
Pieter Wuille
da56ef239b clusterlin: minimize chunks (feature)
After the normal optimization process finishes, and finds an optimal
spanning forest, run a second process (while computation budget remains)
to split chunks into minimal equal-feerate chunks.
2026-01-12 17:38:30 -05:00
Pieter Wuille
34a77138b7 txgraph: permit non-topological clusters to defer fixing (optimization) 2026-01-05 11:48:30 -05:00
Pieter Wuille
3380e0cbb5 txgraph: use PostLinearize less prior to linearizing
With the new SFL algorithm, the process of loading an existing linearization into the
SFL state is very similar to what PostLinearize does. This means there is little benefit
to performing an explicit PostLinearize step before linearizing inside txgraph. Instead,
it seems better to use our allotted CPU time to perform more SFL optimization steps.
2026-01-05 11:48:16 -05:00
Pieter Wuille
62dd88624a txgraph: drop NEEDS_SPLIT_ACCEPTABLE (simplification)
With the SFL algorithm, we will practically be capable of keeping
most if not all clusters optimal. With that, it seems less valuable
to avoid doing work after splitting an acceptable cluster, because by
doing some work we may get it to OPTIMAL.

This reduces the complexity of the code a bit as well.
2026-01-05 11:48:16 -05:00
bensig
08ed802bab doc: fix double-word typos in comments 2025-12-30 12:12:26 -08:00
Pieter Wuille
75bdb925f4 clusterlin: drop support for improvable chunking (simplification)
With MergeLinearizations() gone and the LIMO-based Linearize() replaced by SFL, we do not
need a class (LinearizationChunking) that can maintain an incrementally-improving chunk
set anymore.

Replace it with a function (ChunkLinearizationInfo) that just computes the chunks as
SetInfos once, and returns them as a vector. This simplifies several call sites too.
2025-12-18 16:01:31 -05:00
Pieter Wuille
5ce2800745 clusterlin: randomize equal-feerate parts of linearization (privacy)
This places equal-feerate chunks (with no dependencies between them) in random
order in the linearization output, hiding information about DepGraph insertion
order from the output. Likewise, it randomizes the order of transactions within
chunks for the same reason.
2025-12-18 16:01:31 -05:00
Pieter Wuille
3efc94d656 clusterlin: replace cluster linearization with SFL (feature)
This replaces the existing LIMO linearization algorithm (which internally uses
ancestor set finding and candidate set finding) with the much more performant
spanning-forest linearization algorithm.

This removes the old candidate-set search algorithm, and several of its tests,
benchmarks, and needed utility code.

The worst case time per cost is similar to the previous algorithm, so
ACCEPTABLE_ITERS is unchanged.
2025-12-18 16:01:31 -05:00
Lőrinc
039307554e
refactor: unify container presence checks - trivial counts
The changes made here were:

| From              | To               |
|-------------------|------------------|
| `m.count(k)`      | `m.contains(k)`  |
| `!m.count(k)`     | `!m.contains(k)` |
| `m.count(k) == 0` | `!m.contains(k)` |
| `m.count(k) != 0` | `m.contains(k)`  |
| `m.count(k) > 0`  | `m.contains(k)`  |

The commit contains the trivial, mechanical refactors where it doesn't matter if the container can have multiple elements or not

Co-authored-by: Jan B <608446+janb84@users.noreply.github.com>
2025-12-03 13:36:58 +01:00
Anthony Towns
ade0397f59 txgraph: drop move assignment operator 2025-11-25 07:36:50 -05:00
Lőrinc
2d23820ee1 refactor: remove dead branches in SingletonClusterImpl
`SplitAll()` always calls `ApplyRemovals()` first, for a singleton, it empties the cluster, therefore any `SingletonClusterImpl` passed to `Split()` must be empty.

`TxGraphImpl::ApplyDependencies()` first merges each dependency group and asserts the group has at least one dependency.
Since `parent` != `child`, `TxGraphImpl::Merge()` upgrades the merge target to `GenericClusterImpl`, therefore the `ApplyDependencies()` is never dispatched to `SingletonClusterImpl`.

Found during review: https://github.com/bitcoin/bitcoin/pull/33157#discussion_r2423058928
Coverage evidence:
* https://maflcko.github.io/b-c-cov/fuzz.coverage/src/txgraph.cpp.gcov.html#L1446
* https://storage.googleapis.com/oss-fuzz-coverage/bitcoin-core/reports/20251103/linux/src/bitcoin-core/src/txgraph.cpp.html#L1446
2025-11-03 12:19:05 +01:00
Greg Sanders
9b43428c96 TxGraph: change m_excluded_clusters
Change BlockBuilderImpl's m_excluded_clusters to unordered
set since ordering is not used.

Change the set to a set of sequence numbers for a modest
stability increase under fuzz testing.
2025-10-14 12:44:57 -04:00
Pieter Wuille
023cd5a546 txgraph: add SingletonClusterImpl (mem optimization)
This adds a specialized Cluster implementation for singleton clusters, saving
a significant amount of memory by avoiding the need for m_depgraph, m_mapping,
and m_linearization, and their overheads.
2025-10-11 17:46:43 -04:00
Pieter Wuille
e346250732 txgraph: give Clusters a range of intended tx counts (preparation) 2025-10-11 17:32:35 -04:00
Pieter Wuille
e93b0f09cc txgraph: abstract out creation of empty Clusters (refactor) 2025-10-11 17:32:35 -04:00
Pieter Wuille
6baf12621f txgraph: comment fixes (doc fix) 2025-10-11 17:32:35 -04:00
Pieter Wuille
726b995739 txgraph: make Cluster an abstract class (refactor) 2025-10-11 17:32:32 -04:00
Pieter Wuille
2602d89edd txgraph: avoid accessing other Cluster internals (refactor)
This adds 4 functions to Cluster to help implement Merge() and Split() without
needing access to the internals of the other Cluster. This is a preparation for
a follow-up that will make Clusters a virtual class whose internals are abstracted
away.
2025-10-11 17:26:39 -04:00
Pieter Wuille
04c808ac4c txgraph: expose memory usage estimate function (feature) 2025-10-11 17:25:09 -04:00
Pieter Wuille
7680bb8fd4 txgraph: keep track of Cluster memory usage (preparation) 2025-10-11 17:25:09 -04:00
Pieter Wuille
4ba562e5f4 txgraph: keep data structures compact (mem optimization) 2025-10-11 17:25:09 -04:00
Pieter Wuille
b1637a90de txgraph: avoid holes in DepGraph positions (mem optimization) 2025-10-11 17:25:05 -04:00
Pieter Wuille
2b1d302508 txgraph: move some sanity checks from Cluster to TxGraphImpl (refactor) 2025-10-11 17:16:05 -04:00
Pieter Wuille
d40302fbaf txgraph: Make level of Cluster implicit (optimization)
This reduces per-Cluster memory usage by making Clusters not aware of their
own level. Instead, track it either in calling code, or infer it based on
the transactions in them.
2025-10-11 17:13:50 -04:00
Pieter Wuille
d45f3717d2 txgraph: use enum Level instead of bool main_only 2025-09-10 08:03:17 -04:00
Pieter Wuille
f3c2fc867f txgraph: add work limit to DoWork(), try optimal (feature)
This adds an `iters` parameter to DoWork(), which controls how much work it is
allowed to do right now.

Additionally, DoWork() won't stop at just getting everything ACCEPTABLE, but if
there is work budget left, will also attempt to get every cluster linearized
optimally.
2025-07-14 10:28:54 -04:00
Pieter Wuille
e96b00d99e txgraph: make number of acceptable iterations configurable (feature) 2025-07-14 09:42:58 -04:00
Pieter Wuille
cfe9958852 txgraph: track amount of work done in linearization (preparation) 2025-07-14 09:41:17 -04:00
Pieter Wuille
6ba316eaa0 txgraph: 1-or-2-tx split-off clusters are optimal (optimization) 2025-07-14 09:30:24 -04:00
Pieter Wuille
fad0eb091e txgraph: reset quality when merging clusters (bugfix) 2025-07-14 09:30:24 -04:00
Pieter Wuille
1632fc104b txgraph: Track multiple potential would-be clusters in Trim (improvement)
In the existing Trim function, as soon as the set of accepted transactions
would exceed the max cluster size or count limit, the acceptance loop is
stopped, removing all later transactions. However, it is possible that by
excluding some of those transactions the would-be cluster splits apart into
multiple would-clusters. And those clusters may well permit far more
transactions before their limits are reached.

Take this into account by using a union-find structure inside TrimTxData to
keep track of the count/size of all would-be clusters that would be formed
at any point, and only reject transactions which would cause these resulting
partitions to exceed their limits.

This is not an optimization in terms of CPU usage or memory; it just
improves the quality of the transactions removed by Trim().
2025-07-02 16:01:57 -04:00
Pieter Wuille
a04e205ab0 txgraph: Add ability to trim oversized clusters (feature)
During reorganisations, it is possible that dependencies get add which
result in clusters that violate limits (count, size), when linking the
new from-block transactions to the old from-mempool transactions.

Unlike RBF scenarios, we cannot simply reject these policy violations
when they are due to received blocks. To accomodate this, add a Trim()
function to TxGraph, which removes transactions (including descendants)
in order to make all resulting clusters satisfy the limits.

In the initial version of the function added here, the following approach
is used:
- Lazily compute a naive linearization for the to-be-merged cluster (using
  an O(n log n) algorithm, optimized for far larger groups of transactions
  than the normal linearization code).
- Initialize a set of accepted transactions to {}
- Iterate over the transactions in this cluster one by one:
  - If adding the transaction to the set makes it exceed the max cluster size
    or count limit, stop.
  - Add the transaction to the set.
- Remove all transactions from the cluster that were not included in the set
  (note that this necessarily includes all descendants too, because they
  appear later in the naive linearization).

Co-authored-by: Greg Sanders <gsanders87@gmail.com>
2025-07-02 14:52:54 -04:00
Greg Sanders
eabcd0eb6f txgraph: remove unnecessary m_group_oversized (simplification) 2025-07-02 14:52:54 -04:00
Pieter Wuille
19b14e61ea txgraph: Permit transactions that exceed cluster size limit (feature)
This removes the restriction added in the previous commit that individual
transactions do not exceed the max cluster size limit.

With this change, the responsibility for enforcing cluster size limits can
be localized purely in TxGraph, without callers (and in particular, tests)
needing to duplicate the enforcement for individual transactions.
2025-07-02 14:52:54 -04:00
Pieter Wuille
c4287b9b71 txgraph: Add ability to configure maximum cluster size/weight (feature)
This is integrated with the oversized property: the graph is oversized when
any connected component within it contains more than the cluster count limit
many transactions, or when their combined size/weight exceeds the cluster size
limit.

It becomes disallowed to call AddTransaction with a size larger than this limit,
though this limit will be lifted in the next commit.

In addition, SetTransactionFeeRate becomes SetTransactionFee, so that we do not
need to deal with the case that a call to this function might affect the
oversizedness.
2025-07-02 14:52:54 -04:00
fanquake
e50312eab0
doc: fix typos
Co-authored-by: Ragnar <rodiondenmark@gmail.com>
Co-authored-by: VolodymyrBg <aqdrgg19@gmail.com>
2025-06-03 08:09:28 +01:00
Pieter Wuille
8673e8f019 txgraph: Special-case singletons in chunk index (optimization) 2025-05-12 17:07:30 -04:00
Pieter Wuille
abdd9d35a3 txgraph: Skipping end of cluster has no impact (optimization) 2025-05-12 17:07:30 -04:00
Pieter Wuille
604acc2c28 txgraph: Reuse discarded chunkindex entries (optimization) 2025-05-12 17:07:30 -04:00
Pieter Wuille
c734081454 txgraph: Introduce TxGraph::GetWorstMainChunk (feature)
It returns the last chunk that would be suggested for mining by BlockBuilder
objects. This is intended for eviction.
2025-05-12 17:07:30 -04:00
Pieter Wuille
394dbe2142 txgraph: Introduce BlockBuilder interface (feature)
This interface lets one iterate efficiently over the chunks of the main
graph in a TxGraph, in the same order as CompareMainOrder. Each chunk
can be marked as "included" or "skipped" (and in the latter case,
dependent chunks will be skipped).
2025-05-12 17:07:30 -04:00
Pieter Wuille
883df3648e txgraph: Generalize GetClusterRefs to support subsections (preparation)
This is preparation for a next commit which will need a way to extract Refs
for just individual chunks from a cluster.
2025-05-12 17:07:30 -04:00