6f113cb1847c6890f1fbd052ff7eb8ea41ccafc5 txgraph: use fallback order to sort chunks (feature) (Pieter Wuille) 0a3351947e736c646a6dfffef24b83d003c569e7 txgraph: use fallback order when linearizing (feature) (Pieter Wuille) fba004a3df02d8d5d47f1ad0bb1ccbfde01bb2af txgraph: pass fallback_order to TxGraph (preparation) (Pieter Wuille) 941c432a4637efd4e5040259f47f2bfed073af7c txgraph test: subclass TxGraph::Ref like mempool does (preparation) (Pieter Wuille) 39d0052cbf478a729ae0288262003bba9c12690b clusterlin: make optimal linearizations deterministic (feature) (Pieter Wuille) 8bfbba32077cb8682208ef31748a10562be027db txgraph: sort distinct-cluster chunks by equal-feerate-prefix size (feature) (Pieter Wuille) e0bc73ba9270b860d81e479a7bddcff8cfd8bfb6 clusterlin: sort tx in chunk by feerate and size (feature) (Pieter Wuille) 6c1bcb2c7c1a0017562e99195d74c3a05444633b txgraph: clear cluster's chunk index in ~Ref (preparation) (Pieter Wuille) 7427c7d0983050543f1fc7863121d8e2bf4b1511 txgraph: update chunk index on Compact (preparation) (Pieter Wuille) 3ddafceb9afd9d493b927bc91dae324225ed8e32 txgraph: initialize Ref in AddTransaction (preparation) (Pieter Wuille) Pull request description: Part of #30289. TxGraph's fundamental responsibility is deciding the order of transactions in the mempool. It relies on the `cluster_linearize.h` code to optimize it, but there can and often will be many different orderings that are essentially equivalent from a quality perspective, so we have to pick one. At a high level, the solution will involve one or more of: * Deciding based on **internal identifiers** (`Cluster::m_sequence`, `DepGraphIndex`). This is very simple, but risks leaking information about transaction receive order. * Deciding **randomly**, which is private, but may interfere with relay expectations, block propagation, and ability to monitor network behavior. * Deciding **based on txid**, which is private and deterministic, but risks incentivizing grinding to get an edge (though we haven't really seen such behavior). * Deciding **based on size** (e.g. prefer smaller transactions), which is somewhat related to quality, but not unconditionally (depending on mempool layout, the ideal ordering might call for smaller transactions first, last, or anywhere in between). It's also not a strong ordering as there can be many identically-sized transactions. However, if it were to encourage grinding behavior, incentivizing smaller transactions is probably not a bad thing. As of #32545, the current behavior is primarily picking randomly, though inconsistently, as some code paths also use internal identifiers and size. #33335 sought to change it to use random (preferring size in a few places), with the downsides listed above. This PR is an alternative to that, which changes the order to tie-break based on size everywhere possible, and use lowest-txid-first as final fallback. This is fully deterministic: for any given set of mempool transactions, if all linearized optimally, the transaction order exposed by TxGraph is deterministic. The transactions within a chunk are sorted according to: 1. `PostLinearize` (which improves sub-chunk order), using an initial linearization created using the rules 2-5 below. 2. Topology (parents before children). 3. Individual transaction feerate (high to low) 4. Individual transaction weight (small to large) 5. Txid (low to high txid) The chunks within a cluster are sorted according to: 1. Topology (chunks after their dependencies) 2. Chunk feerate (high to low) 3. Chunk weight (small to large) 4. Max-txid (chunk with lowest maximum-txid first) The chunks across clusters are sorted according to: 1. Feerate (high to low) 2. Equal-feerate-chunk-prefix weight (small to large) 3. Max-txid (chunk with lowest maximum-txid first) The equal-feerate-chunk-prefix weight of a chunk C is defined as the sum of the weights of all chunks in the same cluster as C, with the same feerate as C, up to and including C itself, in linearization order (but excluding such chunks that appear after C). This is a well-defined approximation of sorting chunks from small to large across clusters, while remaining consistent with intra-cluster linearization order. ACKs for top commit: ajtowns: reACK 6f113cb1847c6890f1fbd052ff7eb8ea41ccafc5 it was good before and now it's better instagibbs: ACK 6f113cb1847c6890f1fbd052ff7eb8ea41ccafc5 marcofleon: light crACK 6f113cb1847c6890f1fbd052ff7eb8ea41ccafc5 Tree-SHA512: 16dc43c62b7e83c81db1ee14c01e068ae2f06c1ffaa0898837d87271fa7179dd98baeb74abc9fe79220e01fdba6876defe60022c2b72badc21d770644a0fe0ac
Unit tests
The sources in this directory are unit test cases. Boost includes a unit testing framework, and since Bitcoin Core already uses Boost, it makes sense to simply use this framework rather than require developers to configure some other framework (we want as few impediments to creating unit tests as possible).
The build system is set up to compile an executable called test_bitcoin
that runs all of the unit tests. The main source file for the test library is found in
util/setup_common.cpp.
The examples in this document assume the build directory is named
build. You'll need to adapt them if you named it differently.
Compiling/running unit tests
Unit tests will be automatically compiled if dependencies were met during the generation of the Bitcoin Core build system and tests weren't explicitly disabled.
The unit tests can be run with ctest --test-dir build, which includes unit
tests from subtrees.
Run build/bin/test_bitcoin --list_content for the full list of tests.
To run the unit tests manually, launch build/bin/test_bitcoin. To recompile
after a test file was modified, run cmake --build build and then run the test again. If you
modify a non-test file, use cmake --build build --target test_bitcoin to recompile only what's needed
to run the unit tests.
To add more unit tests, add BOOST_AUTO_TEST_CASE functions to the existing
.cpp files in the test/ directory or add new .cpp files that
implement new BOOST_AUTO_TEST_SUITE sections.
To run the GUI unit tests manually, launch build/bin/test_bitcoin-qt
To add more GUI unit tests, add them to the src/qt/test/ directory and
the src/qt/test/test_main.cpp file.
Running individual tests
The test_bitcoin runner accepts command line arguments from the Boost
framework. To see the list of arguments that may be passed, run:
build/bin/test_bitcoin --help
For example, to run only the tests in the getarg_tests file, with full logging:
build/bin/test_bitcoin --log_level=all --run_test=getarg_tests
or
build/bin/test_bitcoin -l all -t getarg_tests
or to run only the doubledash test in getarg_tests
build/bin/test_bitcoin --run_test=getarg_tests/doubledash
The --log_level= (or -l) argument controls the verbosity of the test output.
The test_bitcoin runner also accepts some of the command line arguments accepted by
bitcoind. Use -- to separate these sets of arguments:
build/bin/test_bitcoin --log_level=all --run_test=getarg_tests -- -printtoconsole=1
The -printtoconsole=1 after the two dashes sends debug logging, which
normally goes only to debug.log within the data directory, to the
standard terminal output as well.
Running test_bitcoin creates a temporary working (data) directory with a randomly
generated pathname within test_common bitcoin/, which in turn is within
the system's temporary directory (see
temp_directory_path).
This data directory looks like a simplified form of the standard bitcoind data
directory. Its content will vary depending on the test, but it will always
have a debug.log file, for example.
The location of the temporary data directory can be specified with the
-testdatadir option. This can make debugging easier. The directory
path used is the argument path appended with
/test_common bitcoin/<test-name>/datadir.
The directory path is created if necessary.
Specifying this argument also causes the data directory
not to be removed after the last test. This is useful for looking at
what the test wrote to debug.log after it completes, for example.
(The directory is removed at the start of the next test run,
so no leftover state is used.)
$ build/bin/test_bitcoin --run_test=getarg_tests/doubledash -- -testdatadir=/somewhere/mydatadir
Test directory (will not be deleted): "/somewhere/mydatadir/test_common bitcoin/getarg_tests/doubledash/datadir"
Running 1 test case...
*** No errors detected
$ ls -l '/somewhere/mydatadir/test_common bitcoin/getarg_tests/doubledash/datadir'
total 8
drwxrwxr-x 2 admin admin 4096 Nov 27 22:45 blocks
-rw-rw-r-- 1 admin admin 1003 Nov 27 22:45 debug.log
If you run an entire test suite, such as --run_test=getarg_tests, or all the test suites
(by not specifying --run_test), a separate directory
will be created for each individual test.
Adding test cases
To add a new unit test file to our test suite, you need
to add the file to either src/test/CMakeLists.txt or
src/wallet/test/CMakeLists.txt for wallet-related tests. The pattern is to create
one test file for each class or source file for which you want to create
unit tests. The file naming convention is <source_filename>_tests.cpp
and such files should wrap their tests in a test suite
called <source_filename>_tests. For an example of this pattern,
see uint256_tests.cpp.
Logging and debugging in unit tests
ctest --test-dir build will write to the log file build/Testing/Temporary/LastTest.log. You can
additionally use the --output-on-failure option to display logs of the failed tests automatically
on failure. For running individual tests verbosely, refer to the section
above.
To write to logs from unit tests you need to use specific message methods
provided by Boost. The simplest is BOOST_TEST_MESSAGE.
For debugging you can launch the test_bitcoin executable with gdb or lldb and
start debugging, just like you would with any other program:
gdb build/bin/test_bitcoin
Segmentation faults
If you hit a segmentation fault during a test run, you can diagnose where the fault
is happening by running gdb ./build/bin/test_bitcoin and then using the bt command
within gdb.
Another tool that can be used to resolve segmentation faults is valgrind.
If for whatever reason you want to produce a core dump file for this fault, you can do
that as well. By default, the boost test runner will intercept system errors and not
produce a core file. To bypass this, add --catch_system_errors=no to the
test_bitcoin arguments and ensure that your ulimits are set properly (e.g. ulimit -c unlimited).
Running the tests and hitting a segmentation fault should now produce a file called core
(on Linux platforms, the file name will likely depend on the contents of
/proc/sys/kernel/core_pattern).
You can then explore the core dump using
gdb build/bin/test_bitcoin core
(gdb) bt # produce a backtrace for where a segfault occurred