merge-script b9bf24cfe2
Merge bitcoin/bitcoin#34616: Cluster mempool: SFL cost model (take 2)
744d47fcee0d32a71154292699bfdecf954a6065 clusterlin: adopt trained cost model (feature) (Pieter Wuille)
4eefdfc5b7d0b86a523683de2a90da910b77a106 clusterlin: rescale costs (preparation) (Pieter Wuille)
ecc9a84f854e5b77dfc8876cf7c9b8d0f3de89d0 clusterlin: use 'cost' terminology instead of 'iters' (refactor) (Pieter Wuille)
9e7129df2962f7c52d07c14a56398bb285cac084 clusterlin: introduce CostModel class (preparation) (Pieter Wuille)

Pull request description:

  Part of #30289, replaces earlier #34138.

  This introduces a more accurate cost model for SFL, to control how much CPU time is spent inside the algorithm for clusters that cannot be linearized perfectly within a reasonable amount of time.

  The goal is having a metric for the amount of work performed, so that txmempool can impose limits on that work: a lower bound that is always performed (unless optimality is reached before that point, of course), and an upper bound to limit the latency and total CPU time spent on this. There are conflicting design goals here:
  * On the one hand, it seems ideal if this metric is closely correlated to actual CPU time, because otherwise the limits become inaccurate.
  * On the other hand, it seems a nightmare to have the metric be platform/system dependent, as it makes network-wide reasoning nearly impossible. It's expected that slower systems take longer to do the same thing; this holds for everything, and we don't need to compensate for this.

  There are multiple solutions to this:
  * One extreme is just measuring the time. This is very accurate, but extremely platform dependent, and also non-deterministic due to random scheduling/cache effects.
  * The other extreme is using a very abstract metric like counting how many times certain loops/function inside the algorithm run. That is what is implemented in master right now, just counting the sum of the numbers of transactions updated across all `UpdateChunks()` calls. It however necessarily fails to account for significant portions of runtime spent elsewhere, resulting in a rather wide range of "ns per cost" values.
  * This PR takes a middle ground, counting many function calls / branches / loops, with weights that were determined through benchmarking on an average on a number of systems.

  Specifically, the cost model was obtained by:
  * For a variety of machines:
    * Running a fixed collection of ~385000 clusters found through random generation and fuzzing, optimizing for difficulty of linearization.
      * Linearize each 1000-5000 times, with different random seeds. Sometimes without input linearization, sometimes with a bad one.
        * Gather cycle counts for each of the operations included in this cost model, broken down by their parameters.
    * Correct the data by subtracting the runtime of obtaining the cycle count.
    * Drop the 5% top and bottom samples from each cycle count dataset, and compute the average of the remaining samples.
    * For each operation, fit a least-squares linear function approximation through the samples.
  * Rescale all machine expressions to make their total time match, as we only care about relative cost of each operation.
  * Take the per-operation average of operation expressions across all machines, to construct expressions for an average machine.
  * Approximate the result with integer coefficients.

  The benchmarks were performed by `l0rinc <pap.lorinc@gmail.com>` and myself, on AMD Ryzen 5950X, AMD Ryzen 7995WX, AMD Ryzen 9980X, Apple M4 Max, Intel Core i5-12500H, Intel Core Ultra 7 155H, Intel N150 (Umbrel), Intel Core i7-7700, Intel Core i9-9900K, Intel Haswell (VPS, virtualized), Intel Xeon E5-2637, ARM Cortex-A76 (Raspberry Pi 5), ARM Cortex-A72 (Raspberry Pi 4).

  Based on final benchmarking, the "acceptable" iteration count (which is the minimum spent on every cluster) is to 75000 units, which corresponds to roughly 50 μs on Ryzen 5950X and similar modern desktop hardware.

ACKs for top commit:
  instagibbs:
    ACK 744d47fcee0d32a71154292699bfdecf954a6065
  murchandamus:
    reACK 744d47fcee0d32a71154292699bfdecf954a6065

Tree-SHA512: 5cb37a6bdd930389937c435f910410c3581e53ce609b9b594a8dc89601e6fca6e6e26216e961acfe9540581f889c14bf289b6a08438a2d7adafd696fc81ff517
2026-02-25 12:11:13 +00:00
..
2026-02-03 20:27:19 +01:00
2025-05-01 03:05:57 +00:00
2026-02-17 12:55:28 +01:00
2026-01-22 12:10:33 -05:00

Unit tests

The sources in this directory are unit test cases. Boost includes a unit testing framework, and since Bitcoin Core already uses Boost, it makes sense to simply use this framework rather than require developers to configure some other framework (we want as few impediments to creating unit tests as possible).

The build system is set up to compile an executable called test_bitcoin that runs all of the unit tests. The main source file for the test library is found in util/setup_common.cpp.

The examples in this document assume the build directory is named build. You'll need to adapt them if you named it differently.

Compiling/running unit tests

Unit tests will be automatically compiled if dependencies were met during the generation of the Bitcoin Core build system and tests weren't explicitly disabled.

The unit tests can be run with ctest --test-dir build, which includes unit tests from subtrees.

Run build/bin/test_bitcoin --list_content for the full list of tests.

To run the unit tests manually, launch build/bin/test_bitcoin. To recompile after a test file was modified, run cmake --build build and then run the test again. If you modify a non-test file, use cmake --build build --target test_bitcoin to recompile only what's needed to run the unit tests.

To add more unit tests, add BOOST_AUTO_TEST_CASE functions to the existing .cpp files in the test/ directory or add new .cpp files that implement new BOOST_AUTO_TEST_SUITE sections.

To run the GUI unit tests manually, launch build/bin/test_bitcoin-qt

To add more GUI unit tests, add them to the src/qt/test/ directory and the src/qt/test/test_main.cpp file.

Running individual tests

The test_bitcoin runner accepts command line arguments from the Boost framework. To see the list of arguments that may be passed, run:

build/bin/test_bitcoin --help

For example, to run only the tests in the getarg_tests file, with full logging:

build/bin/test_bitcoin --log_level=all --run_test=getarg_tests

or

build/bin/test_bitcoin -l all -t getarg_tests

or to run only the doubledash test in getarg_tests

build/bin/test_bitcoin --run_test=getarg_tests/doubledash

The --log_level= (or -l) argument controls the verbosity of the test output.

The test_bitcoin runner also accepts some of the command line arguments accepted by bitcoind. Use -- to separate these sets of arguments:

build/bin/test_bitcoin --log_level=all --run_test=getarg_tests -- -printtoconsole=1

The -printtoconsole=1 after the two dashes sends debug logging, which normally goes only to debug.log within the data directory, to the standard terminal output as well.

Running test_bitcoin creates a temporary working (data) directory with a randomly generated pathname within test_common bitcoin/, which in turn is within the system's temporary directory (see temp_directory_path). This data directory looks like a simplified form of the standard bitcoind data directory. Its content will vary depending on the test, but it will always have a debug.log file, for example.

The location of the temporary data directory can be specified with the -testdatadir option. This can make debugging easier. The directory path used is the argument path appended with /test_common bitcoin/<test-name>/datadir. The directory path is created if necessary. Specifying this argument also causes the data directory not to be removed after the last test. This is useful for looking at what the test wrote to debug.log after it completes, for example. (The directory is removed at the start of the next test run, so no leftover state is used.)

$ build/bin/test_bitcoin --run_test=getarg_tests/doubledash -- -testdatadir=/somewhere/mydatadir
Test directory (will not be deleted): "/somewhere/mydatadir/test_common bitcoin/getarg_tests/doubledash/datadir"
Running 1 test case...

*** No errors detected
$ ls -l '/somewhere/mydatadir/test_common bitcoin/getarg_tests/doubledash/datadir'
total 8
drwxrwxr-x 2 admin admin 4096 Nov 27 22:45 blocks
-rw-rw-r-- 1 admin admin 1003 Nov 27 22:45 debug.log

If you run an entire test suite, such as --run_test=getarg_tests, or all the test suites (by not specifying --run_test), a separate directory will be created for each individual test.

Adding test cases

To add a new unit test file to our test suite, you need to add the file to either src/test/CMakeLists.txt or src/wallet/test/CMakeLists.txt for wallet-related tests. The pattern is to create one test file for each class or source file for which you want to create unit tests. The file naming convention is <source_filename>_tests.cpp and such files should wrap their tests in a test suite called <source_filename>_tests. For an example of this pattern, see uint256_tests.cpp.

Logging and debugging in unit tests

ctest --test-dir build will write to the log file build/Testing/Temporary/LastTest.log. You can additionally use the --output-on-failure option to display logs of the failed tests automatically on failure. For running individual tests verbosely, refer to the section above.

To write to logs from unit tests you need to use specific message methods provided by Boost. The simplest is BOOST_TEST_MESSAGE.

For debugging you can launch the test_bitcoin executable with gdb or lldb and start debugging, just like you would with any other program:

gdb build/bin/test_bitcoin

Segmentation faults

If you hit a segmentation fault during a test run, you can diagnose where the fault is happening by running gdb ./build/bin/test_bitcoin and then using the bt command within gdb.

Another tool that can be used to resolve segmentation faults is valgrind.

If for whatever reason you want to produce a core dump file for this fault, you can do that as well. By default, the boost test runner will intercept system errors and not produce a core file. To bypass this, add --catch_system_errors=no to the test_bitcoin arguments and ensure that your ulimits are set properly (e.g. ulimit -c unlimited).

Running the tests and hitting a segmentation fault should now produce a file called core (on Linux platforms, the file name will likely depend on the contents of /proc/sys/kernel/core_pattern).

You can then explore the core dump using

gdb build/bin/test_bitcoin core

(gdb) bt  # produce a backtrace for where a segfault occurred