4fec726c4d352daf2fb4a7e5ed463e44c8815ddb refactor: Simplify Interpret asmap function (Fabian Jahr)
79e97d45c16f043d23ba318a661cc39ec53cf760 doc: Add more extensive docs to asmap implementation (Fabian Jahr)
cf4943fdcdd167a56c278ba094cecb0fa241a8f8 refactor: Use span instead of vector for data in util/asmap (Fabian Jahr)
385c34a05261846dac2b42d47f69b317f534dd40 refactor: Unify asmap version calculation and naming (Fabian Jahr)
fa41fc6a1a7d492b894e206f83e0c9786b44a2f0 refactor: Operate on bytes instead of bits in Asmap code (Fabian Jahr)
Pull request description:
This is a second slice carved out of #28792. It contains the following changes that are crucial for the embedding of asmap data which is added the following PR in the series (probably this will remain in #28792).
The changes are:
- Modernizes and simplifies the asmap code by operating on `std::byte` instead of bits
- Unifies asmap version calculation and naming (previously it was called version and checksum interchangeably)
- Operate on a `span` rather than a vector in the asmap internal to prevent holding the asmap data in memory twice
- Add more extensive documentation to the asmap implementation
- Unify asmap casing in implemetation function names
The first three commits were already part of #28792, the others are new.
The documentation commit came out of feedback gathered at the latest CoreDev. The primary input for the documentation was the documentation that already existed in the Python implementation (`contrib/asmap/asmap.py`) but there are several other comments as well. Please note: I have also asked several LLMs to provide suggestions on how to explain pieces of the implementation and better demonstrate how the parts work together. I have copied bits and pieces that I liked but everything has been edited further by me and obviously all mistakes here are my own.
ACKs for top commit:
hodlinator:
re-ACK 4fec726c4d352daf2fb4a7e5ed463e44c8815ddb
sipa:
ACK 4fec726c4d352daf2fb4a7e5ed463e44c8815ddb
sedited:
Re-ACK 4fec726c4d352daf2fb4a7e5ed463e44c8815ddb
Tree-SHA512: 950a591c3fcc9ddb28fcfdc3164ad3fbd325fa5004533c4a8b670fbf8b956060a0daeedd1fc2fced1f761ac49cd992b79cabe12ef46bc60b2559a7a613d0e166
Verify that addresses announcing themselves (addr == source) are exempt
from time penalties, while addresses announced by others receive the
expected penalty.
This prevents holding the asmap data in memory twice.
The version hash changes due to spans being serialized without their size-prefix (unlike vectors).
This change allows to drop brittle sizeof calls in favor of the
std::span::size method.
Other improvements include:
* Use of a namespace to mark test and bench data
* Use of the modern std::byte
* Drop of a no longer used std::vector copy and the bench/data module
There are no changes to behavior. Changes in this commit are all additions, and
are easiest to review using "git diff -U0 --word-diff-regex=." options.
Motivation for this change is to keep util functions with really generic names
like "Split" and "Join" out of the global namespace so it is easier to see
where these functions are defined, and so they don't interfere with function
overloading, especially since the util library is a dependency of the kernel
library and intended to be used with external code.
3ea54e5db7d53da5afa321e1800c29aa269dd3b3 net: Add continuous ASMap health check logging (Fabian Jahr)
28d7e55dff826a69f3f8e58139dbffb611cc5947 test: Add tests for unfiltered GetAddr usage (Fabian Jahr)
b8843d37aed1276ff8527328c956c70c6e02ee13 fuzz: Let fuzzers use filter options in GetAddr/GetAddresses (Fabian Jahr)
e16f420547fc72a5a2902927aa7138e43c0fb7c8 net: Optionally include terrible addresses in GetAddr results (Fabian Jahr)
Pull request description:
There are certain statistics we can collect by running all our known clearnet addresses against the ASMap file. This could show issues with a maliciously manipulated file or with an old file that has decayed with time.
This is just a proof of concept for now. My idea currently is to run the analysis once per day and print the results to logs if an ASMap file is used.
ACKs for top commit:
achow101:
ACK 3ea54e5db7d53da5afa321e1800c29aa269dd3b3
mzumsande:
ACK 3ea54e5db7d53da5afa321e1800c29aa269dd3b3
brunoerg:
crACK 3ea54e5db7d53da5afa321e1800c29aa269dd3b3
Tree-SHA512: 777acbfac43cc43ce4a0a3612434e4ddbc65f59ae8ffc9e24f21de09011bccb297f0599cbaa82bcf40ef68e5af582c4e98556379db7ceff7d9f97574a1cf8e09
This also cleans up the addrman (de)serialization code paths to only
allow `Disk` serialization. Some unit tests previously forced a
`Network` serialization, which does not make sense, because Bitcoin Core
in production will always `Disk` serialize.
This cleanup idea was suggested by Pieter Wuille and implemented by Anthony
Towns.
Co-authored-by: Pieter Wuille <pieter@wuille.net>
Co-authored-by: Anthony Towns <aj@erisian.com.au>
if an addr matching the network requirements is only on the new table and
select is invoked with new_only = false, ensure that the code selects the new
table.
in order to test this case, we use a non deterministic addrman. this means we
cannot have more than one address in any addrman table, or risk sporadic
failures when the second address happens to conflict.
if the code chose a table at random, the test would fail 50% of the time
Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
this adds coverage for the 7 different cases of which table should be selected
when the network is specified. the different cases are the result of new_only
being true or false and whether there are network addresses on both, neither,
or one of new vs tried tables. the only case not covered is when new_only is
false and the only network addresses are on the new table.
Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
c9d548c91fb12fba516dee896f1f97692cfa2104 net: remove CService::ToStringPort() (Vasil Dimov)
fd4f0f41e915d99c9b0eac1afd21c5628222e368 gui: simplify OptionsDialog::updateDefaultProxyNets() (Vasil Dimov)
96c791dd20fea54c17d224000dee677bc158f66a net: remove CService::ToString() use ToStringAddrPort() instead (Vasil Dimov)
944a9de08a00f8273e73cd28b40e46cc0eb0bad1 net: remove CNetAddr::ToString() and use ToStringAddr() instead (Vasil Dimov)
043b9de59aec88ae5e29daac7dc2a8b51a9414ce scripted-diff: rename ToStringIP[Port]() to ToStringAddr[Port]() (Vasil Dimov)
Pull request description:
Before this PR we had the somewhat confusing combination of methods:
`CNetAddr::ToStringIP()`
`CNetAddr::ToString()` (duplicate of the above)
`CService::ToStringIPPort()`
`CService::ToString()` (duplicate of the above, overrides a non-virtual method from `CNetAddr`)
`CService::ToStringPort()`
Avoid [overriding non-virtual methods](https://github.com/bitcoin/bitcoin/pull/25349/#issuecomment-1185226396).
"IP" stands for "Internet Protocol" and while sometimes "IP addresses" are called just "IPs", it is incorrect to call Tor or I2P addresses "IPs". Thus use "Addr" instead of "IP".
Change the above to:
`CNetAddr::ToStringAddr()`
`CService::ToStringAddrPort()`
The changes touch a lot of files, but are mostly mechanical.
ACKs for top commit:
sipa:
utACK c9d548c91fb12fba516dee896f1f97692cfa2104
achow101:
ACK c9d548c91fb12fba516dee896f1f97692cfa2104
jonatack:
re-ACK c9d548c91fb12fba516dee896f1f97692cfa2104 only change since my previous reviews is rebase, but as a sanity check rebased to current master and at each commit quickly re-reviewed and re-verified clean build and green unit tests
LarryRuane:
ACK c9d548c91fb12fba516dee896f1f97692cfa2104
Tree-SHA512: 633fb044bdecf9f551b5e3314c385bf10e2b78e8027dc51ec324b66b018da35e5b01f3fbe6295bbc455ea1bcd1a3629de1918d28de510693afaf6a52693f2157
Now that Size() performs internal consistency checks,
it will rightfully fail (and assert) when dealing with
a corrupted AddrMan. Therefore remove this check.
The functionality of the old size() is covered by the new Size()
when no arguments are specified, so this does not change behavior.
Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
For now, the new functionality will be used in the context of
querying fixed seeds. Other possible applications for
future changes is the use in the context of making automatic
connections to specific networks, or making more detailed info
about addrman accessible via rpc.
Both methods do the same thing, so simplify to having just one.
`ToString()` is too generic in this case and it is unclear what it does,
given that there are similar methods:
`ToStringAddr()` (inherited from `CNetAddr`),
`ToStringPort()` and
`ToStringAddrPort()`.
In addrman unit tests, make it possible to override the check ratio from
the command line, without recompiling:
```
test_bitcoin --run_test="addrman_tests/*" -- -checkaddrman=1
```
Also, make the arguments of the constructor of `AddrManTest` the
same as the arguments of `AddrMan`.
ea4c9fd4ab9aaa2e8f2c2e38a75c9f05d0bfc866 test: Cover eviction by timeout in addrman_evictionworks (Martin Zumsande)
4f1bb467b556ec93c9b8f758783fda4d050da491 test: Add test for multiplicity in addrman new tables (Martin Zumsande)
e880bb7836dab2018049390884220177c6db9b92 test: Add test for updating addrman entries (Martin Zumsande)
f02eee8c8784dfc8db80a21ab6508f7c99298255 test: introduce utility function to retrieve an addrman (Martin Zumsande)
f0e5efb82493f7a14580335ce719d5be81c8713e test: Remove unused AddrManTest class (Martin Zumsande)
b696d7870b29232057600df5ddd8351888253b95 test: Remove tests for internal helper functions (Martin Zumsande)
0538520091bf2982a029a0298835400f5afbdc15 test: use AddrMan instead of AddrManTest where possible (Martin Zumsande)
1c65d427bbf61bb558cf7e18f7aff99b19f68508 test: Inline SimConnFail function (Martin Zumsande)
5b7aac34f2363822c3a1cfafda8ffc9528905058 test: delete unused GetBucketAndEntry function (Amiti Uttarwar)
2ba1e74e59a325ca6cb140757067dd5e0c7c249b test: Update addrman_serialization unit test to use AddrMan's interface (Amiti Uttarwar)
dad5f760211df314d650999e0a76edb0151b4fe1 addrman: Introduce a test-only function to lookup addresses (Amiti Uttarwar)
Pull request description:
This PR (joint work with Amiti Uttarwar) changes the addrman unit tests such that they only use the public `AddrMan` interface:
This has the advantage that the tests are less implementation-dependent, i.e. it would be possible to rewrite the internal addrman implementation (as drafted [here](https://github.com/sipa/bitcoin/tree/202106_multiindex_addrman) for using a multiindex) without having to adjust the tests.
This includes the following steps:
* Adding a test-only function `FindAddressEntry()` to the public addrman interface which returns info about an address in addrman (e.g. bucket, position, whethe the address is in new or tried). Obviously we want to do this sparingly, but I think a single test-only function is ok (which could also be useful elsewhere, e.g. in fuzz tests).
* Removal of the `AddrManTest` subclass which would reach into AddrMan's internals, using `AddrMan` instead
* Removal of tests for internal helper functions that are not publicly exposed (these are still tested indirectly via the public functions calling them).
* Additional tests for previously untested features such as multiplicity in the new tables, that can be tested with the help of `FindAddressEntry()`.
All in all, this PR increases the unit test coverage of AddrMan by a bit.
ACKs for top commit:
jnewbery:
ACK ea4c9fd4ab9aaa2e8f2c2e38a75c9f05d0bfc866
josibake:
reACK ea4c9fd4ab
Tree-SHA512: c2d4ec8bdc62ffd6055ddcd37dea85ec08c76889e9e417e8d7c62a96cf68a8bcbe8c67bec3344d91fa7d3c499f6d9f810962da1dddd38e70966186b10b8ab447
The logic of these functions is already covered by existing unit tests
using publicly exposed functions of the interface.
Therefore, removing them does not decrease test coverage.
Switches to AddrMan for tests that use no features of AddrManTest.
Also removes unusued AddrManTest variables
Co-Authored-By: Amiti Uttarwar <amiti@uttarwar.org>