b0c706795ce6a3a00bf068a81ee99fef2ee9bf7e Remove unreliable seed from chainparams.cpp, and the associated README (SatsAndSports)
Pull request description:
The DNS seed `dnsseed.bitcoin.dashjr-list-of-p2p-nodes.us.` is not returning a representative sample of bitcoin nodes. It currently returns nothing later than 28.1.0, breaching the policy.
This PR removes that seed from the list of DNS seeds
### Rationale
The [policy for seeds](https://github.com/bitcoin/bitcoin/blob/master/doc/dnsseed-policy.md) includes this:
> The DNS seed results must consist exclusively of fairly selected and functioning Bitcoin nodes from the public network
A number of comments below, in response to this PR, include apparent breaches of this policy: [1](https://github.com/bitcoin/bitcoin/pull/33723#issuecomment-3458071231) [2](https://github.com/bitcoin/bitcoin/pull/33723#issuecomment-3457655364), [3](https://github.com/bitcoin/bitcoin/pull/33723#issuecomment-3457712557), in particular the first linked comment ([1](https://github.com/bitcoin/bitcoin/pull/33723#issuecomment-3458071231)) comparing the distribution at this seed to other seeds. This seed is not including anything later than 28.2.0, breaching this policy.
To ensure the policy is followed, and the seeds include a representative sample of Bitcoin nodes, this PR removes this seed from the list
### Data
I ran this:
```
# Get some ip address from that seed:
# Repeated multiple times, to get many different IPs:
dig +short dnsseed.bitcoin.dashjr-list-of-p2p-nodes.us >> dnsseed.bitcoin.dashjr-list-of-p2p-nodes.us
# For each distinct ip gathered from the seed, get basic info about the node, including it's User Agent string:
cat dnsseed.bitcoin.dashjr-list-of-p2p-nodes.us | sort -u | while read ip; do echo ===; echo $ip; nmap -p 8333 --script bitcoin-info "$ip"; done > seed_versions.txt
```
and then summarized the agents with `egrep 'User Agent' seed_versions.txt | sort | uniq -c` and got:
```
1 User Agent: /Satoshi:22.0.0/
1 User Agent: /Satoshi:22.1.0/
5 User Agent: /Satoshi:24.0.1/
1 User Agent: /Satoshi:25.1.0/
30 User Agent: /Satoshi:27.0.0/
1 User Agent: /Satoshi:27.1.0/
1 User Agent: /Satoshi:27.1.0/Knots:20240801/
1 User Agent: /Satoshi:28.0.0/
7 User Agent: /Satoshi:28.1.0/
2 User Agent: /Satoshi:28.1.0/Knots:20250305/
```
ACKs for top commit:
l0rinc:
reACK b0c706795ce6a3a00bf068a81ee99fef2ee9bf7e
delta1:
reACK b0c706795c
Crypt-iQ:
crACK b0c706795ce6a3a00bf068a81ee99fef2ee9bf7e
laanwj:
ACK b0c706795ce6a3a00bf068a81ee99fef2ee9bf7e
murchandamus:
ACK b0c706795c
RandyMcMillan:
ACK b0c7067
wiz:
ACK b0c706795ce6a3a00bf068a81ee99fef2ee9bf7e
dergoegge:
ACK b0c706795ce6a3a00bf068a81ee99fef2ee9bf7e
stickies-v:
re-ACK b0c706795ce6a3a00bf068a81ee99fef2ee9bf7e
mzumsande:
ACK b0c706795ce6a3a00bf068a81ee99fef2ee9bf7e
instagibbs:
ACK b0c706795ce6a3a00bf068a81ee99fef2ee9bf7e
Tree-SHA512: 7230b8dd24560ce6f8247e2e82ae7846ded8b91e230c59cc3643da3f5b9c12b5f025c1bb14490c19ca55f3794e81ce08106b31b3bf883d5c2dced05017123ac4
Historically, the headers have been bumped some time after a file has
been touched. Do it now to avoid having to touch them again in the
future for that reason.
-BEGIN VERIFY SCRIPT-
sed -i --regexp-extended 's;( 20[0-2][0-9])(-20[0-2][0-9])? The Bitcoin Core developers;\1-present The Bitcoin Core developers;g' $( git show --pretty="" --name-only HEAD~0 )
-END VERIFY SCRIPT-
The encoding arg is confusing, because it is not applied consistently
for all IO.
Also, it is useless, as the majority of files are ASCII encoded, which
are fine to encode and decode with any mode.
Moreover, UTF-8 is already required for most scripts to work properly,
so setting the encoding twice is redundant.
So remove the encoding from most IO. It would be fine to remove from all
IO, however I kept it for two files:
* contrib/asmap/asmap-tool.py: This specifically looks for utf-8
encoding errors, so it makes sense to sepecify the utf-8 encoding
explicitly.
* test/functional/test_framework/test_node.py: Reading the debug log in
text mode specifically counts the utf-8 characters (not bytes), so it
makes sense to specify the utf-8 encoding explicitly.
We shouldn't have | at the end of the last clause, as this will make it match
the empty string too (so effectively everything starting with Satoshi: matches).
While doing this, put the | at the beginning of every line of regex rather than
the end, to make it easier to update in the future without accidentally running
into this problem again.
Regenerate mainnet seeds from new sources without the need for hardcoded
data. Result has 512 nodes from each network type except cjdns, for
which only eight nodes were found that match the seed node criteria.
Pull additional nodes from virtu's crawler. Data includes sufficient
Onion and I2P nodes to align the uptime requirements for these networks
to that of clearnet nodes (i.e., 50%). Data also includes more than
three times the number of CJDNS nodes currently hardcoded into
nodes_main_manual.txt, so hardcoded nodes becomes obsolete.
The seeders now produce onion and i2p seeds, so there is no need to keep these
in the manual list.
Although should also be produced, there are not enough
good ones detected by the seeder, so we keep the manual seeds for them.
The crawlers are not guaranteed to output nodes in a random order, so
shuffle the ips list after parsing to break any biasing that may be
caused by the output order.
Update the user agent regex to match all 3 digits of the version number,
not just the first 2 digits.
Also updates it to include 24.2, 25.2, 26.1, 27.0, 27.1, 27.99, 28.0 and
28.99.
fa52e13ee81fcc7543890dbd6986fcb55168583f test: Remove struct.pack from almost all places (MarcoFalke)
fa826db477a981b48bff53021f9695a5f6682dc0 scripted-diff: test: Use int.to_bytes over struct packing (MarcoFalke)
faf2a975ad46799d075e3a70c699da0d8182aab9 test: Use int.to_bytes over struct packing (MarcoFalke)
faf3cd659a72473a1aa73c4367a145f4ec64f146 test: Normalize struct.pack format (MarcoFalke)
Pull request description:
`struct.pack` has many issues:
* The format string consists of characters that may be confusing and may need to be looked up in the documentation, as opposed to using easy to understand self-documenting code.
This lead to many test bugs, which weren't hit, which is fine, but still confusing. Ref: https://github.com/bitcoin/bitcoin/pull/29400, https://github.com/bitcoin/bitcoin/pull/29399, https://github.com/bitcoin/bitcoin/pull/29363, fa3886b7c69cbbe564478f30bb2c35e9e6b1cffa, ...
Fix all issues by using the built-in `int` helpers `to_bytes` via a scripted diff.
Review notes:
* For `struct.pack` and `int.to_bytes` the error behavior is the same, although the error messages are not identical.
* Two `struct.pack` remain. One for float serialization in a C++ code comment, and one for native serialization.
ACKs for top commit:
achow101:
ACK fa52e13ee81fcc7543890dbd6986fcb55168583f
rkrux:
tACK [fa52e13](fa52e13ee8)
theStack:
Code-review ACK fa52e13ee81fcc7543890dbd6986fcb55168583f
Tree-SHA512: ee80d935b68ae43d1654b047e84ceb80abbd20306df35cca848b3f7874634b518560ddcbc7e714e2e7a19241e153dee64556dc4701287ae811e26e4f8c57fe3e
6abe772a17e09fe96e68cd3311280d5a30f6378b contrib: Add asmap-tool (Fabian Jahr)
Pull request description:
This adds `asmap.py` and `asmap-tool.py` from sipa's `nextgen` branch: https://github.com/sipa/asmap/tree/nextgen
The motivation is that we should maintain the tooling for de- and encoding asmap files within the bitcoin core repository because it is not possible to use an asmap file that is not encoded.
We already had an earlier version of `asmap.py` within the seeds contrib tools. The newer version only had a small amount of changes and is still compatible, so the old version is removed from contrib/seeds and the new version is made available to `makeseeds.py`.
ACKs for top commit:
virtu:
ACK [6abe772](6abe772a17)
0xB10C:
ACK 6abe772a17e09fe96e68cd3311280d5a30f6378b
achow101:
ACK 6abe772a17e09fe96e68cd3311280d5a30f6378b
brunoerg:
ACK 6abe772a17e09fe96e68cd3311280d5a30f6378b
Tree-SHA512: cc2a82ffa4eb46fa0ce4ca769dd82f8d0d2f37fc3652aa748eeb060e1142f9da4035008fe89433e2fd524a4dc153b7b9c085748944b49137b37009b0c0be8afb
Since Python 3.9, type hinting has become a little less awkward, as for
collection types one doesn't need to import the corresponding
capitalized types (`Dict`, `List`, `Set`, `Tuple`, ...) anymore, but can
use the built-in types directly. [1] [2]
This commit applies the replacement for all Python scripts (i.e. in the
contrib and test folders) for the basic types:
- typing.Dict -> dict
- typing.List -> list
- typing.Set -> set
- typing.Tuple -> tuple
[1] https://docs.python.org/3.9/whatsnew/3.9.html#type-hinting-generics-in-standard-collections
[2] https://peps.python.org/pep-0585/#implementation for a list of type
a2bef805c11a4ab391f4ea635bfde7dd2ec9ce81 kernel: update m_assumed_* chain params for 25.x (fanquake)
4128e01dbaa630396b0e7576ee8a1987267f77b9 kernel: update chainTxData for 25.x (fanquake)
00b2b114b442835c863e7772348cb1d1881ba0f2 kernel: update nMinimumChainWork & defaultAssumeValid for 25.x (fanquake)
07fcc0a82cfd08c991f61c2340630f0c4659a3a0 doc: update references to kernel/chainparams.cpp (fanquake)
Pull request description:
Update chainparams pre `25.x` branch off.
Co-Author in the commits as a PR (#27223) had previously been opened too-early to do the same.
Note: Remember that some variance is expected in the `m_assumed_*` sizes.
ACKs for top commit:
achow101:
ACK a2bef805c11a4ab391f4ea635bfde7dd2ec9ce81
josibake:
ACK a2bef805c1
gruve-p:
ACK a2bef805c1
dergoegge:
ACK a2bef805c11a4ab391f4ea635bfde7dd2ec9ce81 on the new mainnet params
Tree-SHA512: 0b19c2ef15c6b15863d6a560a1053ee223057c7bfb617ffd3400b1734cee8f75bc6fd7f04d8f8e3f5af6220659a1987951a1b36945d6fe17d06972004fd62610
3cc989da5c750e740705131bed05bbf93bfdf169 Fix checking bad dns seeds without casting (Yusuf Sahin HAMZA)
Pull request description:
- Since seed lines comes with `str` type, comparing `good` column directly with **0** (`int` type) in the if statement was not working at all. This is fixed by casting `int` type to the values in the `good` column of seeds text file.
- Lines that starts with comment in the seeds text file are now ignored.
- If statement for checking bad seeds are moved to the top of the `parseline` function as if a seed is bad; there is no point of going forward from there.
Since this bug-fix eliminates bad seeds over **550k** in the first place, in my case; particular job for parsing all seeds speed is up by **600%** and whole script's speed is up by **%30**.
Note that **stats** in the terminal are not going to include bad seeds after this fix, which would be the same if this bug were never there before.
ACKs for top commit:
achow101:
ACK 3cc989da5c750e740705131bed05bbf93bfdf169
jonatack:
ACK 3cc989da5c750e740705131bed05bbf93bfdf169
Tree-SHA512: 13c82681de4d72de07293f0b7f09721ad8514a2ad99b0584d1c94fa5f2818821df2000944f9514d6a222a5dccc82856d16c8c05aa36d905cfa7d4610c629fd38