Normally, the values are small enough to fit in size_t, but the risk
that it may not fit sometimes is a reason to use uint64_t consistently
for all architectures.
On 64-bit systems, this refactor is a no-op. On 32-bit systems, it could
avoid bugs in the theoretical and unexpected case where a 32-bit size_t
is too small and overflows.
For example, 32-bit Bitcoin Core versions with CVE-2025-46597 unfixed
may overflow while checking for the bad-blk-length violation when
receiving a malformed and bloated compact block.
Made every signed/unsigned conversion in the serialization helpers explicit so the UBSan `implicit-sign-change` check passes and the `serialize.h` suppression can be dropped.
For consistency, a few other simple changes were also applied to the serialization helpers:
* remove redundant `inline` on function templates;
* unify formatting to make the differences between similar methods obvious.
ffff4a293ad878494e12f8f00108cc99ee2b713e bench: Update span-serialize comment (MarcoFalke)
fa4d6ec97bcb1790a7cd4363a13fda7c80c3dd90 refactor: Avoid false-positive gcc warning (MarcoFalke)
fa942332b40c97375af0722f32f7575bca3af819 scripted-diff: Bump copyright headers after std::span changes (MarcoFalke)
fa0c6b7179c062b7ca92d120455ce02a9f4e9e19 refactor: Remove unused Span alias (MarcoFalke)
fade0b5e5e6e80e3da1ab6448b6212244bafa5d3 scripted-diff: Use std::span over Span (MarcoFalke)
fadccc26c03db00a2be3f703aa7e5eec4312bd2e refactor: Make Span an alias of std::span (MarcoFalke)
fa27e36717ec18d64b7ff7bba71b8f0c202ba31d test: Fix broken span_tests (MarcoFalke)
fadf02ef8bf96ad5b3b8e34fd425b31b555f4371 refactor: Return std::span from MakeUCharSpan (MarcoFalke)
fa720b94be17fa9e7c91188710e6a04939ceab11 refactor: Return std::span from MakeByteSpan (MarcoFalke)
Pull request description:
`Span` has some issues:
* It does not support fixed-size spans, which are available through `std::span`.
* It is confusing to have it available and in use at the same time with `std::span`.
* It does not obey the standard library iterator build hardening flags. See https://github.com/bitcoin/bitcoin/issues/31272 for a discussion. For example, this allows to catch issues like the one fixed in commit fabeca3458b38a3d8930cb0cbc866388c3f120f1.
Both types are type-safe and can even implicitly convert into each other in most contexts.
However, exclusively using `std::span` seems less confusing, so do it here with a scripted-diff.
ACKs for top commit:
l0rinc:
reACK ffff4a293ad878494e12f8f00108cc99ee2b713e
theuni:
ACK ffff4a293ad878494e12f8f00108cc99ee2b713e.
Tree-SHA512: 9cc2f1f43551e2c07cc09f38b1f27d11e57e9e9bc0c6138c8fddd0cef54b91acd8b14711205ff949be874294a121910d0aceffe0e8914c4cff07f1e0e87ad5b8
Historically, the headers have been bumped some time after a file has
been touched. Do it now to avoid having to touch them again in the
future for that reason.
-BEGIN VERIFY SCRIPT-
sed -i --regexp-extended 's;( 20[0-2][0-9])(-20[0-2][0-9])? The Bitcoin Core developers;\1-present The Bitcoin Core developers;g' $( git show --pretty="" --name-only HEAD~1 )
-END VERIFY SCRIPT-
The use of e.g. `std::underlying_type_t<T>` replaces the older `typename std::underlying_type<T>::type`.
The `_t` helper alias template (such as `std::underlying_type_t<T>`) introduced in C++14 offers a cleaner and more concise way to extract the type directly.
See https://en.cppreference.com/w/cpp/types/underlying_type for details.
-BEGIN VERIFY SCRIPT-
sed -i -E 's/(typename )?(std::[a-z_]+)(<[^<>]+>)::type\b/\2_t\3/g' $(git grep -l '::type' ./src ':(exclude)src/bench/nanobench.h' ':(exclude)src/leveldb' ':(exclude)src/minisketch' ':(exclude)src/span.h' ':(exclude)src/sync.h')
-END VERIFY SCRIPT-
8d491ae9ecf1948ea29f67b50ca7259123f602aa serialization: Add ParamsStream GetStream() method (Ryan Ofsky)
951203bcc496c4415b7754cd764544659b76067f net: Simplify ParamsStream usage (Ryan Ofsky)
e6794e475c84d9edca4a2876e2342cbb1d85f804 serialization: Accept multiple parameters in ParamsStream constructor (Ryan Ofsky)
cb28849a88339c1e7ba03ffc7e38998339074e6e serialization: Reverse ParamsStream constructor order (Ryan Ofsky)
83436d14f06551026bcf5529df3b63b4e8a679fb serialization: Drop unnecessary ParamsStream references (Ryan Ofsky)
84502b755bcc35413ad466047893b5edf134c53f serialization: Drop references to GetVersion/GetType (Ryan Ofsky)
f3a2b5237688e9f574444e793724664b00fb7f2a serialization: Support for multiple parameters (Ryan Ofsky)
Pull request description:
Currently it is only possible to attach one serialization parameter to a stream at a time. For example, it is not possible to set a parameter controlling the transaction format and a parameter controlling the address format at the same time because one parameter will override the other.
This limitation is inconvenient for multiprocess code since it is not possible to create just one type of stream and serialize any object to it. Instead it is necessary to create different streams for different object types, which requires extra boilerplate and makes using the new parameter fields a lot more awkward than the older version and type fields.
Fix this problem by allowing an unlimited number of serialization stream parameters to be set, and allowing them to be requested by type. Later parameters will still override earlier parameters, but only if they have the same type.
For an example of different ways multiple parameters can be set, see the new [`with_params_multi`](40f505583f/src/test/serialize_tests.cpp (L394-L410)) unit test.
This change requires replacing the `stream.GetParams()` method with a `stream.GetParams<T>()` method in order for serialization code to retrieve the desired parameters. The change is more verbose, but probably a good thing for readability because previously it could be difficult to know what type the `GetParams()` method would return, and now it is more obvious.
---
This PR is part of the [process separation project](https://github.com/bitcoin/bitcoin/issues/28722).
ACKs for top commit:
maflcko:
ACK 8d491ae9ecf1948ea29f67b50ca7259123f602aa 🔵
sipa:
utACK 8d491ae9ecf1948ea29f67b50ca7259123f602aa
TheCharlatan:
ACK 8d491ae9ecf1948ea29f67b50ca7259123f602aa
Tree-SHA512: 40b7041ee01c0372b1f86f7fd6f3b6af56ef24a6383f91ffcedd04d388e63407006457bf7ed056b0e37b4dec9ffd5ca006cb8192e488ea2c64678567e38d4647
86b7f28d6c507155a9d3a15487ee883989b88943 serialization: use internal endian conversion functions (Cory Fields)
432b18ca8d0654318a8d882b28b20af2cb2d2e5d serialization: detect byteswap builtins without autoconf tests (Cory Fields)
297367b3bb062c57142747719ac9bf2e12717ce9 crypto: replace CountBits with std::bit_width (Cory Fields)
52f9bba889fd9b50a0543fd9fedc389592cdc7e5 crypto: replace non-standard CLZ builtins with c++20's bit_width (Cory Fields)
Pull request description:
This replaces #28674, #29036, and #29057. Now ready for testing and review.
Replaces platform-specific endian and byteswap functions. This is especially useful for kernel, as it means that our deep serialization code no longer requires bitcoin-config.h.
I apologize for the size of the last commit, but it's hard to avoid making those changes at once.
All platforms now use our internal functions rather than libc or platform-specific ones, with the exception of MSVC.
Sadly, benchmarking showed that not all compilers are capable of detecting and optimizing byteswap functions, so compiler builtins are instead used where possible. However, they're now detected via macros rather than autoconf checks.
This[ matches how libc++ implements std::byteswap for c++23](https://github.com/llvm/llvm-project/blob/main/libcxx/include/__bit/byteswap.h#L26).
I suggest we move/rename `compat/endian.h`, but I left that out of this PR to avoid bikeshedding.
#29057 pointed out some irregularities in benchmarks. After messing with various compilers and configs for a few weeks with these changes, I'm of the opinion that we can't win on every platform every time, so we should take the code that makes sense going forward. That said, if any real-world slowdowns are caused here, we should obviously investigate.
ACKs for top commit:
maflcko:
ACK 86b7f28d6c507155a9d3a15487ee883989b88943 📘
fanquake:
ACK 86b7f28d6c507155a9d3a15487ee883989b88943 - we can finish pruning out the __builtin_clz* checks/usage once the minisketch code has been updated. This is more good cleanup pre-CMake & for the kernal.
Tree-SHA512: 715a32ec190c70505ffbce70bfe81fc7b6aa33e376b60292e801f60cf17025aabfcab4e8c53ebb2e28ffc5cf4c20b74fe3dd8548371ad772085c13aec8b7970e
These replace our platform-specific mess in favor of c++20 endian detection
via std::endian and internal byteswap functions when necessary.
They no longer rely on autoconf detection.
Before this change it was possible but awkward to create ParamStream streams
with multiple parameter objects. After this change it is straightforward.
The change to support multiple parameters is implemented by letting
ParamsStream contain substream instances, instead of just references to
external substreams. So a side-effect of this change is that ParamStream can
now accept rvalue stream arguments and be easier to use in some other cases. A
test for rvalues is added in this commit, and some simplifications to non-test
code are made in the next commit.
Move parameter argument after stream argument so will be possible to accept
multiple variadic parameter arguments in the following commit.
Also reverse template parameter order for consistency.
9d1dbbd4ceb8c04340927f5127195dc306adf3fc scripted-diff: Fix bitcoin_config_h includes (TheCharlatan)
Pull request description:
As mentioned in https://github.com/bitcoin/bitcoin/pull/26924#issuecomment-1403449932 and https://github.com/bitcoin/bitcoin/pull/29263#issuecomment-1922334399, it is currently not safe to remove `bitcoin-config.h` includes from headers because some unrelated file might be depending on it.
See also #26972 for discussion.
Solve this by including the file directly everywhere it's required, regardless of whether or not it's already included by another header.
There should be no functional change here, but it will allow us to safely remove includes from headers in the future.
~I'm afraid it's a bit tedious to reproduce these commits, but it's reasonably straightforward:~
Edit: See note below
```bash
# All commands executed from the src/ subdir.
# Collect all tokens from bitcoin-config.h.in
# Isolate the tokens and remove blank lines
# Replace newlines with | and remove the last trailing one
# Collect all files which use these tokens
# Filter out subprojects (proper forwarding can be verified from Makefiles)
# Filter out .rc files
# Save to a text file
git grep -E -l `grep undef config/bitcoin-config.h.in | cut -d" " -f2 | grep -v '^$' | tr '\n' '|' | sed 's/|$//'` | grep -v -e "^leveldb/" -e "^secp256k1/" -e "^crc32c/" -e "^minisketch/" -e "^Makefile" -e "\.rc$" > files-with-config-include.txt
# Find all files from the above list which don't include bitcoin-config.h
git grep -L -E "config/bitcoin-config.h" -- `cat files-with-config-include.txt`
# Include them manually with the exception of some files in crypto:
# crypto/sha256_arm_shani.cpp crypto/sha256_avx2.cpp crypto/sha256_sse41.cpp crypto/sha256_x86_shani.cpp
# These are exceptions which don't use bitcoin-config.h, rather the Makefile.am adds these cppflags manually.
# Commit changes. This should match the first commit of this PR.
# Use the same search as above to find all files which DON'T use any config tokens
git grep -E -L `grep undef config/bitcoin-config.h.in | cut -d" " -f2 | grep -v '^$' | tr '\n' '|' | sed 's/|$//'` | grep -v -e "^leveldb/" -e "^secp256k1/" -e "^crc32c/" -e "^minisketch/" -e "^Makefile" -e "\.rc$" > files-without-config-include.txt
# Manually remove the includes and commit changes. This should match the second commit of this PR.
```
Edit: I'll keep this old description for posterity, but the manual approach has been replaced with a scripted diff from TheCharlatan
ACKs for top commit:
maflcko:
ACK 9d1dbbd4ceb8c04340927f5127195dc306adf3f 🚪
TheCharlatan:
ACK 9d1dbbd4ceb8c04340927f5127195dc306adf3fc
hebasto:
ACK 9d1dbbd4ceb8c04340927f5127195dc306adf3fc, I have reviewed the code and it looks OK.
fanquake:
ACK 9d1dbbd4ceb8c04340927f5127195dc306adf3fc
Tree-SHA512: f11ddc4ae6a887f96b954a6b77f310558ddb271088a3fda3edc833669c4251b7f392515224bbb8e5f67eb2c799b4ffed3b07d96454e82ec635c686d0df545872
-BEGIN VERIFY SCRIPT-
regex_string='^(?!//).*(AC_APPLE_UNIVERSAL_BUILD|BOOST_PROCESS_USE_STD_FS|CHAR_EQUALS_INT8|CLIENT_VERSION_BUILD|CLIENT_VERSION_IS_RELEASE|CLIENT_VERSION_MAJOR|CLIENT_VERSION_MINOR|COPYRIGHT_HOLDERS|COPYRIGHT_HOLDERS_FINAL|COPYRIGHT_HOLDERS_SUBSTITUTION|COPYRIGHT_YEAR|ENABLE_ARM_SHANI|ENABLE_AVX2|ENABLE_EXTERNAL_SIGNER|ENABLE_SSE41|ENABLE_TRACING|ENABLE_WALLET|ENABLE_X86_SHANI|ENABLE_ZMQ|HAVE_BOOST|HAVE_BUILTIN_CLZL|HAVE_BUILTIN_CLZLL|HAVE_BYTESWAP_H|HAVE_CLMUL|HAVE_CONSENSUS_LIB|HAVE_CXX20|HAVE_DECL_BE16TOH|HAVE_DECL_BE32TOH|HAVE_DECL_BE64TOH|HAVE_DECL_BSWAP_16|HAVE_DECL_BSWAP_32|HAVE_DECL_BSWAP_64|HAVE_DECL_FORK|HAVE_DECL_FREEIFADDRS|HAVE_DECL_GETIFADDRS|HAVE_DECL_HTOBE16|HAVE_DECL_HTOBE32|HAVE_DECL_HTOBE64|HAVE_DECL_HTOLE16|HAVE_DECL_HTOLE32|HAVE_DECL_HTOLE64|HAVE_DECL_LE16TOH|HAVE_DECL_LE32TOH|HAVE_DECL_LE64TOH|HAVE_DECL_PIPE2|HAVE_DECL_SETSID|HAVE_DECL_STRERROR_R|HAVE_DEFAULT_VISIBILITY_ATTRIBUTE|HAVE_DLFCN_H|HAVE_DLLEXPORT_ATTRIBUTE|HAVE_ENDIAN_H|HAVE_EVHTTP_CONNECTION_GET_PEER_CONST_CHAR|HAVE_FDATASYNC|HAVE_GETENTROPY_RAND|HAVE_GETRANDOM|HAVE_GMTIME_R|HAVE_INTTYPES_H|HAVE_LIBADVAPI32|HAVE_LIBCOMCTL32|HAVE_LIBCOMDLG32|HAVE_LIBGDI32|HAVE_LIBIPHLPAPI|HAVE_LIBKERNEL32|HAVE_LIBOLE32|HAVE_LIBOLEAUT32|HAVE_LIBSHELL32|HAVE_LIBSHLWAPI|HAVE_LIBUSER32|HAVE_LIBUUID|HAVE_LIBWINMM|HAVE_LIBWS2_32|HAVE_MALLOC_INFO|HAVE_MALLOPT_ARENA_MAX|HAVE_MINIUPNPC_MINIUPNPC_H|HAVE_MINIUPNPC_UPNPCOMMANDS_H|HAVE_MINIUPNPC_UPNPERRORS_H|HAVE_NATPMP_H|HAVE_O_CLOEXEC|HAVE_POSIX_FALLOCATE|HAVE_PTHREAD|HAVE_PTHREAD_PRIO_INHERIT|HAVE_STDINT_H|HAVE_STDIO_H|HAVE_STDLIB_H|HAVE_STRERROR_R|HAVE_STRINGS_H|HAVE_STRING_H|HAVE_STRONG_GETAUXVAL|HAVE_SYSCTL|HAVE_SYSCTL_ARND|HAVE_SYSTEM|HAVE_SYS_ENDIAN_H|HAVE_SYS_PRCTL_H|HAVE_SYS_RESOURCES_H|HAVE_SYS_SELECT_H|HAVE_SYS_STAT_H|HAVE_SYS_SYSCTL_H|HAVE_SYS_TYPES_H|HAVE_SYS_VMMETER_H|HAVE_THREAD_LOCAL|HAVE_TIMINGSAFE_BCMP|HAVE_UNISTD_H|HAVE_VM_VM_PARAM_H|LT_OBJDIR|PACKAGE_BUGREPORT|PACKAGE_NAME|PACKAGE_STRING|PACKAGE_TARNAME|PACKAGE_URL|PACKAGE_VERSION|PTHREAD_CREATE_JOINABLE|QT_QPA_PLATFORM_ANDROID|QT_QPA_PLATFORM_COCOA|QT_QPA_PLATFORM_MINIMAL|QT_QPA_PLATFORM_WINDOWS|QT_QPA_PLATFORM_XCB|QT_STATICPLUGIN|STDC_HEADERS|STRERROR_R_CHAR_P|USE_ASM|USE_BDB|USE_DBUS|USE_NATPMP|USE_QRCODE|USE_SQLITE|USE_UPNP|_FILE_OFFSET_BITS|_LARGE_FILES)'
exclusion_files=":(exclude)src/minisketch :(exclude)src/crc32c :(exclude)src/secp256k1 :(exclude)src/crypto/sha256_arm_shani.cpp :(exclude)src/crypto/sha256_avx2.cpp :(exclude)src/crypto/sha256_sse41.cpp :(exclude)src/crypto/sha256_x86_shani.cpp"
git grep --perl-regexp --files-with-matches "$regex_string" -- '*.cpp' $exclusion_files | xargs git grep -L "bitcoin-config.h" | while read -r file; do line_number=$(awk -v my_file="$file" '/\/\/ file COPYING or https?:\/\/www.opensource.org\/licenses\/mit-license.php\./ {line = NR} /^\/\// && NR == line + 1 {while(getline && /^\/\//) line = NR} END {print line+1}' "$file"); sed -i "${line_number}i\\\\n\#if defined(HAVE_CONFIG_H)\\n#include <config/bitcoin-config.h>\\n\#endif" "$file"; done;
git grep --perl-regexp --files-with-matches "$regex_string" -- '*.h' $exclusion_files | xargs git grep -L "bitcoin-config.h" | while read -r file; do sed -i "/#define.*_H/a \\\\n\#if defined(HAVE_CONFIG_H)\\n#include <config/bitcoin-config.h>\\n\#endif" "$file"; done;
for file in $(git grep --files-with-matches 'bitcoin-config.h' -- '*.cpp' '*.h' $exclusion_files); do if ! grep -q --perl-regexp "$regex_string" $file; then sed -i '/HAVE_CONFIG_H/{N;N;N;d;}' $file; fi; done;
-END VERIFY SCRIPT-
The first command creates a regular expression for matching all bitcoin-config.h symbols in the following form: ^(?!//).*(AC_APPLE_UNIVERSAL_BUILD|BOOST_PROCESS_USE_STD_FS|...|_LARGE_FILES). It was generated with:
./autogen.sh && printf '^(?!//).*(%s)' $(awk '/^#undef/ {print $2}' src/config/bitcoin-config.h.in | paste -sd "|" -)
The second command holds a list of files and directories that should not be processed. These include subtree directories as well as some crypto files that already get their symbols through the makefile.
The third command checks for missing bitcoin-config headers in .cpp files and adds the header if it is missing.
The fourth command checks for missing bitcoin-config headers in .h files and adds the header if it is missing.
The fifth command checks for unneeded bitcoin-config headers in sources files and removes the header if it is unneeded.
This commit makes a minimal change to the ParamsStream class to let it retrieve
multiple parameters. Followup commits after this commit clean up code using
ParamsStream and make it easier to set multiple parameters.
Currently it is only possible to attach one serialization parameter to a stream
at a time. For example, it is not possible to set a parameter controlling the
transaction format and a parameter controlling the address format at the same
time because one parameter will override the other.
This limitation is inconvenient for multiprocess code since it is not possible
to create just one type of stream and serialize any object to it. Instead it is
necessary to create different streams for different object types, which
requires extra boilerplate and makes using the new parameter fields a lot more
awkward than the older version and type fields.
Fix this problem by allowing an unlimited number of serialization stream
parameters to be set, and allowing them to be requested by type. Later
parameters will still override earlier parameters, but only if they have the
same type.
This change requires replacing the stream.GetParams() method with a
stream.GetParams<T>() method in order for serialization code to retrieve the
desired parameters. This change is more verbose, but probably a good thing for
readability because previously it could be difficult to know what type the
GetParams() method would return, and now it is more obvious.
fa1a38470697796a1a67397a815c8f8256f59224 Move compat.h include from system.h to system.cpp (MarcoFalke)
88887531b704f3943fdb33abbdd5378ecfeee14f Move compat/assumptions.h include to one place that actually needs it (MarcoFalke)
77774110f4dd591a71441851813d59c03c9e3c78 Remove __cplusplus from compat/assumptions.h (MarcoFalke)
faa3d4f1d8ecff444be53215d72e32d71d9ce138 Remove duplicate NDEBUG check from compat/assumptions.h (MarcoFalke)
Pull request description:
Generally, compile-time checks should be close to the code that use them. Especially, since `compat/assumptions.h` is only included in one place, where iwyu suggests to remove it.
Fix all issues:
* The `NDEBUG` check is used in `util/check`, so it is redundant in `compat/assumptions.h`.
* The `__cplusplus` check is redundant with `doc/dependencies.md` (see commit message).
* Add missing `// IWYU pragma: keep` to avoid removing the include by accident.
ACKs for top commit:
achow101:
ACK fa1a38470697796a1a67397a815c8f8256f59224
TheCharlatan:
re-ACK fa1a38470697796a1a67397a815c8f8256f59224
theuni:
ACK fa1a38470697796a1a67397a815c8f8256f59224
Tree-SHA512: f8b6db84be5d8844a2267345c0b1405fcbc39b8b5eeaa24db5b8412a74145fe44cf188b6b0c39cc2b062690ed37ca5b4662473484afe28dbec6469e79961389b
Protocol version is no longer needed to work out the serialized size
of objects so drop that information from CSizeComputer and rename the
class to SizeComputer.
The moved part can be reviewed with the git options
--ignore-all-space --color-moved=dimmed-zebra --color-moved-ws=ignore-all-space
(Modified by Marco Falke)
Co-authored-by: Pieter Wuille <pieter@wuille.net>
Instead of recursively calling `UnserializeMany` and peeling off one
argument at a time, use a fold expression. This simplifies the code,
makes it most likely faster because it reduces the number of function
calls, and compiles faster because there are fewer template
instantiations.
Instead of recursively calling `SerializeMany` and peeling off one
argument at a time, use a fold expression. This simplifies the code,
makes it most likely faster because it reduces the number of function
calls, and compiles faster because there are fewer template
instantiations.