There are many cases where we assume message processing is
single-threaded in order for how we access node-related memory to be
safe. Add an explicit mutex that we can use to document this, which allows
the compiler to catch any cases where we try to access that memory from
other threads and break that assumption.
`Misbehaving` has several coding related issues (ignoring the conceptual
issues here for now):
* It is public, but it is not supposed to be called from outside of
net_processing. Fix that by making it private and creating a public
`UnitTestMisbehaving` method for unit testing only.
* It doesn't do anything if a `nullptr` is passed. It would be less
confusing to just skip the call instead. Fix that by passing `Peer&`
to `Misbehaving()`.
* It calls `GetPeerRef`, causing `!m_peer_mutex` lock annotations to be
propagated. This is harmless, but verbose. Fix it by removing the no
longer needed call to `GetPeerRef` and the no longer needed lock
annotations.
703b1e612a4bd4521e20ae21eb8fb7c19f4ef942 Close minor startup race between main and scheduler threads (Larry Ruane)
Pull request description:
This is a low-priority bug fix. The scheduler thread runs `CheckForStaleTipAndEvictPeers()` every 45 seconds (EXTRA_PEER_CHECK_INTERVAL). If its first run happens before the active chain is set up (`CChain::SetTip()`), `bitcoind` will assert:
```
(...)
2021-07-28T22:16:49Z init message: Loading block index…
bitcoind: validation.cpp:4968: CChainState& ChainstateManager::ActiveChainstate() const: Assertion `m_active_chainstate' failed.
Aborted (core dumped)
```
I ran into this while using the debugger to investigate an unrelated problem. Single-stepping through threads with a debugger can cause the relative thread execution timing to be very different than usual. I don't think any automated tests are needed for this PR. I'll give reproduction steps in the next PR comment.
ACKs for top commit:
MarcoFalke:
cr ACK 703b1e612a4bd4521e20ae21eb8fb7c19f4ef942
tryphe:
tested ACK 703b1e612a4bd4521e20ae21eb8fb7c19f4ef942
0xB10C:
ACK 703b1e612a4bd4521e20ae21eb8fb7c19f4ef942
glozow:
code review ACK 703b1e612a4bd4521e20ae21eb8fb7c19f4ef942 - it makes sense to me to start peerman's background tasks here, after `chainstate->LoadChainTip()` and `node.connman->Start()` have been called.
Tree-SHA512: 9316ad768cba3b171f62e2eb400e3790af66c47d1886d7965edb38d9710fc8c8f8e4fb38232811c9346732ce311d39f740c5c2aaf5f6ca390ddc48c51a8d633b
Don't schedule class PeerManagerImpl's background tasks from its
constructor, but instead do that from a separate method,
StartScheduledTasks(), that can be called later at the end of startup,
after other things, such as the active chain, are initialzed.
680eb56d828ce358b4e000c140f5b247ff5e6179 [net processing] Don't pass CConnman to RelayTransactions (John Newbery)
a38a4e8f039dfabfd9435f3a63f1a9b56de086d6 [net processing] Move RelayTransaction into PeerManager (John Newbery)
Pull request description:
This is the first part of #21160. It moves the RelayTransaction() function to be a member function of the PeerManager class. This is required in order to move the transaction inventory data into the Peer object, since Peer objects are only accessible from within PeerManager.
ACKs for top commit:
ajtowns:
ACK 680eb56d828ce358b4e000c140f5b247ff5e6179
Tree-SHA512: 8c93491a4392b6369bb7f090de326a63cd62a088de59026e202f226f64ded50a0cf1a95ed703328860f02a9d2f64d3a87ca1bca9a6075b978bd111d384766235
0eaea66e8bfdfb23ff86c0f0924c2d75f5aca75f Make tx relay data structure use std::chrono types (Pieter Wuille)
55e82881a1503bff146970856c1474a6ea659c94 Make all Poisson delays use std::chrono types (Pieter Wuille)
c733ac4d8a597b1001555b142f488ed9fbecc405 Convert block/header sync timeouts to std::chrono types (Pieter Wuille)
4d98b401fbd821700f7a792b0a4cb52c9b71bc9f Change all ping times to std::chrono types (Pieter Wuille)
Pull request description:
(Picking up #20044. Rebased against master.)
This changes various uses of integers to represent timestamps and durations to `std::chrono` duration types with type-safe conversions, getting rid of various `.count()`, constructors, and conversion factors.
ACKs for top commit:
jnewbery:
utACK 0eaea66e8bfdfb23ff86c0f0924c2d75f5aca75f
vasild:
ACK 0eaea66e8bfdfb23ff86c0f0924c2d75f5aca75f
MarcoFalke:
re-ACK 0eaea66e8bfdfb23ff86c0f0924c2d75f5aca75f, only changes: minor rename, using C++11 member initializer, using 2min chrono literal, rebase 🤚
ajtowns:
utACK 0eaea66e8bfdfb23ff86c0f0924c2d75f5aca75f
Tree-SHA512: 2dbd8d53bf82e98f9b4611e61dc14c448e8957d1a02575b837fadfd59f80e98614d0ccf890fc351f960ade76a6fb8051b282e252e81675a8ee753dba8b1d7f57
We don't mark RelayTransaction as const. Even though it doesn't mutate
PeerManagerImpl state, it _is_ mutating the internal state of a CNode
object, by updating setInventoryTxToSend. In a subsequent commit, that
field will be moved to the Peer object, which is owned by
PeerMangerImpl.
This requires PeerManagerImpl::ReattemptInitialBroadcast() to no longer
be const.
To make eclipse attacks more difficult, regularly initiate outbound connections
and stay connected long enough to sync headers and potentially learn of new
blocks. If we learn a new block, rotate out an existing block-relay peer in
favor of the new peer.
This augments the existing outbound peer rotation that exists -- currently we
make new full-relay connections when our tip is stale, which we disconnect
after waiting a small time to see if we learn a new block. As block-relay
connections use minimal bandwidth, we can make these connections regularly and
not just when our tip is stale.
Like feeler connections, these connections are not aggressive; whenever our
timer fires (once every 5 minutes on average), we'll try to initiate a new
block-relay connection as described, but if we fail to connect we just wait for
our timer to fire again before repeating with a new peer.