fa4cb13b52030c2e55c6bea170649ab69d75f758 test: [doc] Manually unify stale headers (MarcoFalke)
fa5f29774872d18febc0df38831a6e45f3de69cc scripted-diff: [doc] Unify stale copyright headers (MarcoFalke)
Pull request description:
Historically, the upper year range in file headers was bumped manually
or with a script.
This has many issues:
* The script is causing churn. See for example commit 306ccd4, or
drive-by first-time contributions bumping them one-by-one. (A few from
this year: https://github.com/bitcoin/bitcoin/pull/32008,
https://github.com/bitcoin/bitcoin/pull/31642,
https://github.com/bitcoin/bitcoin/pull/32963, ...)
* Some, or likely most, upper year values were wrong. Reasons for
incorrect dates could be code moves, cherry-picks, or simply bugs in
the script.
* The upper range is not needed for anything.
* Anyone who wants to find the initial file creation date, or file
history, can use `git log` or `git blame` to get more accurate
results.
* Many places are already using the `-present` suffix, with the meaning
that the upper range is omitted.
To fix all issues, this bumps the upper range of the copyright headers
to `-present`.
Further notes:
* Obviously, the yearly 4-line bump commit for the build system (c.f.
b537a2c02a9921235d1ecf8c3c7dc1836ec68131) is fine and will remain.
* For new code, the date range can be fully omitted, as it is done
already by some developers. Obviously, developers are free to pick
whatever style they want. One can list the commits for each style.
* For example, to list all commits that use `-present`:
`git log --format='%an (%ae) [%h: %s]' -S 'present The Bitcoin'`.
* Alternatively, to list all commits that use no range at all:
`git log --format='%an (%ae) [%h: %s]' -S '(c) The Bitcoin'`.
<!--
* The lower range can be wrong as well, so it could be omitted as well,
but this is left for a follow-up. A previous attempt was in
https://github.com/bitcoin/bitcoin/pull/26817.
ACKs for top commit:
l0rinc:
ACK fa4cb13b52030c2e55c6bea170649ab69d75f758
rkrux:
re-ACK fa4cb13b52030c2e55c6bea170649ab69d75f758
janb84:
ACK fa4cb13b52030c2e55c6bea170649ab69d75f758
Tree-SHA512: e5132781bdc4417d1e2922809b27ef4cf0abb37ffb68c65aab8a5391d3c917b61a18928ec2ec2c75ef5184cb79a5b8c8290d63e949220dbeab3bd2c0dfbdc4c5
The test was a bit confusing, because it just referred to the "global
log level" without explicitly specifying what it is. The level is set
though the LogSetup constructor. However, it is easier to follow unit
tests, if they are self-contained. So just set the level to Debug
explicitly here.
Also, add a new debug_3 log, to further document the intended behavior
of the unit test.
Also, replace the LogPrintLevel with the shorter and exact replacements
LogTrace and LogDebug.
As reported by hebasto in https://github.com/bitcoin/bitcoin/issues/33545,
newer libc++ versions implementing https://wg21.link/lwg3430 will no longer
implicitly convert `fs::path` objects to `std::filesystem::path` objects when
constructing `std::ifstream` and `std::ofstream` types.
This is not a problem in Unix systems since `fs::path` objects use
`std::string` as their native string type, but it causes compile errors on
Windows which use `std::wstring` as their string type, since `fstream`s can't
be constructed from `wstring`s.
Fix the windows libc++ compile errors by adding a new `fs::path::std_path()`
method and using it construct `fstream`s more portably.
Additionally, delete `fs::path`'s implicit `native_string` conversion so these
errors will not go undetected in the future, even though there is not currently
a CI job testing Windows libc++ builds.
Change time_window from 20s to 1h so Reset is not accidentally called
if the test takes a while.
Change num_lines from 1024 to 10 since LogRateLimiter is parameterized
and does not require logging 1MiB of data.
Co-Authored-By: stickies-v <stickies-v@protonmail.com>
- Add helper functions and structs to improve readability and
reusability of test code
- Make tests more specific by comparing all produced log lines with
expected log lines instead of relying on approximations or proxies.
To mitigate disk-filling attacks caused by unsafe usages of LogPrintf and
friends, we rate-limit them by passing a should_ratelimit bool that
eventually makes its way to LogPrintStr which may call
LogRateLimiter::Consume. The rate limiting is accomplished by
adding a LogRateLimiter member to BCLog::Logger which tracks source
code locations for the given logging window.
Every hour, a source location can log up to 1MiB of data. Source
locations that exceed the limit will have their logs suppressed for the
rest of the window determined by m_limiter.
This change affects the public LogPrintLevel function if called with
a level >= BCLog::Level::Info.
The UpdateTipLog function has been changed to use the private LogPrintLevel_
macro with should_ratelimit set to false. This allows UpdateTipLog to log
during IBD without hitting the rate limit.
Note that on restart, a source location that was rate limited before the
restart will be able to log until it hits the rate limit again.
Co-Authored-By: Niklas Gogge <n.goeggi@gmail.com>
Co-Authored-By: stickies-v <stickies-v@protonmail.com>
The std::source_location conveniently stores the file name, line number,
and function name of a source code location. We switch to using it instead
of the __func__ identifier and the __FILE__ and __LINE__ macros.
BufferedLog is changed to have a std::source_location member, replacing the
source_file, source_line, and logging_function members. As a result,
MemUsage no longer explicitly counts source_file or logging_function as the
std::source_location memory usage is included in the MallocUsage call.
This also changes the behavior of -logsourcelocations as std::source_location
includes the entire function signature. Because of this, the functional test
feature_config_args.py must be changed to no longer include the function
signature as the function signature can differ across platforms.
Co-Authored-By: Niklas Gogge <n.goeggi@gmail.com>
Co-Authored-By: stickies-v <stickies-v@protonmail.com>
LogRateLimiter will be used to keep track of source locations and our
current time-based logging window. It contains an unordered_map and a
m_suppressions_active bool to track source locations. The map is keyed
by std::source_location, so a custom Hash function (SourceLocationHasher)
and custom KeyEqual function (SourceLocationEqual) is provided.
SourceLocationHasher uses CSipHasher(0,0) under the hood to get a
uniform distribution.
A public Reset method is provided so that a scheduler (e.g. the
"b-scheduler" thread) can periodically reset LogRateLimiter's state when
the time window has elapsed.
The LogRateLimiter::Consume method checks if we have enough available
bytes in our rate limiting budget to log an additional string. It
returns a Status enum that denotes the rate limiting status and can
be used by the caller to emit a warning, skip logging, etc.
The Status enum has three states:
- UNSUPPRESSED (logging was successful)
- NEWLY_SUPPRESSED (logging was succcesful, next log will be suppressed)
- STILL_SUPPRESSED (logging was unsuccessful)
LogLimitStats counts the available bytes left for logging per source
location for the current logging window. It does not track actual source
locations; it is used as a value in m_source_locations.
Also exposes a SuppressionsActive() method so the logger can use
that in a later commit to prefix [*] to logs whenenever suppressions
are active.
Co-Authored-By: Niklas Gogge <n.goeggi@gmail.com>
Co-Authored-By: stickies-v <stickies-v@protonmail.com>
This test checks m_log_sourcelocations, not the formatting with format
specifiers. Those are tested in logging_LogPrintMacros below.
So just use LogPrintStr directly in this test, without format specifiers
and format args.
This is required for a follow-up commit.
LogPrint*ThreadNames is redundant with LogWith(out)ThreadNames, because
they all measure toggling the thread names (and check that it has no
effect on performance).
This also allows to remove unused and deprecated macros.
There are no changes to behavior. Changes in this commit are all additions, and
are easiest to review using "git diff -U0 --word-diff-regex=." options.
Motivation for this change is to keep util functions with really generic names
like "Split" and "Join" out of the global namespace so it is easier to see
where these functions are defined, and so they don't interfere with function
overloading, especially since the util library is a dependency of the kernel
library and intended to be used with external code.
These provide simple and clear ways to write the most common logging
operations:
LogInfo("msg");
LogDebug(BCLog::LogFlags::NET, "msg");
LogError("msg");
LogWarning("msg");
LogTrace(BCLog::LogFlags::NET, "msg");
For cases where the level cannot be hardcoded, LogPrintLevel(category,
level, ...) remains available.
for verbose log messages for development or debugging only, as bitcoind may run
more slowly, that are more granular/frequent than the Debug log level, i.e. for
very high-frequency, low-level messages to be logged distinctly from
higher-level, less-frequent debug logging that could still be usable in production.
An example would be to log higher-level peer events (connection, disconnection,
misbehavior, eviction) as Debug, versus Trace for low-level, high-volume p2p
messages in the BCLog::NET category. This will enable the user to log only the
former without the latter, in order to focus on high-level peer management events.
With respect to the name, "trace" is suggested as the most granular level
in resources like the following:
- https://sematext.com/blog/logging-levels
- https://howtodoinjava.com/log4j2/logging-levels
Update the test framework and add test coverage.
This is more consistent with the other functions, as well as with the
logging output itself. If we want to make this change, we should do it
before it's all over the place.