0.99.[012] have a bug https://github.com/JelteF/derive_more/issues/114
which makes the Deref derive for bridgedesc::StateGuard not work
and therefore breaks minimal-versions CI.
It seems simpler to require the newer version everywhere.
The explicit list of variant names, that needs to be kept in sync, and
is a test failure semver break hazard, is now gone.
All the necessary code is now generated automatically, and cannot be
wrong.
I want this because I find myself wanting to add a second
implementation of FlagEvent, for another type.
The feature we want is `#[doc = include_str!("README.md")]`, which is
stable since 1.54 and our MSRV is now 1.56.
This commit is precisely the result of the following Perl rune:
perl -i~ -0777 -pe 's{(^//!(?!.*\@\@).*\n)+}{#![doc = include_str!("../README.md")]\n}m' crates/*/src/lib.rs
(Since the APIs for the `Schedule::sleep*` functions changed, this
is a breaking change in tor-rtcompat. Therefore, the Runtime trait
in tor-rtcompat is now a different trait. Therefore, anything that
uses the Runtime trait in its APIs has also broken.)
This fixes an busy-loop.
When the last `TaskHandle` on a `TaskSchedule` is dropped, the
schedule is permanently canceled: whatever operation it was
scheduling should no longer be performed. But our code was broken:
the `sleep()` and `sleep_until_wallclock()` functions don't verify
whether the handles are dropped or not.
This breakage caused an CPU-eating busy-loop in
`sleep_until_wallclock`.
With this patch, we now return a `Result<(), SleepError>` from these
functions.
Fixes#572.
Because we want to work more on ensuring that our semver stability
story is solid, we are _not_ bumping arti-client to 1.0.0 right now.
Here are the bumps we _are_ doing. Crates with "minor" bumps have
had API breaks; crates with "patch" bumps have had new APIs added.
Note that `tor-congestion` is not bumped here: it's a new crate, and
hasn't been published before.
```
tor-basic-utils minor
fs-mistrust minor
tor-config minor
tor-rtcompat minor
tor-rtmock minor
tor-llcrypto patch
tor-bytes patch
tor-linkspec minor
tor-cell minor
tor-proto minor
tor-netdoc patch
tor-netdir minor
tor-persist patch
tor-chanmgr minor
tor-guardmgr minor
tor-circmgr minor
tor-dirmgr minor
arti-client minor
arti-hyper minor
arti major
arti-bench minor
arti-testing minor
```
Do _not_ bump the dependency versions on crates that have had no
changes since arti 0.0.5, since those crates do not depend on the
new APIs.
```
cargo set-version -p tor-basic-utils --bump patch
cargo set-version -p tor-llcrypto --bump patch
git restore crates/tor-checkable
git restore crates/tor-consdiff
git restore crates/tor-rtmock
```
This performs the transitive closure of the last operation:
everything that depends on a crate with a breaking change gets the
version which it depends on bumped.
```
cargo set-version -p tor-proto --bump minor
cargo set-version -p tor-netdoc --bump minor
cargo set-version -p arti-hyper --bump minor
cargo set-version -p arti-bench --bump minor
cargo set-version -p arti-testing --bump minor
cargo set-version -p tor-config --bump minor
```
Over the years we've found that most callers who want a netdir want
what C Tor calls a "reasonably live" network directory: One that is
not expired by too much, or too far in the future. But a few want a
_strictly_ live directory: one that says it is valid now, with no
tolerances. And a few want _any_ directory, no matter how expired
it is.
This commit adds net methods to NetDirProvider to provide these
directories. I think that most use cases will want to explicitly
think about what kind of directory they want, so I've made `netdir`
the simplest method. I might remove `timely_netdir` by the end of
this branch; see TODO comments.
Part of #518.
This name is more accurate because we aren't only dealing with
clock skew here: we're also trying to tolerate the case where the
authorities fail to reach consensus for a while.
According to doc/Errors.md, and in keeping with current best
practices, we should not include display an error's `source()` as
part of that error's display method. Instead, we should let the
caller decide to call source() and display that error in turn.
Part of #323.
The `DirBootstrapStatus` type now exposes a blockage() method to
return an `Option<DirBlockage>`.
The blockage types reported are more low-level than I'd like, but
they are IMO good enough for now: we'll want to get experience with
actual vs hypothetical problems before we refine them.
A "reset" happens whenever we have to start a download attempt over
-- either because we ran out of retries, or we found something wrong
with the consensus after fetching certificates.
An "error" happens when we have a recoverable error from one or more
directory sources.
A "stall" happens whenever a round of downloads or cache loads leads
to no change in the status.
We don't yet use this as part of our status reporting.
Previously we used the "if-modified-since" time associated with the
consensus download, and/or the "valid-after" time in the consensus
attempt, to put multiple attempts into sequence, and to tell one
from another. But that approach was always a kludge, and will soon
get more unreliable as the DirStatus logic gets a bit more complex.
With this commit, we change separate download attempts to be
identified with an AttemptId that increments whenever we decide to
get a different directory from the one we have. IMO this new code
is _much_ cleaner.
This is about to become only a _part_ of what defines a DirStatus: a
DirStatus will also include a reset count, and some kind of info
about how long we've gone without progress.
This change also means that we need to create the handle and scheduler
earlier in the process of creating the DirMgr. If we don't, we won't
have a way to manage the task before bootstrap() returns.
The "full" feature is a catch-all for all features, _except_:
* Those that select a particular implementation (like
tor-llcrypto/with-openssl) or build flag (like "static")
* Those that are experimental or unstable (like "experimental-api")
* Those that are testing-only.
This change (not yet exposed as an API) will let the TorClient have
a `TaskHandle` corresponding to the directory task, letting it
make the directory task dormant as needed.
This only affects uses of thread_rng(), and affects them all more or
less indiscriminately. One test does not work with
ARTI_TEST_PRNG=deterministic; the next commit will fix it.
This commit was made by reverting the previous commit, then
re-running the script I used to generate it. In theory there should
be no semantic changes: only changes due to improved formatting from
cargo edit.
I followed the following procedure to make these changes:
* I used maint/changed_crates to find out which crates had changed
since 0.3.0.
* I used grep and maint/list_crates to sort those crates in
topological (dependency) order.
* I looked through semver_status to find which crates were listed as
having semver-relevant changes (new APIs and breaking changes).
* I scanned through the git logs of the crates with no
semver-relevant changes listed to confirm that, indeed, they had
no changes. For those crates, I incremented their patch-level
version _without_ changing the version that other crates depend on.
* I scanned through the git logs of the crates with no
semver-relevant changes listed to confirm that, indeed, they had
no obvious breaking changes.
* I treated all crates that depend on `arti` and/or `arti-client` as
having breaking changes.
* I identified crates that depend on crates that have changed, even
if they have not changed themselves, and identified them as having
a non-breaking change.
* For all of the crates, I used `cargo set-version -p $CRATE --bump
$STATUS` (where `STATUS` is `patch` or `minor`) to update the
versions, and the depended-upon versions.
Now that the relevant functions now report changed/not-changed
status via a boolean out-parameter (see !527), there's no reason to
have a separate NoChanged error case.
Closes#484.
Previously in !511 I had introduced a bug where, if there was an
error more serious than "no change", that error would keep us from
noticing that we had no change, and we'd loop until the safety
counter ran out. Then we'd panic.
This commit fixes the bug by reintroducing the `changed` boolean --
this time as an outparam for the add_from_* methods.
Fixes#482.
We no longer have separate return paths for recoverable and fatal
errors; instead, they are merged, and distinguished based on
recovery actions.
Since it is now possible for download() to give an error that should
_not_ destroy the previous state, it takes `&mut Box<dyn DirState>`.
This change unfortunately means that we can no longer call `state =
state.advance()`, but instead have to do some mem::swap junk with
poisoned values. Any better solution would be a good thing.
Additionally, the reset() and advance() methods can no longer fail.
There is still a separate return path for reset-triggering errors;
I'm about to fix that.
Previously, we did this in `advance()`, but that wasn't so great: it
meant that we could fail in the advance() code, whereas the calls to
`advance()` treated errors as fatal.
This treats failed verification as a blocking error that requires a
reset.
Fixes one aspect of #439.
Previously DocSource would tell you whether the document was from
a local store or a cache server, but it wouldn't tell you _which_
server it came from.
This change required adding DocSource as an argument to
DirState::add_from_download.
This was anomalous, in that it contains &'static str, rather than a
proper nested error (eg a config::ConfigError, maybe).
But in fact it tursn out it is now not constructed. The last
construction site was removed a long time ago in
Use derive_builder for Authority and FallbackDir.
With this API we can now stop consensus download attempts early if
any consensus that the directory cache gave us would be necessarily
too far in the future or in the past.
This saves wasted bandwidth for clients with skewed clocks.
Closes#466.
If we're happy with a directory from 3 days ago, we should say
"if-modified-since 3 days ago".
This patch is larger than I'd like, since I had to add &DirMgrConfig
as an argument to the functions that make a consensus request.
Closes#467.
Since we want to be willing to use older consensuses, we don't
necessarily want to reset a download just because the consensus is
expired.
This new behavior isn't ideal either; I've added a TODO that relates
to #433.
Related of #412
This new section describes how much variance we accept when it comes
to expired and not-yet-valid directory documents. (Currently, the
only ones where this matters for are consensus documents and
authority certificates.) A document that is invalid by no more than
these tolerances is not _live_, but it can still be used.
These tolerances serve two purposes:
* First, they allow clients to run with a little more clock skew
than they would tolerate otherwise.
* Second, they allow clients to survive the situation where the
authorities are unable to reach a consensus for a day or two.
Compare with Tor's REASONABLY_LIVE_TIME and NETWORKSTATUS_ALLOW_SKEW
constants; also compare with proposal 212.
Closes#412.
- Some FIXMEs got removed or amended.
- AddMicrodescs now yields a mutable reference, so we can use .drain()
and reuse the allocation.
- Some panics were downgraded to debug_asserts.
- We don't want to inadvertently replace our netdir with one that's
actually older, so detect and error on this condition.
- Also, print a debug line when we get a new netdir without enough
guards.
- (An unrelated TODO was also added.)
- Taking a previous netdir directly and keeping it around before we need
it is a bit of a waste of memory, and also doesn't mesh well with how
SharedMutArc works.
- To remedy this, introduce a new trait `PreviousNetDir` and have the
state machines take that instead. (I was a bit tempted to just pass in
the SharedMutArc directly. Maybe I should've done that.)
- GetMicrodescsState now uses the NetDirChange API to propagate netdir
changes, instead of modifying the netdir directly.
- PendingNetDir was refactored in order to support this use case.
- As a result, the netdir-related methods in WriteNetDir can be removed,
leaving only the DirFilter for now.
- add_from_cache() no longer takes a store, because nothing uses it.
- (bodge: apply_netdir_changes() was put in a few places missed
previously)
- The new DirState::get_netdir_change() API lets the state machine
export a NetDirChange: a request to either replace the current netdir,
or add microdescs to it.
- bootstrap.rs now consumes this new API, even though nothing implements
it yet.
- This will let us implement GetMicrodescsState without having to
directly mutate the netdir. The calling code also handles checking the
netdir against the circmgr for sufficiency, and updating the consensus
metadata in the store, meaning the revised GetMicrodescsState will not
have to perform these tasks.
- The additional parameters passed to GetConsensusState are now passed
through all the states, and used as well.
- WriteNetDir doesn't have a now() or config() method any more, since
the states now get this from the runtime or the config parameters.
- This required modifying the tests to make a mocked runtime and custom
config directly, instead of using DirRcv for this purpose.
- Additionally, because we don't have to upgrade a weak reference for
DirState::dl_config(), that function no longer wraps its return value
in Result.
- (A bunch of the FIXMEs from the previous commit that introduced the
additional parameters have now been rectified as a result.)
- GetConsensusState::new now takes a set of parameters matching what it
actually needs, instead of just taking a writedir. (It still *does*
take a writedir, and indeed still uses it for basically everything,
but that will eventually go away.)
- Its call sites were updated.
- Some tests now need to take a runtime, and got indented a lot as a
result.
- Resetting was made non-functional, because we need to thread through
the parameters passed to GetConsensusState to all of the other
states, too. This will happen in a later commit.
- Given that this is effectively an implementation detail, it doesn't
really make sense to have it be in the crate root...
- (also, we're going to change it a bunch now)
- fetch_single now takes what it needs, instead of an Arc<DirMgr<R>>.
- This required refactoring the CANNED_RESPONSE mechanism, given the
test would otherwise fail due to not having a CircMgr to pass to
fetch_single.
- query_into_requests is now called make_requests_for_documents, and
does the &[DocId] -> DocQuery conversion internally instead.
- DirMgr::make_consensus_request and DirMgr::query_into_requests are now
gone. The tests use the new functions, as does fetch_multiple.
- There's no good reason these functions needed to be part of the
dirmgr, apart from needing a runtime and a store.
- However, we can just add those as arguments and copy them over. This
commit does that.
- Function renamed & docs tidied up a bit
- Function signature now takes what it needs (immutable &dyn Store
instead of mutex, slice instead of Vec) and nothing more
- DocQuery::load_documents_into was also renamed
DocQuery::load_from_store_into and given similar treatment
Annoyingly, Rust doesn't automatically generate this sort of `impl` for
you, and I'd like to reduce the usage of Mutex<DynStore> everywhere else
in favour of either &dyn Store or &mut dyn Store.
(This is for two reasons: firstly, we might have a Store implementation
that doesn't use a mutex as above, or similar refactors; secondly,
passing the raw trait object reference lets us encode mutability into
the function signature, which I believe is quite valuable.)
For reference, the git source for this crate (and the others in its
workspace) currently lives in my personal github account (ijackson).
If this fork turns out to be long-lived and gains features and/or
users, it would be good to move it to a gitlab somewhere.
I have granted Nick crate ownership on the crates.io system.
* Builders additionally derive: Debug, Serialize, Deserialize.
* Validated structs no longer derive: Serialize, Deserialize
and all related attributes deleted.
* As a consequence, all the `#[serde(deny_unknown_fields)]`
are gone. That means that right now unknown fields are totally
ignored. This is good for compatibility but poor for useability.
Doing something better here is arti#417, in progress.
* As a consequence, delete tor_dirmgr::retry::default_parallelism.
(The default value was already duplicated into a builder attr.)
And drop the ad-hoc orport() method. This brings FallbackDir's
orports field in line with our list builder API.
The general semver note in "configuation" seems to cover most of this.
This type was returned by the public DownloadSchedule::builder
function. But the only thing that seems to have noticed that the type
name itself wasn't exported, was rustdoc. Hmmm.
The new API is (roughly) as discussed in
https://gitlab.torproject.org/tpo/core/arti/-/issues/451
This is quite a large commit and it is not convenient to split it up.
It contains the following changes:
* Redo the list builder and accessor macros implemnetation,
including docs and tests.
* Change uses of define_list_config_builder. In each case:
- Move the docs about the default value to the containing field.
- Remove the other docs (which were just recapitulations, and
are now not needed since the ListBuilder is no longer public).
- Rewmove or replace `pub` in the define_list_builder_helper call,
so that the builder is no longer public.
- Change the main macro call site to use define_list_builder_helper.
- Add a call to define_list_builder_accessors.
* Make the module `list_builder` pub so that we have somewhere to
put the overview documentation.
* Consequential changes:
- Change `outer.inner().replace(X)` to `outer.set_inner(X)`
- Consequential changes to imports (`use` statements).
Previously this field was differently named to its serde and to its
accessors. We are about to introduce a macro_rules macro which will
provide list accessors and we don't want that macro to have a field
renaming feature.
So stop renaming the field.
The `[patch]` approach causes the tree not to build when used as a
dependency, unless the `[patch]` is replicated into the depending
project.
Instead, replace our `derive_builer =` dependencies with a reference
to a specific git commit:
perl -i~ -pe 'next unless m/^derive_builder/; s#"(0\.11\.2)"#{ version = "$1", git = "https://github.com/ijackson/rust-derive-builder", rev = "ba0c1a5311bd9f93ddf5f5b8ec2a5f6f03b22fbe" }#' crates/*/Cargo.toml
Note that the commitid has changed. This is because derive_builder is
in fact a workspace of 4 crates. 3 of them are of interest to arti
itself (the 4th exists only for testing). So the same "add git
revision" treatment had to be done to the `derive_builder` and
`derive_builder_macro` crates. Each dependency edge involves a new
commit in the derive_builder workspace, since we can't create a git
commit containing its own commitid. (We want to use commits, rather
than a branch, so that what we are depending on is actually properly
defined, and not subject to the whims of my personal github
namespace.)
There are no actual code changes in derive_builder.
This is actually a number of *attempts* not a number of *retries*.
The setter method was already called "attempts".
This chnages the deserialisation of the config.
Use sub_builder. We must do something special for defaults.
This involves moving the actual default values for retry_bootstrap and
retry_microdescs into config.rs, since they need to access the fields
of the un-built version of the structure. (An alternative would be to
generate "weak setters" which do not override previous settings, but
derive_builder does not offer to generate them and that seems
overkill.)
Instead, everyone should use DownloadScheduleBuilder.
The new() method would in any case be useless in a moment, since we're
going to embed DownloadScheduleBuilder in the NetworkConfig, not
DownloadSchedule.
The call sites in the tests are all about to change again.
The current behaviour is to treat 0 as indicating "use the default",
which is quite strange. We are going to get rid of that.
The new way will be to reject zero, during
DownloadScheduleBuilder::build, Add a test case for that.
And add an imprecation in define_list_config_builder's doc comment do
do so in future for other invocations of the macro.
Add add the missing full stops.
This is an automated change made with a perl one-liner and verified
with grep -L and grep -l.
Some warnings are introduced with this change; they will be removed
in subsequent commits.
See arti#208 for older discussion on this issue.
NetworkConfigBuilder needs to not contain any validated structs, so
that its serde does not expose the validated details.
AuthorityListBuilder is what ought to go here - and it contains
Vec<AuthorityBuilder>, not Vec<Authority>. As a consequence, many
places now deal with AuthorityBuilder, rather than Authority.
Now the network fallbacks configuration wants to Deserialize
a Vec<FallbackDirBuilder>, rather than validated Vec<FallbackDir>.
Methods on FallbackListBuilder are as per
https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/462#note_2797697
mutatis mutandi for the fact that this struct has only fallbacks in it.
This is where the FallbackList type is. We are going to want to
provide a builder too, which ought to impl Default.
This means that the default value for the type must be next to the
type. In any case, it was anomalous that it wasn't.
This commit is pure code motion.
This commitid is the current head of my MR branch
https://github.com/colin-kiegel/rust-derive-builder/pull/253https://github.com/ijackson/rust-derive-builder/tree/field-builder
Using the commitid prevents surprises if that branch is updated.
We will require this newer version of derive_builder. The version
will need to be bumped again later, assuming the upstream MR is merged
and upstream do a release containing the needed changes.
We will need the new version of not only `derive_builder_core` (the
main macro implementation) but also`derive_builder` for a new error
type.
Doing this will make us treat caches that send us these objects as
not-working, and close circuits to them instead of trying over and
over.
The case where we add a document from the cache requires special
handling: it isn't actually a error to find an expired document in
our cache (unless the passage of time itself is erroneous, which is
a debatable proposition at best).
Fixes#431.
Not all of these strictly need to be bumped to 0.2.0; many could go
to 0.1.1 instead. But since everything at the tor-rtcompat and
higher layers has had breaking API changes, it seems not so useful
to distinguish. (It seems unlikely that anybody at this stage is
depending on e.g. tor-protover but not arti-client.)
The previous algorithm had two flaws:
* It would wait even after the final attempt, when there were no
more retries to do.
* It would fail to wait between attempts if an error occurred.
This refactoring fixes both of these issues, and adds some comments.