This commitid is the current head of my MR branch
https://github.com/colin-kiegel/rust-derive-builder/pull/253https://github.com/ijackson/rust-derive-builder/tree/field-builder
Using the commitid prevents surprises if that branch is updated.
We will require this newer version of derive_builder. The version
will need to be bumped again later, assuming the upstream MR is merged
and upstream do a release containing the needed changes.
We will need the new version of not only `derive_builder_core` (the
main macro implementation) but also`derive_builder` for a new error
type.
Rather than running preemptive circuit construction every 10
seconds, we change it to back off when it is "failing". (We define
"failing" as creating no new circuits, and as giving at least one
error.)
This change means that we'll have one less reason to hammer the
network when our connectivity is failed for some reason.
Closes#437.
Part of #329.
This feature is similar to ChanProvenance from ChanMgr, except that
we don't yet need to report it outside the crate. I'm going to use
it to distinguish newly created circuits from existing circuits in
the preemptive circuit builder.
This lets us say that the UsageMismatch cases in some parts of the
code reflect a programming error (RetryTime::Never), whereas in
other case it reflects another circuit request getting to the
circuit first (RetryTime::Immediate).
Previously we did not distinguish errors that came from pending
circuits from errors that came from the circuits we were
building. We also reported errors as coming from "Left" or "Right",
instead of a more reasonable description.
We were treating restrict_mut() failures as internal errors, and
using internal errors to represent them. But in fact, these
failures are entirely possible based on timing. Here's how it
happens:
* Two different circuit requests arrive at the same time, and both
notice a pending circuit that they could use.
* The pending circuit completes; both pending requests are notified.
* The first request calls restrict_mut(), and restricts the request
in such a way that the second couldn't use it.
* The second request calls restrict_mut(), and gets a failure.
Because of this issue, we treat these errors as transient failures
and just wait for another circuit.
Closes#427.
(This is not a breaking API change, since `AbstractSpec` is a
crate-private trait.)
Not all of these strictly need to be bumped to 0.2.0; many could go
to 0.1.1 instead. But since everything at the tor-rtcompat and
higher layers has had breaking API changes, it seems not so useful
to distinguish. (It seems unlikely that anybody at this stage is
depending on e.g. tor-protover but not arti-client.)
The older default seems (experimentally) to be ridiculously high.
Generally, if we can't build a circuit within a handful attempts,
that circuit has already timed out... unless there is a fast-failure
condition, in which case we're just hammering the network (or our
view of it.)
Found with `arti-testing` for #329.
Previously, if we had launch_parallelism > 1, and we were willing to
retry building a circuit max_retries times, then we'd launch up to
max_retries * launch_parallelism circuits before giving up. Ouch!
With this patch, we try to keep the total number of circuits
planned and attempted to the actual max_retries limit.
Part of #329; found with arti-testing.
The FirstHopId type now records an enum that stores whether the hop
is a guard or a fallback. This change addresses concerns about
remembering to check the type or source of an Id before passing it
down to the FallbackState or GuardSet.
Making this change required an API change, so that dirmgr can
report success/failure status without actually knowing whether it's
using a fallback or a guard.
The code here uses a new iterator type, since I couldn't find one of
these on crates.io. I tried writing the code without it, but it was
harder to follow and test.
We do this by creating a new FallbackSet type that includes status
information, and updating the GuardMgr APIs to record success and
failure about it when appropriate. We can use this to mark
FallbackDirs retriable (or not).
With this change, FallbackDir is now stored internally as a Guard in
the GuardMgr crate. That's fine: the FallbackDir type really only
matters for configuration.
If we're building a path with the guard manager involved, we now ask
the guard manager to pick our first hop no matter what. We only
pick from the fallback list ourselves if we're using the API with no
guard manager.
This causes some follow-on changes where we have to remember an
OwnedChanTarget object in a TorPath we've built, and where we gain
the ability to say we're building a path "from nothing extra at
all." Those are all internal to the crate, though.
Closes#220, by making sure that we use our guards to get a fresh
netdir (if we can) before falling back to any fallbacks, even if our
consensus is old.
Compilation should be fixed in the next commit.
The guard manager is responsible for handing out the first hops of
tor circuits, keeping track of their successes and failures, and
remembering their states. Given that, it makes sense to store this
information here. It is not yet used; I'll be fixing that in
upcoming commits.
Arguably, this information no longer belongs in the directory
manager: I've added a todo about moving it.
This commit will break compilation on its own in a couple of places;
subsequent commits will fix it up.
This is the logical place for it, I think: the GuardMgr's job is to
pick the first hop for a circuit depending on remembered status for
possible first hops. Making this change will let us streamline the
code that interacts with these objects.
The various background daemon tasks that `arti-client` used to spawn are
now handled inside their respective crates instead, with functions
provided to spawn them that return `TaskHandle`s.
This required introducing a new trait, `NetDirProvider`, which steals
some functionality from the `DirProvider` trait to enable `tor-circmgr`
to depend on it (`tor-circmgr` is a dependency of `tor-dirmgr`, so it
can't depend on `DirProvider` directly).
While we're at it, we also make some of the tasks wait for events from
the `NetDirProvider` instead of sleeping, slightly increasing
efficiency.
Some error types indicate that the guard has failed as a dircache.
We should treat these errors as signs to close the circuit, and to
mark the guard as having failed.
We already have the ability to get peer information from ChanMgr
errors, and therefore from any RetryErrors that contain ChanMgr
errors.
This commit adds optional peer information to tor-proto errors, and
a function to expose whatever peer information is available.
It'll soon more convenient to pass in FallbackDirs as a slice of
references, rather than just a slice of FallbackDirs: I'm going to
be changing how we handle these in tor-dirmgr.
If all guards are down and they won't be retriable for a while, try
waiting that long to get whichever guard _is_ retriable.
Additionally, if we are making multiple circuit plans in parallel,
only report our planning as having failed if we failed at making
_all_ the plans. Previously we treated any failure as fatal for the
other plans, which could lead to trouble in the case when guards
were all down or pending.
Part of #407.
Instead of requiring a `Box<dyn Isolation>`, it now takes either a
`Box<dyn Isolation>`, or an arbitrary `T` that implements
`Isolation`.
This API still allows the user to pass in a `Box<dyn Isolation>` if
that's what they have, but it doesn't require them to Box the
isolation on their own.
Part of #414.