We must not apply our new path-bias behavior (where we blame a guard
if it gives us too many indeterminate circuit failures) if the path
was not chosen at random. If too many random paths fail, we know
that's suspicious, since the other relays are a random sample. But
if a bunch of user-provided paths fail, that could simply be because
the user's chosen exit is down.
We now track, for every guard: the total number of successful
circuits we've built through it, along with the total number of
"indeterminate" circuits.
Recall that a circuit's status is "indeterminate" if it has failed
for a reason that _might_ be the guard's fault, or might not be the
guard's fault. For example, if extending to the second hop of the
circuit fails, we have no way to know whether the guard deliberately
refused to connect there, or whether the second hop is just offline.
But we don't want to forgive all indeterminate circuit failures: if
we did, then a malicious guard could simply reject any second hops
that it didn't like, thereby filtering the client into a chosen
set of circuits.
As a stopgap solution, this patch now makes guards become
permanently disabled if the fraction of their circuit failures
becomes too high.
See also general-purpose path bias selection (arti#65), and Mike's
idea for changing the guard reachability definition (torspec#67).
This patch doesn't do either of those.
Closes#185.
Instead of racily advancing time forward, this commit attempts to rework
how WaitFor works, such that it makes advances when all sleeper futures
that have been created have been polled (by handing the MockSleepRuntime
a Waker with which to wake up the WaitFor).
The above described mechanics work well enough for the double timeout
test, but fail in the presence of code that spawns asynchronous /
background tasks that must make progress before time is advanced for the
test to work properly. In order to deal with these cases, a set of APIs
are introduced in order to block time from being advanced until some
code has run, and a carveout added in order to permit small advances in
time where required.
(In some cases, code needed to be hacked up a bit in order to be made
properly testable using these APIs; the `MockablePlan` trait included in
here is somewhat unfortunate.)
This should fix arti#149.
Now that we have two kinds of isolation tokens (those set on a
stream, and those set by the stream's associated TorClient), we
need a more sophisticated kind of isolation.
This fixes the bug introduced with the previous commit, where
per-stream tokens would override per-TorClient tokens.
The new `hyper` tor-client example demonstrates integrating arti with the
popular Rust `hyper` HTTP library by implementing a custom Hyper "connector"
(a type that can initiate connections to HTTP servers) that proxies said
connections via the Tor network.
This will let callers use the tokio traits on these types too, if
they call `split()` on the DataStream.
(Tokio also has a `tokio::io::split()` method, but it requires a
lock whereas `DataStream::split()` doesn't.)
futures::io::AsyncRead (and Write) isn't the same thing as tokio::io::AsyncRead,
which is a somewhat annoying misfeature of the Rust async ecosystem (!).
To mitigate this somewhat for people trying to use the `DataStream` struct with
tokio, implement the tokio versions of the above traits using `tokio-util`'s
compat layer, if a crate feature (`tokio`) is enabled.
The three arguments TorClient::bootstrap requires by way of configuration
have been factored into a new TorClientConfig object.
This object gains two associated functions: one which uses `tor_config`'s
`CfgPath` machinery to generate sane defaults for the state and cache
directories, and one that accepts said directories in order to create a
config object with those inserted.
(this commit was inspired by trying to use arti as a library and being somewhat
overwhelmed by the amount of config stuff there was to do :p)
Previously we'd say that we were "waiting for the other process to
bootstrap" even if it was already bootstrapped: and we wouldn't
actually declare success when it was done.
The `get_relay` function was confusing, since it would return None if
the relay was present, but wasn't actually a guard. We only used it
in one place, and in that one place we used it wrong, leading to a
panic bug.
Fixes#193.
For now, this avoids having to separately handle
AuthorityBuilderError, DirMgrConfigBuilderError, DownloadScheduleConfigBuilderError,
NetworkConfigBuilderError and FallbackDirBuilderError when anyhow is not
used.
Turn off a clippy warning.
The default soft limit is typically enough for process usage on most
Unixes, but OSX has a pretty low default (256), which you can run
into easily under heavy usage.
With this patch, we're going to aim for as much as 16384, if we're
allowed.
Fixes part of #188.
I don't love this approach, but those errors aren't distinguished by
ErrorKind, so we have to use libc or winapi, apparently. At least
nothing here is unsafe.
Addresses part of #188.
The previous code would report all failures to build a circuit as
failures of the guard. But of course that's not right: If we
fail to extend to the second or third hop, that might or might not
be the guard's fault.
Now we use the "pending status" feature of the GuardMonitor type so
that an early failure is attributed to the guard, but a later
failure is attributed as "Indeterminate". Only a complete circuit
is called a success. We use a new "GuardStatusHandle" type here so
that we can report the status early if there is a timeout.
(When we're building a path with a guard, we need to tell the guard
manager whether the path succeeded, and we need to wait to hear
whether the guard is usable.)
The advantage here is that we no longer have to use a futures-aware
Mutex, or a blocking send operation, and therefore can simplify a
bunch of the GuardMgr APIs to no longer be async. That'll avoid
having to propagate the asyncness up the stack.
The disadvantage is that unbounded channels are just that: nothing
in the channel prevents us from overfilling it. Fortunately, the
process that consumes from the channel shouldn't block much, and
the channel only gets filled when we're planning a circuit path.
I tried using -Z minimal-versions to downgrade all first-level
dependencies to their oldest permitted versions, and found that we
were apparently depending on newer features of all three crates.
I'm kind of surprised there were only three.
Also, refactor our message handling to be more like the tor_proto
reactors. The previous code had a bug where, once the stream of
events was exhausted, we wouldn't actually get any more
notifications.
There are some missing parts here (like persistence and tests)
and some incorrect parts (I am 90% sure that the "exploratory
circuit" flag is bogus). Also it is not integrated with the circuit
manager code.
The limitations with toml seemed to be reaching a head, and I wasn't
able to refactor the guardmgr code enough to actually have its state
be serializable as toml. Json's limitations are much narrower.
We'll need `id_pair_is_listed()` to track whether a sampled guard is
(or is not) listed in the consensus.
We'll need `missing_descriptor_for` to see whether we've downloaded
enough microdescs to use a consensus.
(Thank goodness for rust; we messed up the coherency in C here so
many times, but I'm pretty sure that this time around we can't have
gotten it wrong.)