Additionally, refactor the IoError out of tor_cell::Error:
nothing in TorCell created this; it was only used by tor_proto.
This required refactoring in tor_proto to use a new error type. Here I
decided to use a new CodecError for now, though we may refactor that
away soon too.
This error type doesn't impement HasKind, since the kind will depend
on context.
However, the existing implementation was pretty messy and inconsistent:
Some errors had positions, some didn't.
Some took messages as str, some as String.
Some had internal errors that were somewhat orthogonal to their actual
types.
This commit refactors tor_netdoc::Error to use a ParseErrorKind, and
adds a set of convenience functions to add positions and
messages to the errors that need them.
Since the only purpose of this function is to make sure that no
bootstrapping task is running, a simple futures:🔒:Mutex
should do the job just fine.
Closes#337.
This commit changes how the `TorClient` type works, enabling it to be
constructed synchronously without initiating the bootstrapping process.
Daemon tasks are still started on construction (although some of them
won't do anything if the client isn't bootstrapped).
The old bootstrap() methods are now reimplemented in terms of the new
create_unbootstrapped() and bootstrap_existing() methods.
This required refactoring how the `DirMgr` works to enable the same sort
of thing there.
closes#293
This needs two kinds. We have decided to treat a non-shutdown
SpawnError as "unexplained" rather than as an InternalError.
There are many crates whose
From<futures::task::SpawnError> for Error
erroneously treat it as an internal error. We will fix them in a moment.
tor-netdir needs to bump because tor-netdoc bumped, even though
there were no other changes in tor-netdir. Whoops.
tor-guardmgr needs to bump because it already published, with the
older tor-netdir.
This is based on @janimo's approach in !74, but diverges in a few
important ways.
1. It assumes that something like !251 will merge, so that we can
have separate implementations for native_tls and rustls compiled
at the same time.
2. It assumes that we can implement this for the futures::io traits
only with no real penalty.
3. It uses the `x509-signature` crate to work around the pickiness of
the `webpki` crate. If webpki eventually solves their
[bug 219](https://github.com/briansmith/webpki/issues/219), we
can remove a lot of that workaround.
Closes#86.
The information is pretty basic here: we use "have we been able to
connect/TLS-handshake/Tor-handshake" as a proxy for "are we on the
internet? Are we on a reasonably unfiltered part of the internet?"
Eventually we'll want to make the information gathered and exported
more detailed: I've noted a few places in the code. For now,
however, this is about as good as C Tor does today, and it should be
a good starting point.
This uses a slightly different design from tor-dirmgr. Instead of
exporting an entire state structure via `postage::watch`, it exports
only the parts of that structure which the user is supposed to
read. I think that's more reasonable in this case because most of
the possible internal transitions in the tor-chanmgr state don't
cause a change in the exposed status.
The interface is similar to the one exposed by `arti-client`: it
internally uses postage::watch to give a series of events showing
when a bootstrap status is changing.
Thanks to the existing state/driver separation in the DirMgr design
we don't need much new logic: each download state needs to expose
(internally) how far along it is in its download, which the
bootstrap code passes to the DirMgr if it has changed.
I believe that in the long run, we'll probably want to expose more
(or different) information here, and we'll want to process it
differently. With that in mind, I've made the API for
`DirBootstrapStatus` deliberately narrow, so that we can change its
of its internal later on without breaking code that depends on it.
(The information exposed by this commit is not yet summarized in
`arti-client`.)
Part of #96.
We now do multiple samples (configurable; default 3) per type of
`arti-bench` benchmark run, and take a mean and median average of all
data collected, in order to hopefully be a bit more resilient to random
outliers / variation.
This uses some `futures::stream::Stream` hacks, which might result in
more connections being made than required (and might impact the TTFB
metrics somewhat, at least for downloading).
Results now get collected into a `BenchmarkResults` struct per type of
benchmark, which will be in turn placed into a `BenchmarkSummary` in a
later commit; this will also add the ability to serialize the latter
struct out to disk, for future reference.
part of arti#292
The purpose of a this API is to tell the user how far along Arti is
in getting bootstrapped, and if it's stuck, what it's stuck on.
This API doesn't yet expose any useful information: by the time it's
observable to a client, it's always "100% bootstrapped." But I'm
putting it in a MR now so that we can review the basic idea, and to
avoid conflicts with later work on tickets like #293 and #278.
This is part of #96.
This test should only fail very rarely (around 1/2.4e8) when guards
are chosen from a list of 20 with uniform probability. But that
wasn't what we were doing on the mock test network: we were choosing
from a list of 10 viable guards, with nonuniform probability.
As a fix, we change the test network probabilities so that the
guards _are_ chosen with a uniform probability for this test, and we
use a modified version of the test network where there are indeed 20
Guard-flagged relays with the required DirCache=2 protocol.
Closes#276.
Previously we could only configure one global tracing filter that
applied to stdout and journald. There was no support for log files,
either.
This patch fixes both issues, by substantially revising the
configuration format: There are now separate filters for each log
file, for journald, and for the console log. Because we want to
allow multiple logfiles, they have to go into an array in the
configuration.
The configuration logic has grown a bit complicated in its types,
since the tracing_subscriber crate would prefer to have the complete
structure of tracing Layers known statically. That's fine when you
know how many you have, and which kinds there will be, but for
the runtime-configuration case we need to mess around with
`Box<dyn Layer ...>`.
I also had to switch from tracing_subscriber's EnvFilter to its
Targets filter. It seems "EnvFilter" can only be applied as a Layer
in itself, and won't work as a Filter on an individual Layer.
Closes#166.
Closes#170.
The new `arti-bench` crate does a simple end-to-end benchmark test
embedding Arti: it generates some random data (of configurable amount,
depending on command-line parameters), and then sends said data back and
forth via Arti (which should be configured to use a local Chutney
network).
Additionally, the benchmark can also be run via a local SOCKS5 server
(in order to benchmark the performance via a local Chutney node, for
comparison).
The `tests/chutney/arti-bench.sh` sets up and tears down Chutney as
required to make this work.
This is very much a first cut; there are many things that should
eventually get added, such as support for multiple connections, JSON
output capabilities, running multiple tests, ...
This approach tries to preserve the current interface, but uses a
counter-based event backend to implement a coalescing stream of
events that can be represented as small integers. The advantage
here is that publishing events no longer needs to be a blocking
operation, since there is no queue to fill up.
This patch doesn't actually make anything reconfigurable, but it
does create an API that will tell you "you can't change the value of
that!" If the API looks reasonable, I can start making it possible
to change the values of individual items.
The rand crate's documentation says it's not okay to rely on StdRng
having reproducible output. So instead, let's switch to ChaCha12Rng
instead (which is what StrRng currently uses).
This implements a basic typed event broadcast mechanism, as described in
arti#230: consumers of the new `tor-events` crate can emit `TorEvent`
events, which others can consume via the `TorEventReceiver`.
Under the hood, the crate uses the `async-broadcast`
(https://github.com/smol-rs/async-broadcast) crate, and a
`futures::mpsc::UnboundedSender` for the event emitters; these are glued
together in the `EventReactor`, which must be run in a background thread
for things to work. (This is done so event sending is always cheap and
non-blocking, since `async-broadcast` senders don't have this
functionality.)
Additionally, the `TorEventKind` type is used to implement selective
event reception / emission: receivers can subscribe to certain event
types (and in fact start out receiving nothing), which filters the set
of events they receive. Having no subscribers for a given event type
means it won't even be emitted in the first place, making things more
efficient.
To do this at all neatly, I had to split out `tor-config` from
`arti-config` again, and putting the lower level stuff (paths,
builder errors) into tor-config. I also changed our use of
derive_builder to always use a common error type, to avoid
error type proliferation.
Rather like e8e9699c3c ("Get rid of
tor-proto's ChannelImpl, and use the reactor more instead"), this
admittedly rather large commit refactors the way circuits in `tor-proto`
work, centralising all of the logic in one large nonblocking reactor
which other things send messages into and out of, instead of having a
bunch of `-Impl` types that are protected by mutexes.
Congestion control becomes a lot simpler with this refactor, since the
reactor can manage both stream- and circuit-level congestion control
unilaterally without having to share this information with consumers,
meaning we can get rid of some locks.
The way streams work also changes, in order to facilitate better
handling of backpressure / fairness between streams: each stream now has
a set of channels to send and receive messages over, instead of sending
relay cells directly onto the channel (now, the reactor pulls messages
off each stream in each map, and tries to avoid doing so if it won't be
able to forward them yet).
Additionally, a lot of "close this circuit / stream" messages aren't
required any more, since that state is simply indicated by one end of a
channel going away. This should make cleanup a lot less brittle.
Getting all of this to work involved writing a fair deal of intricate
nonblocking code in Reactor::run_once that tries very hard to be mindful
of making backpressure work correctly (and congestion control); the old
code could get away with having tasks .await on things, but the new
reactor can't really do this (as it'd lock the reactor up), so has to do
everything in a nonblocking manner.
It requires tracing-subscriber 0.2, which is a lower version than we
want, and which causes trouble with our minimal-versions CI test.
There is a pending issue to fix this; we can reinstate tracing-test
once it is merged: https://github.com/dbrgn/tracing-test/pull/11
Instead of awkwardly sharing the internals of a `tor-proto` `Channel`
between the reactor task and any other tasks, move most of the internals
into the reactor and have other tasks communicate with the reactor via
message-passing to allocate circuits and send cells.
This makes a lot of things simple, and has convenient properties like
not needing to wrap the `Channel` in an `Arc` (though some places in the
code still do this for now).
A lot of test code required tweaking in order to deal with the refactor;
in fact, fixing the tests probably took longer than writing the mainline
code (!). Importantly, we now use `tokio`'s `tokio::test` annotation
instead of `async_test`, so that we can run things in the background
(which is required to have reactors running for the circuit tests).
This is an instance of #205, and also kind of #217.
We need this for the circuit timeout estimator (#57). It needs to
know "how recently have we got some incoming traffic", so that it
can tell whether a circuit has truly timed out, or whether the
entire network is down.
I'm implementing this with coarsetime, since we need to update these
in response to every single incoming cell, and we need the timestamp
operation to be _fast_.
(This reinstates an earlier commit, f30b2280, which I reverted
because we didn't need it at the time.)
Closes#179.
This switches out `arti`'s argument-parsing library with `clap`, which
is a lot more featureful (and very widely used within the Rust
ecosystem). We also now use a lot of `clap`'s features to improve the
CLI experience:
- The CLI now expects a subcommand (currently, either "help", or "proxy"
for the existing SOCKS proxy behaviour). This should let us add
additional non-SOCKS-proxy features to arti in future.
- `clap` supports default values determined at runtime, so the way the
default config file is loaded was changed: now, we determine the
OS-specific path for said file before invoking `clap`, so the help
command can show it properly.
- The behaviour of `tor_config` was also changed; now, one simply
specifies a list of configuration files to load, together with
whether they're required.
- That function also way overused generics; this has been fixed.
- Instead of using the ARTI_LOG environment variable to configure
logging, one now uses the `-l, --log-level` CLI option.
(The intent is for this option to be more discoverable by users.)
- The `proxy` subcommand allows the user to override the SOCKS port used
on the CLI without editing the config file.
The new `hyper` tor-client example demonstrates integrating arti with the
popular Rust `hyper` HTTP library by implementing a custom Hyper "connector"
(a type that can initiate connections to HTTP servers) that proxies said
connections via the Tor network.
futures::io::AsyncRead (and Write) isn't the same thing as tokio::io::AsyncRead,
which is a somewhat annoying misfeature of the Rust async ecosystem (!).
To mitigate this somewhat for people trying to use the `DataStream` struct with
tokio, implement the tokio versions of the above traits using `tokio-util`'s
compat layer, if a crate feature (`tokio`) is enabled.
The three arguments TorClient::bootstrap requires by way of configuration
have been factored into a new TorClientConfig object.
This object gains two associated functions: one which uses `tor_config`'s
`CfgPath` machinery to generate sane defaults for the state and cache
directories, and one that accepts said directories in order to create a
config object with those inserted.
(this commit was inspired by trying to use arti as a library and being somewhat
overwhelmed by the amount of config stuff there was to do :p)
Thanks to the chrono update, we no longer include an
obsolete/vulnerable version of the `time` crate. Unfortunately, it
turns out that chrono has the same trouble as `time`: it, too, looks
at the environment via localtime_r, and the environment isn't
threadsafe.
One step forward, one step back. At least the underlying issue is
one that lots of people seem to care about; let's hope they come up
with a solution.
The default soft limit is typically enough for process usage on most
Unixes, but OSX has a pretty low default (256), which you can run
into easily under heavy usage.
With this patch, we're going to aim for as much as 16384, if we're
allowed.
Fixes part of #188.
I don't love this approach, but those errors aren't distinguished by
ErrorKind, so we have to use libc or winapi, apparently. At least
nothing here is unsafe.
Addresses part of #188.
Also, refactor our message handling to be more like the tor_proto
reactors. The previous code had a bug where, once the stream of
events was exhausted, we wouldn't actually get any more
notifications.
There are some missing parts here (like persistence and tests)
and some incorrect parts (I am 90% sure that the "exploratory
circuit" flag is bogus). Also it is not integrated with the circuit
manager code.
The limitations with toml seemed to be reaching a head, and I wasn't
able to refactor the guardmgr code enough to actually have its state
be serializable as toml. Json's limitations are much narrower.