This has the humantime_serde::option module, which we have upstreamed
and are about to switch to.
The remaining dependency with version = "1" is going to be removed
in a moment.
This should save around 1MB per consensus, since every relay has a
'protocols' lines, but there are only a few distinct possibilities
for such a line.
Closes#385.
When the version is a Tor version, we can just parse it; otherwise,
we can intern it. This shrinks GenericRouterStatus and avoids a lot
of extra help allocations.
This is an API break: now one must use `.tor()` to access the Tor
configuration parts.
But it is not a config file format break, because `#[serde(flatten)]`.
This hashtable starts out pretty large, but it can spend most of our
runtime (when we aren't downloading) being small. To avoid doing
too much work, I've made it so we only call shrink_to_fit twice per
consensus: once when we're no longer pending, and once when we're
complete.
Closes#388.
Previously we'd allocate an error as a place-holder here, but it's
not a great idea to do that with a `Bug`: each `Bug` stores a whole
stack trace, which uses a whole pile of allocations to construct.
Now we keep an `Option<Error>` instead.
Found while heap profiling.
Closes#383.
Replace the recapitulation of TorClientConfig fields in ArtiConfig and
instead just have it contain one. This is part of #374.
The conversions from ArtiConfig back to ArtiConfigBuilder and
TorClientConfigBuilder would need to change, but, since we don't want
them anyway,
No longer impl Deserialize for ArtiConfig. (As per #371 this will
want to become a private type.)
No longer impl From<ArtiConfig> for ArtiConfigBuilder and
TorClientConfigBuilder. And abolish tests of that code.
(This all has to be in one commit, because previously
ArtiConfig::tor_client_config used the validated-to-builder config
retcon.)
I used
git-grep -P '\#\[serde\((?!default|deny_unknown)'
to find places where I needed to add additional attributes on the
builder method fields.
This is currently a bit duplicative, but when #371 is completely done,
the validated (non-builder) configs won't need to be Deserialize any
more.
This is part of #371 and #372.
We are going to want to specify custom attributes on fields of the
builder struct. This feature was missing from derive_builder.
This commitid is the current head of my MR branch
https://github.com/colin-kiegel/rust-derive-builder/pull/237https://github.com/ijackson/rust-derive-builder/tree/builder-field-attrs
Using the commitid prevents surprises if that branch is updated.
We will require this newer version of derive_builder. The version
will need to be bumped again later, assuming the upstream MR is merged
and upstream do a release containing the needed changes.
This commit adds support for a BrokenTcp provider that can make
connection attempts fail or time out. It doesn't yet have a way to
turn on the failure.
This makes Arti usable in IPv6-only environments (arti#92) by letting us
attempt multiple connections to a given relay using all of its
addresses instead of just using the first (probably IPv4) one, using the
strategy from RFC 8305 § 5.
This isn't a complete implementation of Happy Eyeballs; ideally, we'd
sort the address list before doing concurrent connections. However, it
works (and has been tested inside an IPv6-only container inside eta's
network :p)
The doc include rune does not work with our MSRV; it needs 1.54.
The alternative would be some kind of cfg() but that would
- not provide the crate-level doc on Rust 1.53
- involve the use of cfg_attr
Instead, just do it the old way.
Previously we tried to do each connection in a run, and only then did we
start transferring data over them. Now we collect a bunch of the
futures that return an open stream, and run them all in parallel
with using them. This change includes connect-time in our
benchmarks, and allows us to test contention in our connect code.
Instead of using a Stream, I've changed the connection-generation
code to call a future-returning function directly, so we have a way
to explicitly pass which run we're in.
This commit changes the main parsing code for RsaIdentity in
tor-netdoc, and .
Previously, parse_hex_ident was something like 10% of our startup
CPU time; now it's only like ~2%. (Still not perfect, but way
better.)
Closes#377.
We perform this operation in a bunch of places, and most of them
use hex::decode(). That's not great, since hex::decode() has to do
heap allocation. This implementation uses hex::decode_to_slice(),
which should be faster.
(In the future we might choose to use one of the faster hex
implementations, but I'm hoping that this change will be sufficient
to get hex decoding out of our profiles.)
Part of #377.
Previously they returned an Arc, which wasn't necessary unless the
client actually _wanted_ a new Arc.
This would be an API break, except that these functions are marked
'experimental-api', so semver does not apply; nonetheless I've noted
the break in semver_status.md, just in case we care.
Closes#369
This commit adds a new program to try to implement the ideas behind
experimentation in arti#329. In particular, it tries to implement
basic client "can I bootstrap and connect" functionality testing,
with a lot of instrumentation, and support for breaking things.
So far, the instrumentation is limited to counting TCP bytes and
connections, and counting events. Still, this is enough to measure
behavior on some of the incorrect-clock tests.
NOTE:
For now, you are _required_ to pass in an explicit configuration, in
hopes that this will lead you to override your storage directories
for doing specific experiments.
This just clones the fields. It is not clear to me why it was
written this way in be86df631d
Remove anyhow dependency from tor-retry, and rename it to retry-error
Previously, I think, RetryError wasn't Clone.
When I wrote this, I arranged to skip dumping the field `pending`.
This must have been because I thought that either
(a) PendingEntry couldn't `#[derive(Debug)]` (but it can)
and/or
(b) Some of the fields of PendingEntry ought not to be dumped because
they might contain (eg) packet data. But I think they don't: there's
just the spec, and the Result which is (basically) a Circ.
I tried preseving something closer to the original using educe, but
educe gets somehow tangled up with the generics, and the result fails
to compile. I haven't investigated this further.
This makes using the `PreferredRuntime` the first-class option inside
`arti-client`, freeing users who don't want to think about runtimes from
having to do so.
`TorClient::create_unbootstrapped` and `builder` now automatically
use this runtime, leaving only `builder_custom` for users who wish to
manually specify a runtime.
This lets us clean up the docs a lot: mentions of using custom runtimes
are now relegated to nearer the end of the crate-level documentation,
and we mostly just link to `tor_rtcompat`'s docs to explain more there.
Instead, we take some more time to explain how you use the builder API
to create clients synchronously.
Other doc cleanups included getting rid of the explanation of `TorAddr`
in the main crate-level doc; this is already well-documented elsewhere,
and is something users should discover organically later.
fixes arti#326
Add tls_conn field to ArtiHttpConnector (and argument to constructor).
Introduce MaybeHttpsStream and use it in ArtiHttpConnection.
Have the example program pass the native TLS connector.
Currently the TLS connector and the HTTPS variant are not used, but
this commit is very noisy and fomrulaic, so I have split out the code
to use them into a separate commit for easier preparation and review.
arti-hyper wants to be able to have a kind for TLS failure.
Given that arti-hyper is above arti-client, this shows that callers
above arti-client might need to invent kinds for their own errors.
Possibly this means we need other Other errors for other locations.
If we have pluggable components we might even want OtherTorError.
Fixes#365
Inspection of the code and logs shows that:
* One of the plan futures' oneshots must be returning Cancelled
* This means that the corresponding sender must have been dropped
* The sender is owned by the task spawned by spawn_launch
Presumably that entire task gets dropped as part of executor shutdown,
or something.
The correct response in this situation is to declare that we are
shutting down, and stop trying to do stuff.
Unfortunately, despite trying quite hard by putting sleeps in various
strategic places, I have not been able to reproduce the problem. So I
can't be 100% sure that the new behaviour is correct.
But I am reasonably confident that this ought not to be able to occur
unless either 1. the task from spawn_launch is dropped, or 2. that
task somehow panics despite its attempts to trap panics and report
them as errors through the oneshot.
So this "burn it all down" action ought only to occur in actually
serious situations.
I observe that
3ff9b187ea
Handle panics from circuit construction.
changed the EK for PendingCanceled to EK::ReactorShuttingDown,
and there's From impl. I think, therefore, that it is right
to reuse this Error variant.
I don't quite understand why when take_action gets an actual error it
doesn't push it, but just logs it. But I am not changing that for
now.
Arguably the two instances of retry_error.push are a sign of an
inferior flow control pattern - maybe the loop body including the code
I am adding ought to be an IEFE returning
`Result<Option<circ>, crate::Error>`.
I wanted this while debugging something.
The ad-hoc impl Debug with f.debug_struct is getting repetitive
and I've already perpetrated one copy-paste mistake.
We should consider using something like the `educe` crate's Clone.
`PreferredRuntime` is now an opaque wrapper struct that contains the
"actual" preferred runtime as a type alias. The `current_user_runtime`
and `create_runtime` functions become `PreferredRuntime::current` and
`PreferredRuntime::create`, respectively.
This removes the dependence on `impl Trait`, meaning we can now name the
returned runtime (yay!).
In addition, the documentation was cleaned up a bit to make it
(hopefully) flow better. Items that don't make sense to publicize, like
testing implementation details, have been marked #[doc(hidden)] and
semver warnings added.
Previously coarsetime and the traffic-timestamp feature were
enabled, since they were only required for a small corner of the
guardmgr algorithm.
But in 1.0 and beyond we'll be adding a bunch of other features (eg,
netflow padding, DoS prevention) that will need coarsetime all over
the place.
And since we're going to be doing coarsetime all over the place, the
previous justification for making traffic-timestamping optional (the
tiny performance hit) is no longer relevant.
`StateMgr` got a new `unlock()` method that does what it says on the
tin. We now call it from `bootstrap()` using the new
`util::StateMgrUnlockGuard`, which works in a manner similar to the
`BoolResetter` from `tor_dirmgr`.
(A decent small little task in future might be to unify these types in
some sort of general arti utility crate?)
closes arti#335
This lint is IMO inherently ill-conceived.
I have looked for the reasons why this might be thought to be a good
idea and there were basically two (and they are sort of contradictory):
I. "Calling ‘.clone()` on an Rc, Arc, or Weak can obscure the fact
that only the pointer is being cloned, not the underlying data."
This is the wording from
https://rust-lang.github.io/rust-clippy/v0.0.212/#clone_on_ref_ptr
It is a bit terse; we are left to infer why it is a bad idea to
obscure this fact. It seems to me that if it is bad to obscure some
fact, that must be because the fact is a hazard. But why would it be
a hazard to not copy the underlying data ?
In other languages, faliing to copy the underlying data is a serious
correctness hazard. There is a whose class of bugs where things were
not copied, and then mutated and/or reused in multiple places in ways
that were not what the programmer intended. In my experience, this is
a very common bug when writing Python and Javascript. I'm told it's
common in golang too.
But in Rust this bug is much much harder to write. The data inside an
Arc is immutable. To have this bug you'd have use interior mutability
- ie mess around with Mutex or RefCell. That provides a good barrier
to these kind of accidents.
II. "The reason for writing Rc::clone and Arc::clone [is] to make it
clear that only the pointer is being cloned, as opposed to the
underlying data. The former is always fast, while the latter can
be very expensive depending on what is being cloned."
This is the reasoning found here
https://github.com/rust-lang/rust-clippy/issues/2048
This is saying that *not* using Arc::clone is hazardous.
Specifically, that a deep clone is a performance hazard.
But for this argument, the lint is precisely backwards. It's linting
the "good" case and asking for it to be written in a more explicit
way; while the supposedly bad case can be written conveniently.
Also, many objects (in our codebase, and in all the libraries we use)
that are Clone are in fact simply handles. They contain Arc(s) (or
similar) and are cheap to clone. Indeed, that is the usual case.
It does not make sense to distinguish in the syntax we use to clone
such a handle, whether the handle is a transparent Arc, or an opaque
struct containing one or more other handles.
Forcing Arc::clone to be written as such makes for code churn when a
type is changed from Arc<Something> to Something: Clone, or vice
versa.
My proximate motivation is that tls-api wants its inner streams to be
Debug. But in general, I agree with the Rust API Guidelines notion
that almost everything should be Debug.
I have gone for the "dump all the things" approach. A more nuanced
approach would be possible too.
Now all of the runtime types we provide all
impl<S> TlsProvider<S> where S: ...
rather than merely TlsProvider<Self::TcpStream>.
And we document and intent to perhaps require this in the future.
This patch removes the EventStream associated type and the Runtime
parameter. The Runtime parameter wasn't actually used for anything,
and the EventStream was easy enough to replace with a BoxStream in
this case.
Also replaced DirBootstrapEvents with a BoxStream to avoid tying
anything to our backend.
This helps the user distinguish between protocol violations that
happen when connecting to the tor network from those that happen
while connected.
Closes#358.
Try to make the `hook-tcp` example a bit easier to read by
adding/changing comments, and renaming the lifetimes for
`async_trait`-generated trait methods.
To implement this, we had to refactor the tor_circmgr api for
flushing state changes to disk, so that it checks if it has the lock,
and only then tries to store.
There was only one use of this, and it was in as-yet-unused relay-only
code.
Removing this type required refactoring the relay onion handshake code
to use its own error type, which is probably clever anyway.
We handle them by reporting them to task that's waiting for the
circuit, then relaying the panic.
Doing so allows the waiting task to distinguish panics
(EK::Internal) from cases where the reactor dropped the task
entirely (EK::ReactorShuttingDown). And doing _that_ removes one
case of EK::Canceled, which helps us on our goals towards #348.
Closes#347.
This is the first one where anyhow::Error impl AsRef<dyn StdError>
We want this because we want to add error reporting functionality
which works with all kinds of errors, which means we need an
anyhow::Error which can be vieweed as a StdError.
(The alternative would be to deref at the call sites of
report_and_exit, making it less ergonomic.)
anyhow 1.0.23 is from November 2019.
These (&foo).into() constructions are needed so we use the
implemnetation of `From<&Foo>`, not a recursive call to this very
function.
This is a partial revert of the previous commit. I'm making this a
separate commit for the benefit of posterity.