Now that the relevant functions now report changed/not-changed
status via a boolean out-parameter (see !527), there's no reason to
have a separate NoChanged error case.
Closes#484.
This gets rid of `#[serde(flatten)]` which prevents serde_ignored (and
other kinds of introspection) from working properly.
The price is now that the toplevel has to deal with two configuration
objects.
The Resolvable trait is overkill right now, but is going to do More
Things in a moment. In particular, we need the impl on tuples, so
that the whole config can be processed in one go.
We are going to need this for some generic code which is going to
appear shortly. Having it produced by impl_standard_builder seems
best. But that does mean being able to disable it, so extra stuff in
the macro. Nothing uses this trait yet.
ConfigResolveError is not used now either, but will be in a moment.
We are going to reorganise ArtiConfig to not contain a
TorClientConfig. This test case's calls to bld.tor() will all need to
change. Do this in advance to make that future commit more readable.
I don't understand why this isn't tripping all the time. Maybe
because this is in a macro. Anyway, I am going to add a new
invocation of this macro from within a test where, empirically, it
trips.
This change requires a little refactoring of TorClientBuilder: now,
instead of enabling or disabling mistrust, it enables or disables
the decision to _override_ the mistrust in the config.
We support all of the following (in TOML notation):
```
user = "rose" # by name
user = 413 # by ID
user = false # no user
user = ":current" # A 'special' user.
user = { name: "rose" }
user = { id: 413 }
user = { special: ":none" }
user = { special: ":current" }
```
Previously in !511 I had introduced a bug where, if there was an
error more serious than "no change", that error would keep us from
noticing that we had no change, and we'd loop until the safety
counter ran out. Then we'd panic.
This commit fixes the bug by reintroducing the `changed` boolean --
this time as an outparam for the add_from_* methods.
Fixes#482.
this change adds unit tests for the public methods of StreamPrefs. although
these are mostly "setter" style functions, the tests confirm the basic
expectations and operation.
When I added these tests, they didn't find any bugs in my own
implementation, but I did find a bug in futures::future::unfold.
See the in-code comment.
This is a general-purpose implementation of the ad-hoc approach
currently taken in (eg) crates/tor-proto/src/channel/reactor.rs,
with an API intended to defned against the more obvious mistakes.
This allows us to separate the two concerns: the channel reactor can
focus on handling channel cells and control messages and is over 2.5x
shorter.
The complexity of the manual sink implementation, and the machinery
needed to avoid having to suspend while holding an item, are dealt
with separately. That separate implemenation now has proper
documentation. (Tests are in the nest commit to avoid this one being
even more unwieldy.)
We use `extend` to define this as an extension trait. A competitor is
`ext` but in my personal projects I have found `extend` slightly
better.
We no longer have separate return paths for recoverable and fatal
errors; instead, they are merged, and distinguished based on
recovery actions.
Since it is now possible for download() to give an error that should
_not_ destroy the previous state, it takes `&mut Box<dyn DirState>`.
This change unfortunately means that we can no longer call `state =
state.advance()`, but instead have to do some mem::swap junk with
poisoned values. Any better solution would be a good thing.
Additionally, the reset() and advance() methods can no longer fail.
There is still a separate return path for reset-triggering errors;
I'm about to fix that.
Previously, we did this in `advance()`, but that wasn't so great: it
meant that we could fail in the advance() code, whereas the calls to
`advance()` treated errors as fatal.
This treats failed verification as a blocking error that requires a
reset.
Fixes one aspect of #439.
Previously DocSource would tell you whether the document was from
a local store or a cache server, but it wouldn't tell you _which_
server it came from.
This change required adding DocSource as an argument to
DirState::add_from_download.
Generally, change the paths that mention the crate name to go via a
module-level "use".
This involves adding tor-config as a direct dependency for a few
crates.
This crate no longer has any reason to exist. All its remaining
functionality is generic enough to go into tor-config.
In this commit, we move the contents of lib.rs into a new file in
tor-config. It contains:
* Code motion
* The minimal "mod" and "use" changes
* The minimal doc comment
* A new a compat alias for ConfigurationSources.
The compat alias is there because various crates currently speak of
arti_config::ConfigurationSources and it is most convenient to fix
them up after the type is available in tor_config.
This is a benchmarking tool, and fs-mistrust doesn't like the
permissions in our CI. The env var ARTI_FS_DISABLE_PERMISSION_CHECKS
is (of course) specific to arti. Maybe it should be honoured here,
or this should be done via the config files.
But disabling this is fine for now.
Test the Deserialize impl of every config struct.
This detects bugs like the one fixed in !502.
The macro now becomes more complex because it needs to take options.
Right now this tt-munching option parser is overkill, but this
leave space for further options in the future.
This was anomalous, in that it contains &'static str, rather than a
proper nested error (eg a config::ConfigError, maybe).
But in fact it tursn out it is now not constructed. The last
construction site was removed a long time ago in
Use derive_builder for Authority and FallbackDir.
I have Plans for this macro. In particular:
* I have a wip branch which tests that the Builder can be
deserialised from an empty config (ie, that config reading
of a config with a blank section for this item works).
* I think we should autogenerate $Config::builder(),
and promote that, rather than $ConfigBuilder::default().
This macro could do that.
This macro is kind of derive-y. Also it has a test in it, and failing
to call it could allow bugs to exist, as well as missing bits of API.
Putting it next to the structs makes it easy to see that it's actually
been called.
We expect that a user may copy this file and uses it as a starting
point for their own configuration.
When they do that, we don't want them to freeze the default config in
time. Instead, we can expect them to uncomment settings they wish to
change. Then when they upgrade arti, *other* settings will get the
new defaults, which I think is right.
This is redundant, because the defaults have to be supplied by the
config builders (usually via builder default attributes).
That this is actually done and correct is tested by the
`default_config()` test case in arti/src/cfg.rs.
Discovered by a test case in my local tree. The test case was
macro-generated by an extension of impl_standard_builder (which
macro istself currently awaiting review, arti!499)
Have also sent an MR to update the upstream docs
https://github.com/jean-airoldie/humantime-serde/pull/8
The type of ret.map_err(codec_err_to_chan)? is (). ISTM that
writing `let () = ` makes it clear that there is nothing there,
but the lint forbids this.
This lint is warn by default and trips here for me on current nightly.
It seems wrong to me. We should be able to make it clear to the
reader that there is nothing here - note how this differs from the
lines below where Ready contains msg. A let () binding is a good way
to do that.
I think the lint allow ought to be added everywhere, but that doesn't
seem easy right now - see this issue about maint/add_warning:
https://gitlab.torproject.org/tpo/core/arti/-/issues/469
With this API we can now stop consensus download attempts early if
any consensus that the directory cache gave us would be necessarily
too far in the future or in the past.
This saves wasted bandwidth for clients with skewed clocks.
Closes#466.
If we're happy with a directory from 3 days ago, we should say
"if-modified-since 3 days ago".
This patch is larger than I'd like, since I had to add &DirMgrConfig
as an argument to the functions that make a consensus request.
Closes#467.
Since we want to be willing to use older consensuses, we don't
necessarily want to reset a download just because the consensus is
expired.
This new behavior isn't ideal either; I've added a TODO that relates
to #433.
Related of #412
This new section describes how much variance we accept when it comes
to expired and not-yet-valid directory documents. (Currently, the
only ones where this matters for are consensus documents and
authority certificates.) A document that is invalid by no more than
these tolerances is not _live_, but it can still be used.
These tolerances serve two purposes:
* First, they allow clients to run with a little more clock skew
than they would tolerate otherwise.
* Second, they allow clients to survive the situation where the
authorities are unable to reach a consensus for a day or two.
Compare with Tor's REASONABLY_LIVE_TIME and NETWORKSTATUS_ALLOW_SKEW
constants; also compare with proposal 212.
Closes#412.
These generic arguments weren't consistent. It doesn't make sense ot
insist on getting a borrowed type and then cloning it. So tidy things
up in the direction of taking owned values, which is what
ConfigurationSources actually needs.
(My personal preference would be &dyn to avoid monomorphisation code
bloat but that was controversial last time I proposed it somewhere.)
- arti#445 highlighted the lack of good documentation around Arti's
multiple runtime support, as well as it being difficult to determine
what runtime was actually in use.
- Improve the documentation to solve the first problem.
- To solve the second problem, make Runtime require Debug (which is
arguably a good idea anyway, since it makes them easier to embed in
things), and print out the current runtime's Debug information when
arti is invoked with `--version`.
- (It also prints out other Cargo features, too!)
fixes arti#445
A build script reaching into your .git/hooks/ and modifying them
nonconsensually was a bit of a horrifying concept, and also made it hard
to build arti with the feature disabled. Remove this crate, and replace
it with manual instructions on how to install the hooks in
CONTRIBUTING.md.
- Some FIXMEs got removed or amended.
- AddMicrodescs now yields a mutable reference, so we can use .drain()
and reuse the allocation.
- Some panics were downgraded to debug_asserts.
- We don't want to inadvertently replace our netdir with one that's
actually older, so detect and error on this condition.
- Also, print a debug line when we get a new netdir without enough
guards.
- (An unrelated TODO was also added.)
- Taking a previous netdir directly and keeping it around before we need
it is a bit of a waste of memory, and also doesn't mesh well with how
SharedMutArc works.
- To remedy this, introduce a new trait `PreviousNetDir` and have the
state machines take that instead. (I was a bit tempted to just pass in
the SharedMutArc directly. Maybe I should've done that.)
- GetMicrodescsState now uses the NetDirChange API to propagate netdir
changes, instead of modifying the netdir directly.
- PendingNetDir was refactored in order to support this use case.
- As a result, the netdir-related methods in WriteNetDir can be removed,
leaving only the DirFilter for now.
- add_from_cache() no longer takes a store, because nothing uses it.
- (bodge: apply_netdir_changes() was put in a few places missed
previously)
- The new DirState::get_netdir_change() API lets the state machine
export a NetDirChange: a request to either replace the current netdir,
or add microdescs to it.
- bootstrap.rs now consumes this new API, even though nothing implements
it yet.
- This will let us implement GetMicrodescsState without having to
directly mutate the netdir. The calling code also handles checking the
netdir against the circmgr for sufficiency, and updating the consensus
metadata in the store, meaning the revised GetMicrodescsState will not
have to perform these tasks.
- The additional parameters passed to GetConsensusState are now passed
through all the states, and used as well.
- WriteNetDir doesn't have a now() or config() method any more, since
the states now get this from the runtime or the config parameters.
- This required modifying the tests to make a mocked runtime and custom
config directly, instead of using DirRcv for this purpose.
- Additionally, because we don't have to upgrade a weak reference for
DirState::dl_config(), that function no longer wraps its return value
in Result.
- (A bunch of the FIXMEs from the previous commit that introduced the
additional parameters have now been rectified as a result.)
Previously, CompoundRuntime would use the default implementations of
SleepProvider::now() and ::wallclock(), instead of using its wrapped
SleepProvider. This mildly embarrassing omission has been rectified.
- GetConsensusState::new now takes a set of parameters matching what it
actually needs, instead of just taking a writedir. (It still *does*
take a writedir, and indeed still uses it for basically everything,
but that will eventually go away.)
- Its call sites were updated.
- Some tests now need to take a runtime, and got indented a lot as a
result.
- Resetting was made non-functional, because we need to thread through
the parameters passed to GetConsensusState to all of the other
states, too. This will happen in a later commit.
- Given that this is effectively an implementation detail, it doesn't
really make sense to have it be in the crate root...
- (also, we're going to change it a bunch now)
- fetch_single now takes what it needs, instead of an Arc<DirMgr<R>>.
- This required refactoring the CANNED_RESPONSE mechanism, given the
test would otherwise fail due to not having a CircMgr to pass to
fetch_single.
- query_into_requests is now called make_requests_for_documents, and
does the &[DocId] -> DocQuery conversion internally instead.
- DirMgr::make_consensus_request and DirMgr::query_into_requests are now
gone. The tests use the new functions, as does fetch_multiple.
- There's no good reason these functions needed to be part of the
dirmgr, apart from needing a runtime and a store.
- However, we can just add those as arguments and copy them over. This
commit does that.
- Function renamed & docs tidied up a bit
- Function signature now takes what it needs (immutable &dyn Store
instead of mutex, slice instead of Vec) and nothing more
- DocQuery::load_documents_into was also renamed
DocQuery::load_from_store_into and given similar treatment
Annoyingly, Rust doesn't automatically generate this sort of `impl` for
you, and I'd like to reduce the usage of Mutex<DynStore> everywhere else
in favour of either &dyn Store or &mut dyn Store.
(This is for two reasons: firstly, we might have a Store implementation
that doesn't use a mutex as above, or similar refactors; secondly,
passing the raw trait object reference lets us encode mutability into
the function signature, which I believe is quite valuable.)
Relay nicknames are always between 1 and 19 characters long, and
they're always ASCII: That means that storing them in a [u8;19] will
always be possible, and always use less resources than storing them
in a String.
Fortunately, the tinystr crate already helps us with this kind of
thing.
If the target directory itself is unreadable by untrusted users,
then its contents can't be read[*] by them regardless of their
permissions. If the target directory _is_ readable, then _it_ will
be rejected if we are forbidding readable objects. (And if we
aren't we don't care if the contents are readable.)
A similar argument would apply to writable objects within an
unreadable target directory. We're not making that argument, since
such contents are likelier to be a mistake.
[*] Unless they're hard-linked; see comments in "Limitations"
section.
I'm doing this per discussion, so that we can have it be part of the
TorConfig later on, and not break stuff as we change the Mistrust
API to have a builder.
This change, unfortunately, results in a little more internal
complexity and duplicated code in arti and arti-client. I've marked
those points with TODOs.
This is derived from the environment, not the configuration file: We
might not want to trust the configuration file until we've decided
whether we like its permissions.
This includes:
* a CachedDir::join method.
* functions to read and write from provided filenames in a
CachedDir.
* a method to tell whether a fs-mistrust error is about bad file
permissions, or failure to inspect file permissions or some other
kind of IO problem.
For reference, the git source for this crate (and the others in its
workspace) currently lives in my personal github account (ijackson).
If this fork turns out to be long-lived and gains features and/or
users, it would be good to move it to a gitlab somewhere.
I have granted Nick crate ownership on the crates.io system.
Here we add a config option to disable safe logging, and ensure that
safe logging is disabled when we are formatting an error message on
exit (since we assume it's safe to write sensitive info to stderr.)
This is a rough first-cut of an API that I think might help us with
keeping limited categories of sensitive information out of our logs.
I'll refine it based on experiences with using it.
* Builders additionally derive: Debug, Serialize, Deserialize.
* Validated structs no longer derive: Serialize, Deserialize
and all related attributes deleted.
* As a consequence, all the `#[serde(deny_unknown_fields)]`
are gone. That means that right now unknown fields are totally
ignored. This is good for compatibility but poor for useability.
Doing something better here is arti#417, in progress.
* As a consequence, delete tor_dirmgr::retry::default_parallelism.
(The default value was already duplicated into a builder attr.)
And drop the ad-hoc orport() method. This brings FallbackDir's
orports field in line with our list builder API.
The general semver note in "configuation" seems to cover most of this.
This allows us to use this with an item builder type which doesn't
impl Default. (Obviously this only makes sense for items which aren't
actually builders.)
It is Quite Vexing that we have to use [ ] rather than the < > around
the generics, particularly given that we are also using [ ] to signal
"this is arrayish".
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Although these do not appear in the config, it does have a builder.
It seems sensible to get rid of this ad-hoc list manipulation site,
and replace it with our standard list builder API.
define_list_builder_helper requires that the builder element type be
Deserialize. Currently GuardUsageRestriction is a transparent, public
enum, so we aren't really exposing anything.
We could introduce GuardUsageRestrictionBuilder now, but
since it's not in the config and thereofore only in the public API of
the lower crates, we can definitely put that off.
This type was returned by the public DownloadSchedule::builder
function. But the only thing that seems to have noticed that the type
name itself wasn't exported, was rustdoc. Hmmm.
The new API is (roughly) as discussed in
https://gitlab.torproject.org/tpo/core/arti/-/issues/451
This is quite a large commit and it is not convenient to split it up.
It contains the following changes:
* Redo the list builder and accessor macros implemnetation,
including docs and tests.
* Change uses of define_list_config_builder. In each case:
- Move the docs about the default value to the containing field.
- Remove the other docs (which were just recapitulations, and
are now not needed since the ListBuilder is no longer public).
- Rewmove or replace `pub` in the define_list_builder_helper call,
so that the builder is no longer public.
- Change the main macro call site to use define_list_builder_helper.
- Add a call to define_list_builder_accessors.
* Make the module `list_builder` pub so that we have somewhere to
put the overview documentation.
* Consequential changes:
- Change `outer.inner().replace(X)` to `outer.set_inner(X)`
- Consequential changes to imports (`use` statements).
Previously this field was differently named to its serde and to its
accessors. We are about to introduce a macro_rules macro which will
provide list accessors and we don't want that macro to have a field
renaming feature.
So stop renaming the field.
Document that this can contain either a string for expansion, or a
literal PathBuf not for expansion.
Rename the `from_path` method to `new_literal`: a very important
difference is whether it gets expanded - less important than the Rust
type. Also, now it takes `Into<PathBuf>`, which avoids a needless
clone.
(We don't change the API in `arti-client` because
`&tempfile::Tempdir()` doesn't implement `Into<PathBuf>`, so
`arti-client` has to have some new `as_ref` calls.)
Provide accessors `as_unexpanded_str` and `as_literal_path`. The
deserialisation already makes this part of the stable API,l so not
pvoding accessors seems just obstructive. They are useful for tests,
too.
Add tests for the new entrypoints, and for deserialisation of both
variants from TOML (via config, or directly) and JSON.
We introduce LiteralPath struct, so that a literal path deserialises
from
some_path = { literal: "actual path string" }
This makes the deserialisation unambiguous.
arti uses this. Somehow this seems to be enabled by some other thing
in the crate graph, but I found that adding a similar dependency to
another crate resulted in a `config` which doesn't compile.
This helps make it possible to use `SecureDir` (name pending) even
when we want to disable permissions checks. Otherwise, optional
permission checking would require users of this crate to maintain
separate code paths for the "check" and "don't check" cases.
This required a bit of poking through the `users` crate, to mess
with the user and group dbs. The original goal was to "trust the
group with the same name as us", but it turned into a bit of a
production, since:
* We want to take our own name from $USER, assuming that matches
our uid. (Otherwise we want to ask getpwuid_r().)
* We only want to trust the group if we are actually a member of
that group.
* We want to cache this information.
* We want to test this code.
Previously we would temporarily put self.resolved into an invalid
state by adding a path component that might be a symlink. With this
change, we create a new temporary path object (using Cow to avoid
unnecessary allocations) and only conditionally replace
self.resolved.
The only way to get a SecureDir is by having checked a directory.
Once you have one, it encourages you to open and create files and
directories with the right permissions, and checks them for you.
This crate is meant to solve #315 by giving a way to make sure that
a file or directory is only accessible by trusted users. I've tried
to explain carefully (in comments and documentation) what this crate
is doing and why, under the assumption that it will someday be read
by another person like me who does _not_ live and breathe unix file
permissions. The crate is still missing some key features, noted in
the TODO section.
It differs from the first version of the crate by taking a more
principled approach to directory checking: it emulates the path
lookup process (reading symlinks and all) one path change at a time,
thus ensuring that we check every directory which could enable
an untrusted user to get to our target file, _or_ which could
enable them to get to any symlink that would get them to the target
file.
The API is also slightly different: It separates the `Mistrust`
object (where you configure what you do or do not trust) from the
`Verifier` (where you set up a check that you want to perform on a
single object). Verifiers are set up to be a bit ephemeral,
so that it is hard to accidentally declare that _every_ object
is meant to be readable when you only mean that _some_ objects
may be readable.
The `[patch]` approach causes the tree not to build when used as a
dependency, unless the `[patch]` is replicated into the depending
project.
Instead, replace our `derive_builer =` dependencies with a reference
to a specific git commit:
perl -i~ -pe 'next unless m/^derive_builder/; s#"(0\.11\.2)"#{ version = "$1", git = "https://github.com/ijackson/rust-derive-builder", rev = "ba0c1a5311bd9f93ddf5f5b8ec2a5f6f03b22fbe" }#' crates/*/Cargo.toml
Note that the commitid has changed. This is because derive_builder is
in fact a workspace of 4 crates. 3 of them are of interest to arti
itself (the 4th exists only for testing). So the same "add git
revision" treatment had to be done to the `derive_builder` and
`derive_builder_macro` crates. Each dependency edge involves a new
commit in the derive_builder workspace, since we can't create a git
commit containing its own commitid. (We want to use commits, rather
than a branch, so that what we are depending on is actually properly
defined, and not subject to the whims of my personal github
namespace.)
There are no actual code changes in derive_builder.
The arti crate itself is only used in benchmark and testing crate.
I think this sentence does not belong here.
Also extend retry-error description (from Architecture.md).
Now that we require Rust 1.56, we can upgrade to AES 0.8. This
forces us to have some slight API changes.
We require cipher 0.4.1, not cipher 0.4.0, since 0.4.0 has
compatibility issues with Rust 1.56.
This is actually a number of *attempts* not a number of *retries*.
The setter method was already called "attempts".
This chnages the deserialisation of the config.
Use sub_builder. We must do something special for defaults.
This involves moving the actual default values for retry_bootstrap and
retry_microdescs into config.rs, since they need to access the fields
of the un-built version of the structure. (An alternative would be to
generate "weak setters" which do not override previous settings, but
derive_builder does not offer to generate them and that seems
overkill.)
Instead, everyone should use DownloadScheduleBuilder.
The new() method would in any case be useless in a moment, since we're
going to embed DownloadScheduleBuilder in the NetworkConfig, not
DownloadSchedule.
The call sites in the tests are all about to change again.
The current behaviour is to treat 0 as indicating "use the default",
which is quite strange. We are going to get rid of that.
The new way will be to reject zero, during
DownloadScheduleBuilder::build, Add a test case for that.
Really, we probably don't want any of these not to be pub, but it
triggers "unreachable pub" in my test cases, and making it not pub by
mistake seems not very serious, and likely to be noticed.
Making the struct private in the test cases has the useful effect of
checking that all the methods are tested.
And add an imprecation in define_list_config_builder's doc comment do
do so in future for other invocations of the macro.
Add add the missing full stops.
This is an automated change made with a perl one-liner and verified
with grep -L and grep -l.
Some warnings are introduced with this change; they will be removed
in subsequent commits.
See arti#208 for older discussion on this issue.
This means that `NetworkConfig::initial_predicted_ports` is now like
the other list-like things, returning `&mut list_builder` with the same
`set()` and `append()` methods.
NetworkConfigBuilder needs to not contain any validated structs, so
that its serde does not expose the validated details.
AuthorityListBuilder is what ought to go here - and it contains
Vec<AuthorityBuilder>, not Vec<Authority>. As a consequence, many
places now deal with AuthorityBuilder, rather than Authority.
Since "logfiles: Introduce LogfileListConfigBuilder", this code is in
LogfileListConfigBuilder::build(), which is called by derive_builder's
generated LoggingConfig::build(), and which will add a file context
itself due to the `sub_builder` feature.
So this is otiose. And, we are about to replace this whole thing with
macro_rules-generated code (which won't do this).
If duration addition overflows, then continue with Never.
Caching the AbsRetryTime constructed with duration from supplied
function also reduces the overhead of earliest_absolute.
In
https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/462#note_2797697
we decided not to do this.
However, having looked again at the way the FallbackList works, I
think there is a lot of value in making these two things (and anything
else like them[1]) as similar as possible.
[1] At least PreemptiveCircuitConfig.initial_predicted_ports and
NetworkConfig.authorities need the same treatment, and perhaps also
GuardUsage.restrictions (although there is no
GuardRestrictionBuilder).
In the irc discussion I imagined `LogfilesConfigBuilder` as opposed to
`LogfileConfigBuilder` (differing only in the `s`) which would be bad,
but we can use `List` instead.
We do *not* need to abstract away the validated version of the config.
Providing a type alias helps the derive_builder sub_builder DTRT
without needing special overrides.
I have split this commit so that we can drop it, if we conclude it's
not wanted.
Now the network fallbacks configuration wants to Deserialize
a Vec<FallbackDirBuilder>, rather than validated Vec<FallbackDir>.
Methods on FallbackListBuilder are as per
https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/462#note_2797697
mutatis mutandi for the fact that this struct has only fallbacks in it.
This is where the FallbackList type is. We are going to want to
provide a builder too, which ought to impl Default.
This means that the default value for the type must be next to the
type. In any case, it was anomalous that it wasn't.
This commit is pure code motion.
This commitid is the current head of my MR branch
https://github.com/colin-kiegel/rust-derive-builder/pull/253https://github.com/ijackson/rust-derive-builder/tree/field-builder
Using the commitid prevents surprises if that branch is updated.
We will require this newer version of derive_builder. The version
will need to be bumped again later, assuming the upstream MR is merged
and upstream do a release containing the needed changes.
We will need the new version of not only `derive_builder_core` (the
main macro implementation) but also`derive_builder` for a new error
type.
Rename "deletable" to "obsolete".
Simplify function structure.
Report errors from `metadata()` and `modified()`.
Don't claim that we're going to delete something unless we are.
Comment about making CUTOFF configurable.
Rather than running preemptive circuit construction every 10
seconds, we change it to back off when it is "failing". (We define
"failing" as creating no new circuits, and as giving at least one
error.)
This change means that we'll have one less reason to hammer the
network when our connectivity is failed for some reason.
Closes#437.
Part of #329.
This feature is similar to ChanProvenance from ChanMgr, except that
we don't yet need to report it outside the crate. I'm going to use
it to distinguish newly created circuits from existing circuits in
the preemptive circuit builder.
Instead of just having a function that recalculates the latest clock
skew, instead recalculate the clock skew when it may have changed,
and notify other processes via a postage::watch.
Apparently on OSX you are not allowed to construct an Instant that is a
long time before the time when the test is running.
Also, fix the length of a year in this test.
This time, our estimator discards outliers, takes the mean of what's
left, and uses the standard deviation to try to figure out how
seriously to take our report of skew/not-skew.
These estimates are still not actually used.
Fortunately, we don't need a separate type here: authenticated
clock skew can only come attached to a `tor_proto::Error`.
We also remove skew from `tor_proto::Error::HandshakeCertsExpired`,
since it would now be redundant.
of a channel.
At first I wanted to have this information not be a part of channels
at all, but it is a fairly tiny amount of data, and the alternatives
are pretty crufty.
Upstream 0.8.2 has broken compilation with Rust 1.53; versions
0.8.{0,1} have been yanked.
Possibly by the time the next arti version comes out, they'll have
fixed this situation, or we'll have upgraded our MSRV.
Upstream issue at https://github.com/Nugine/rlimit/issues/42 .
Doing this will make us treat caches that send us these objects as
not-working, and close circuits to them instead of trying over and
over.
The case where we add a document from the cache requires special
handling: it isn't actually a error to find an expired document in
our cache (unless the passage of time itself is erroneous, which is
a debatable proposition at best).
Fixes#431.
The old version of this function was error-prone, and in fact had
errors: it was too easy to forget to add non-persistent fields, and
that's exactly what we forgot in a few cases
(`microdescriptor_missing`, `circ_history`, and
`suspicious_behavior_warned`).
The new version of this function consumes both of the incoming
Guards, and constructs every field explicitly so that we can't
forget to list any.
Closes#429.
Previously, we treated successfully building a circuit to a guard as
a "success", and any failure, including a directory cache failure,
as a failure. With this change, guards now have separate
success/failure and retry status for circuit usage and directory
usage.
This change is needed for guard-as-directory retry to have
reasonable behavior. Otherwise, when a guard succeeds at building a
circuit, that clears the directory-is-failing status and makes us
retry the guards to quickly.
Previously the code would do stuff like
```
schedule = RetrySchedule::new(INITIAL_DELAY);
```
which is needlessly verbose, since the schedule already keeps track
of its initial delay.
This lets us say that the UsageMismatch cases in some parts of the
code reflect a programming error (RetryTime::Never), whereas in
other case it reflects another circuit request getting to the
circuit first (RetryTime::Immediate).
For each case, describe its semantics (in addition to when you would
create it).
Explain the relationship between After and At.
Stop saying "Strategy": we renamed this type to "RetryTime".
Previously we did not distinguish errors that came from pending
circuits from errors that came from the circuits we were
building. We also reported errors as coming from "Left" or "Right",
instead of a more reasonable description.
We were treating restrict_mut() failures as internal errors, and
using internal errors to represent them. But in fact, these
failures are entirely possible based on timing. Here's how it
happens:
* Two different circuit requests arrive at the same time, and both
notice a pending circuit that they could use.
* The pending circuit completes; both pending requests are notified.
* The first request calls restrict_mut(), and restricts the request
in such a way that the second couldn't use it.
* The second request calls restrict_mut(), and gets a failure.
Because of this issue, we treat these errors as transient failures
and just wait for another circuit.
Closes#427.
(This is not a breaking API change, since `AbstractSpec` is a
crate-private trait.)
Not all of these strictly need to be bumped to 0.2.0; many could go
to 0.1.1 instead. But since everything at the tor-rtcompat and
higher layers has had breaking API changes, it seems not so useful
to distinguish. (It seems unlikely that anybody at this stage is
depending on e.g. tor-protover but not arti-client.)
Unlike the rest of the crates, these don't have a "tor-" or "arti-"
prefix, and are potentially used by code outside arti. With that in
mind, it's probably for the best not to bump them to 0.2.0 along
with the rest of our crates.
They have had no changes since 0.1.0 other than refactoring and
changing of clippy lints. Therefore, I'm not bumping the
dependencies from other crates onto these: it's fine whether our
other crates use caret/retry-error 0.1.0 or 0.1.1.
This feature allows us to detect different failing cases for
arti#329 that would otherwise be hard to induce. It works by
filtering consensus directory objects and/or microdescriptor objects
before introducing them to the directory manager.
Closes#397.
This commit uses the `visibility` and `visible` crates to
conditionally make certain structs and their fields public
(respectively). This is incredibly dangerous to use for anything
besides testing, and I've tried to write the documentation for the
feature accordingly.
The older default seems (experimentally) to be ridiculously high.
Generally, if we can't build a circuit within a handful attempts,
that circuit has already timed out... unless there is a fast-failure
condition, in which case we're just hammering the network (or our
view of it.)
Found with `arti-testing` for #329.
Previously, if we had launch_parallelism > 1, and we were willing to
retry building a circuit max_retries times, then we'd launch up to
max_retries * launch_parallelism circuits before giving up. Ouch!
With this patch, we try to keep the total number of circuits
planned and attempted to the actual max_retries limit.
Part of #329; found with arti-testing.
The previous algorithm had two flaws:
* It would wait even after the final attempt, when there were no
more retries to do.
* It would fail to wait between attempts if an error occurred.
This refactoring fixes both of these issues, and adds some comments.
The FirstHopId type now records an enum that stores whether the hop
is a guard or a fallback. This change addresses concerns about
remembering to check the type or source of an Id before passing it
down to the FallbackState or GuardSet.
Making this change required an API change, so that dirmgr can
report success/failure status without actually knowing whether it's
using a fallback or a guard.
The code here uses a new iterator type, since I couldn't find one of
these on crates.io. I tried writing the code without it, but it was
harder to follow and test.
We do this by creating a new FallbackSet type that includes status
information, and updating the GuardMgr APIs to record success and
failure about it when appropriate. We can use this to mark
FallbackDirs retriable (or not).
With this change, FallbackDir is now stored internally as a Guard in
the GuardMgr crate. That's fine: the FallbackDir type really only
matters for configuration.
If we're building a path with the guard manager involved, we now ask
the guard manager to pick our first hop no matter what. We only
pick from the fallback list ourselves if we're using the API with no
guard manager.
This causes some follow-on changes where we have to remember an
OwnedChanTarget object in a TorPath we've built, and where we gain
the ability to say we're building a path "from nothing extra at
all." Those are all internal to the crate, though.
Closes#220, by making sure that we use our guards to get a fresh
netdir (if we can) before falling back to any fallbacks, even if our
consensus is old.
Compilation should be fixed in the next commit.
We only do this when we fail to get a regular guard (e.g., because
they're all down), and when we have been asked for a guard for a
one-hop directory.
Most of the change in this commit is plumbing to make all of the
types match up.
As before, compilation may still be broken.
We need to extend our notion of "the origin of a guard" to include
"somewhere outside the guard list"; we need the ability to return a
FallbackDir as a Guard; and we need to remember a few more pieces of
information in each pending request.
As before, this commit may break compilation; it will be restored soon.
The guard manager is responsible for handing out the first hops of
tor circuits, keeping track of their successes and failures, and
remembering their states. Given that, it makes sense to store this
information here. It is not yet used; I'll be fixing that in
upcoming commits.
Arguably, this information no longer belongs in the directory
manager: I've added a todo about moving it.
This commit will break compilation on its own in a couple of places;
subsequent commits will fix it up.
This is the logical place for it, I think: the GuardMgr's job is to
pick the first hop for a circuit depending on remembered status for
possible first hops. Making this change will let us streamline the
code that interacts with these objects.
The various background daemon tasks that `arti-client` used to spawn are
now handled inside their respective crates instead, with functions
provided to spawn them that return `TaskHandle`s.
This required introducing a new trait, `NetDirProvider`, which steals
some functionality from the `DirProvider` trait to enable `tor-circmgr`
to depend on it (`tor-circmgr` is a dependency of `tor-dirmgr`, so it
can't depend on `DirProvider` directly).
While we're at it, we also make some of the tasks wait for events from
the `NetDirProvider` instead of sleeping, slightly increasing
efficiency.
There are two reasons why the DynFilter newtype might be needed:
1. To impl Default. But we don't need it to impl Default since we can
have an accessor which does the defaulting.
2. To hide the API. But this is usrely an unstable API.
Just writing Arc<dyn> gets rid of a lot of unnecessary boilerplate and
conversion code.
This is a revised version of !397; it implements a scheduling system for
periodic tasks that can be externally controlled, and then uses the
external control aspect to implement a basic dormant mode (#90).
More technically, the scheduling system consists of a `Stream` that
periodic tasks are expected to embed in a `while` loop or similar, a
way for tasks themselves to choose how long to wait until the stream
next yields a result, and a handle to control this outside of the task.
We now check the handshake certificates unconditionally, and only
report them as _expired_ as a last resort.
(Rationale: if somebody is presenting the wrong identity from a year
ago, it is more interesting that they are presenting the wrong ID
than it is that they are doing so with an expired cert.
We also now report a different error if the certificate is expired,
but its expiration is within the range of reported clock skew.
(Rationale: it's helpful to distinguish this case, so that we can
blame the failure on possible clock skew rather than definitely
attributing it to a misbehaving relay.)
Part of #405.
NETINFO cells, which are sent in every handshake, may contain
timestamps. This patch adds an accessor for the timestamp in the
Netinfo messages, and teaches the tor-proto code how to compute the
minimum clock skew in the code.
The computation isn't terribly precise, but it doesn't need to be:
Tor should work fine if your clock is accurate to within a few
hours.
This patch also notes a Y2038 problem in the protocol: see
torspec#80.
Part of #405.
Some error types indicate that the guard has failed as a dircache.
We should treat these errors as signs to close the circuit, and to
mark the guard as having failed.
This commit refactors the dirclient error type into two cases:
errors when constructing a circuit, and errors that occur once we
already have a one-hop circuit. The latter can usually be
attributed to the specific cache we're talking to.
This commit also adds a function to expose the information about
which directory gave us the info.
We already have the ability to get peer information from ChanMgr
errors, and therefore from any RetryErrors that contain ChanMgr
errors.
This commit adds optional peer information to tor-proto errors, and
a function to expose whatever peer information is available.
It'll soon more convenient to pass in FallbackDirs as a slice of
references, rather than just a slice of FallbackDirs: I'm going to
be changing how we handle these in tor-dirmgr.
If all guards are down and they won't be retriable for a while, try
waiting that long to get whichever guard _is_ retriable.
Additionally, if we are making multiple circuit plans in parallel,
only report our planning as having failed if we failed at making
_all_ the plans. Previously we treated any failure as fatal for the
other plans, which could lead to trouble in the case when guards
were all down or pending.
Part of #407.
When all guards are down, we would previously mark them all as up,
and retry aggressively. But that's far too aggressive: if there's
something wrong with our ability to connect to guards, it makes us
hammer the network over and over, ignoring all the guard retry
timeouts in practice.
Instead,
* We now allow the `pick_guard()` function to fail without
automatically retrying.
* We give different errors in the cases when all our guards are
down, and when all of the guards selected by our active usage
are down.
* Our "guards are down" error includes the time at which a guard
will next be retriable.
This is part of #407.
C tor used one schedule, and guard-spec specified another. But in
reality we should probably use a randomized schedule to retry
guards, for the reasons explained in the documentation for
RetrySchedule.
I've chosen the minima to be not too far from our previous minima
for primary and non-primary guards.
This is part of #407.
Currently, Arti doesn't need this. But once it does, it will be
way better to have a separate type for connected sockets, rather
than having to error-check every time somebody gives us a socket.
Part of #410
Each channel now remembers an OwnedChanTarget.
Each circuit now remembers a vector of OwnedChanTarget to represent
the path that it was constructed for.
Part of #415.
Instead of requiring a `Box<dyn Isolation>`, it now takes either a
`Box<dyn Isolation>`, or an arbitrary `T` that implements
`Isolation`.
This API still allows the user to pass in a `Box<dyn Isolation>` if
that's what they have, but it doesn't require them to Box the
isolation on their own.
Part of #414.