arti/doc/dev/notes/hssvc-ipt-algorithms.md

# HS service IPTs and descriptor publication algorithms

## Code structure

There are three main pieces:

 * IPT Establisher:
   One per IPT.
   Given a single IPT relay attempts to set up,
   verify, maintain, and report on the introduction point.
   Persistent (on-disk) state: none.

 * IPT Manager:
   One per HS.
   Selects IPTs, creates and destroys IPT establishers,
   monitors their success/failure, etc.
   Persistent (on-disk) state:
   current set of IPT Relays.
   Optional persistent (on-disk) state:
   current list of IPTs and their (last) states, fault counters, etc.,
   including secret keys necessary to re-stablish that IPT;
   all previous descriptor contents (`IptSetForDescriptor`)
   issued to hsdir publisher,
   that have not yet expired.

 * hsdir publisher:
   One per HS.
   Identifies the hsdirs for the relevant time periods.
   Constructs descriptors according to the IPT manager's instructions,
   and publishes them to the hsdirs.
   Persistent (on-disk) state (optional):
   which versions (`IptSetForDescriptor`) are published where.

Output of the whole thing:
Stream of introduction requests,
done by passing an mpsc sender into the IPT Manager's constructor,
which is simply cloned and given to each IPT Establisher.

(Each IPT Establisher is told by the IPT Manager
when a descriptor mentioning that IPT is about to be published,
so that the IPT Establisher can reject introduction attempts
using an unpublished IPT.)

I think there are too many possible IPTs
to maintain experience information about IPTs we used to use;
the list of experience information would grow to the size of the network.
Is this true?
If not, would recording *all* our IPT experiences
lead to distinguishability ?

Some of the persistent state is optional:
for a persistent hidden service, we prefer to store this information,
to improve resilience after service restarts.
But we can work without it,
for example when we are operating an ephemeral service.

## IPT selection and startup for a new HS, overall behaviour

 * Select N suitable relays randomly to be IPTs

 * Attempt to establish and verify them, in parallel

 * Wait a short time
   and then publish a short-lifetime descriptor listing the ones
   set up so far (this gets us some working descriptors right away)

 * When we have all the IPTs set up, republish the descriptor.

(This behaviour follows from the detailed algorithm below.)

## Verification and monitoring (optional, probably not in v1)

After ESTABLISH_INTRO,
we attempt (via a 2nd set of circuits)
an INTRODUCE probe, to see if the IPT is working.

We do such probes periodically at random intervals.

NOTE: there is a behaviour/privacy risk here,
which should be properly considered before implementation.

## General operation, IPT selection

We maintain records of each still-possibly-relevant IPT.
(We distinguish "IPT",
an intended or established introduction point with particular keys etc.,
from an "IPT Relay", which is a relay at which we'll establish the IPT.)

We attempt to maintain a pool of N established and verified IPTs,
at N IPT Relays.

When we have fewer than N IPT Relays
that have `Establishing` or `Good` IPTs (see below)
and fewer than k*N IPT Relays overall,
we choose a new IPT Relay at random from the consensus
and try to establish an IPT on it.

(Rationale for the k*N limit:
we do want to try to replace faulty IPTs, but
we don't want an attacker to be able to provoke us into
rapidly churning through IPT candidates.)

When we select a new IPT Relay, we randomly choose a planned replacement time,
after which it becomes `Retiring`.

Additionally, any IPT becomes `Retiring`
after it has been used for a certain number of introductions
(c.f. C Tor `#define INTRO_POINT_MIN_LIFETIME_INTRODUCTIONS 16384`.)
When this happens we retain the IPT Relay,
and make new parameters to make a new IPT at the same Relay.

## IPT states

Each IPT Relay can have multiple IPTs,
but all but one are Retiring.

Each IPT can be in the following states:

 * `Establishing`:
   The IPT has been selected,
   but we are still establishing it
   and verifying it for the first time
   (either because we restarted, or because the HS was just created,
   or because our connect to the Tor network failed).
   It won't be published in any descriptor.

 * `Good`:
   The IPT is good.  We have a circuit to it,
   and the last verification was successful.
   This IPT will be included in descriptors.

 * `Faulty`:
   The IPT has been advertised but appears to be faulty.
   (For example, the circuit to it has collapsed
   and could not be reestablished.)
   But we won't publish it in any descriptor.
   We will allow the re-establishment attempt to proceed,
   but if it doesn't yield success within a reasonable time,
   we will try to replace this IPT with another IPT.

 * `Retiring`:
   We have reached the IPT's planned replacement time,
   or the IPT has been used for many rendezvous requests.
   (We will continue to maintain our circuit to it
   so long as descriptors with it are valid.)

(`Establishing/Good/Faulty` are reported by the IPT Establisher
to the IPT Manager.
`Retiring` is actually orthogonal, and dealt with by the IPT Manager.)

We also maintain for each IPT:

 * The duration of the last or current establishment attempt.

 * The latest expiry time of any descriptor that mentions it
   that we published (or tried to).

 * A fault counter (per IPT Relay, not per IPT)
   which is incremented each time the IPT enters the state `Faulty`.

An IPT is removed from our records, and we give up on it,
when it is no longer `Good` or `Establishing`
and all descriptors that mentioned it have expired.

(Until all published descriptors mentioning an IPT expire,
we consider ourselves bound by those previously-published descriptors,
and try to maintain the IPT.
TODO: Allegedly this is unnecessary, but I don't see how it could be.)

When we lose our circuit to an IPT,
we look at the `ErrorKind` to try to determine
if the fault was local (and would therefore affect all relays and IPTs):

 * `TorAccessFailed`, `LocalNetworkError`, `ExternalToolFailed`
   and perhaps others:
   Return the IPT to `Establishing`.

 * Others: declare the IPT `Faulty`.

If our verification probe fails,
but the circuit to the IPT appears to remain up:

 * If we didn't manage to build the test circuit to the IPT,
   check the `ErrorKind`, as above.

 * If we managed to build the test circuit to the IPT,
   but the probe failed (or the probe payload didn't arrive),
   declare the IPT `Faulty`.

## IPT sets and lifetimes

We remember every IPT we have published that is still valid.

At each point in time we have an idea of set of IPTs we want to publish.
The possibilities are:

 * `Certain`:
   We are sure of which IPTs we want to publish.
   We try to do so, talking to hsdirs as necessary,
   updating any existing information.
   (We also republish to an hsdir if its descriptor will expire soon,
   or we haven't published there since Arti was restarted.)

 * `Unknown`:
   We have no idea which IPTs to publish.
   We leave whatever is on the hsdirs as-is.

 * `Uncertain`:
   We have some IPTs we could publish,
   but we're not confident about them.
   We publish these to a particular hsdir if:
    - our last-published descriptor has expired
    - or it will expire soon
    - or if we haven't published since Arti was restarted.

The idea of what to publish is calculated as follows:

 * If we have at least N `Good` IPTs: `Certain`.
   (We publish the "best" N IPTs for some definition of "best".
   TODO: should we use the fault count?  recency?)

 * Unless we have at least one `Good` IPT: `Unknown`.

 * Otherwise: if there are IPTs in `Establishing`,
   and they have been in `Establishing` only a short time [1]:
   `Unknown`; otherwise `Uncertain`.

The effect is that we delay publishing an initial descriptor
by at most 1x the fastest IPT setup time,
at most doubling the initial setup time.

Each update to the IPT set that isn't `Unknown` comes with a
proposed descriptor expiry time,
which is used if the descriptor is to be actually published.
The proposed descriptor lifetime for `Uncertain`
is the minimum (30 minutes).
Otherwise, we double the lifetime each time,
unless any IPT in the previous descriptor was declared `Faulty`,
in which case we reset it back to the minimum.
TODO: Perhaps we should just pick fixed short and long lifetimes instead,
to limit distinguishability.

(Rationale: if IPTs are regularly misbehaving,
we should be cautious and limit our exposure to the damage.)

[1] NOTE: We wait a "short time" between establishing our first IPT,
and publishing an incomplete (<N) descriptor -
this is a compromise between
availability (publishing as soon as we have any working IPT)
and
exposure and hsdir load
(which would suggest publishing only when our IPT set is stable).
One possible strategy is to wait as long again
as the time it took to establish our first IPT.
Another is to somehow use our circuit timing estimator.

## Descriptor publication

The descriptor output from the IPT maintenance algorithm is
an updated (`postage::watch`) `IptSetStatus`:

```
enum IptSetStatus {
    Unknown,
    Certain(IptSetForDescriptor),
    Uncertain(IptSetForDescriptor),
}
struct IptSetForDescriptor {
    ipts: list of introduction points for descriptor
    expiry_time: Instant,
}
```

We run a publication algorithm separately for each hsdir:

We record for each hsdir what we have published.

We attempt publication in the following cases:

 * `Certain`, if: the IPT list has changed from what was published,
   and we haven't published a `Certain` set recently
 * `Uncertain`, if: nothing is published,
   or what is published will expire soon,
   or we haven't published since Arti was restarted

If a publication attempt failed
we block further attempts
according to an exponential backoff schedule;
when the timer expires we reconsider
if and what we want to publish.

## Tuning parameters

TODO: Review these tuning parameters both for value and origin.
Some of these may be in `param-spec.txt` section "8. V3 onion service parameters"
Some of them may be in C Tor.

 * N, number of IPTs to try to maintain:
   configurable, default is 3, max is 20.
   (rend-spec-v3 2.5.4 NUM_INTRO_POINT)

 * k*N: Maximum number of IPTs including replaced faulty ones.
   (We may actually maintain more than this when we are have *retiring* IPTs,
   but this doesn't expose us to IPT churn since attackers can't
   force us to retire IPTs.

 * IPT replacement time: 4..7 days (uniform random)
   TODO: what is the right value here?  (Should we do time-based rotation at all?)

 * "Soon" for "if the published descriptor will expire soon":
   10 minutes.

 * Verification probe interval:
   descriptor expiry time minus 15 minutes.

 * Backoff schedule for hsdir publication.

## Load balancing (and maybe failover)

This is a sketch, only.
TODO: Look at what Onion Balance does before implementing this.

If it's desired to allow multiple Arti processes to serve a single HS:

The shards will have the IPT Establishers.

There will be one central IPT Manager
(perhaps with a failover).

Each shard will have an IPT Manager Stub
which receives instructions from,
and reports experiences to,
the central IPT Manager.