arti/doc/dev/notes/hssvc-ipt-algorithms.md

# HS service IPTs and descriptor publication algorithms

## Code structure

There are three main pieces:

 * IPT Establisher:
   One per IPT.
   Given a single IPT relay attempts to set up,
   verify, maintain, and report on the introduction point.
   Persistent (on-disk) state: none.

 * IPT Manager:
   One per HS.
   Selects IPTs, creates and destroys IPT establishers,
   monitors their success/failure, etc.
   Persistent (on-disk) state:
   current set of IPT Relays.
   Optional persistent (on-disk) state:
   current list of IPTs and their (last) states, fault counters, etc.,
   including secret keys necessary to re-stablish that IPT;
   all previous descriptor contents (`IptSetForDescriptor`)
   issued to hsdir publisher,
   that have not yet expired.

 * hsdir publisher:
   One per HS.
   Identifies the hsdirs for the relevant time periods.
   Constructs descriptors according to the IPT manager's instructions,
   and publishes them to the hsdirs.
   Persistent (on-disk) state (optional):
   which versions (`IptSetForDescriptor`) are published where.

Output of the whole thing:
Stream of introduction requests,
done by passing an mpsc sender into the IPT Manager's constructor,
which is simply cloned and given to each IPT Establisher.

(Each IPT Establisher is told by the IPT Manager
when a descriptor mentioning that IPT is about to be published,
so that the IPT Establisher can reject introduction attempts
using an unpublished IPT.)

I think there are too many possible IPTs
to maintain experience information about IPTs we used to use;
the list of experience information would grow to the size of the network.
Is this true?
If not, would recording *all* our IPT experiences
lead to distinguishability ?

Some of the persistent state is optional:
for a persistent hidden service, we prefer to store this information,
to improve resilience after service restarts.
But we can work without it,
for example when we are operating an ephemeral service.

## IPT selection and startup for a new HS, overall behaviour

 * Select N suitable relays randomly to be IPTs

 * Attempt to establish and verify them, in parallel

 * Wait a short time
   and then publish a short-lifetime descriptor listing the ones
   set up so far (this gets us some working descriptors right away)

 * When we have all the IPTs set up, republish the descriptor.

(This behaviour follows from the detailed algorithm below.)

## Verification and monitoring (optional, probably not in v1)

After ESTABLISH_INTRO,
we attempt (via a 2nd set of circuits)
an INTRODUCE probe, to see if the IPT is working.

We do such probes periodically at random intervals.

NOTE: there is a behaviour/privacy risk here,
which should be properly considered before implementation.

## General operation, IPT selection

We maintain records of each still-possibly-relevant IPT.
(We distinguish "IPT",
an intended or established introduction point with particular keys etc.,
from an "IPT Relay", which is a relay at which we'll establish the IPT.)

We attempt to maintain a pool of N established and verified IPTs,
at N IPT Relays.

When we have fewer than N IPT Relays
that have `Establishing` or `Good` IPTs (see below)
and fewer than k*N IPT Relays overall,
we choose a new IPT Relay at random from the consensus
and try to establish an IPT on it.

(Rationale for the k*N limit:
we do want to try to replace faulty IPTs, but
we don't want an attacker to be able to provoke us into
rapidly churning through IPT candidates.)

When we select a new IPT Relay, we randomly choose a planned replacement time,
after which it becomes `Retiring`.

Additionally, any IPT becomes `Retiring`
after it has been used for a certain number of introductions
(c.f. C Tor `#define INTRO_POINT_MIN_LIFETIME_INTRODUCTIONS 16384`.)
When this happens we retain the IPT Relay,
and make new parameters to make a new IPT at the same Relay.

## IPT states

Each IPT Relay can have multiple IPTs,
but all but one are Retiring.

Each IPT can be in the following states:

 * `Establishing`:
   The IPT has been selected,
   but we are still establishing it
   and verifying it for the first time
   (either because we restarted, or because the HS was just created,
   or because our connect to the Tor network failed).
   It won't be published in any descriptor.

 * `Good`:
   The IPT is good.  We have a circuit to it,
   and the last verification was successful.
   This IPT will be included in descriptors.

 * `Faulty`:
   The IPT has been advertised but appears to be faulty.
   (For example, the circuit to it has collapsed
   and could not be reestablished.)
   But we won't publish it in any descriptor.
   We will allow the re-establishment attempt to proceed,
   but if it doesn't yield success within a reasonable time,
   we will try to replace this IPT with another IPT.

 * `Retiring`:
   We have reached the IPT's planned replacement time,
   or the IPT has been used for many rendezvous requests.
   (We will continue to maintain our circuit to it
   so long as descriptors with it are valid.)

(`Establishing/Good/Faulty` are reported by the IPT Establisher
to the IPT Manager.  
`Retiring` is actually orthogonal, and dealt with by the IPT Manager.)

We also maintain for each IPT:

 * The duration of the last or current establishment attempt.

 * The latest expiry time of any descriptor that mentions it
   that we published (or tried to).

 * A fault counter (per IPT Relay, not per IPT)
   which is incremented each time the IPT enters the state `Faulty`.

An IPT is removed from our records, and we give up on it,
when it is no longer `Good` or `Establishing`
and all descriptors that mentioned it have expired.

(Until all published descriptors mentioning an IPT expire,
we consider ourselves bound by those previously-published descriptors,
and try to maintain the IPT.
TODO: Allegedly this is unnecessary, but I don't see how it could be.)

When we lose our circuit to an IPT,
we look at the `ErrorKind` to try to determine
if the fault was local (and would therefore affect all relays and IPTs):

 * `TorAccessFailed`, `LocalNetworkError`, `ExternalToolFailed`
   and perhaps others:
   Return the IPT to `Establishing`.

 * Others: declare the IPT `Faulty`.

If our verification probe fails,
but the circuit to the IPT appears to remain up:

 * If we didn't manage to build the test circuit to the IPT,
   check the `ErrorKind`, as above.

 * If we managed to build the test circuit to the IPT,
   but the probe failed (or the probe payload didn't arrive),
   declare the IPT `Faulty`.

## IPT sets and lifetimes

We remember every IPT we have published that is still valid.

At each point in time we have an idea of set of IPTs we want to publish.
The possibilities are:

 * `Certain`:
   We are sure of which IPTs we want to publish.
   We try to do so, talking to hsdirs as necessary,
   updating any existing information.
   (We also republish to an hsdir if its descriptor will expire soon,
   or we haven't published there since Arti was restarted.)

 * `Unknown`:
   We have no idea which IPTs to publish.
   We leave whatever is on the hsdirs as-is.

 * `Uncertain`:
   We have some IPTs we could publish,
   but we're not confident about them.
   We publish these to a particular hsdir if:
    - our last-published descriptor has expired
    - or it will expire soon
    - or if we haven't published since Arti was restarted.

The idea of what to publish is calculated as follows:

 * If we have at least N `Good` IPTs: `Certain`.
   (We publish the "best" N IPTs for some definition of "best".
   TODO: should we use the fault count?  recency?)

 * Unless we have at least one `Good` IPT: `Unknown`.

 * Otherwise: if there are IPTs in `Establishing`,
   and they have been in `Establishing` only a short time [1]:
   `Unknown`; otherwise `Uncertain`.

The effect is that we delay publishing an initial descriptor
by at most 1x the fastest IPT setup time,
at most doubling the initial setup time.

Each update to the IPT set that isn't `Unknown` comes with a
proposed descriptor expiry time,
which is used if the descriptor is to be actually published.
The proposed descriptor lifetime for `Uncertain`
is the minimum (30 minutes).
Otherwise, we double the lifetime each time,
unless any IPT in the previous descriptor was declared `Faulty`,
in which case we reset it back to the minimum.
TODO: Perhaps we should just pick fixed short and long lifetimes instead,
to limit distinguishability.

(Rationale: if IPTs are regularly misbehaving,
we should be cautious and limit our exposure to the damage.)

[1] NOTE: We wait a "short time" between establishing our first IPT,
and publishing an incomplete (<N) descriptor -
this is a compromise between
availability (publishing as soon as we have any working IPT)
and
exposure and hsdir load
(which would suggest publishing only when our IPT set is stable).
One possible strategy is to wait as long again
as the time it took to establish our first IPT.
Another is to somehow use our circuit timing estimator.

## Descriptor publication

The descriptor output from the IPT maintenance algorithm is
an updated (`postage::watch`) `IptSetStatus`:

```
enum IptSetStatus {
    Unknown,
    Certain(IptSetForDescriptor),
    Uncertain(IptSetForDescriptor),
}
struct IptSetForDescriptor {
    ipts: list of introduction points for descriptor
    expiry_time: Instant,
}
```

We run a publication algorithm separately for each hsdir:

We record for each hsdir what we have published.

We attempt publication in the following cases:

 * `Certain`, if: the IPT list has changed from what was published,
   and we haven't published a `Certain` set recently
 * `Uncertain`, if: nothing is published,
   or what is published will expire soon,
   or we haven't published since Arti was restarted

If a publication attempt failed
we block further attempts
according to an exponential backoff schedule;
when the timer expires we reconsider
if and what we want to publish.

## Tuning parameters

TODO: Review these tuning parameters both for value and origin.
Some of these may be in `param-spec.txt` section "8. V3 onion service parameters"
Some of them may be in C Tor.

 * N, number of IPTs to try to maintain:
   configurable, default is 3, max is 20.
   (rend-spec-v3 2.5.4 NUM_INTRO_POINT)

 * k*N: Maximum number of IPTs including replaced faulty ones.
   (We may actually maintain more than this when we are have *retiring* IPTs,
   but this doesn't expose us to IPT churn since attackers can't
   force us to retire IPTs.

 * IPT replacement time: 4..7 days (uniform random)
   TODO: what is the right value here?  (Should we do time-based rotation at all?)

 * "Soon" for "if the published descriptor will expire soon":
   10 minutes.

 * Verification probe interval:
   descriptor expiry time minus 15 minutes.

 * Backoff schedule for hsdir publication.

## Load balancing (and maybe failover)

This is a sketch, only.
TODO: Look at what Onion Balance does before implementing this.

If it's desired to allow multiple Arti processes to serve a single HS:

The shards will have the IPT Establishers.

There will be one central IPT Manager
(perhaps with a failover).

Each shard will have an IPT Manager Stub
which receives instructions from,
and reports experiences to, 
the central IPT Manager.
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00			`# HS service IPTs and descriptor publication algorithms`

			`## Code structure`

			`There are three main pieces:`

			`* IPT Establisher:`
			`One per IPT.`
			`Given a single IPT relay attempts to set up,`
			`verify, maintain, and report on the introduction point.`
			`Persistent (on-disk) state: none.`

			`* IPT Manager:`
			`One per HS.`
			`Selects IPTs, creates and destroys IPT establishers,`
			`monitors their success/failure, etc.`
			`Persistent (on-disk) state:`
dev notes: Draft IPT algorithm: Make IPT persistence optional As per current version of torpsec!154 2023-07-28 15:22:08 +01:00			`current set of IPT Relays.`
			`Optional persistent (on-disk) state:`
			`current list of IPTs and their (last) states, fault counters, etc.,`
			`including secret keys necessary to re-stablish that IPT;`
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00			all previous descriptor contents (`IptSetForDescriptor`)
			`issued to hsdir publisher,`
			`that have not yet expired.`

			`* hsdir publisher:`
			`One per HS.`
			`Identifies the hsdirs for the relevant time periods.`
			`Constructs descriptors according to the IPT manager's instructions,`
			`and publishes them to the hsdirs.`
dev notes: Draft IPT algorithm: Make IPT persistence optional As per current version of torpsec!154 2023-07-28 15:22:08 +01:00			`Persistent (on-disk) state (optional):`
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00			which versions (`IptSetForDescriptor`) are published where.

			`Output of the whole thing:`
			`Stream of introduction requests,`
			`done by passing an mpsc sender into the IPT Manager's constructor,`
			`which is simply cloned and given to each IPT Establisher.`

dev notes: Draft IPT algorithm: note re unpublished IPT Prompted by https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/1429#note_2924483 2023-07-25 14:51:04 +01:00			`(Each IPT Establisher is told by the IPT Manager`
			`when a descriptor mentioning that IPT is about to be published,`
			`so that the IPT Establisher can reject introduction attempts`
			`using an unpublished IPT.)`

dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00			`I think there are too many possible IPTs`
			`to maintain experience information about IPTs we used to use;`
			`the list of experience information would grow to the size of the network.`
			`Is this true?`
			`If not, would recording all our IPT experiences`
			`lead to distinguishability ?`

dev notes: Draft IPT algorithm: Make IPT persistence optional As per current version of torpsec!154 2023-07-28 15:22:08 +01:00			`Some of the persistent state is optional:`
			`for a persistent hidden service, we prefer to store this information,`
			`to improve resilience after service restarts.`
			`But we can work without it,`
			`for example when we are operating an ephemeral service.`

dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00			`## IPT selection and startup for a new HS, overall behaviour`

			`* Select N suitable relays randomly to be IPTs`

			`* Attempt to establish and verify them, in parallel`

dev notes: Draft IPT algorithm: Early descriptor publish timing Prompted by https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/1429#note_2924494 2023-07-25 15:10:45 +01:00			`* Wait a short time`
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00			`and then publish a short-lifetime descriptor listing the ones`
			`set up so far (this gets us some working descriptors right away)`

			`* When we have all the IPTs set up, republish the descriptor.`

			`(This behaviour follows from the detailed algorithm below.)`

dev notes: Draft IPT algorithm: note re intro pt verification Discussions here: https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/1429#note_2924481 https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/1429#note_2924495 2023-07-25 14:44:06 +01:00			`## Verification and monitoring (optional, probably not in v1)`
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00
			`After ESTABLISH_INTRO,`
			`we attempt (via a 2nd set of circuits)`
			`an INTRODUCE probe, to see if the IPT is working.`

			`We do such probes periodically at random intervals.`

dev notes: Draft IPT algorithm: note re intro pt verification Discussions here: https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/1429#note_2924481 https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/1429#note_2924495 2023-07-25 14:44:06 +01:00			`NOTE: there is a behaviour/privacy risk here,`
			`which should be properly considered before implementation.`

dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00			`## General operation, IPT selection`

			`We maintain records of each still-possibly-relevant IPT.`
dev notes: Draft IPT algorithm: Reuse relays when cycling IPT 2023-07-28 15:09:37 +01:00			`(We distinguish "IPT",`
			`an intended or established introduction point with particular keys etc.,`
			`from an "IPT Relay", which is a relay at which we'll establish the IPT.)`
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00
dev notes: Draft IPT algorithm: Reuse relays when cycling IPT 2023-07-28 15:09:37 +01:00			`We attempt to maintain a pool of N established and verified IPTs,`
			`at N IPT Relays.`

			`When we have fewer than N IPT Relays`
			that have `Establishing` or `Good` IPTs (see below)
			`and fewer than k*N IPT Relays overall,`
			`we choose a new IPT Relay at random from the consensus`
			`and try to establish an IPT on it.`
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00
dev notes: Draft IPT algorithm: Maintain k*N, not 2N Make this a separate parameter. 2023-07-28 15:08:24 +01:00			`(Rationale for the k*N limit:`
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00			`we do want to try to replace faulty IPTs, but`
			`we don't want an attacker to be able to provoke us into`
			`rapidly churning through IPT candidates.)`

dev notes: Draft IPT algorithm: Reuse relays when cycling IPT 2023-07-28 15:09:37 +01:00			`When we select a new IPT Relay, we randomly choose a planned replacement time,`
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00			after which it becomes `Retiring`.
dev notes: Draft IPT algorithm: Reuse relays when cycling IPT 2023-07-28 15:09:37 +01:00
dev notes: Draft IPT algorithm: Retire IPTs after N introductions Prompted by https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/1429#note_2924485 2023-07-25 14:54:06 +01:00			Additionally, any IPT becomes `Retiring`
			`after it has been used for a certain number of introductions`
			(c.f. C Tor `#define INTRO_POINT_MIN_LIFETIME_INTRODUCTIONS 16384`.)
dev notes: Draft IPT algorithm: Reuse relays when cycling IPT 2023-07-28 15:09:37 +01:00			`When this happens we retain the IPT Relay,`
			`and make new parameters to make a new IPT at the same Relay.`
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00
			`## IPT states`

dev notes: Draft IPT algorithm: Reuse relays when cycling IPT 2023-07-28 15:09:37 +01:00			`Each IPT Relay can have multiple IPTs,`
			`but all but one are Retiring.`

dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00			`Each IPT can be in the following states:`

			* `Establishing`:
			`The IPT has been selected,`
			`but we are still establishing it`
			`and verifying it for the first time`
			`(either because we restarted, or because the HS was just created,`
			`or because our connect to the Tor network failed).`
			`It won't be published in any descriptor.`

			* `Good`:
			`The IPT is good. We have a circuit to it,`
			`and the last verification was successful.`
			`This IPT will be included in descriptors.`

			* `Faulty`:
			`The IPT has been advertised but appears to be faulty.`
dev notes: Draft IPT algorithm: Minor clarifications 2023-07-26 12:33:18 +01:00			`(For example, the circuit to it has collapsed`
			`and could not be reestablished.)`
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00			`But we won't publish it in any descriptor.`
dev notes: Draft IPT algorithm: Delay IPT replacement Prompted by https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/1429#note_2924497 2023-07-25 15:13:37 +01:00			`We will allow the re-establishment attempt to proceed,`
			`but if it doesn't yield success within a reasonable time,`
			`we will try to replace this IPT with another IPT.`
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00
			* `Retiring`:
dev notes: Draft IPT algorithm: Minor clarifications 2023-07-26 12:33:18 +01:00			`We have reached the IPT's planned replacement time,`
			`or the IPT has been used for many rendezvous requests.`
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00			`(We will continue to maintain our circuit to it`
			`so long as descriptors with it are valid.)`

			(`Establishing/Good/Faulty` are reported by the IPT Establisher
			`to the IPT Manager.`
			`Retiring` is actually orthogonal, and dealt with by the IPT Manager.)

			`We also maintain for each IPT:`

			`* The duration of the last or current establishment attempt.`

			`* The latest expiry time of any descriptor that mentions it`
			`that we published (or tried to).`

dev notes: Draft IPT algorithm: Reuse relays when cycling IPT 2023-07-28 15:09:37 +01:00			`* A fault counter (per IPT Relay, not per IPT)`
			which is incremented each time the IPT enters the state `Faulty`.

dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00			`An IPT is removed from our records, and we give up on it,`
			when it is no longer `Good` or `Establishing`
			`and all descriptors that mentioned it have expired.`

dev notes: Draft IPT algorithm: Added TODO re previous descriptor semantics Prompted by https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/1429#note_2924491 2023-07-25 15:35:32 +01:00			`(Until all published descriptors mentioning an IPT expire,`
			`we consider ourselves bound by those previously-published descriptors,`
			`and try to maintain the IPT.`
			`TODO: Allegedly this is unnecessary, but I don't see how it could be.)`

dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00			`When we lose our circuit to an IPT,`
			we look at the `ErrorKind` to try to determine
dev notes: Draft IPT algorithm: Reuse relays when cycling IPT 2023-07-28 15:09:37 +01:00			`if the fault was local (and would therefore affect all relays and IPTs):`
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00
			* `TorAccessFailed`, `LocalNetworkError`, `ExternalToolFailed`
			`and perhaps others:`
			Return the IPT to `Establishing`.

			* Others: declare the IPT `Faulty`.

			`If our verification probe fails,`
			`but the circuit to the IPT appears to remain up:`

Fix typos 2023-07-25 19:17:18 +01:00			`* If we didn't manage to build the test circuit to the IPT,`
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00			check the `ErrorKind`, as above.

			`* If we managed to build the test circuit to the IPT,`
			`but the probe failed (or the probe payload didn't arrive),`
			declare the IPT `Faulty`.

			`## IPT sets and lifetimes`

			`We remember every IPT we have published that is still valid.`

			`At each point in time we have an idea of set of IPTs we want to publish.`
			`The possibilities are:`

			* `Certain`:
			`We are sure of which IPTs we want to publish.`
			`We try to do so, talking to hsdirs as necessary,`
			`updating any existing information.`
			`(We also republish to an hsdir if its descriptor will expire soon,`
			`or we haven't published there since Arti was restarted.)`

			* `Unknown`:
			`We have no idea which IPTs to publish.`
			`We leave whatever is on the hsdirs as-is.`

			* `Uncertain`:
			`We have some IPTs we could publish,`
			`but we're not confident about them.`
			`We publish these to a particular hsdir if:`
			`- our last-published descriptor has expired`
			`- or it will expire soon`
			`- or if we haven't published since Arti was restarted.`

			`The idea of what to publish is calculated as follows:`

			* If we have at least N `Good` IPTs: `Certain`.
dev notes: Draft IPT algorithm: Add a TODO re selecting from >N IPTs Prompted by https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/1429#note_2924498 2023-07-25 15:21:45 +01:00			`(We publish the "best" N IPTs for some definition of "best".`
			`TODO: should we use the fault count? recency?)`
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00
			* Unless we have at least one `Good` IPT: `Unknown`.

			* Otherwise: if there are IPTs in `Establishing`,
dev notes: Draft IPT algorithm: Early descriptor publish timing Prompted by https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/1429#note_2924494 2023-07-25 15:10:45 +01:00			and they have been in `Establishing` only a short time [1]:
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00			`Unknown`; otherwise `Uncertain`.

			`The effect is that we delay publishing an initial descriptor`
			`by at most 1x the fastest IPT setup time,`
			`at most doubling the initial setup time.`

			Each update to the IPT set that isn't `Unknown` comes with a
			`proposed descriptor expiry time,`
			`which is used if the descriptor is to be actually published.`
			The proposed descriptor lifetime for `Uncertain`
			`is the minimum (30 minutes).`
			`Otherwise, we double the lifetime each time,`
			unless any IPT in the previous descriptor was declared `Faulty`,
			`in which case we reset it back to the minimum.`
dev notes: Draft IPT algorithm: Possible fixed descriptor lifetimes Prompted by https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/1429#note_2924501 2023-07-25 15:28:12 +01:00			`TODO: Perhaps we should just pick fixed short and long lifetimes instead,`
			`to limit distinguishability.`
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00
			`(Rationale: if IPTs are regularly misbehaving,`
			`we should be cautious and limit our exposure to the damage.)`

dev notes: Draft IPT algorithm: Early descriptor publish timing Prompted by https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/1429#note_2924494 2023-07-25 15:10:45 +01:00			`[1] NOTE: We wait a "short time" between establishing our first IPT,`
			`and publishing an incomplete (<N) descriptor -`
			`this is a compromise between`
			`availability (publishing as soon as we have any working IPT)`
			`and`
			`exposure and hsdir load`
			`(which would suggest publishing only when our IPT set is stable).`
			`One possible strategy is to wait as long again`
			`as the time it took to establish our first IPT.`
			`Another is to somehow use our circuit timing estimator.`

dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00			`## Descriptor publication`

			`The descriptor output from the IPT maintenance algorithm is`
			an updated (`postage::watch`) `IptSetStatus`:

			```
			`enum IptSetStatus {`
			`Unknown,`
			`Certain(IptSetForDescriptor),`
			`Uncertain(IptSetForDescriptor),`
			`}`
			`struct IptSetForDescriptor {`
			`ipts: list of introduction points for descriptor`
			`expiry_time: Instant,`
			`}`
			```

			`We run a publication algorithm separately for each hsdir:`

			`We record for each hsdir what we have published.`

			`We attempt publication in the following cases:`

dev notes: Draft IPT algorithm: Add a publication rate limit Prompted by https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/1429#note_2924499 2023-07-25 15:25:16 +01:00			* `Certain`, if: the IPT list has changed from what was published,
			and we haven't published a `Certain` set recently
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00			* `Uncertain`, if: nothing is published,
			`or what is published will expire soon,`
			`or we haven't published since Arti was restarted`

			`If a publication attempt failed`
			`we block further attempts`
			`according to an exponential backoff schedule;`
			`when the timer expires we reconsider`
			`if and what we want to publish.`

			`## Tuning parameters`

dev notes: Draft IPT algorithm: Added TODO/xref re tuning Prompted by https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/1429#note_2924530 2023-07-25 15:31:25 +01:00			`TODO: Review these tuning parameters both for value and origin.`
			Some of these may be in `param-spec.txt` section "8. V3 onion service parameters"
			`Some of them may be in C Tor.`

dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00			`* N, number of IPTs to try to maintain:`
			`configurable, default is 3, max is 20.`
			`(rend-spec-v3 2.5.4 NUM_INTRO_POINT)`

dev notes: Draft IPT algorithm: Maintain k*N, not 2N Make this a separate parameter. 2023-07-28 15:08:24 +01:00			`* k*N: Maximum number of IPTs including replaced faulty ones.`
			`(We may actually maintain more than this when we are have retiring IPTs,`
			`but this doesn't expose us to IPT churn since attackers can't`
			`force us to retire IPTs.`
dev notes: Draft IPT algorithm: Added 2N IPT limit to tuning params Prompted by https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/1429#note_2924496 2023-07-25 15:16:02 +01:00
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00			`* IPT replacement time: 4..7 days (uniform random)`
dev notes: Draft IPT algorithm: Retire IPTs - timing question Prompted by https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/1429#note_2924485 2023-07-25 14:55:32 +01:00			`TODO: what is the right value here? (Should we do time-based rotation at all?)`
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00
			`* "Soon" for "if the published descriptor will expire soon":`
			`10 minutes.`

			`* Verification probe interval:`
Fix typos 2023-07-25 19:17:18 +01:00			`descriptor expiry time minus 15 minutes.`
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00
			`* Backoff schedule for hsdir publication.`

			`## Load balancing (and maybe failover)`

			`This is a sketch, only.`
dev notes: Draft IPT algorithm: Retire IPTs - note re Onion Balance Prompted by https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/1429#note_2924487 2023-07-25 14:58:16 +01:00			`TODO: Look at what Onion Balance does before implementing this.`
dev notes: Draft IPT algorithm 2023-07-21 11:53:26 +01:00
			`If it's desired to allow multiple Arti processes to serve a single HS:`

			`The shards will have the IPT Establishers.`

			`There will be one central IPT Manager`
			`(perhaps with a failover).`

			`Each shard will have an IPT Manager Stub`
			`which receives instructions from,`
			`and reports experiences to,`
			`the central IPT Manager.`