Commit Graph

5328 Commits

Author SHA1 Message Date
Ian Jackson 696696857d Merge branch 'hashx_perf' into 'main'
hashx: Performance improvements for program generation

See merge request tpo/core/arti!1524
2023-08-23 09:24:13 +00:00
Nick Mathewson c2faedbca7 arti-client: Fix a couple more typos.
I spotted these while I was working on something else.
2023-08-22 16:23:51 -04:00
Nick Mathewson 7a37641aeb Use "typos-cli" to fix a bunch of typos. 2023-08-22 16:23:51 -04:00
Nick Mathewson 7f44281725 arti::cfg tests: Use fold to make nightly clippy happier 2023-08-22 12:24:51 -04:00
Nick Mathewson c0e050b640 Resolve a pair of warnings about redundant closures. 2023-08-22 12:24:51 -04:00
gabi-250 d633efc28b Merge branch 'netdir-todo-2-take-2' into 'main'
tor-netdir: Add separate functions for computing hsdirs for upload/download.

See merge request tpo/core/arti!1518
2023-08-22 15:46:10 +00:00
Nick Mathewson 232c6d957e hss: Improve comments in IptEstablisher::drop. 2023-08-22 10:51:19 -04:00
Nick Mathewson d792bc2a5f hss: Allow IptEstablisher to start in Advertised mode. 2023-08-22 10:51:19 -04:00
Nick Mathewson 980407a894 hss: switch to select_biased 2023-08-22 10:51:19 -04:00
Nick Mathewson 683e607db7 hss: change terminate oneshot to send "void".
We don't actually want to distinguish drop from not-drop.
2023-08-22 10:51:19 -04:00
Nick Mathewson 2a20d1b05a hss: enable tor_proto/experimental-api
Needed for ClientCirc::wait_for_close
2023-08-22 10:51:18 -04:00
Nick Mathewson 1309bc6753 HSS: Use correct timeouts and delays in IptEstablisher 2023-08-22 10:50:43 -04:00
Nick Mathewson 28b8c9c31c HSS: Use a more accurate timeout for ESTABLISH_INTRO handshake. 2023-08-22 10:50:43 -04:00
Nick Mathewson ec6721ec94 HSS: Refactor RendRequest so we can return a stream of it.
We need a type that holds a rend_handshake::IntroRequest object
internally, but where we don't materialize that object from the
Introduce2 message inside the MsgHandler, since that's more crypto
than we want to put in that task.
2023-08-22 10:50:43 -04:00
Nick Mathewson 85c3820a5e HSS: Use DropNotifyWatchSender.
This ensures that the status becomes Faulty when the reactor exits.
2023-08-22 10:50:43 -04:00
Nick Mathewson 8439500e57 HSS: Implement start_accepting and drop for IptEstablisher.
This does not yet do exactly what's documented, but it's closer.
2023-08-22 10:50:43 -04:00
Nick Mathewson 36424540dd hss: launch task to establish introduce requests.
(This requires us to change the type of the data sent in the
stream. I hope to put it back soon.)
2023-08-22 10:50:43 -04:00
Nick Mathewson 07e7eabd3f hss: Once an ipt session is established, let it keep running. 2023-08-22 10:50:43 -04:00
Nick Mathewson 7c14371898 hss: make Ipt establisher code use an mpsc::Sender.
This solves some problems but introduces a few new ones; I've tried
to open comments for the latter.
2023-08-22 10:50:43 -04:00
Nick Mathewson d83ff291bf hss: Establish intro point by RelayIds. 2023-08-22 10:50:43 -04:00
Nick Mathewson 6021976466 proto: fix a comment to refer to circuits, not channels. 2023-08-22 10:50:43 -04:00
Gabriela Moldovan b1c54adae7
tor-netdir: Use an owned HsBlindId instead of a reference.
`HsBlindId` is `Copy`.
2023-08-22 15:48:07 +01:00
Gabriela Moldovan 59b94ed06c
tor-netdir: Replace flat_map() with cartesian_product(). 2023-08-22 15:48:04 +01:00
Gabriela Moldovan 3e193ebd63
tor-netdir: Make `hs_dirs_upload` take an iterator instead of a slice (fmt). 2023-08-22 15:48:00 +01:00
Gabriela Moldovan d651b3e3a2
tor-netdir: Make `hs_dirs_upload` take an iterator instead of a slice. 2023-08-22 15:47:57 +01:00
Gabriela Moldovan 3090259b55
tor-netdir: Add TODO about making HsDirOp private.
When `hs_dirs` is removed this won't n't need to be public anymore.
2023-08-22 15:47:54 +01:00
Gabriela Moldovan b0c3fc73ca
tor-hsclient: Use hs_dirs_download instead of the deprecated hs_dirs. 2023-08-22 15:47:51 +01:00
Gabriela Moldovan 83f26aebde
tor-netdir: Deprecate hs_dirs(). 2023-08-22 15:47:47 +01:00
Gabriela Moldovan 47e24dc8eb
tor-netdir: Add separate functions for computing hsdirs for upload/download.
The hsdir selection algorithm for uploads and downloads is different
enough to justify splitting `hs_dirs` into 2 different functions.
More specifically, when selecting the relays to upload a service's
descriptors to, the service's `hsids` need to be matched up with the
correct `ring` (using the time period) before applying `select_nodes` to
pick the replicas. This is not the case when downloading, because
for downloads select relays from the current ring.
2023-08-22 15:47:44 +01:00
Gabriela Moldovan 7e4c850efd
tor-netdir: Add private helpers for selecting hsdirs.
These will become useful when we split `hs_dirs()` into 2 separate
functions (one for uploading/services, and another for
downloading/clients).
2023-08-22 15:47:41 +01:00
Nick Mathewson f248607119 Merge branch 'send_raw_msg' into 'main'
proto: new ClientCirc::send_raw_msg function.

Closes #1010

See merge request tpo/core/arti!1525
2023-08-22 14:19:54 +00:00
Ian Jackson 403c931072 Merge branch 'upgrade_num_enum' into 'main'
Upgrade num_enum dependency to 0.7

See merge request tpo/core/arti!1530
2023-08-22 14:15:12 +00:00
Nick Mathewson 78328f096e Merge branch 'redundant_config_links' into 'main'
Resolve warnings about ambiguous/redundant doc links

See merge request tpo/core/arti!1531
2023-08-22 14:07:44 +00:00
Nick Mathewson 109efd3152 Merge branch 'hss_ct_from_parts' into 'main'
hsservice: Compute rendezvous points correctly.

See merge request tpo/core/arti!1521
2023-08-22 13:55:17 +00:00
Nick Mathewson b4e0595f8d proto: Add crossrefs between start_conversation and send_raw_msg 2023-08-22 09:35:48 -04:00
Nick Mathewson 80c921c637 Resolve warnings about ambiguous/redundant doc links
Nightly rustdoc now warns if you have a link that isn't necessary,
and if you have a link that might refer to two different things.
2023-08-22 09:00:33 -04:00
Nick Mathewson 1ddf637572 hsservice: Fix an error message. 2023-08-22 08:04:12 -04:00
Nick Mathewson 3352b1373b hsservice: Compute rendezvous points correctly.
This duplicates some code from hsclient as noted in the comments;
it might be good to reduce this, but the remaining nontrivial
duplication is small, and the logic flow is slightly different
because of the two-step process.
2023-08-22 08:04:12 -04:00
Nick Mathewson 3e34cb6d33 Lower linkspec list parsing into new tor-linkspec method. 2023-08-22 08:04:12 -04:00
Micah Elizabeth Scott 26b5ae9a3c hashx: Use a boxed slice for Program storage
This is a very small change that converts our Vec cheaply into a boxed
slice during program generation. Program generation speed shows no
changes, and there's no change when using compiled hashes, but is a
surprisingly effective 10% speedup to interpreted hash execution.

Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-08-21 15:27:28 -07:00
Micah Elizabeth Scott 52dbdbea3b hashx: Assembly buffer sizing and tidying
I was looking for ways to optimize out the many redundant capacity
checks in the Assembler. I didn't find any promising approaches, but
I also saw no evidence that it was an important bottleneck. (A simple
unsafe fix didn't improve any important metrics)

While I was in there, I tightened up the buffer size definitions for
both x86_64 and aarch64, and added assertions to test the limits we
set for the size of prologue, epilogue, and single instructions.

I kept some of the inlining and data type tweaks, even though benchmarks
show no difference. They seem like a step in the right direction, from
the disassembly at least.

Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-08-21 15:27:28 -07:00
Micah Elizabeth Scott acf598e785 hashx: avoid surprising overhead of enum code() method
This is a very simple change that avoids a surprising performance
pitfall: using the code() method on an enum from another crate
caused a non-inlined function call in code where we otherwise expect
a high level of compiler optimization. Replacing code() with a cast
to u8 avoids this function call and allows more intensive optimization
at the call site.

Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-08-21 15:27:28 -07:00
Micah Elizabeth Scott 0af908bcf2 hashx: Rearrange destination register validator for performance
This hoists a few decisions out of the innermost portions of
choose_dst_reg, by moving what we can out of dst_register_allowed.

Wallclock time benchmarks:
  generate-interp improves, -6.0%

Cachegrind benchmarks:
  generate_interp_1000x, -5.0% instructions, -11.6% L2 access, -6% RAM

Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-08-21 15:27:28 -07:00
Micah Elizabeth Scott ceacd5c988 hashx: New approach to avoid memcpy in Program
I was trying to eliminate all the places where we copied a Program
(about 4100 bytes) except for the one final copy into a Box; but that
approach was proving too annoying. Even returning a Program via Result
will cause multiple unnecessary copies that don't optimize out.

This patch switches approaches, and instead allocates a Vec<Instruction>
presized to the correct capacity. This allocation is made as early as
possible and retained for the lifetime of the program if necessary.
This means we'll never avoid a heap allocation, but we can always
avoid extra copies and we don't need a separate Box for interpreted
programs.

Performance effects are subtle. Overall wallclock time doesn't change
much. Cachegrind shows some accesses moving up from RAM to L2 cache.
Using GDB to probe memcpy sizes shows that large (>1024b) memcpy are now
totally gone in the generate-interp test.

Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-08-21 15:27:28 -07:00
Micah Elizabeth Scott ee6acfa5cd hashx: Rewrite RegisterSet again to reduce CPU frontend stalls
Closer inspection of the CPU counters showed that the branching in
RegisterSet::index() was a big problem, contributing to the overall
CPU frontend stall bottleneck in program generation.

This new version is less general, and closer to the appraoch used by
the original C implementation. We store a sorted ArrayVec of in-set
registers, and most operations construct the RegisterSet only once
using a combined filter predicate.

Choosing a register from a set is now cheaper in branches, instructions,
and L1 cache space. We now very rarely manipulate an entire RegisterSet
in any way other than by selecting a register randomly. (Just for the
register R5 special case.)

Wallclock time benchmarks:
  generate-interp improves, -7.0%
  generate-x86_64 improves, -7.2%

Cachegrind benchmarks:
  generate_interp_1000x, more total instructions run but a large
  decrease in frontend cache misses. +4.6% instructions, +11% L1
  accesses, -99% L2 access, -40% RAM access.

  generate_compiled_100x, +4.0% instructions, +9.4% L1 access.
  cache miss improvements: -57% L2 access, -25% RAM access.

Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-08-21 15:27:28 -07:00
Micah Elizabeth Scott e142fd9882 hashx: new RegisterWriter format handles more cases transparently
There was a special case in writer_pair_allowed for making add and
subtract equivalent. This patch changes RegisterWriter's encoding, using
per-opcode variants instead of per-format variants. The Add/Sub merge
can now happen earlier, when RegisterWriter is constructed.

Before and after RegisterWriter sizes are the same, at 8 bytes.
This patch removes many uses of Option<RegisterWriter> in favor
of using a new RegisterWriter::None default, and passes by value
rather than by reference.

Wallclock time benchmarks:
  generate-interp improves, -7.5%
  generate-x86_64 improves, -5.3%

Cachegrind benchmarks:
  generate_interp_1000x, negligible change in total instructions,
  improvement in cache footprint: -22.8% L2 accesses

Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-08-21 15:27:28 -07:00
Nick Mathewson e699607ef8 Upgrade num_enum dependency to 0.7 2023-08-21 13:47:08 -04:00
Nick Mathewson e4373e88e2 proto: new ClientCirc::send_raw_msg function.
Closes #1010.
2023-08-21 09:23:46 -04:00
Micah Elizabeth Scott 780e10e1d5 hashx/bench: Add cachegrind microbenchmarks
This uses the 'iai' crate and valgrind to measure fine grained cache
behavior during program generation and hash computation.

Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
2023-08-18 20:09:40 -07:00
gabi-250 15f9da4d0e Merge branch 'hss-err' into 'main'
tor-hsservice errors: Introduce more error types

See merge request tpo/core/arti!1515
2023-08-18 13:39:20 +00:00