arti/doc/dev/testing/Profiling.md

4.1 KiB

Arti profiling methodology

This document describes basic tools for profiling Arti's CPU and memory usage. Not all of these tools will make sense for every situation, and we may want to switch them in the future. The main reason for recording them here is so that we don't have to re-learn how to use them the next time we need to do a big round of profiling tests.

Building for profiling

When you're testing with cargo build --release, use CARGO_PROFILE_RELEASE_DEBUG=true to include extra debugging information for better output.

Profiling tools

Here I'll talk about a few tools for measuring CPU usage, memory usage, and the like. For now, I'll assume you're on a reasonably modern Linux environment: if you aren't, you'll have to do some stuff differently.

I'll talk about particular scenarios to profile in the next major section.

cargo flamegraph

cargo-flamegraph is a pretty quick-and-easy event profiling visualization tool. It produces nice SVG flamegraphs in a variety of pretty colors. As with all flamegraphs, these are better for visualization than detailed drill-down. On Linux, cargo-flamegraph uses perf under the hood.

To install, make sure you have a working version of perf installed. Then run cargo install flamegraph.

Basic usage:

flamegraph {command}

Output: flamegraph.svg

Also consider using the --reverse flag, to reverse the stack and see the lowest-level functions that get the most use.

tcmalloc and pprof

This can generate usage graphs showing who allocated your memory when. (It can get a bit confusing in Rust.)

HEAPPROFILE=/tmp/heap.hprof \
 LD_PRELOAD=/usr/lib64/libtcmalloc_and_profiler.so \
 {command}
pprof --pdf --inuse_space {binary} /tmp/heap.hprof > heap.pdf

You might need a longer timeout with this one; it's nontrivial.

valgrind --massif

This tool can also generate usage graphs like pprof above.

valgrind --tool=massif {command}

It will generate a file called massif.out.PID. You can view it with the ms_print tool (included with valgrind) or the massif-visualizer tool (installed separately, highly recommended.)

Some commands to profile

These should generally run against a chutney network whenever possible; the ARTI_CONF envvar should be set to e.g. $(pwd)/chutney/net/nodes/arti.toml.

Bootstrapping a directory

arti-testing bootstrap -c ${ARTI_CONF}

(This test bootstraps only. It might make sense to do this one on the real network, since its data is more complex. You need to start with an empty set of state files for this to test bootstrapping instead of loading.)

Large number of circuits, focusing on circuit construction

Bootstrap outside of benchmarking, then run:

arti-bench -u 1 -d 1 -s 100 -C 20 -p 1 -c ${ARTI_CONF}

(100 samples, 20 circuits per sample, 1 stream per circuit, only 1 byte to upload or download.)

Note that this test won't necessarily tell you so much about path construction, since path construction on a large real network with different weights, policies, and families is more complex than on a chutney network.

(just times out with chutney; directory changes too fast, I think.)

Running offline

Also

  • Bootstrapping failure conditional
  • Going offline
  • Primary guards go down after bootstrap

(See HowToBreak.md)

Data transfer

arti-bench -s 20 -C 1 -p 1 {...}

(No parallelism, 10 MB up and down.)

Data transfer with many circuits

arti-bench -s 1 -C 64 -p 1 -c ${ARTI_CONF}

(Circuit parallelism only, 10 mb up and down)

Data transfer with many streams

arti-bench -s 1 -C 1 -p 64 -c ${ARTI_CONF}

(Stream parallelism only, 10 mb up and down)

Huge number of simultaneous connection attempts

arti-bench -s 1 -C 16 -p 16 -c ${ARTI_CONF}

(stream and circuit parallelism)

TODO

arti-bench:

  • take a target address as a string.
  • Allow -p 0 to build a circuit only?
  • Some way to build a path only?

Extract chutney boilerplate.

arti-testing:

  • ability to make connections aggressively simultaneous