Skip to main content

trillium bench

A load generator that reports latency as an HDR-histogram summary. It runs in two modes: closed-loop (a fixed pool of connections, each firing the next request as soon as the previous completes) and open-loop (requests scheduled at a target arrival rate, independent of how fast they complete).

trillium bench <URL> [OPTIONS]

Closed-loop (default)

By default, bench opens 50 concurrent connections and runs for 10 seconds:

trillium bench https://localhost:8080

Tune the concurrency and stopping condition:

trillium bench https://localhost:8080 -c 200 -d 30s # 200 connections for 30s
trillium bench https://localhost:8080 -c 100 -n 1000000 # 100 connections, 1M requests total
FlagDefaultNotes
-c, --connections50concurrent connections
-d, --duration10srun for this long (e.g. 10s, 1m, 30s500ms)
-n, --requestsstop after this many requests (closed-loop only)

--duration and --requests are mutually exclusive; with neither, bench runs for 10 seconds.

Open-loop

Passing -r / --rate switches to open-loop scheduling: requests are launched at a fixed offered rate (requests per second) regardless of how quickly the server responds. This is the mode for measuring latency under a known load, since it doesn't let a slow server throttle the offered rate (avoiding "coordinated omission").

trillium bench https://localhost:8080 --rate 5000 --duration 30s
trillium bench https://localhost:8080 --rate 5000 --pacing poisson
FlagDefaultNotes
-r, --ratetarget arrival rate (req/s); enables open-loop
--pacinguniformuniform (fixed interval) or poisson (exponential gaps)
--max-concurrencyhard cap on in-flight requests; excess are dropped as saturation

When the server can't keep up with the offered rate, scheduled requests that would exceed --max-concurrency are counted as saturation drops in the report — a direct signal that you've found the server's ceiling.

Request shape

bench shares the client flags for method, headers, body, TLS, and HTTP version:

trillium bench https://api.example.com/items -m POST \
-H Content-Type=application/json -b '{"q":"test"}'

trillium bench https://api.example.com/upload --body-size 4kb # synthetic body
FlagDefaultNotes
-m, --methodGETHTTP method
-H, --headersKEY=VALUE, repeatable
-f, --filerequest body from a file
-b, --bodyinline request body
--body-sizesynthesize a zero-filled body of this size (4kb, 1mb)
--http-version1.10.93
-t, --tlsrustlsTLS backend
--no-keepalivedisable HTTP/1.1 connection reuse

Warmup and timeout

trillium bench https://localhost:8080 -d 1m --warmup 5s --timeout 2s
FlagNotes
-w, --warmupdiscard statistics collected during this initial period
--timeoutper-request timeout

--warmup lets connection pools and JITs settle before measurement begins, so the histogram reflects steady state rather than cold-start latency.

Reading the report

When stdout is a terminal, bench shows a live progress bar during the run and then prints a report with these sections:

  • Summary — elapsed time, completed/succeeded counts, request throughput (req/s), and bytes sent/received with receive throughput.
  • Status codes — a count per HTTP status, colored by class.
  • Errors — counts bucketed into io, timeout, protocol, other, plus saturation drops (open-loop).
  • Latency (full response) and Latency (TTFB) — HDR-histogram percentiles (min, mean, p50, p75, p90, p95, p99, p99.9, max, stdev). TTFB is time-to-first-byte.
  • Open-loop queue wait — in open-loop mode, how long scheduled requests waited for a free slot (a second saturation signal).

Machine-readable output

trillium bench https://localhost:8080 --json > report.json
trillium bench https://localhost:8080 --csv timings.csv
FlagNotes
--jsonemit the full report as JSON to stdout (suppresses the bar)
--csv <PATH>write per-request timing samples (scheduled/started offsets, queue, TTFB, total, status, bytes) to a CSV file
--no-progresssuppress the live progress display even on a tty

The CSV captures one row per request, suitable for plotting latency over time or post-hoc percentile analysis.

Tuning the client's HTTP layer

For squeezing the client side, bench exposes a few trillium_http::HttpConfig knobs. These are rarely needed; reach for them only when the client itself is the bottleneck.

--response-buffer-len <BYTES>
--response-buffer-max-len <BYTES>
--head-max-len <BYTES>
--copy-loops-per-yield <N>
--received-body-max-len <BYTES>

Full flag reference

trillium bench [OPTIONS] <URL>

Options:
-m, --method <METHOD> [default: GET]
-c, --connections <CONNECTIONS> [default: 50]
-d, --duration <DURATION> (conflicts with --requests)
-n, --requests <REQUESTS> (conflicts with --duration)
-r, --rate <RATE> target req/s; switches to open-loop
--pacing <PACING> [default: uniform] (uniform | poisson)
--max-concurrency <N>
-w, --warmup <WARMUP>
--timeout <TIMEOUT>
-H, --headers <HEADERS> KEY=VALUE, repeatable
-f, --file <FILE>
-b, --body <BODY>
--body-size <BODY_SIZE>
--http-version <HTTP_VERSION> [default: 1.1]
-t, --tls <TLS> [default: rustls]
--no-keepalive
--json
--csv <CSV>
--no-progress
-v, --verbose...
-q, --quiet...
-h, --help