trillium bench
A load generator that reports latency as an HDR-histogram summary. It runs in two modes: closed-loop (a fixed pool of connections, each firing the next request as soon as the previous completes) and open-loop (requests scheduled at a target arrival rate, independent of how fast they complete).
trillium bench <URL> [OPTIONS]
Closed-loop (default)
By default, bench opens 50 concurrent connections and runs for 10 seconds:
trillium bench https://localhost:8080
Tune the concurrency and stopping condition:
trillium bench https://localhost:8080 -c 200 -d 30s # 200 connections for 30s
trillium bench https://localhost:8080 -c 100 -n 1000000 # 100 connections, 1M requests total
| Flag | Default | Notes |
|---|---|---|
-c, --connections | 50 | concurrent connections |
-d, --duration | 10s | run for this long (e.g. 10s, 1m, 30s500ms) |
-n, --requests | stop after this many requests (closed-loop only) |
--duration and --requests are mutually exclusive; with neither, bench
runs for 10 seconds.
Open-loop
Passing -r / --rate switches to open-loop scheduling: requests are launched
at a fixed offered rate (requests per second) regardless of how quickly the
server responds. This is the mode for measuring latency under a known load,
since it doesn't let a slow server throttle the offered rate (avoiding
"coordinated omission").
trillium bench https://localhost:8080 --rate 5000 --duration 30s
trillium bench https://localhost:8080 --rate 5000 --pacing poisson
| Flag | Default | Notes |
|---|---|---|
-r, --rate | target arrival rate (req/s); enables open-loop | |
--pacing | uniform | uniform (fixed interval) or poisson (exponential gaps) |
--max-concurrency | hard cap on in-flight requests; excess are dropped as saturation |
When the server can't keep up with the offered rate, scheduled requests that
would exceed --max-concurrency are counted as saturation drops in the
report — a direct signal that you've found the server's ceiling.
Request shape
bench shares the client flags for method, headers, body, TLS,
and HTTP version:
trillium bench https://api.example.com/items -m POST \
-H Content-Type=application/json -b '{"q":"test"}'
trillium bench https://api.example.com/upload --body-size 4kb # synthetic body
| Flag | Default | Notes |
|---|---|---|
-m, --method | GET | HTTP method |
-H, --headers | KEY=VALUE, repeatable | |
-f, --file | request body from a file | |
-b, --body | inline request body | |
--body-size | synthesize a zero-filled body of this size (4kb, 1mb) | |
--http-version | 1.1 | 0.9–3 |
-t, --tls | rustls | TLS backend |
--no-keepalive | disable HTTP/1.1 connection reuse |
Warmup and timeout
trillium bench https://localhost:8080 -d 1m --warmup 5s --timeout 2s
| Flag | Notes |
|---|---|
-w, --warmup | discard statistics collected during this initial period |
--timeout | per-request timeout |
--warmup lets connection pools and JITs settle before measurement begins, so
the histogram reflects steady state rather than cold-start latency.
Reading the report
When stdout is a terminal, bench shows a live progress bar during the run and
then prints a report with these sections:
- Summary — elapsed time, completed/succeeded counts, request throughput (req/s), and bytes sent/received with receive throughput.
- Status codes — a count per HTTP status, colored by class.
- Errors — counts bucketed into
io,timeout,protocol,other, plussaturation drops(open-loop). - Latency (full response) and Latency (TTFB) — HDR-histogram
percentiles (
min,mean,p50,p75,p90,p95,p99,p99.9,max,stdev). TTFB is time-to-first-byte. - Open-loop queue wait — in open-loop mode, how long scheduled requests waited for a free slot (a second saturation signal).
Machine-readable output
trillium bench https://localhost:8080 --json > report.json
trillium bench https://localhost:8080 --csv timings.csv
| Flag | Notes |
|---|---|
--json | emit the full report as JSON to stdout (suppresses the bar) |
--csv <PATH> | write per-request timing samples (scheduled/started offsets, queue, TTFB, total, status, bytes) to a CSV file |
--no-progress | suppress the live progress display even on a tty |
The CSV captures one row per request, suitable for plotting latency over time or post-hoc percentile analysis.
Tuning the client's HTTP layer
For squeezing the client side, bench exposes a few
trillium_http::HttpConfig knobs. These are
rarely needed; reach for them only when the client itself is the bottleneck.
--response-buffer-len <BYTES>
--response-buffer-max-len <BYTES>
--head-max-len <BYTES>
--copy-loops-per-yield <N>
--received-body-max-len <BYTES>
Full flag reference
trillium bench [OPTIONS] <URL>
Options:
-m, --method <METHOD> [default: GET]
-c, --connections <CONNECTIONS> [default: 50]
-d, --duration <DURATION> (conflicts with --requests)
-n, --requests <REQUESTS> (conflicts with --duration)
-r, --rate <RATE> target req/s; switches to open-loop
--pacing <PACING> [default: uniform] (uniform | poisson)
--max-concurrency <N>
-w, --warmup <WARMUP>
--timeout <TIMEOUT>
-H, --headers <HEADERS> KEY=VALUE, repeatable
-f, --file <FILE>
-b, --body <BODY>
--body-size <BODY_SIZE>
--http-version <HTTP_VERSION> [default: 1.1]
-t, --tls <TLS> [default: rustls]
--no-keepalive
--json
--csv <CSV>
--no-progress
-v, --verbose...
-q, --quiet...
-h, --help