M/M/1 Tail Latency Estimator

Percentiles for Wq (wait) and W (time-in-system) under M/M/1 assumptions.

home · tools · mm1 means

Inputs

Arrival rate λ

requests / second

Units

Service input

Display time unit

requests / second

S (ms/request)

Percentiles (comma-separated) Rounding (significant digits)

Important: This is a math model, not a guarantee. Real systems have timeouts, burstiness, batching, priority queues, retries, non-exponential service, and multi-stage pipelines.

Outputs

Utilization ρ = λ/μ

Stability requires ρ < 1

Mean W (time in system)

Mean Wq (wait time)

Percentiles

p	Wq (wait)	W (system)

How this is computed

For M/M/1 with λ<μ:

Sojourn time W is exponential with rate μ−λ: P(W > t) = e^{-(μ-λ)t}.
Wait time Wq has an atom at 0: P(Wq=0)=1-ρ, and for t>0: P(Wq > t)=ρ·e^{-(μ-λ)t}.
Thus percentiles use closed forms (with a flat region at 0 for Wq when p ≤ 1-ρ).