bv.quantile
Approximate quantile of a numeric field over a window or lifetime, backed by DDSketch.
Signature
bv.quantile(
field: str,
q: float,
*,
window: str | None = None,
where: bv.Col | None = None,
exact_threshold: int = 256,
hybrid_alpha: float = 0.01,
) -> AggDescriptor
Previously called
bv.percentile. Renamed toquantileper ADR-002 for Polars-convention consistency. The old name remains as a deprecation alias in v0.0.x and is removed in v0.1.
Description
bv.quantile estimates the q-th quantile (0 ≤ q ≤ 1) of a numeric field
across events that match the optional where= predicate. Backed by a hybrid
exact-then-DDSketch state: while the entity has fewer than exact_threshold
distinct values, the state holds them exactly and returns the precise q-th
order statistic. Once the threshold is crossed, the state promotes to a
DDSketch with relative accuracy hybrid_alpha (default 0.01 = 1% relative
error).
Returns a single f64 per entity. Use bv.quantile(field="amount", q=0.99, window="1h") for "the 99th-percentile transaction amount over the last hour".
Pair with bv.mean and bv.std when
fraud rules need both central-tendency and tail behavior.
The hybrid mode is transparent at the API: callers always read a single
float. Promotion happens server-side without observable behavior change
beyond a small accuracy tradeoff. bv.quantile belongs to the sketch
family and is BoundedSketch per Phase 12.8 V0-MEM-GOV-02 — fixed structural cap regardless of stream length.
Parameters
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
field |
str |
Yes | — | Name of the numeric field (f64 or i64) to take the quantile of. |
q |
float |
Yes | — | The quantile to compute, in [0, 1]. 0.5 is the median; 0.99 is the 99th percentile. |
window |
str |
No | None (lifetime) |
Duration string matching \d+(ms|s|m|h|d) or "forever". |
where |
bv.Col |
No | None |
Boolean expression on event fields; only matching events contribute. |
exact_threshold |
int |
No | 256 |
Distinct-value count below which exact mode is used. |
hybrid_alpha |
float |
No | 0.01 |
DDSketch relative-error parameter once promoted (0.01 = 1% relative error). |
Returns
A single f64. When the entity has seen zero matching events, the result is
null (returned as Python None).
Complexity
| Resource | Bound |
|---|---|
| CPU per event | Tier 2 (Exact mode, ~8 ns floor / ~35 ns measured) — see cost-class.md |
| Tier 3 (DDSketch mode, post-promotion, ~130 ns floor / ~180 ns measured) — see cost-class.md | |
| Memory per entity | BoundedSketch — exact array up to exact_threshold entries, then DDSketch buckets (~few KB max) regardless of stream length |
Lifetime mode (window=None) |
Allowed — BoundedSketch per Phase 12.8 V0-MEM-GOV-02 |
Examples
Example 1: 99th-percentile transaction amount per user, hourly
import beava as bv
@bv.event
class Txn:
user_id: str
amount: float
@bv.table(key="user_id")
def UserAmountStats(txn) -> bv.Table:
return (
txn.group_by("user_id")
.agg(p99_amount_1h=bv.quantile("amount", q=0.99, window="1h"))
)
# Push events
app.push("Txn", {"user_id": "alice", "amount": 12.50})
app.push("Txn", {"user_id": "alice", "amount": 1500.00})
# Query
result = app.get("UserAmountStats", "alice")
# result == {"p99_amount_1h": 1500.0}
Example 2: Median latency for successful events only
@bv.table(key="user_id")
def LatencyP50(reqs) -> bv.Table:
return (
reqs.group_by("user_id")
.agg(median_lat_ok=bv.quantile("latency_ms",
q=0.5,
window="5m",
where=bv.col("status") == 200))
)
Wire
JSON wire form in a register payload:
{
"kind": "derivation",
"name": "UserAmountStats",
"output_kind": "table",
"key": ["user_id"],
"agg": {
"p99_amount_1h": {
"op": "quantile",
"params": {
"field": "amount",
"q": 0.99,
"window": "1h",
"exact_threshold": 256,
"hybrid_alpha": 0.01
}
}
}
}
See examples/wire/register-fraud-team.request.json for a full pipeline example (uses quantile for tx_p99_1h).
Edge cases
- No matching events / cold-start: result is
null. (Cold-start returns{}for the row, not an error.) qout of range: values outside[0, 1]raiseRegistrationError(code="aggregation_invalid_param")at register time.- Non-numeric field: schema validation rejects at register time with structured error code
schema_mismatch. - NaN inputs: poisons the sketch state in DDSketch mode (NaN routes to a non-finite bucket); filter with
where=~bv.col("amount").isnull()if your source can emit NaN. - Lifetime mode (
window=None): explicitly allowed — DDSketch isBoundedSketchper Phase 12.8 V0-MEM-GOV-02. - Hybrid promotion: transparent. Callers cannot observe whether the entity is in exact or sketch mode (intentional). Promotion happens once the entity has seen
exact_thresholddistinct values. exact_thresholdlowered to 0: equivalent to "always sketch mode"; useful only for explicit memory tuning.
See also
- cost-class.md — performance tier (Tier 2 exact / Tier 3 sketch)
- bv.entropy — also a
BoundedSketchfor distributions - bv.top_k — heavy-hitters companion
- bv.n_unique — cardinality companion
- bv.min / bv.max — extreme-value companions (Tier 1)
- pipeline-dsl/compilation-rules.md — chain compilation rules