bv.seasonal_deviation
Z-score of the most recent event's value against this entity's hour-of-day baseline.
Signature
bv.seasonal_deviation(
field: str,
*,
where: bv.Col | None = None,
) -> AggDescriptor
Description
bv.seasonal_deviation returns the z-score of the most recent observation
of field against this entity's running per-hour baseline. The state
maintains 24 (count, sum, sum_sq) HourBucket triples — one per UTC
hour — and updates the bucket at index hour_of_day(now_ms) on every
matching event. The query computes
(latest_value − bucket.mean) / bucket.stddev for the bucket of the most
recent event. Use it for "is this transaction anomalously large for the
hour at which it landed?" — features that detect unusual-hour activity
where the same value would be normal in the entity's typical hour profile
but stands out in the current hour.
The 24 hourly buckets are a structural cap, so bv.seasonal_deviation
qualifies as O(1) per entity under the
V0-MEM-GOV-02 lifetime-aggregation
contract — no required register-time kwarg, no fallback default. State per
entity is [HourBucket; 24] + Option<(f64, usize)> for the latest
observation: 24 × 24 bytes + 16 bytes ≈ 600 bytes. Phase 12.9 boxed
SeasonalDeviationState so the AggOp::SeasonalDeviation variant fits the
80-byte enum cap (the state itself lives on the heap behind a Box); see
crates/beava-core/src/agg_op.rs line 482 and
Phase 12.9 SUMMARY.
bv.seasonal_deviation belongs to the bounded-buffer family — it
shares state-shape with bv.hour_of_day_histogram
but tracks variance, not just counts. Per-event update is Tier 1 (~10 ns
floor / ~30 ns measured per cost-class.md) — three FP
ops on the bucket plus the latest-observation memo. The query path
computes one mean, one variance via E[X²] − E[X]² with Bessel correction,
one sqrt, and one division. There is no window= kwarg in v0 —
seasonal_deviation is lifetime-only. For a "seasonal z-score over the
last 30 days" view, compose with @bv.event(cold_after="30d") per
V0-MEM-GOV-01.
Parameters
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
field |
str |
Yes | — | Numeric field whose hour-of-day baseline is tracked. f64 or i64. |
where |
bv.Col |
No | None |
Boolean expression on event fields; only matching events update the bucket and record the latest value. |
State invariants (informational; not user-tunable):
- 24 hourly buckets — one per UTC hour
0..23(UTC, notimezone=kwarg in v0). - Per-bucket Welford-incompatible textbook variance via running
(n, sum, sum_sq)(single-pass, adequate for v0; subject to catastrophic cancellation only at extreme magnitudes).
Returns
A f64 z-score for the most recent matching event, or null (Python
None) when the entity has not yet accumulated enough data to compute one.
Specifically, the result is null when:
- No matching event has ever been observed (cold-start).
- The bucket for the most recent event has fewer than 2 observations (variance undefined).
- The bucket variance is exactly
0.0(degenerate — every prior observation in this hour was identical).
Complexity
| Resource | Bound |
|---|---|
| CPU per event | Tier 1 (~10 ns floor / ~30 ns measured — n += 1; sum += v; sum_sq += v*v on the indexed bucket) — see cost-class.md |
| Memory per entity | O(1) — [HourBucket; 24] + Option<(f64, usize)> ≈ 600 bytes per Phase 12.8 V0-MEM-GOV-02. Boxed inside AggOp per Phase 12.9 to fit the 80-byte enum cap |
| Lifetime mode | Required — bv.seasonal_deviation has no window= kwarg in v0; lifetime is the only mode |
Examples
Example 1: Anomalous-amount detection per user
import beava as bv
@bv.event
class Txn:
user_id: str
amount: float
@bv.table(key="user_id")
def UserAmountSeasonality(txn) -> bv.Table:
return (
txn.group_by("user_id")
.agg(amount_z_for_hour=bv.seasonal_deviation("amount"))
)
# After many transactions building a per-hour baseline,
# a sudden $5000 charge at 03:00 UTC for a user whose 03:00 history is small
# returns a positive z-score:
result = app.get("UserAmountSeasonality", "alice")
# result == {"amount_z_for_hour": 4.2}
Example 2: Successful-only request-size seasonality per endpoint
@bv.table(key="endpoint")
def EndpointSizeSeasonality(reqs) -> bv.Table:
return (
reqs.group_by("endpoint")
.agg(size_z=bv.seasonal_deviation("size_bytes",
where=bv.col("status") == 200))
)
Wire
JSON wire form in a register payload:
{
"kind": "derivation",
"name": "UserAmountSeasonality",
"output_kind": "table",
"key": ["user_id"],
"agg": {
"amount_z_for_hour": {
"op": "seasonal_deviation",
"params": {
"field": "amount"
}
}
}
}
See examples/wire/register-fraud-team.request.json for a full payload example.
Edge cases
- Empty stream / cold-start: result is
null— there is no latest observation to score. - Single observation in the current bucket: result is
null— variance requiresn ≥ 2. - Degenerate bucket (variance = 0): result is
null— every prior observation in this hour was identical, so no spread to z-score against. - Non-numeric source field (
Value::Str,Value::Bool): event silently dropped (the bucket is not updated, the latest-observation memo is not written). - NaN inputs: dropped —
numeric_from_rowreturnsNoneon non-F64/I64payloads. - Pre-1970 events (
now_ms< 0): the hour index usesrem_euclid, so negativenow_msstill maps to a valid hour (no panic, no wraparound). window=kwarg attempted: raisesTypeErrorat SDK-helper-call time. For a "seasonal z-score over the last N days" use@bv.event(cold_after="...")to bound the lifetime via per-entity TTL.- Catastrophic cancellation at extreme magnitudes: the textbook
E[X²] − E[X]²formula can lose precision whensum²/nandsum_sqare nearly equal. v0 accepts this trade-off (single-pass + atomic-friendly) over a Welford rewrite. v0.1+ may switch to Welford if precision becomes a real-workload issue. - UTC-only: the bucket index is computed against UTC. There is no
timezone=kwarg in v0; if you need local-hour bucketing, derive alocal_hourcolumn and usebv.zscoreon the per-hour split viawhere=bv.col("local_hour") == X. - Out-of-order event-time: does not matter. beava is processing-time-only per
project_redis_shaped_no_event_time_ever; the bucket is keyed on server arrival timenow_ms. - Lifetime mode: the only mode. Per-entity memory is fixed at ~600 bytes per V0-MEM-GOV-02.
See also
- cost-class.md — performance tier (Tier 1)
- bv.hour_of_day_histogram — count-only sibling (same hour-of-day index, no value tracking)
- bv.dow_hour_histogram — 168-bin weekly cousin (counts only)
- bv.z_score — entity-level z-score against a single rolling baseline (no per-hour split)
- bv.ew_zscore — drift-aware z-score with exponential decay (no per-hour split)
- V0-MEM-GOV-01 — cold-entity eviction (
@bv.event(cold_after=...)) for windowing the lifetime - V0-MEM-GOV-02 —
O(1)lifetime-aggregation contract - pipeline-dsl/compilation-rules.md — chain compilation rules