Capacity & autoscaling
Size instances, replicas, and pools — and set scale targets — to real expected volume instead of guesswork.
Open specification · v1.0 · vendor-neutral
How often a piece of code runs, and how much it handles, is something the team knows but the codebase never records. Expected Load writes it down — in a comment next to the code, one uniform shape across every language — so any tool can plan capacity, generate load tests, set alerts, estimate cost, and more from a single source of truth.
/**
* @expected-load
* monthly_requests: 5_000_000
* request_duration_ms: 40
*/
export function handler(req: Request): Response {
// ...
}
Expected Load records a single fact — how often code runs and how much it handles. That fact feeds a whole range of tools, each reading the declaration it cares about. Cost is just one of them.
Size instances, replicas, and pools — and set scale targets — to real expected volume instead of guesswork.
Generate realistic load, spike, and soak scenarios, and assert latency budgets, straight from the declared numbers.
Derive monitoring baselines and anomaly thresholds, and rank what matters most for on-call.
Propagate volume through the call graph to surface hotspots, fan-out, and amplification before they bite.
Estimate spend, forecast budgets, and see the cost impact of a change while it's still in review.
Turn the same load into an energy and carbon estimate, and inform where workloads should run.
The specification defines the data; independent tools provide each of these — one declaration, read many ways.
The numbers travel with the code — same pull request, same history, same review.
They sit next to the relevant code, not in a parallel file someone forgets to update.
One grammar across languages. No proprietary format, no lock-in — any tool can parse it.
It's a comment. No dependency, no annotation library, no runtime cost, no new files.
A marker — expected-load or @expected-load,
case-insensitive — introduces a declaration. Fields follow in
block form (one per line) or inline form (one line).
Only the surrounding comment framing changes by language.
# expected-load:
# monthly_requests: 5_000_000
# request_duration_ms: 40
resource "example_function" "api" {
name = "api"
}
/**
* @expected-load
* monthly_calls: 100_000
* avg_input_tokens: 1_200
*/
export async function summarise(text: string) {}
# expected-load:
# monthly_calls: 100_000
# avg_input_tokens: 1_200
def summarise(text: str) -> str:
return text[:280]
// expected-load:
// monthly_requests: 5_000_000
// request_duration_ms: 40
func APIHandler(w http.ResponseWriter, r *http.Request) {}
/**
* @expected-load
* monthlyRequests = 5_000_000
* requestDurationMs = 35
*/
public void handle() {}
/// expected-load:
/// monthly_calls: 100_000
/// avg_input_tokens: 1_200
pub fn summarise(text: &str) -> String {}
# expected-load:
# monthly_requests: 5_000_000
# request_duration_ms: 40
kind: Deployment
metadata:
name: api
Block form puts each field on its own line. Inline form puts them on one:
expected-load monthly_requests=5_000_000 request_duration_ms=40.
Keys normalize to snake_case, so
monthly-requests, monthlyRequests and
monthly_requests are the same field.
Values are integers; _ and , are allowed
as digit separators — write 5_000_000, not
5000000.
Add confidence, source and
last_updated to record how trustworthy the numbers are
and when they were checked.
A small, stable core so independent producers and consumers agree on the common cases. The vocabulary is open — consumers may define more, and must tolerate fields they don't recognize.
| Field | Unit | Meaning |
|---|---|---|
monthly_requests | req / month | Invocations per month (HTTP, queue messages, function calls). |
request_duration_ms | ms | Average wall-clock duration of one request. |
storage_gb | GB | Average data at rest. |
monthly_data_processed_gb | GB / month | Data transferred or processed per month. |
| Field | Unit | Meaning |
|---|---|---|
monthly_calls | calls / month | Model or API invocations per month. |
avg_input_tokens | tokens | Average input (prompt) tokens per call. |
avg_output_tokens | tokens | Average output (completion) tokens per call. |
avg_conversation_turns | turns | Average turns per conversation, for multi-turn workloads. |
| Field | Unit | Meaning |
|---|---|---|
requests_per_active_minute | req / min | Request rate while the workload is active — captures burstiness a monthly total misses. |
| Field | Values | Meaning |
|---|---|---|
version | integer (default 1) | Specification major version targeted. |
confidence | low · medium · high | How sure the author is of the numbers. |
source | manual · observed · estimated | Where the numbers came from. |
last_updated | ISO 8601 date | When the numbers were last reviewed. |
Whatever a tool does with the data — a capacity planner, a load-test generator, a cost estimator, a linter — it reads a declaration in the same five steps, with no host-language parser required.
#, //, /** … */, ///).snake_case; values to integers (ignoring _/,).The full processing model, diagnostics, and ABNF grammar are in the specification.