Rate limiting and stable error codes in our Tasks API

When we built Sanctum Tasks, we built it for one specific audience: programs, not people. Every design decision — from the API structure to the error handling — was made with the assumption that the caller would be an AI agent making thousands of requests, not a human clicking through a dashboard. That's a fundamental shift in how you think about API design. Your errors can't be user-friendly — they have to be integrator-friendly. Your limits can't be intuitive — they have to be predictable.

Integrators Are Agents, Not Humans

When a human uses an API, they expect graceful failures. If something goes wrong, they'd rather see "Oops, something went wrong — try again" than a raw HTTP status. Humans can retry manually. They can figure out what they did wrong. But an agent has none of that flexibility.

When your caller is an AI agent, the API contract is everything. The code can't negotiate with your API, can't interpret a "friendly" error message, and can't try again unless you tell it to. Every error has to tell the agent exactly what happened and what to do next. That's why we've been deliberate about error codes since day one.

We've designed our Tasks API around one principle: every error response tells the caller exactly what happened and what to do about it. No guessing, no interpretation, no "an error occurred" messages that leave the agent stuck.

Why 429 Beats Silent Failure

The simplest example is rate limiting. Every API has limits — we limit requests per minute to keep the service stable. But here's the design question: what do you return when someone hits the limit?

The wrong answer is to return success anyway. That's silent failure, and it's catastrophic for agents because they'd believe the request succeeded and move on. The second wrong answer is to return an error that looks like a request problem — like a 400 — because the agent might interpret that as something it did wrong and never retry.

The right answer is 429 Too Many Requests. It's explicit, it's standard, it tells the caller exactly what's happening: rate limit hit, back off. It also tells the caller where to look in the response for retry information (our Retry-After header), giving the agent everything it needs to proceed correctly.

We also return 429 for any limit, not just requests per minute. If a bulk create operation exceeds our batch size, we return 429 rather than some custom error. This keeps the error taxonomy simple — 429 always means back off — and the agent can implement a single retry strategy without checking multiple error codes.

Our Error Code Taxonomy

Every error from the Tasks API falls into one of these categories. We're deliberate about this because we want integrators to build reliable handlers without guessing:

400 Bad Request — the request was malformed or invalid. The caller made a syntax error, the JSON was wrong, or the payload didn't match what the API expects. For agents: fix the request format and retry.

401 Unauthorized — the API key was missing or invalid. For agents: check your API key, it's probably wrong. Don't retry without a new key.

403 Forbidden — the API key was valid but not permitted to perform that operation. This is different from 401 because the caller is authenticated but not allowed. For example, a read-only key trying to create a task. Don't retry — the operation is forbidden.

404 Not Found — the requested resource doesn't exist. For agents: this is usually a bug in your request — check the ID or path. Don't retry.

429 Too Many Requests — rate limit hit. For agents: stop immediately, read the Retry-After header, wait that long, then retry exactly once.

500 Internal Server Error — something went wrong on our side. For agents: something is wrong with the server. This is the only error where we recommend exponential backoff — retry with increasing intervals, and after several failures, escalate.

That covers every possible error from the API. No extra codes, no subtle distinctions, no custom error strings that vary by endpoint. Agents can handle everything with a simple switch.

What We Log vs Return

There's a difference between what happens internally and what we return. We log extensively for debugging — every request logs the full request data, all parameters, the user key (not the key itself but an identifier), timestamps, and everything we need to reproduce the error. That all stays on our side, useful when someone reports a problem.

What we return to the caller is much simpler: just the status code, a brief message like "rate limit exceeded" or "task not found," and any headers required by the API specification (like Retry-After on 429). We don't return full stack traces, database errors, or internal configuration in the response. That's a security practice and an API design practice: error messages are for integrators debugging their code, not for reading our server internals.

There's one exception: on 500 errors, we return a correlation ID in the response. It's a short string that the caller can include when reporting the problem to us. We can look up the full error on our side using that ID.

Example Failure Mode: Bulk Create

Here's a concrete example of why this matters. Our create-task endpoint accepts a batch format — you can create multiple tasks in a single request. This is useful for agents that want to set up work in a batch.

But that endpoint has limits: we cap at ten tasks per request. What happens when an agent tries to create eleven? We could return success for ten and silently drop the eleventh. That's terrible — the agent would think all eleven tasks were created and proceed with incomplete work.

We could return a 400 error on the whole request. But that forces the agent to break its batch into two requests — not great for its workflow.

We return 429. The agent sees rate limit exceeded, respects the Retry-After (which we set to zero with a note to immediately retry with a smaller batch), and retries with ten tasks. It recovers correctly, the agent doesn't get stuck, and no work is lost.

That's a design that works for agents in production. Every error is recoverable or clearly not-recoverable.

The Bigger Picture

Everything we've done on the API — the error codes, the rate limiting, the idempotent operations, the predictable responses — is about running a service that agents can depend on. We're not building a human-facing tool that needs user-friendly errors. We're building a tool that agents can reliably integrate into their own pipelines.

That's the philosophy: every decision is made with the assumption that the caller isn't going to read an error message and think. It's going to parse the status code, handle it correctly, and move on. If that's not possible with our API, we've failed.

If you're building APIs for agents, we'd encourage you to apply the same principle: make all errors machine-readable, keep error code sets small and stable, and treat every failure mode as a contract.

We run this in production every day. If you want the same capability for your program, or help turning ops data into a clear decision, get in touch.