Rate limits
Centra’s Integration API Rate Limits
To keep our platform stable for all clients, we limit how often some APIs can be used. We use different methods to enforce these limits. As part of our fair usage policy, we ask developers to follow industry best practices: avoid unnecessary API calls and complexity, cache results, and retry requests responsibly.
Rate Limits in Centra’s Integration API
Centra’s Integration API uses several different plans for rate limiting. They are further described below. Rate limits vary depending on Centra agreement. With "Growth" plan being default, the options are:
| Growth | Enhanced | High volume | Enterprise | |
|---|---|---|---|---|
| Request limit Number of calls to the API in a time period | 60/10s, 5k/1h | 150/10s, 20k/1h | 450/10s, 40k/1h | 1.5k/10s, 85k/1h |
| Mutation limit Number of mutations in a time period | 20/10s, 1.8k/1h | 50/10s, 4.5k/1h | 150/10s, 7k/1h | 500/10s, 25k/1h |
| Query complexity limit The aggregated complexity of queries in a time period | 250k/10s, 30M/1h | 0.9M/10s, 90M/1h | 1.8M/10s, 180M/1h | 4M/10s, 400M/1h |
If your integration requires higher rate limits, you can request an upgrade of your tier. It is possible to query the Integration API to see what rate limits are enforced for a particular client.
Rate limits apply per Centra environment and are shared by all integrations. One integration may exhaust the limit and temporarily rate-limit others.
We may temporarily modify rate limits to ensure the stability of the platform and its performance for all users.
Your integration must be built to handle lowered rate limits gracefully and to behave as a good citizen in an environment where multiple integrations share the same rate limits for the same client. See Avoiding hitting the Rate Limits.
Integration API token bucket rate limit algorithm
The rate limit implementation in Centra’s Integration API is based on the token bucket algorithm. A short explanation of the token bucket algorithm concept:
- There is a bucket with a fixed capacity that contains tokens.
- The bucket is full at the beginning.
- An allowed request consumes one or more tokens.
- If the bucket contains enough tokens for a particular request, the request is allowed and tokens are consumed. Otherwise, the request is denied and no tokens are consumed.
- The bucket is replenished with new tokens at a constant rate (e.g. 30 tokens every second).
- Once the bucket is fully replenished, it stays full until the next request consumes some tokens.
Token buckets used for Centra’s Integration API rate limits
The Integration API uses six token buckets for rate limiting:
- Burst request limit: Number of queries and mutations in 10 seconds.
- Burst mutation limit: Number of mutations in 10 seconds.
- Burst query complexity limit: The aggregated complexity of queries in 10 seconds.
- Sustained request limit: Number of queries and mutations in 1 hour.
- Sustained mutation limit: Number of mutations in 1 hour.
- Sustained query complexity limit: The aggregated complexity of queries in 1 hour.
Tokens in each of these buckets are consumed and replenished independently. If at least one of the buckets contains insufficient tokens for a given request, the request is denied.
Token calculation in Centra’s Integration API
The token calculation logic is straightforward:
- For the request limits (burst and sustained), 1 request costs 1 token.
- For the mutation limits (burst and sustained), 1 mutation costs 1 token.
- For the query complexity limits (burst and sustained), 1 complexity point costs 1 token. See how we calculate the query complexity here.
What counts as one request? And what about batched mutations?
Technically speaking, one HTTP request can contain multiple GraphQL operations, and each operation can include multiple top-level fields, which would roughly translate to REST calls. For example:
query threeInOne {
viewer {
name
integrationName
}
counters {
orders(where: { status: [PENDING] })
}
rateLimits {
type
intervalSeconds
quota
usedQuota
remainingQuota
}
}
mutation twoInOne {
captureShipment(id: 345) {
userErrors {
message
path
}
userWarnings {
message
path
}
}
addOrderNote(
input: { order: { externalId: "my-id-123" }, message: "Hello world" }
) {
userErrors {
message
path
}
userWarnings {
message
path
}
}
}
If you send such a document in your request, the GraphQL server needs to know which of the two operations to run; hence the JSON body must also include the operationName parameter. Here's the official specification: link.
Avoid sending extra (not executed) operations in your requests, as it's an inefficient use of bandwidth and server resources.
So, with a JSON body like this
{
"query": "(as above)",
"operationName": "twoInOne",
"variables": {}
}
then, in terms of rate limits, this counts as:
- One request for the
REQUEST_COUNTbuckets. - Two mutations for the
MUTATION_COUNTbuckets. - Two points for
QUERY_COMPLEXITY.
Some GraphQL servers allow for execution of multiple independent operations in one batch by wrapping them in a JSON array (link). This way isn't supported by the Integration API.
Checking the Integration API rate limit status
The following query can be used to get information about currently enforced rate limits, and available tokens in each token bucket:
query {
rateLimits {
type
intervalSeconds
quota
usedQuota
remainingQuota
}
}
The result returned is a list of six objects, which represent the six token buckets:
-
type– type of the rate limit managed by this token bucket, represented as an enumRateLimitTypethat can have 3 possible values:-
REQUEST_COUNT -
QUERY_COMPLEXITY -
MUTATION_COUNT
-
-
intervalSeconds– the time it takes to replenish an empty bucket:-
TEN_SECONDSand -
ONE_HOUR
-
-
quota– size of the bucket, the maximum amount of tokens that will ever fit in the bucket -
usedQuota– how many tokens have been consumed by requests and not yet replenished -
remainingQuota– how many tokens remain in the bucket and are available for consumption
The query consumes 1 request and 5 complexity points.
Testing Rate Limits
No matter how good your integration is, it can still encounter the HTTP "429 Too Many Requests" status code sometimes and must handle it correctly. To see how such a response looks like and simplify testing these scenarios you can include a special header: X-Trigger-Rate-Limit-Error: true. The response will contain a "Retry-After" header with a date formatted according to the RF2822 format (https://www.rfc-editor.org/rfc/rfc2822.html), 10 seconds into the future. For example: Fri, 21 Mar 2025 19:15:55 GMT.
Avoiding hitting the Rate Limits
Follow industry best practices to avoid rate limits: reduce unnecessary requests, cache data, monitor errors, and respect backoff signals.
- Architecture
- Use the Integration API for asynchronous backend integrations only
Never use the Integration API for serving data to a frontend website. Serving data to a frontend (even with a proxy) means that the rate limits will very likely be hit at periods of high website traffic, with user-visible errors as a result. Use the Storefront API serving frontends instead, which offer several orders of magnitude more throughput and lower latency. - Ensure the data model is efficient
A custom data model set up in Centra that is inefficient may lead to a need to mutate the same data multiple times (e.g. updating a "hand wash only" icon file that’s attached to each of 10,000 products separately as opposed to using a single file shared between the products). Do not use Dynamic Attributes for data that could be normalized by using Mapped Attributes. This is especially important for translatable attributes. See more about attributes in Centra. - Use caching for data that your app uses often
If you need to access some data frequently, cache it. Some data changes very seldom (e.g., markets, stores, countries, pricelists, product catalog) - Subscribe to events to update your cached data
You should subscribe to events for cache invalidation, rather than brute force poll the API repeatedly. See more about the events. - Only mutate data that has changed
Keep track of mutations that have already been made. Don’t attempt to mutate data that is already up to date in Centra.
- Use the Integration API for asynchronous backend integrations only
- Craft requests
- Optimize your code to get only data it needs
Select only necessary fields, especially for nested objects and lists. Sometimes, you can use a different query to simplify the structure. Avoid deep nesting: even if GraphQL is flexible enough to be able to query "everything" in one go, sometimes it's more efficient to issue additional queries instead. - Use the most comprehensive mutation for your task
For some common tasks, the Integration API offers mutations that conveniently carry out multiple activities in just one mutation. For example, if syncing a product with a variant and sizes, use the mutation for that, rather than multiple mutations in sequence. - For batch jobs, use batch operations where available
The Integration API offers some batch operations, geared for larger import jobs. Use those if available for batch tasks, and monitor the status.
- Optimize your code to get only data it needs
- Handle limiting
- Smoothen out the rate of requests
Regulate the rate of requests to ensure a smooth distribution. This especially applies if you send requests asynchronously, which enables sharper load spikes (sudden spikes are more likely to get rate-limited). - Provide a great user experience while they wait for large operations
Syncing a large amount of data to or from the Integration API will take time, whether running multiple mutations in sequence or a batch job. Give your users clear information about the progress and status of large jobs, such as adding a new collection of products to Centra or loading an empty data warehouse with historical data. - Handle errors appropriately
Requests that result in user errors (and warnings) should be handled appropriately in order to prevent spamming the API and to ensure your integration can recover gracefully after having been rate-limited. - Respect the backoff time
When your integration gets rate-limited, the response will be returned with HTTP status code 429. It will contain theRetry-AfterHTTP header with the timestamp of when you should resume your requests. Your integration should wait until that time passes to make any further requests.
- Smoothen out the rate of requests
Are you still hitting Rate Limits?
The rate limits have been carefully designed and should be sufficient for most use cases. Please get in touch with our support for guidance if you are struggling to stay within the rate limits.
Are you still in need of higher rate limits, despite following the guidelines? We are able to offer higher rate limits as an additional (paid) service.
Requesting an upgrade to rate limits
To meet increased operational needs, your team can request an upgrade to a higher rate limit tier through multiple convenient channels:
-
Reach out to your Client Success Manager or Centra Support
-
From the Centra AMS, go to Systems → API tokens, select Change Integration API Rate Limits → Request an upgrade (Note: you must have Full Access Administrator user rights)