Rate limits

Centra’s Integration API Rate Limits

To keep our platform stable for all clients, we limit how often some APIs can be used. We use different methods to enforce these limits. As part of our fair usage policy, we ask developers to follow industry best practices: avoid unnecessary API calls and complexity, cache results, and retry requests responsibly.

Rate Limits in Centra’s Integration API

Centra’s Integration API uses several different plans for rate limiting. They are further described below. Rate limits vary depending on Centra agreement. With "Growth" plan being default, the options are:

	Growth	Enhanced	High volume	Enterprise
Request limit Number of calls to the API in a time period	`60`/10s, `5k`/1h	`150`/10s, `20k`/1h	`450`/10s, `40k`/1h	`1.5k`/10s, `85k`/1h
Mutation limit Number of mutations in a time period	`20`/10s, `1.8k`/1h	`50`/10s, `4.5k`/1h	`150`/10s, `7k`/1h	`500`/10s, `25k`/1h
Query complexity limit The aggregated complexity of queries in a time period	`250k`/10s, `30M`/1h	`0.9M`/10s, `90M`/1h	`1.8M`/10s, `180M`/1h	`4M`/10s, `400M`/1h

If your integration requires higher rate limits, you can request an upgrade of your tier. It is possible to query the Integration API to see what rate limits are enforced for a particular client.

Rate limits apply per Centra environment and are shared by all integrations. One integration may exhaust the limit and temporarily rate-limit others.

We may temporarily modify rate limits to ensure the stability of the platform and its performance for all users.

Your integration must be built to handle lowered rate limits gracefully and to behave as a good citizen in an environment where multiple integrations share the same rate limits for the same client. See Avoiding hitting the Rate Limits.

Integration API token bucket rate limit algorithm

The rate limit implementation in Centra’s Integration API is based on the token bucket algorithm. A short explanation of the token bucket algorithm concept:

There is a bucket with a fixed capacity that contains tokens.
The bucket is full at the beginning.
An allowed request consumes one or more tokens.
If the bucket contains enough tokens for a particular request, the request is allowed and tokens are consumed. Otherwise, the request is denied and no tokens are consumed.
The bucket is replenished with new tokens at a constant rate (e.g. 30 tokens every second).
Once the bucket is fully replenished, it stays full until the next request consumes some tokens.

Token buckets used for Centra’s Integration API rate limits

The Integration API uses six token buckets for rate limiting:

Burst request limit: Number of queries and mutations in 10 seconds.
Burst mutation limit: Number of mutations in 10 seconds.
Burst query complexity limit: The aggregated complexity of queries in 10 seconds.
Sustained request limit: Number of queries and mutations in 1 hour.
Sustained mutation limit: Number of mutations in 1 hour.
Sustained query complexity limit: The aggregated complexity of queries in 1 hour.

Tokens in each of these buckets are consumed and replenished independently. If at least one of the buckets contains insufficient tokens for a given request, the request is denied.

Token calculation in Centra’s Integration API

The token calculation logic is straightforward:

For the request limits (burst and sustained), 1 request costs 1 token.
For the mutation limits (burst and sustained), 1 mutation costs 1 token.
For the query complexity limits (burst and sustained), 1 complexity point costs 1 token. See how we calculate the query complexity here.

What counts as one request? And what about batched mutations?

Technically speaking, one HTTP request can contain multiple GraphQL operations, and each operation can include multiple top-level fields, which would roughly translate to REST calls. For example:

query threeInOne {
  viewer {
    name
    integrationName
  }

  counters {
    orders(where: { status: [PENDING] })
  }

  rateLimits {
    type
    intervalSeconds
    quota
    usedQuota
    remainingQuota
  }
}

mutation twoInOne {
  captureShipment(id: 345) {
    userErrors {
      message
      path
    }
    userWarnings {
      message
      path
    }
  }

  addOrderNote(
    input: { order: { externalId: "my-id-123" }, message: "Hello world" }
  ) {
    userErrors {
      message
      path
    }
    userWarnings {
      message
      path
    }
  }
}

If you send such a document in your request, the GraphQL server needs to know which of the two operations to run; hence the JSON body must also include the operationName parameter. Here's the official specification: link.

info

Avoid sending extra (not executed) operations in your requests, as it's an inefficient use of bandwidth and server resources.

So, with a JSON body like this

{
  "query": "(as above)",
  "operationName": "twoInOne",
  "variables": {}
}

then, in terms of rate limits, this counts as:

One request for the REQUEST_COUNT buckets.
Two mutations for the MUTATION_COUNT buckets.
Two points for QUERY_COMPLEXITY.

Some GraphQL servers allow for execution of multiple independent operations in one batch by wrapping them in a JSON array (link). This way isn't supported by the Integration API.

Checking the Integration API rate limit status

The following query can be used to get information about currently enforced rate limits, and available tokens in each token bucket:

query {
  rateLimits {
    type
    intervalSeconds
    quota
    usedQuota
    remainingQuota
  }
}

The result returned is a list of six objects, which represent the six token buckets:

type – type of the rate limit managed by this token bucket, represented as an enum RateLimitType that can have 3 possible values:
- REQUEST_COUNT
- QUERY_COMPLEXITY
- MUTATION_COUNT
intervalSeconds – the time it takes to replenish an empty bucket:
- TEN_SECONDS and
- ONE_HOUR
quota – size of the bucket, the maximum amount of tokens that will ever fit in the bucket
usedQuota – how many tokens have been consumed by requests and not yet replenished
remainingQuota – how many tokens remain in the bucket and are available for consumption

The query consumes 1 request and 5 complexity points.

Testing Rate Limits

No matter how good your integration is, it can still encounter the HTTP "429 Too Many Requests" status code sometimes and must handle it correctly. To see how such a response looks like and simplify testing these scenarios you can include a special header: X-Trigger-Rate-Limit-Error: true. The response will contain a "Retry-After" header with a date formatted according to the RF2822 format (https://www.rfc-editor.org/rfc/rfc2822.html), 10 seconds into the future. For example: Fri, 21 Mar 2025 19:15:55 GMT.

Avoiding hitting the Rate Limits

Follow industry best practices to avoid rate limits: reduce unnecessary requests, cache data, monitor errors, and respect backoff signals.

Architecture
- Use the Integration API for asynchronous backend integrations only
  Never use the Integration API for serving data to a frontend website. Serving data to a frontend (even with a proxy) means that the rate limits will very likely be hit at periods of high website traffic, with user-visible errors as a result. Use the Storefront API serving frontends instead, which offer several orders of magnitude more throughput and lower latency.
- Ensure the data model is efficient
  A custom data model set up in Centra that is inefficient may lead to a need to mutate the same data multiple times (e.g. updating a "hand wash only" icon file that’s attached to each of 10,000 products separately as opposed to using a single file shared between the products). Do not use Dynamic Attributes for data that could be normalized by using Mapped Attributes. This is especially important for translatable attributes. See more about attributes in Centra.
- Use caching for data that your app uses often
  If you need to access some data frequently, cache it. Some data changes very seldom (e.g., markets, stores, countries, pricelists, product catalog)
- Subscribe to events to update your cached data
  You should subscribe to events for cache invalidation, rather than brute force poll the API repeatedly. See more about the events.
- Only mutate data that has changed
  Keep track of mutations that have already been made. Don’t attempt to mutate data that is already up to date in Centra.
Craft requests
- Optimize your code to get only data it needs
  Select only necessary fields, especially for nested objects and lists. Sometimes, you can use a different query to simplify the structure. Avoid deep nesting: even if GraphQL is flexible enough to be able to query "everything" in one go, sometimes it's more efficient to issue additional queries instead.
- Use the most comprehensive mutation for your task
  For some common tasks, the Integration API offers mutations that conveniently carry out multiple activities in just one mutation. For example, if syncing a product with a variant and sizes, use the mutation for that, rather than multiple mutations in sequence.
- For batch jobs, use batch operations where available
  The Integration API offers some batch operations, geared for larger import jobs. Use those if available for batch tasks, and monitor the status.
Handle limiting
- Smoothen out the rate of requests
  Regulate the rate of requests to ensure a smooth distribution. This especially applies if you send requests asynchronously, which enables sharper load spikes (sudden spikes are more likely to get rate-limited).
- Provide a great user experience while they wait for large operations
  Syncing a large amount of data to or from the Integration API will take time, whether running multiple mutations in sequence or a batch job. Give your users clear information about the progress and status of large jobs, such as adding a new collection of products to Centra or loading an empty data warehouse with historical data.
- Handle errors appropriately
  Requests that result in user errors (and warnings) should be handled appropriately in order to prevent spamming the API and to ensure your integration can recover gracefully after having been rate-limited.
- Respect the backoff time
  When your integration gets rate-limited, the response will be returned with HTTP status code 429. It will contain the Retry-After HTTP header with the timestamp of when you should resume your requests. Your integration should wait until that time passes to make any further requests.

Are you still hitting Rate Limits?

The rate limits have been carefully designed and should be sufficient for most use cases. Please get in touch with our support for guidance if you are struggling to stay within the rate limits.

Are you still in need of higher rate limits, despite following the guidelines? We are able to offer higher rate limits as an additional (paid) service.

Requesting an upgrade to rate limits

To meet increased operational needs, your team can request an upgrade to a higher rate limit tier through multiple convenient channels:

Reach out to your Client Success Manager or Centra Support
From the Centra AMS, go to Systems → API tokens, select Change Integration API Rate Limits → Request an upgrade (Note: you must have Full Access Administrator user rights)

Centra’s Integration API Rate Limits​

Rate Limits in Centra’s Integration API​

Integration API token bucket rate limit algorithm​

Token buckets used for Centra’s Integration API rate limits​

Token calculation in Centra’s Integration API​

What counts as one request? And what about batched mutations?​

Checking the Integration API rate limit status​

Testing Rate Limits​

Avoiding hitting the Rate Limits​

Are you still hitting Rate Limits?​

Requesting an upgrade to rate limits​