Guides

Rate Limits

Understand the rate limits and quotas for the CatLove AI API.

Overview

Rate limits help ensure fair usage and protect the service from abuse. Limits are applied per API key and vary based on your subscription tier.

We measure limits in three ways:

RPM - Requests per minute
TPM - Tokens per minute
RPD - Requests per day

Rate Limits by Tier

Tier	RPM	TPM	RPD
Free	20	10,000	200
Basic	60	60,000	1,000
Pro	500	500,000	10,000
Enterprise	Custom	Custom	Custom

Need higher limits? Upgrade your plan or contact us for enterprise pricing.

Rate Limit Headers

API responses include headers with rate limit information:

x-ratelimit-limit-requests: 60
x-ratelimit-limit-tokens: 60000
x-ratelimit-remaining-requests: 59
x-ratelimit-remaining-tokens: 59900
x-ratelimit-reset-requests: 1s
x-ratelimit-reset-tokens: 100ms

x-ratelimit-limit-* - Your current limit
x-ratelimit-remaining-* - Remaining quota
x-ratelimit-reset-* - Time until reset

Handling 429 Errors

When you exceed the rate limit, the API returns a 429 status code. Here's how to handle it:

import time
from openai import RateLimitError

def make_request_with_backoff():
    while True:
        try:
            return client.chat.completions.create(...)
        except RateLimitError as e:
            # Get retry-after header or default to 60s
            retry_after = int(e.response.headers.get('retry-after', 60))
            print(f"Rate limited. Waiting {retry_after}s...")
            time.sleep(retry_after)

Best Practices

Implement backoff - Use exponential backoff for retries
Batch requests - Combine multiple operations when possible
Cache responses - Cache results to reduce API calls
Monitor usage - Track your usage in the dashboard
Use streaming - Streaming responses don't count against RPM until complete

Token vs Request Limits

Both request and token limits apply. A single request with many tokens can exhaust your TPM limit even if you have remaining RPM quota.