429 Rate Limit Errors on GPT=4.1

Lollop 0 Reputation points
2025-05-03T22:32:50.3333333+00:00

I am getting 429 Rate Limit errors on an Azure OpenAI gpt-4.1 resource; the details for this resource, as shown in Azure AI Foundry, are:

Rate Limit: 721,000 TPM

Requests: 721 RPM

But it is capped at 30K for some reason.

status_code: 429, model_name: gpt-4.1, body: {'message': 'Request too large for gpt-4.1 in organization org-<snip> on tokens per min (TPM): Limit 30000, Requested 42638. The input or output tokens must be reduced in order to run successfully. Visit https://platform.openai.com/account/rate-limits to learn more.', 'type': 'tokens', 'param': None, 'code': 'rate_limit_exceeded'}

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,950 questions
0 comments No comments
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.