Claude Code Usage Limits: Free Tier Restrictions

Are you building with Claude Code on the Free Tier? Whether you are experimenting with the powerful Claude Opus 4.5 or building rapid-fire applications with Claude Haiku, understanding your usage limits is critical to maintaining a stable application.In this guide, we break down the specific Claude Code usage limits, including Requests Per Minute (RPM) and Tokens Per Minute (TPM), so you can manage your API capacity effectively and avoid hitting those frustrating rate limit errors.

Why Do These Limits Exist?
Claude Code Free Tier: Rate Limits by Model
- High-Intelligence Models (Opus & Sonnet)
- High-Speed Models (Haiku)
Additional Tool & Storage Limits
How to Handle Rate Limits
Conclusion

Why Do These Limits Exist?

Before diving into the numbers, it is important to understand why these limits are in place. According to Anthropic, rate limits serve two main purposes:

Mitigation against misuse: Preventing malicious actors from flooding the system.
Managing API capacity: Ensuring fair usage and stability for all users sharing the infrastructure.

Claude Code Free Tier: Rate Limits by Model

The Free Tier limits vary significantly depending on which model family you are accessing. Below is the detailed breakdown of the Requests Per Minute (RPM) and Tokens Per Minute (TPM) for the current model lineup.

High-Intelligence Models (Opus & Sonnet)

For tasks requiring complex reasoning, coding, and nuance, you will likely use the Opus or Sonnet series. Note that these powerful models have stricter throughput limits on the Free Tier.

Model	Requests / Min (RPM)	Input Tokens / Min (TPM)	Output Tokens / Min (TPM)
Claude Opus 4.5	5	10K	4K
Claude Sonnet 4 & 4.5	5	10K	4K
Claude Opus 4 & 4.1	5	10K	4K

Note on Context: All models listed above support up to a 200k context window. The Input TPM limit of 10k explicitly excludes cache reads, meaning you can leverage context caching to maximize your throughput without eating into your input allowance.

High-Speed Models (Haiku)

If your application requires speed and efficiency, the Haiku series offers more generous token limits, allowing for higher volume data processing.

Model	Requests / Min (RPM)	Input Tokens / Min (TPM)	Output Tokens / Min (TPM)
Claude Haiku 4.x	5	10K	4K
Claude Haiku 3.5	5	25K	5K
Claude Haiku 3	5	25K	5K

While the RPM remains steady at 5 across the board, the Haiku 3 and 3.5 series allow for 2.5x more input tokens per minute compared to their high-intelligence counterparts.

Additional Tool & Storage Limits

Beyond the raw model usage, Claude Code imposes limits on specific tools and storage capabilities to ensure performance.

1. Batch Requests

If you are processing large volumes of data asynchronously, you need to be aware of the batching limit.

Limit: 5 Batch requests per minute.
Scope: This applies across all models.

2. Web Search Tool

For agents and applications that browse the web for real-time information:

Limit: 30 uses per second.
Scope: This applies across all models.
Observation: This is a relatively high limit compared to model generation, allowing for aggressive information retrieval.

3. Files API Storage

If you are uploading documents for RAG (Retrieval-Augmented Generation) or analysis:

Total Storage: 100 GB.
Scope: This limit is calculated across your entire organization.

How to Handle Rate Limits

If you exceed these Free Tier limits, you will likely encounter a 429 Too Many Requests error. Here is how to handle them:

Implement Exponential Backoff: Do not retry immediately. Wait a few seconds, then retry, increasing the wait time with every subsequent failure.
Optimize Prompts: Since Input TPM is a major constraint (10k for most models), keep your prompts concise.
Utilize Context Caching: As noted in the limits, cache reads are excluded from the Input TPM count. Caching repetitive context (like system instructions or large docs) is the best way to get more out of the Free Tier.

Conclusion

Navigating Claude Code usage limits is all about choosing the right model for the right task. Use Haiku 3.5 when you need to process more tokens, and save Opus 4.5 for low-volume, high-complexity queries.

If you find yourself constantly hitting the 5 RPM or 10k TPM ceiling, it may be time to consider upgrading from the Free Tier to a paid plan for higher throughput.

Contents