Unlimited Claude & Codex — No Token Limits

Introduction

The notokenlimit.com API lets you integrate frontier AI models into your own application using an endpoint that is fully compatible with the OpenAI Chat Completions format. Access is private and must be approved by an administrator.

API-Key authentication (Authorization: Bearer ntl_live_...).
Per-user quotas (per-minute, daily, monthly).
Admin-controlled model allow-list.
Full audit trail, every request is logged (no content).
Automatic and manual IP blocking.

Quick start

1Create an account on notokenlimit.com.
2Request API access from an administrator.
3Open the Developer panel and generate a key.
4Make your first request (see Examples).

bash

curl -X POST https://notokenlimit.com/api/v1/chat \
  -H "Authorization: Bearer ntl_live_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4.6",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Authentication

Every request requires the Authorization header with your API key:

http

Authorization: Bearer ntl_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Keys always start with ntl_live_.
The plaintext key is shown only ONCE, right after creation.
We only store a SHA-256 hash of the key in our database.
If you lose a key, you must revoke it and generate a new one.
Never embed the key on the client side (frontend, mobile). Always use a backend.

Endpoints

POST/api/v1/chat/completionsGenerate a chat completion. OpenAI-compatible format.

POST/api/v1/messagesAnthropic Messages API · Claude Code, Anthropic SDK

GET/api/v1/modelsList the models available to your account.

Base URLhttps://notokenlimit.com

Limits & quotas

Each account has individually configurable quotas. When exceeded, the server responds with HTTP 429 and a specific error code.

Window	Error code	Meaning
60 seconds	rate_limit_minute	Too many requests in a minute
24 hours	quota_day	Daily quota exhausted
30 days	quota_month	Monthly quota exhausted

Quota response headers are not returned yet, check the Developer panel for live usage.

Models

The models available to your account are approved by an administrator. You can list them on your panel or by calling GET /api/v1/models.

If you try to use a model that is not allowed, you will receive 403 model_not_allowed.

Live model catalog

These are the model IDs accepted by the API right now. Click any ID to copy.

Loading models…

Errors

Error bodies follow the OpenAI format:

json

{
  "error": {
    "type": "rate_limit_error",
    "code": "rate_limit_minute",
    "message": "Rate limit: 60/min"
  }
}

HTTP	Code	Cause
400	invalid_messages	Malformed messages
400	missing_model	Missing model field
400	unknown_model	Unknown model
401	missing_auth	Missing Authorization header
401	invalid_key	Invalid or unknown key
401	key_revoked	Key revoked
401	key_expired	Key expired
403	access_disabled	API access not enabled
403	suspended	Access suspended
403	ip_blocked	Your IP is blocked
403	model_not_allowed	Model not allowed for your account
429	rate_limit_minute	Too many requests per minute
429	quota_day	Daily quota reached
429	quota_month	Monthly quota reached
502	upstream_error	Provider failure (retry)

Security

✓

Keys are stored only as SHA-256 hashes (one-way). Only the public prefix is kept for display.

✓

All traffic is encrypted in transit (HTTPS/TLS).

✓

Logs store metadata (model, status, latency, tokens, IP), never the message content.

✓

Malicious IPs can be blocked in real time from the admin panel.

✓

Each key can have an optional expiry date.

✓

Administrators can suspend a user's access at any time.

✓

Treat keys like passwords, they are NEVER returned after creation.

Examples

cURL

bash

curl -X POST https://notokenlimit.com/api/v1/chat \
  -H "Authorization: Bearer $NTL_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4.6",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain entropy in one sentence."}
    ],
    "max_tokens": 256
  }'

JavaScript (fetch)

javascript

const res = await fetch("https://notokenlimit.com/api/v1/chat", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.NTL_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "claude-sonnet-4.6",
    messages: [{ role: "user", content: "Hello world" }],
  }),
});

if (!res.ok) {
  const err = await res.json();
  throw new Error(err.error?.message ?? "API error");
}
const data = await res.json();
console.log(data.choices[0].message.content);

Python (requests)

python

import os, requests

resp = requests.post(
    "https://notokenlimit.com/api/v1/chat",
    headers={
        "Authorization": f"Bearer {os.environ['NTL_KEY']}",
        "Content-Type": "application/json",
    },
    json={
        "model": "claude-sonnet-4.6",
        "messages": [{"role": "user", "content": "Hello"}],
    },
    timeout=60,
)
resp.raise_for_status()
print(resp.json()["choices"][0]["message"]["content"])

Drop-in with the OpenAI SDK

You can use the official OpenAI SDK pointed at our endpoint:

python

from openai import OpenAI
import os

client = OpenAI(
    base_url="https://notokenlimit.com/api/v1",
    api_key=os.environ["NTL_KEY"],
)

resp = client.chat.completions.create(
    model="claude-sonnet-4.6",
    messages=[{"role": "user", "content": "Hello"}],
)
print(resp.choices[0].message.content)

FAQ

How do I request access?+

Do you support SSE streaming?+

Yes. Pass "stream": true to /api/v1/chat/completions (OpenAI-style chunks) or /api/v1/messages (Anthropic-style events).

Can I use it with coding agents, Claude tools or the Anthropic/OpenAI SDKs?+

Yes. We expose an Anthropic-compatible endpoint at /api/v1/messages and an OpenAI-compatible one at /api/v1/chat/completions, both with streaming AND tool/function calling. Point Claude Code or the Anthropic SDK at ANTHROPIC_BASE_URL=https://notokenlimit.com/api with your ntl_live_ key as x-api-key, or any OpenAI-compatible tool at base_url=https://notokenlimit.com/api/v1. See the Clients & IDEs section for per-tool setup.

What if I lose a key?+

Revoke it from the panel and create a new one. Lost keys cannot be recovered, we only store the hash.

Do you store my messages?+

No. We only store metadata: timestamp, model, status, latency, estimated tokens, and IP. Never the text.

Can I raise my quotas?+

Yes, contact an administrator. Limits are configurable per user.

Clients & IDEs

Your ntl_live_ key is a drop-in replacement in any OpenAI- or Anthropic-compatible tool — two universal modes, both with streaming and tool/function calling:

Claude Code · Anthropic SDK · Claude Agent SDK

Point Claude Code or the Anthropic SDK at our Anthropic endpoint. Tool/function calling works.

bash

export ANTHROPIC_BASE_URL=https://notokenlimit.com/api
export ANTHROPIC_API_KEY=ntl_live_YOUR_KEY
export ANTHROPIC_MODEL=claude-opus-4.8
claude

OpenAI SDK (Python / JS)

Point the OpenAI SDK (or any OpenAI-compatible library) at our base URL — it appends /chat/completions automatically.

python

from openai import OpenAI

client = OpenAI(
    base_url="https://notokenlimit.com/api/v1",
    api_key="ntl_live_YOUR_KEY",
)
resp = client.chat.completions.create(
    model="claude-opus-4.8",
    messages=[{"role": "user", "content": "Hello"}],
    tools=[],   # tool / function calling supported
)

Antigravity · Cline · Cursor · Roo Code · Continue · Kilo Code

In the model/provider settings, add a custom "OpenAI Compatible" provider with these values:

http

Provider type : OpenAI Compatible
Base URL      : https://notokenlimit.com/api/v1
API Key       : ntl_live_YOUR_KEY
Model ID      : claude-opus-4.8

GitHub Copilot & others

Any tool that lets you set a custom OpenAI base URL works the same way. For tools locked to official providers (e.g. GitHub Copilot's stock setup), run a local OpenAI-compatible proxy pointing here.

Use any model id from GET /api/v1/models (e.g. claude-opus-4.8, gpt-5.5, gemini-3.1-pro). Whatever you set as the model in your tool is sent as the model field.

Every request is audited (model, status, latency, IP — never your content). Unusual volume or a brand-new IP automatically alerts our staff and can trigger an IP block, so keep your key server-side.

Build with our AI in minutes.

Introduction

Quick start

Authentication

Endpoints

Limits & quotas

Models

Live model catalog

Errors

Security

Examples

cURL

JavaScript (fetch)

Python (requests)

Drop-in with the OpenAI SDK

FAQ

Clients & IDEs

Claude Code · Anthropic SDK · Claude Agent SDK

OpenAI SDK (Python / JS)

Antigravity · Cline · Cursor · Roo Code · Continue · Kilo Code

GitHub Copilot & others