Introduction
The notokenlimit.com API lets you integrate frontier AI models into your own application using an endpoint that is fully compatible with the OpenAI Chat Completions format. Access is private and must be approved by an administrator.
- API-Key authentication (Authorization: Bearer ntl_live_...).
- Per-user quotas (per-minute, daily, monthly).
- Admin-controlled model allow-list.
- Full audit trail, every request is logged (no content).
- Automatic and manual IP blocking.
Quick start
- 1Create an account on notokenlimit.com.
- 2Request API access from an administrator.
- 3Open the Developer panel and generate a key.
- 4Make your first request (see Examples).
curl -X POST https://notokenlimit.com/api/v1/chat \
-H "Authorization: Bearer ntl_live_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4.6",
"messages": [{"role": "user", "content": "Hello"}]
}'Authentication
Every request requires the Authorization header with your API key:
Authorization: Bearer ntl_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx- Keys always start with ntl_live_.
- The plaintext key is shown only ONCE, right after creation.
- We only store a SHA-256 hash of the key in our database.
- If you lose a key, you must revoke it and generate a new one.
- Never embed the key on the client side (frontend, mobile). Always use a backend.
Endpoints
/api/v1/chat/completions/api/v1/messages/api/v1/modelshttps://notokenlimit.comLimits & quotas
Each account has individually configurable quotas. When exceeded, the server responds with HTTP 429 and a specific error code.
| Window | Error code | Meaning |
|---|---|---|
| 60 seconds | rate_limit_minute | Too many requests in a minute |
| 24 hours | quota_day | Daily quota exhausted |
| 30 days | quota_month | Monthly quota exhausted |
Quota response headers are not returned yet, check the Developer panel for live usage.
Models
The models available to your account are approved by an administrator. You can list them on your panel or by calling GET /api/v1/models.
If you try to use a model that is not allowed, you will receive 403 model_not_allowed.
Live model catalog
These are the model IDs accepted by the API right now. Click any ID to copy.
Errors
Error bodies follow the OpenAI format:
{
"error": {
"type": "rate_limit_error",
"code": "rate_limit_minute",
"message": "Rate limit: 60/min"
}
}| HTTP | Code | Cause |
|---|---|---|
| 400 | invalid_messages | Malformed messages |
| 400 | missing_model | Missing model field |
| 400 | unknown_model | Unknown model |
| 401 | missing_auth | Missing Authorization header |
| 401 | invalid_key | Invalid or unknown key |
| 401 | key_revoked | Key revoked |
| 401 | key_expired | Key expired |
| 403 | access_disabled | API access not enabled |
| 403 | suspended | Access suspended |
| 403 | ip_blocked | Your IP is blocked |
| 403 | model_not_allowed | Model not allowed for your account |
| 429 | rate_limit_minute | Too many requests per minute |
| 429 | quota_day | Daily quota reached |
| 429 | quota_month | Monthly quota reached |
| 502 | upstream_error | Provider failure (retry) |
Security
Examples
cURL
curl -X POST https://notokenlimit.com/api/v1/chat \
-H "Authorization: Bearer $NTL_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4.6",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain entropy in one sentence."}
],
"max_tokens": 256
}'JavaScript (fetch)
const res = await fetch("https://notokenlimit.com/api/v1/chat", {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.NTL_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "claude-sonnet-4.6",
messages: [{ role: "user", content: "Hello world" }],
}),
});
if (!res.ok) {
const err = await res.json();
throw new Error(err.error?.message ?? "API error");
}
const data = await res.json();
console.log(data.choices[0].message.content);Python (requests)
import os, requests
resp = requests.post(
"https://notokenlimit.com/api/v1/chat",
headers={
"Authorization": f"Bearer {os.environ['NTL_KEY']}",
"Content-Type": "application/json",
},
json={
"model": "claude-sonnet-4.6",
"messages": [{"role": "user", "content": "Hello"}],
},
timeout=60,
)
resp.raise_for_status()
print(resp.json()["choices"][0]["message"]["content"])Drop-in with the OpenAI SDK
You can use the official OpenAI SDK pointed at our endpoint:
from openai import OpenAI
import os
client = OpenAI(
base_url="https://notokenlimit.com/api/v1",
api_key=os.environ["NTL_KEY"],
)
resp = client.chat.completions.create(
model="claude-sonnet-4.6",
messages=[{"role": "user", "content": "Hello"}],
)
print(resp.choices[0].message.content)FAQ
How do I request access?+
Do you support SSE streaming?+
Can I use it with coding agents, Claude tools or the Anthropic/OpenAI SDKs?+
What if I lose a key?+
Do you store my messages?+
Can I raise my quotas?+
Clients & IDEs
Your ntl_live_ key is a drop-in replacement in any OpenAI- or Anthropic-compatible tool — two universal modes, both with streaming and tool/function calling:
Claude Code · Anthropic SDK · Claude Agent SDK
Point Claude Code or the Anthropic SDK at our Anthropic endpoint. Tool/function calling works.
export ANTHROPIC_BASE_URL=https://notokenlimit.com/api
export ANTHROPIC_API_KEY=ntl_live_YOUR_KEY
export ANTHROPIC_MODEL=claude-opus-4.8
claudeOpenAI SDK (Python / JS)
Point the OpenAI SDK (or any OpenAI-compatible library) at our base URL — it appends /chat/completions automatically.
from openai import OpenAI
client = OpenAI(
base_url="https://notokenlimit.com/api/v1",
api_key="ntl_live_YOUR_KEY",
)
resp = client.chat.completions.create(
model="claude-opus-4.8",
messages=[{"role": "user", "content": "Hello"}],
tools=[], # tool / function calling supported
)Antigravity · Cline · Cursor · Roo Code · Continue · Kilo Code
In the model/provider settings, add a custom "OpenAI Compatible" provider with these values:
Provider type : OpenAI Compatible
Base URL : https://notokenlimit.com/api/v1
API Key : ntl_live_YOUR_KEY
Model ID : claude-opus-4.8GitHub Copilot & others
Any tool that lets you set a custom OpenAI base URL works the same way. For tools locked to official providers (e.g. GitHub Copilot's stock setup), run a local OpenAI-compatible proxy pointing here.