API Documentation

Getting Started

Authentication

All API requests require an API key. Get yours by:

Sign up at idlecloud.ai
Complete the onboarding flow
Generate an API key from your dashboard
Set environment variable: IDLECLOUD_API_KEY=ic_...

Note: API keys start with the ic_ prefix.

Quick Start

from idlecloud import IdleCloud

api_key = 'ic_your_api_key_here'

# Create a client instance
client = IdleCloud(api_key=api_key)

# Make a request
response = client.chat.completions.create(
    model="gpt-oss-20b",  # Model is required
    messages=[
        {"role": "user", "content": "Hello!"}
    ],
    stream=True
)

# Print the response as it streams in
for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Python SDK

Installation

pip install idlecloud

Basic Usage

from idlecloud import IdleCloud

api_key = 'ic_your_api_key_here'

# Create a client instance
client = IdleCloud(api_key=api_key)

Streaming

response = client.chat.completions.create(
    model="gpt-oss-20b",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True
)

# Print the response as it streams in
for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Async Support

import asyncio
from idlecloud import AsyncIdleCloud

async def main():
    api_key = 'ic_your_api_key_here'
    client = AsyncIdleCloud(api_key=api_key)

    response = await client.chat.completions.create(
        model="gpt-oss-20b",
        messages=[{"role": "user", "content": "Hello!"}],
        stream=True
    )

    async for chunk in response:
        if chunk.choices[0].delta.content:
            print(chunk.choices[0].delta.content, end="", flush=True)

asyncio.run(main())

REST API

Base URL

https://api.idlecloud.ai/v1

Endpoint: Chat Completions

POST /v1/chat/completions

Headers

Authorization: Bearer ic_your_api_key_here
Content-Type: application/json

Request Body

{
  "model": "gpt-oss-20b",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"}
  ],
  "temperature": 0.7,
  "max_tokens": 150,
  "stream": false
}

Response

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1733256789,
  "model": "gpt-oss-20b",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "The capital of France is Paris."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 23,
    "completion_tokens": 8,
    "total_tokens": 31,
    "completion_tokens_details": {
      "reasoning_tokens": 0
    }
  }
}

Streaming Response

Content-Type: text/event-stream

data: {"id":"chatcmpl-...","object":"chat.completion.chunk",...}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk",...}

data: [DONE]

cURL Example

curl https://api.idlecloud.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ic_your_api_key" \
  -d '{
    "model": "gpt-oss-20b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Available Models

gpt-oss-20b

Type: General-purpose chat model
Parameters: 20B
Context: 120k tokens
Capabilities: Chat, instruction following, reasoning
Special: Includes reasoning tokens in usage (see Harmony format)

Coming Soon: gpt-oss-20b-safeguard - Perfect for a broad range of moderation tasks

Parameters

Parameter	Type	Required	Description
`model`	string	Yes	Model to use (e.g., "gpt-oss-20b")
`messages`	array	Yes	List of message objects with role and content
`temperature`	number	No	Sampling temperature (0-2, default: 1)
`max_tokens`	integer	No	Maximum tokens to generate
`stream`	boolean	No	Enable streaming (default: false)
`top_p`	number	No	Nucleus sampling (0-1)
`frequency_penalty`	number	No	Reduce repetition (-2 to 2)
`presence_penalty`	number	No	Encourage new topics (-2 to 2)
`stop`	string/array	No	Stop sequences

Error Handling

Error Response Format

{
  "error": {
    "message": "Invalid API key provided",
    "type": "invalid_request_error",
    "code": "invalid_api_key",
    "param": null
  }
}

HTTP Status Codes

Code	Meaning
`200`	Success
`400`	Bad request (invalid parameters)
`401`	Unauthorized (invalid API key)
`404`	Model not found
`429`	Rate limit exceeded
`500`	Internal server error
`503`	Service unavailable (no miners available)

Error Codes

invalid_api_key - API key is invalid or missing
model_not_found - Requested model doesn't exist
rate_limit_exceeded - Too many requests
insufficient_quota - Not enough credits
invalid_request_error - Invalid request parameters

Rate Limits

Rate Limit Headers

X-RateLimit-Limit-Requests: 10
X-RateLimit-Remaining-Requests: 9
X-RateLimit-Reset-Requests: 2025-12-05T17:30:00Z

Handling Rate Limits

from openai import RateLimitError
from time import sleep

try:
    response = client.chat.completions.create(
        model="gpt-oss-20b",
        messages=[{"role": "user", "content": "Hello!"}],
        stream=True
    )
except RateLimitError as e:
    retry_after = int(e.response.headers.get("Retry-After", 60))
    sleep(retry_after)
    response = client.chat.completions.create(
        model="gpt-oss-20b",
        messages=[{"role": "user", "content": "Hello!"}],
        stream=True
    )

Usage Tracking

Token Counting

The usage object includes:

prompt_tokens: Input tokens
completion_tokens: Output tokens (including reasoning)
total_tokens: Sum of prompt + completion
completion_tokens_details.reasoning_tokens: Tokens used for reasoning (gpt-oss-20b)

Example

{
  "usage": {
    "prompt_tokens": 78,
    "completion_tokens": 116,
    "total_tokens": 194,
    "completion_tokens_details": {
      "reasoning_tokens": 98
    }
  }
}

Note: You are billed for all tokens including reasoning tokens, even though only the final answer is returned.

Migration from OpenAI

Using OpenAI SDK (No Code Changes)

from openai import OpenAI

client = OpenAI(
    api_key="ic_...",  # Use IdleCloud API key
    base_url="https://api.idlecloud.ai/v1"  # Point to IdleCloud
)

# Everything else is identical
response = client.chat.completions.create(
    model="gpt-oss-20b",
    messages=[{"role": "user", "content": "Hello!"}]
)

Using IdleCloud SDK (Recommended)

from idlecloud import IdleCloud

api_key = 'ic_your_api_key_here'
client = IdleCloud(api_key=api_key)

# Same interface as OpenAI
response = client.chat.completions.create(
    model="gpt-oss-20b",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Best Practices

API Key Security

Never commit API keys to version control
Use environment variables
Rotate keys regularly
Use separate keys for dev/staging/prod

Rate Limit Management

Implement exponential backoff
Monitor rate limit headers
Batch requests when possible
Cache responses where appropriate

Cost Optimization

Set max_tokens to limit costs
Use lower temperature for deterministic tasks
Monitor token usage via dashboard
Consider caching frequent queries

Error Handling

Always catch API errors
Implement retry logic for transient errors
Log errors for debugging
Provide fallback responses

Quick Navigation

Getting Started

Authentication

Quick Start

Python SDK

Installation

Basic Usage

Streaming

Async Support

REST API

Base URL

Endpoint: Chat Completions

Headers

Request Body

Response

Streaming Response

cURL Example

Available Models

gpt-oss-20b

Parameters

Error Handling

Error Response Format

HTTP Status Codes

Error Codes

Rate Limits

Rate Limit Headers

Handling Rate Limits

Usage Tracking

Token Counting

Example

Migration from OpenAI

Using OpenAI SDK (No Code Changes)

Using IdleCloud SDK (Recommended)

Best Practices

API Key Security

Rate Limit Management

Cost Optimization

Error Handling