PureRouter Public API

The PureRouter API allows you to interact with PureAI’s intelligent LLM routing service. This documentation provides details about the public endpoints available for use in your applications.

Authentication

All endpoints require authentication via API key using the x-router-key header:

# Authentication header example
x-router-key: sk_your_router_key_here

Public Endpoints

The PureRouter public API offers the following endpoints:

Router - Intelligent Routing (/v1/infer)

Sends a query to be automatically routed to the most suitable model based on the selected profile. Example with curl:

curl -i -X POST \
  -H "Content-Type: application/json" \
  -H "x-router-key: sk_your_router_key_here" \
  -d '{
    "prompt": "Tell a long and detailed story about a dragon that learned to program in Python.",
    "profile": "balanced",
    "max_tokens": 500,
    "temperature": 0.7,
    "stream": false
  }' \
  https://api.purerouter-api.com/v1/infer

Example with curl (streaming):

curl -i -X POST \
  --no-buffer \
  -H "Content-Type: application/json" \
  -H "x-router-key: sk_your_router_key_here" \
  -d '{
    "prompt": "Tell a long and detailed story about a dragon that learned to program in Python.",
    "profile": "economy",
    "stream": true
  }' \
  https://api.purerouter-api.com/v1/infer

Example with Python:

import requests

url = "https://api.purerouter-api.com/v1/infer"
headers = {
    "Content-Type": "application/json",
    "x-router-key": "sk_your_router_key_here"
}

payload = {
    "prompt": "Tell a long and detailed story about a dragon that learned to program in Python.",
    "profile": "balanced",
    "max_tokens": 500,
    "temperature": 0.7,
    "stream": False
}

response = requests.post(url, json=payload, headers=headers)
print(response.json())

Deployments - Invoke Specific Model (/v1/deployments//invoke)

Sends a request to a specific model through its deployment ID. Example with curl:

curl -i -X POST \
  -H "Content-Type: application/json" \
  -H "x-router-key: sk_your_router_key_here" \
  -d '{
    "prompt": "Tell a brief story about PureAI.",
    "max_tokens": 250,
    "temperature": 0.8,
    "stream": false
  }' \
  https://api.purerouter-api.com/v1/deployments/{deployment_id}/invoke

Example with curl (streaming):

curl -i -X POST \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "x-router-key: sk_your_router_key_here" \
  -d '{
    "prompt": "Tell a brief story about PureAI.",
    "max_tokens": 250,
    "temperature": 0.8,
    "stream": true
  }' \
  https://api.purerouter-api.com/v1/deployments/{deployment_id}/invoke

Example with curl (alternative format):

curl -i -X POST \
  -H "Content-Type: application/json" \
  -H "x-router-key: sk_your_router_key_here" \
  -d '{
    "inputs": "Tell a brief story about PureAI.",
    "parameters": {
      "max_new_tokens": 250,
      "temperature": 0.8
    }
  }' \
  https://api.purerouter-api.com/v1/deployments/{deployment_id}/invoke

Example with Python:

import requests

deployment_id = ""
url = f"https://api.purerouter-api.com/v1/deployments/{deployment_id}/invoke"
headers = {
    "Content-Type": "application/json",
    "x-router-key": "sk_your_router_key_here"
}

payload = {
    "prompt": "Tell a brief story about PureAI.",
    "max_tokens": 250,
    "temperature": 0.8,
    "stream": False
}

response = requests.post(url, json=payload, headers=headers)
print(response.json())

Example with Python (alternative format):

import requests

deployment_id = ""
url = f"https://api.purerouter-api.com/v1/deployments/{deployment_id}/invoke"
headers = {
    "Content-Type": "application/json",
    "x-router-key": "sk_your_router_key_here"
}

payload = {
    "inputs": "Tell a brief story about PureAI.",
    "parameters": {
        "max_new_tokens": 250,
        "temperature": 0.8
    }
}

response = requests.post(url, json=payload, headers=headers)
print(response.json())

Request Parameters

Common Parameters

prompt (string, required): The input text prompt for the model
max_tokens (integer, optional): Maximum number of tokens to generate
temperature (float, optional): Controls randomness (0.0 to 1.0). Lower = more deterministic, higher = more creative
top_p (float, optional): Nucleus sampling threshold (0.0 to 1.0)
stream (boolean, optional): Whether to stream the response. Default: false

Router-specific Parameters (/v1/infer)

profile (string, optional): Routing profile. Options: “economy”, “balanced”, “quality”. Default: “balanced”

Deployment-specific Parameters (/v1/deployments//invoke)

deployment_id (string, required): The unique identifier of the deployment to invoke

Alternative Format for Deployments

Some deployments also support an alternative request format:

inputs (string, required): The input text prompt for the model (alternative to prompt)
parameters (object, optional): Configuration parameters wrapped in a parameters object
- max_new_tokens (integer, optional): Maximum number of new tokens to generate (alternative to max_tokens)
- temperature (float, optional): Controls randomness (0.0 to 1.0)
- top_p (float, optional): Nucleus sampling threshold (0.0 to 1.0)

Response Examples

Response from /v1/infer endpoint

{
  "id": "resp_7a9b3c2d1e",
  "model": "gpt-4-turbo",
  "choices": [
    {
      "text": "Once upon a time, in a mystical land far beyond the mountains, there lived a dragon named Pyrex who was unlike any other dragon in the realm. While his fellow dragons spent their days hoarding gold and breathing fire, Pyrex was fascinated by the strange glowing rectangles that humans carried around...",
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 142,
    "total_tokens": 167
  },
  "provider": "openai",
  "model_id": "gpt-4-turbo",
  "latency_ms": 1250,
  "profile": "balanced"
}

Response from /v1/deployments//invoke endpoint

{
  "id": "resp_8b2c4d3e5f",
  "model": "llama-3.1-8b",
  "choices": [
    {
      "text": "PureAI was born from a simple yet powerful vision: to democratize access to artificial intelligence and make it accessible to everyone, regardless of their technical background. Founded by a team of passionate engineers and researchers, PureAI started as a small project in a garage but quickly grew into a revolutionary platform that would change how people interact with AI technology.",
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 68,
    "total_tokens": 80
  }
}

Streaming Response Format

When stream: true is set, responses are sent as Server-Sent Events (SSE):

data: {"choices": [{"text": "Once"}]}

data: {"choices": [{"text": " upon"}]}

data: {"choices": [{"text": " a"}]}

data: {"choices": [{"text": " time"}]}

data: [DONE]

Routing Profiles

Choose the profile that best fits your needs:

economy - Cost-optimized routing, uses cheaper models when possible
balanced - Balance between cost and quality (default)
quality - Prioritizes response quality over cost

Error Handling

The PureRouter API returns standard HTTP status codes to indicate the success or failure of a request. In case of error, the response body will contain detailed information about the problem.

Common Status Codes

200 OK: The request was successful
400 Bad Request: The request contains invalid parameters or is malformed
401 Unauthorized: Authentication failure (invalid or missing router key)
404 Not Found: The requested resource was not found (e.g., deployment not found)
500 Internal Server Error: Internal server error

Error Response Example

{
  "error": {
    "message": "Invalid or expired router key",
    "type": "authentication_error",
    "param": "x-router-key",
    "code": "invalid_router_key"
  }
}

Rate Limits

API requests are subject to rate limiting based on your subscription plan. Rate limit information is included in response headers:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1640995200

Use the left menu to navigate through the complete details of each endpoint.

introduction

PureRouter

PureCPP

Public API (PureRouter)

PureRouter Public API

Authentication

Public Endpoints

Router - Intelligent Routing (/v1/infer)

Deployments - Invoke Specific Model (/v1/deployments//invoke)

Request Parameters

Common Parameters

Router-specific Parameters (/v1/infer)

Deployment-specific Parameters (/v1/deployments//invoke)

Alternative Format for Deployments

Response Examples

Response from /v1/infer endpoint

Response from /v1/deployments//invoke endpoint

Streaming Response Format

Routing Profiles

Error Handling

Common Status Codes

Error Response Example

Rate Limits

introduction

PureRouter

PureCPP

​PureRouter Public API

​Authentication

​Public Endpoints

​Router - Intelligent Routing (/v1/infer)

​Deployments - Invoke Specific Model (/v1/deployments//invoke)

​Request Parameters

​Common Parameters

​Router-specific Parameters (/v1/infer)

​Deployment-specific Parameters (/v1/deployments//invoke)

​Alternative Format for Deployments

​Response Examples

​Response from /v1/infer endpoint

​Response from /v1/deployments//invoke endpoint

​Streaming Response Format

​Routing Profiles

​Error Handling

​Common Status Codes

​Error Response Example

​Rate Limits

PureRouter Public API

Authentication

Public Endpoints

Router - Intelligent Routing (/v1/infer)

Deployments - Invoke Specific Model (/v1/deployments//invoke)

Request Parameters

Common Parameters

Router-specific Parameters (/v1/infer)

Deployment-specific Parameters (/v1/deployments//invoke)

Alternative Format for Deployments

Response Examples

Response from /v1/infer endpoint

Response from /v1/deployments//invoke endpoint

Streaming Response Format

Routing Profiles

Error Handling

Common Status Codes

Error Response Example

Rate Limits