Skip to main content

PureRouter Public API

The PureRouter API allows you to interact with PureAI’s intelligent LLM routing service. This documentation provides details about the public endpoints available for use in your applications.

Authentication

All endpoints require authentication via API key using the x-router-key header:
# Authentication header example
x-router-key: sk_your_router_key_here

Public Endpoints

The PureRouter public API offers the following endpoints:

Router - Intelligent Routing (/v1/infer)

Sends a query to be automatically routed to the most suitable model based on the selected profile. Example with curl:
curl -i -X POST \
  -H "Content-Type: application/json" \
  -H "x-router-key: sk_your_router_key_here" \
  -d '{
    "prompt": "Tell a long and detailed story about a dragon that learned to program in Python.",
    "profile": "balanced",
    "max_tokens": 500,
    "temperature": 0.7,
    "stream": false
  }' \
  https://api.purerouter-api.com/v1/infer
Example with curl (streaming):
curl -i -X POST \
  --no-buffer \
  -H "Content-Type: application/json" \
  -H "x-router-key: sk_your_router_key_here" \
  -d '{
    "prompt": "Tell a long and detailed story about a dragon that learned to program in Python.",
    "profile": "economy",
    "stream": true
  }' \
  https://api.purerouter-api.com/v1/infer
Example with Python:
import requests

url = "https://api.purerouter-api.com/v1/infer"
headers = {
    "Content-Type": "application/json",
    "x-router-key": "sk_your_router_key_here"
}

payload = {
    "prompt": "Tell a long and detailed story about a dragon that learned to program in Python.",
    "profile": "balanced",
    "max_tokens": 500,
    "temperature": 0.7,
    "stream": False
}

response = requests.post(url, json=payload, headers=headers)
print(response.json())

Deployments - Invoke Specific Model (/v1/deployments//invoke)

Sends a request to a specific model through its deployment ID. Example with curl:
curl -i -X POST \
  -H "Content-Type: application/json" \
  -H "x-router-key: sk_your_router_key_here" \
  -d '{
    "prompt": "Tell a brief story about PureAI.",
    "max_tokens": 250,
    "temperature": 0.8,
    "stream": false
  }' \
  https://api.purerouter-api.com/v1/deployments/{deployment_id}/invoke
Example with curl (streaming):
curl -i -X POST \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "x-router-key: sk_your_router_key_here" \
  -d '{
    "prompt": "Tell a brief story about PureAI.",
    "max_tokens": 250,
    "temperature": 0.8,
    "stream": true
  }' \
  https://api.purerouter-api.com/v1/deployments/{deployment_id}/invoke
Example with curl (alternative format):
curl -i -X POST \
  -H "Content-Type: application/json" \
  -H "x-router-key: sk_your_router_key_here" \
  -d '{
    "inputs": "Tell a brief story about PureAI.",
    "parameters": {
      "max_new_tokens": 250,
      "temperature": 0.8
    }
  }' \
  https://api.purerouter-api.com/v1/deployments/{deployment_id}/invoke
Example with Python:
import requests

deployment_id = ""
url = f"https://api.purerouter-api.com/v1/deployments/{deployment_id}/invoke"
headers = {
    "Content-Type": "application/json",
    "x-router-key": "sk_your_router_key_here"
}

payload = {
    "prompt": "Tell a brief story about PureAI.",
    "max_tokens": 250,
    "temperature": 0.8,
    "stream": False
}

response = requests.post(url, json=payload, headers=headers)
print(response.json())
Example with Python (alternative format):
import requests

deployment_id = ""
url = f"https://api.purerouter-api.com/v1/deployments/{deployment_id}/invoke"
headers = {
    "Content-Type": "application/json",
    "x-router-key": "sk_your_router_key_here"
}

payload = {
    "inputs": "Tell a brief story about PureAI.",
    "parameters": {
        "max_new_tokens": 250,
        "temperature": 0.8
    }
}

response = requests.post(url, json=payload, headers=headers)
print(response.json())

Request Parameters

Common Parameters

  • prompt (string, required): The input text prompt for the model
  • max_tokens (integer, optional): Maximum number of tokens to generate
  • temperature (float, optional): Controls randomness (0.0 to 1.0). Lower = more deterministic, higher = more creative
  • top_p (float, optional): Nucleus sampling threshold (0.0 to 1.0)
  • stream (boolean, optional): Whether to stream the response. Default: false

Router-specific Parameters (/v1/infer)

  • profile (string, optional): Routing profile. Options: “economy”, “balanced”, “quality”. Default: “balanced”

Deployment-specific Parameters (/v1/deployments//invoke)

  • deployment_id (string, required): The unique identifier of the deployment to invoke

Alternative Format for Deployments

Some deployments also support an alternative request format:
  • inputs (string, required): The input text prompt for the model (alternative to prompt)
  • parameters (object, optional): Configuration parameters wrapped in a parameters object
    • max_new_tokens (integer, optional): Maximum number of new tokens to generate (alternative to max_tokens)
    • temperature (float, optional): Controls randomness (0.0 to 1.0)
    • top_p (float, optional): Nucleus sampling threshold (0.0 to 1.0)

Response Examples

Response from /v1/infer endpoint

{
  "id": "resp_7a9b3c2d1e",
  "model": "gpt-4-turbo",
  "choices": [
    {
      "text": "Once upon a time, in a mystical land far beyond the mountains, there lived a dragon named Pyrex who was unlike any other dragon in the realm. While his fellow dragons spent their days hoarding gold and breathing fire, Pyrex was fascinated by the strange glowing rectangles that humans carried around...",
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 142,
    "total_tokens": 167
  },
  "provider": "openai",
  "model_id": "gpt-4-turbo",
  "latency_ms": 1250,
  "profile": "balanced"
}

Response from /v1/deployments//invoke endpoint

{
  "id": "resp_8b2c4d3e5f",
  "model": "llama-3.1-8b",
  "choices": [
    {
      "text": "PureAI was born from a simple yet powerful vision: to democratize access to artificial intelligence and make it accessible to everyone, regardless of their technical background. Founded by a team of passionate engineers and researchers, PureAI started as a small project in a garage but quickly grew into a revolutionary platform that would change how people interact with AI technology.",
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 68,
    "total_tokens": 80
  }
}

Streaming Response Format

When stream: true is set, responses are sent as Server-Sent Events (SSE):
data: {"choices": [{"text": "Once"}]}

data: {"choices": [{"text": " upon"}]}

data: {"choices": [{"text": " a"}]}

data: {"choices": [{"text": " time"}]}

data: [DONE]

Routing Profiles

Choose the profile that best fits your needs:
  • economy - Cost-optimized routing, uses cheaper models when possible
  • balanced - Balance between cost and quality (default)
  • quality - Prioritizes response quality over cost

Error Handling

The PureRouter API returns standard HTTP status codes to indicate the success or failure of a request. In case of error, the response body will contain detailed information about the problem.

Common Status Codes

  • 200 OK: The request was successful
  • 400 Bad Request: The request contains invalid parameters or is malformed
  • 401 Unauthorized: Authentication failure (invalid or missing router key)
  • 404 Not Found: The requested resource was not found (e.g., deployment not found)
  • 500 Internal Server Error: Internal server error

Error Response Example

{
  "error": {
    "message": "Invalid or expired router key",
    "type": "authentication_error",
    "param": "x-router-key",
    "code": "invalid_router_key"
  }
}

Rate Limits

API requests are subject to rate limiting based on your subscription plan. Rate limit information is included in response headers:
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1640995200
Use the left menu to navigate through the complete details of each endpoint.