PureRouter Public API
The PureRouter API allows you to interact with PureAI’s intelligent LLM routing service. This documentation provides details about the public endpoints available for use in your applications.Authentication
All endpoints require authentication via API key using thex-router-key header:
Public Endpoints
The PureRouter public API offers the following endpoints:Router - Intelligent Routing (/v1/infer)
Sends a query to be automatically routed to the most suitable model based on the selected profile. Example with curl:Deployments - Invoke Specific Model (/v1/deployments//invoke)
Sends a request to a specific model through its deployment ID. Example with curl:Request Parameters
Common Parameters
- prompt(string, required): The input text prompt for the model
- max_tokens(integer, optional): Maximum number of tokens to generate
- temperature(float, optional): Controls randomness (0.0 to 1.0). Lower = more deterministic, higher = more creative
- top_p(float, optional): Nucleus sampling threshold (0.0 to 1.0)
- stream(boolean, optional): Whether to stream the response. Default: false
Router-specific Parameters (/v1/infer)
- profile(string, optional): Routing profile. Options: “economy”, “balanced”, “quality”. Default: “balanced”
Deployment-specific Parameters (/v1/deployments//invoke)
- deployment_id(string, required): The unique identifier of the deployment to invoke
Alternative Format for Deployments
Some deployments also support an alternative request format:- inputs(string, required): The input text prompt for the model (alternative to- prompt)
- parameters(object, optional): Configuration parameters wrapped in a parameters object- max_new_tokens(integer, optional): Maximum number of new tokens to generate (alternative to- max_tokens)
- temperature(float, optional): Controls randomness (0.0 to 1.0)
- top_p(float, optional): Nucleus sampling threshold (0.0 to 1.0)
 
Response Examples
Response from /v1/infer endpoint
Response from /v1/deployments//invoke endpoint
Streaming Response Format
Whenstream: true is set, responses are sent as Server-Sent Events (SSE):
Routing Profiles
Choose the profile that best fits your needs:- economy- Cost-optimized routing, uses cheaper models when possible
- balanced- Balance between cost and quality (default)
- quality- Prioritizes response quality over cost
Error Handling
The PureRouter API returns standard HTTP status codes to indicate the success or failure of a request. In case of error, the response body will contain detailed information about the problem.Common Status Codes
- 200 OK: The request was successful
- 400 Bad Request: The request contains invalid parameters or is malformed
- 401 Unauthorized: Authentication failure (invalid or missing router key)
- 404 Not Found: The requested resource was not found (e.g., deployment not found)
- 500 Internal Server Error: Internal server error