Skip to main content

PureRouter Quickstart

PureRouter is a completely independent product from PureCPP. You can use PureRouter without needing PureCPP and vice versa.

Installation

Install PureRouter using pip:
pip install purerouter
Note: For detailed installation instructions and requirements, see the installation page.

Basic Configuration

PureRouter supports both synchronous and asynchronous operations. Choose the approach that best fits your application:
# Synchronous client
from purerouter import PureRouter
from purerouter.types import InferRequest, InvokeRequest

client = PureRouter(
    router_key="your-api-key-here",
    base_url="https://api.purerouter-api.com",
    timeout=300.0
)

# Asynchronous client
from purerouter import AsyncPureRouter

async_client = AsyncPureRouter(
    router_key="your-api-key-here", 
    base_url="https://api.purerouter-api.com",
    timeout=300.0
)
Note: You need to obtain a valid API key from the PureAI platform. This key allows access to the different routing profiles and deployments configured in your account.

Synchronous Examples

Router Inference (Sync)

Use automatic routing to select the best model for your query:
from purerouter import PureRouter
from purerouter.types import InferRequest

def main():
    client = PureRouter(
        router_key="sk_...",
        base_url="https://api.purerouter-api.com",
        timeout=300.0
    )

    req = InferRequest(
        prompt="Summarize machine learning in 3 sentences.",
        profile="economy",  # Options: economy, balanced, quality
        stream=False
    )

    response = client.router.infer(req)
    print("--- RESPONSE ---")
    print(f"output: {response.output_text}")
    print(f"provider: {response.provider}")
    print(f"model_id: {response.model_id}")
    print(f"latency: {response.latency_ms} ms")
    print(f"usage: {response.usage}")
    print(f"profile: {response.profile}")

if __name__ == "__main__":
    main()

Direct Deployment Call (Sync)

Call a specific model directly using its deployment ID:
from purerouter import PureRouter
from purerouter.types import InvokeRequest

def main():
    client = PureRouter(
        router_key="sk_...",
        base_url="https://api.purerouter-api.com",
        timeout=300.0
    )
    deployment_id = ""

    req = InvokeRequest(
        prompt="Tell a brief story about PureAI.",
        max_tokens=250,
        temperature=0.8
    )

    result = client.deployments.invoke(deployment_id, req)
    text = result["choices"][0]["text"]
    print(text)

if __name__ == "__main__":
    main()

Asynchronous Examples

Router Inference (Async)

For high-performance applications, use async operations:
import asyncio
from purerouter import AsyncPureRouter
from purerouter.types import InferRequest

async def main():
    client = AsyncPureRouter(
        router_key="sk_...",
        base_url="https://api.purerouter-api.com",
        timeout=300.0
    )

    req = InferRequest(
        prompt="Summarize machine learning in 3 sentences.",
        profile="economy",
        stream=False
    )

    response = await client.router.ainfer(req)
    print("--- RESPONSE ---")
    print(f"output: {response.output_text}")
    print(f"provider: {response.provider}")
    print(f"model_id: {response.model_id}")
    print(f"latency: {response.latency_ms} ms")
    print(f"usage: {response.usage}")
    print(f"profile: {response.profile}")

asyncio.run(main())

Direct Deployment Call (Async)

import asyncio
from purerouter import AsyncPureRouter
from purerouter.types import InvokeRequest

async def main():
    client = AsyncPureRouter(
        router_key="sk_...",
        base_url="https://api.purerouter-api.com",
        timeout=300.0
    )
    deployment_id = ""

    req = InvokeRequest(
        prompt="Tell a brief story about PureAI.",
        max_tokens=250,
        temperature=0.8,
        stream=False
    )

    result = await client.deployments.ainvoke(deployment_id, req)
    text = result["choices"][0]["text"]
    print(text)

asyncio.run(main())

Streaming Examples

Router Streaming

Get real-time responses as they’re generated:
import asyncio
import json
import sys
from purerouter import AsyncPureRouter
from purerouter.types import InferRequest

async def main():
    client = AsyncPureRouter(
        router_key="sk_...",
        base_url="https://api.purerouter-api.com",
        timeout=300.0
    )

    req = InferRequest(
        prompt="Tell a long, detailed story about a dragon who learned Python.",
        profile="economy",
        stream=True
    )

    print("\n--- STREAM START ---\n")
    async for ev in client.router.astream(req):
        line = ev.data.strip()
        if not line:
            continue
        try:
            obj = json.loads(line)
        except json.JSONDecodeError:
            sys.stdout.write(line)
            sys.stdout.flush()
            continue

        if obj.get("event") == "start":
            continue
        elif obj.get("event") == "end":
            print("\n\n--- STREAM END ---")
            print(f"latency: {obj.get('latency_ms')} ms")
            print(f"usage: {obj.get('usage')}")
            print(f"cost: {obj.get('cost_usd')}")
            break
        elif "token" in obj:
            sys.stdout.write(obj["token"])
            sys.stdout.flush()

asyncio.run(main())

Deployment Streaming

Stream responses from a specific deployment:
import asyncio
import json
import sys
from purerouter import AsyncPureRouter
from purerouter.types import InvokeRequest

async def main():
    client = AsyncPureRouter(
        router_key="sk_...",
        base_url="https://api.purerouter-api.com",
        timeout=300.0
    )
    deployment_id = ""

    req = InvokeRequest(
        prompt="Hi.",
        max_tokens=250,
        temperature=0.8,
        stream=True
    )

    final = []
    async for ev in client.deployments.astream(deployment_id, req):
        line = (ev.data or "").strip()
        if not line or line == "[DONE]":
            continue

        if line.startswith("data:"):
            line = line[len("data:"):].strip()

        try:
            obj = json.loads(line)
        except json.JSONDecodeError:
            sys.stdout.write(line)
            sys.stdout.flush()
            final.append(line)
            continue

        choices = obj.get("choices") or []
        if choices and isinstance(choices[0], dict) and "text" in choices[0]:
            tok = choices[0]["text"] or ""
            sys.stdout.write(tok)
            sys.stdout.flush()
            final.append(tok)

asyncio.run(main())

Routing Profiles

Choose the profile that best fits your needs:
  • economy - Cost-optimized routing, uses cheaper models when possible
  • balanced - Balance between cost and quality
  • quality - Prioritizes response quality over cost

Workflow

  1. Sign up for the PureAI platform - Register and configure your LLM provider API keys
  2. Get your router API key - Generate an API key on the platform to use the service
  3. Install the Python library - Add the purerouter package to your project
  4. Choose your approach - Select sync/async and streaming based on your needs
  5. Choose the routing profile - Select between economy, balanced, or quality
  6. Implement routing - Integrate PureRouter into your application flow
  7. Optional: Deploy models - Deploy open source models and use them directly by ID

Next Steps