Models & Providers
The Lunar SDK provides access to models from multiple providers. You can list available models, check provider status, and force specific providers for your requests.Listing Models
Model Object
| Field | Type | Description |
|---|---|---|
id | str | Model identifier |
object | str | Always "model" |
created | int | Unix timestamp |
owned_by | str | Provider/owner name |
Listing Providers
See which providers are available for a specific model:Provider Object
| Field | Type | Description |
|---|---|---|
id | str | Provider identifier |
type | str | Provider type (primary, backup) |
enabled | bool | Whether the provider is enabled |
params | dict | Provider-specific parameters |
Force Specific Provider
Use theprovider/model syntax to route requests to a specific provider:
When to Force Providers
| Scenario | Recommendation |
|---|---|
| General use | Let the API choose (no prefix) |
| Testing specific provider | Use provider prefix |
| Compliance requirements | Use provider prefix |
| Comparing providers | Use provider prefix |
Common Providers
| Provider | Prefix | Highlights |
|---|---|---|
| OpenAI | openai/ | GPT-4o, GPT-5, o1, o3 reasoning models |
| Anthropic | anthropic/ | Claude 3, Claude 4, Claude 4.5 series |
google/ | Gemini 2.0, 2.5, 3 series | |
| Mistral | mistral/ | Mistral, Magistral, Codestral |
| DeepSeek | deepseek/ | Chat and reasoning models |
| Perplexity | perplexity/ | Sonar search models |
| Groq | groq/ | Ultra-fast inference (Llama, Qwen) |
| Cerebras | cerebras/ | Fast inference (Llama, Qwen) |
| Cohere | cohere/ | Command R models |
| AWS Bedrock | bedrock/ | Nova, Llama, Mistral |
Async Usage
Finding the Right Model
Model Selection Guidelines
| Need | Recommended Models |
|---|---|
| Low cost | gpt-4o-mini, claude-3-haiku, llama-3.1-8b |
| Fast responses | claude-3-haiku, gpt-4o-mini |
| Complex reasoning | gpt-4o, claude-3-7-sonnet, o3 |
| Code generation | gpt-4o, claude-sonnet-4, codestral-latest |
| Large context | claude-3-7-sonnet (200K), gpt-4o (128K), gemini-2.5-pro (1M) |
| Research | sonar-deep-research, o3-deep-research |
Custom Deployments
Deploy your own models on PureAI’s GPU infrastructure. When you create a deployment, a unique model ID is generated that you can use directly with the Lunar SDK.PureAI supports all text generation models compatible with vLLM. Deploy any open-source model from Hugging Face or your own fine-tuned models.
How It Works
- Create a deployment on the PureAI Console
- Select your model and GPU tier
- Get your unique model ID (e.g.,
DeepSeek-R1-Distill-Llama-8B) - Use
pureai/{model-id}in your requests
GPU Tiers & Pricing
GPU XS - Small Models (7B-13B)
GPU XS - Small Models (7B-13B)
GPU XS (g6.xlarge)
GPU XS 2x (g6.2xlarge)
| Spec | Value |
|---|---|
| GPU | NVIDIA L4 |
| VRAM | 24GB |
| vCPUs | 4 |
| RAM | 16GB |
| Storage | 250GB NVMe |
| Network | 10 Gbps |
| Spot Price | ~$0.20/h |
| Model Size | 7B - 13B |
| Spec | Value |
|---|---|
| GPU | NVIDIA L4 |
| VRAM | 24GB |
| vCPUs | 8 |
| RAM | 32GB |
| Storage | 450GB NVMe |
| Network | 15 Gbps |
| Spot Price | ~$0.35/h |
| Model Size | 7B - 13B |
GPU S - Medium Models (13B-34B)
GPU S - Medium Models (13B-34B)
GPU S (g6e.xlarge)
GPU S 2x (g6e.2xlarge)
| Spec | Value |
|---|---|
| GPU | NVIDIA L40S |
| VRAM | 48GB |
| vCPUs | 4 |
| RAM | 16GB |
| Storage | 250GB NVMe |
| Network | 10 Gbps |
| Spot Price | ~$0.60/h |
| Model Size | 13B - 34B |
| Spec | Value |
|---|---|
| GPU | NVIDIA L40S |
| VRAM | 48GB |
| vCPUs | 8 |
| RAM | 32GB |
| Storage | 450GB NVMe |
| Network | 15 Gbps |
| Spot Price | ~$1.00/h |
| Model Size | 13B - 34B |
GPU M - Large Models INT4 (30B-70B)
GPU M - Large Models INT4 (30B-70B)
GPU M (g5.12xlarge)
GPU M 2x (g5.24xlarge)
GPU M 4x (g5.48xlarge)
| Spec | Value |
|---|---|
| GPU | 4x NVIDIA A10G |
| VRAM | 96GB |
| vCPUs | 48 |
| RAM | 192GB |
| Storage | 3.8TB NVMe |
| Network | 40 Gbps |
| Spot Price | ~$1.80/h |
| Model Size | 30B - 70B INT4 |
| Spec | Value |
|---|---|
| GPU | 4x NVIDIA A10G |
| VRAM | 96GB |
| vCPUs | 96 |
| RAM | 384GB |
| Storage | 3.8TB NVMe |
| Network | 50 Gbps |
| Spot Price | ~$3.00/h |
| Model Size | 30B - 70B INT4 |
| Spec | Value |
|---|---|
| GPU | 8x NVIDIA A10G |
| VRAM | 192GB |
| vCPUs | 192 |
| RAM | 768GB |
| Storage | 7.6TB NVMe |
| Network | 100 Gbps |
| Spot Price | ~$5.00/h |
| Model Size | 30B - 70B INT4 |
GPU L - Large Models FP16 (70B)
GPU L - Large Models FP16 (70B)
GPU L (g6e.12xlarge)
GPU L 2x (g6e.24xlarge)
GPU L 4x (g6e.48xlarge)
| Spec | Value |
|---|---|
| GPU | 4x NVIDIA L40S |
| VRAM | 192GB |
| vCPUs | 48 |
| RAM | 384GB |
| Storage | 3.8TB NVMe |
| Network | 40 Gbps |
| Spot Price | ~$3.50/h |
| Model Size | 70B FP16 |
| Spec | Value |
|---|---|
| GPU | 4x NVIDIA L40S |
| VRAM | 192GB |
| vCPUs | 96 |
| RAM | 768GB |
| Storage | 3.8TB NVMe |
| Network | 50 Gbps |
| Spot Price | ~$6.00/h |
| Model Size | 70B FP16 |
| Spec | Value |
|---|---|
| GPU | 8x NVIDIA L40S |
| VRAM | 384GB |
| vCPUs | 192 |
| RAM | 1536GB |
| Storage | 7.6TB NVMe |
| Network | 100 Gbps |
| Spot Price | ~$10.00/h |
| Model Size | 70B FP16 |
GPU XL - Extra Large Models (70B-180B)
GPU XL - Extra Large Models (70B-180B)
GPU XL (p4d.24xlarge)
GPU XL 80GB (p4de.24xlarge)
| Spec | Value |
|---|---|
| GPU | 8x NVIDIA A100 (40GB) |
| VRAM | 320GB |
| vCPUs | 96 |
| RAM | 1152GB |
| Storage | 8TB NVMe |
| Network | 400 Gbps EFA |
| Spot Price | ~$12.00/h |
| Model Size | 70B - 180B |
| Spec | Value |
|---|---|
| GPU | 8x NVIDIA A100 (80GB) |
| VRAM | 640GB |
| vCPUs | 96 |
| RAM | 1152GB |
| Storage | 8TB NVMe |
| Network | 400 Gbps EFA |
| Spot Price | ~$18.00/h |
| Model Size | 70B - 180B+ |
GPU XXL - Frontier Models (405B)
GPU XXL - Frontier Models (405B)
GPU XXL H100 (p5.48xlarge)
GPU XXL H200 (p5e.48xlarge)
| Spec | Value |
|---|---|
| GPU | 8x NVIDIA H100 (80GB) |
| VRAM | 640GB |
| vCPUs | 192 |
| RAM | 2048GB |
| Storage | 8TB NVMe |
| Network | 3200 Gbps EFA v2 |
| Spot Price | ~$20.00/h |
| Model Size | 405B FP8 |
| Spec | Value |
|---|---|
| GPU | 8x NVIDIA H200 (141GB) |
| VRAM | 1128GB |
| vCPUs | 192 |
| RAM | 2048GB |
| Storage | 8TB NVMe |
| Network | 3200 Gbps EFA v2 |
| Spot Price | ~$30.00/h |
| Model Size | 405B FP16 |
Tier Selection Guide
| Model Size | Recommended Tier | Example Models |
|---|---|---|
| 7B - 13B | GPU XS | Llama 3.1 8B, Mistral 7B, Qwen 7B |
| 13B - 34B | GPU S | CodeLlama 34B, Mixtral 8x7B |
| 30B - 70B INT4 | GPU M | Llama 3.1 70B (INT4), DeepSeek 67B |
| 70B FP16 | GPU L | Llama 3.1 70B (FP16), Qwen 72B |
| 70B - 180B | GPU XL | Falcon 180B, DBRX |
| 405B | GPU XXL | Llama 3.1 405B |
Available Models
OpenAI
OpenAI
GPT-4o Series
gpt-4o- Latest multimodal flagship modelgpt-4o-mini- Fast and cost-effectivegpt-4o-2024-05-13- Specific dated versiongpt-4o-search-preview- With web searchgpt-4o-mini-search-preview- Mini with web search
gpt-4.1- Enhanced GPT-4gpt-4.1-mini- Smaller variantgpt-4.1-nano- Fastest variant
gpt-5- Next generation modelgpt-5-pro- Professional tiergpt-5-mini- Smaller variantgpt-5-nano- Fastest variantgpt-5-chat-latest- Latest chat versiongpt-5-codex- Code specializedgpt-5-search-api- With search capabilities
o1- Advanced reasoning modelo1-mini- Smaller reasoning modelo1-pro- Professional reasoning
o3- Latest reasoning modelo3-mini- Smaller varianto3-deep-research- Deep research capabilities
o4-mini- Compact reasoningo4-mini-deep-research- With deep research
codex-mini-latest- Code generation
Anthropic
Anthropic
Claude 4.5 Series
claude-opus-4-5- Most capable, complex tasksclaude-sonnet-4-5- Balanced performanceclaude-haiku-4-5- Fast and efficient
claude-opus-4- High capabilityclaude-opus-4-1- Updated variantclaude-sonnet-4- Balanced model
claude-3-7-sonnet- Latest Sonnet (200K context)
claude-3-5-haiku- Fast responses
claude-3-opus- Most capable v3claude-3-haiku- Fast and affordable
Google (Gemini)
Google (Gemini)
Gemini 3 Preview
gemini-3-pro-preview- Next gen professionalgemini-3-flash-preview- Next gen fast
gemini-2.5-pro- Professional tier (1M context)gemini-2.5-flash- Fast responsesgemini-2.5-flash-lite- Lightweight version
gemini-2.0-flash- Fast multimodalgemini-2.0-flash-lite- Lightweight version
Mistral
Mistral
Core Models
mistral-large-latest- Most capablemistral-medium-latest- Balancedmistral-small-latest- Fast and efficient
magistral-medium-latest- Reasoning mediummagistral-small-latest- Reasoning small
codestral-latest- Code specialized
open-mistral-nemo- Open source variant
DeepSeek
DeepSeek
deepseek-chat- General chat modeldeepseek-reasoner- Advanced reasoning
Perplexity (Sonar)
Perplexity (Sonar)
sonar- Base search modelsonar-pro- Enhanced searchsonar-reasoning- With reasoningsonar-reasoning-pro- Pro reasoningsonar-deep-research- In-depth research
Groq
Groq
Llama Models
groq-llama-3.1-8b-instant- Ultra fast 8Bgroq-llama-3.3-70b-versatile- Versatile 70Bgroq-llama-4-scout- Scout modelgroq-llama-4-maverick- Maverick model
groq-qwen3-32b- Qwen 32B
groq-gpt-oss-20b- Open source 20Bgroq-gpt-oss-120b- Open source 120B
groq-kimi-k2-instruct- Kimi K2
Cerebras
Cerebras
Llama Models
cerebras-llama-3.1-8b- Fast 8Bcerebras-llama-3.3-70b- Fast 70B
cerebras-qwen-3-32b- Qwen 32Bcerebras-qwen-3-235b- Qwen 235B
cerebras-gpt-oss-120b- GPT OSS 120Bcerebras-zai-glm-4.6- ZAI GLM
Cohere
Cohere
cohere-command-r- Command R basecohere-command-r-plus- Enhanced Command Rcohere-command-r7b- Lightweight 7B
AWS Bedrock
AWS Bedrock
Amazon Nova
amazon-nova-micro- Smallest, fastestamazon-nova-lite- Lightweightamazon-nova-pro- Professionalamazon-nova-premier- Most capable
llama-3-1-8b- Llama 3.1 8Bllama-3-1-70b- Llama 3.1 70Bllama-3-3-70b- Llama 3.3 70B
mistral-large-bedrock- Mistral Largemixtral-8x7b- Mixtral MoE