Articles

What is OpenRouter? A Guide with Practical Examples

  • Explore OpenAI’s API and learn how to write more effective generative AI prompts that help improve your results.
    • Beginner Friendly.
      < 1 hour
  • Leverage the OpenAI API within your Python code. Learn to import OpenAI modules, use chat completion methods, and craft effective prompts.
    • With Certificate
    • Intermediate.
      1 hour

What is OpenRouter?

OpenRouter is a platform that gives developers access to hundreds of large language models(LLMs) through a single API. Instead of maintaining separate integrations for OpenAI, Anthropic, Mistral, Google, and dozens of other AI companies, and juggling between them, we connect to one endpoint, authenticate once, and get access to more than 400 models.

OpenRouter isn’t an AI model itself. It’s a platform layer that connects to the expanding ecosystem of AI models. Think of it as a universal remote control for AI, instead of switching between remotes for your TV, sound system, and lights, OpenRouter gives you one interface that controls them all.

But why do we need this OpenRouter?

Why does OpenRouter exist?

The answer to this lies in the transformation of the AI landscape over the past 18 months, or as OpenRouter CEO Alex Atallah calls the “Cambrian explosion” of AI models.

Do you remember the state of AI just five years ago? We had a few dominant players like Google with research breakthroughs, OpenAI building GPT-3, and a handful of academic efforts. The choice was limited, and APIs were manageable. But today, the landscape has shattered into a thousand pieces.

For developers, this creates an impossible choice to use the best model for every task but manage fragmented APIs and spiraling costs. OpenRouter bridges this gap by:

  • Offering a single API to access hundreds of models from one endpoint.
  • Simplifying billing, authentication, and usage tracking.
  • Letting developers compare models, route requests automatically, and set fallbacks.
  • Reducing time, cost, and complexity in multi-model development.

In short, OpenRouter exists to unify the scattered AI ecosystem and make working with multiple LLMs as easy as calling one API.

So how does OpenRouter route your requests across hundreds of AI models?

OpenRouter architecture: How it works

When we send a request, it doesn’t go directly to OpenAI or Anthropic. Instead, it flows through OpenRouter’s infrastructure, which decides where to send it, optimizes for our preferences, and returns a unified response. Here’s the architecture of OpenRouter:

Layer 1: Client layer (your application)

Your code makes a single API call to OpenRouter’s unified endpoint. You send a prompt, specify a model (or let OpenRouter choose), and include any parameters like temperature, max tokens, or streaming preferences. The request looks identical to an OpenAI API call.

Layer 2: Routing layer

This is where OpenRouter’s magic happens. When your request arrives, the routing layer analyzes it and makes a decision in milliseconds. It considers:

  • Your preferences: Did you specify a particular model, or are you asking OpenRouter to optimize?
  • Cost: Which provider offers the best price for this request?
  • Speed: Which endpoint has the lowest latency right now?
  • Availability: Is your preferred model available, or should we use a fallback?
  • Performance history: Based on past requests, which provider handles this type of task best?

The routing layer consults real-time data about provider uptime, rate limits, and performance metrics. OpenRouter’s edge-based architecture adds only ~25 milliseconds of overhead while providing automatic failover across 50+ cloud providers and global edge deployment.

Layer 3: Provider layer(OpenAI, Anthropic, Mistral, etc.)

Once the routing layer decides, your request is sent to the chosen provider. This could be OpenAI’s servers, Anthropic’s infrastructure, Mistral’s API, or any of the other 400+ models in OpenRouter’s catalog. From the provider’s perspective, they’re receiving a normal API request. They process it and return the result.

Layer 4: Response layer (unified return)

The provider’s response comes back to OpenRouter, which normalizes it into a standard format and returns it to your application.

Diagram showing OpenRouter architecture with client, routing, provider, and response layers, explaining how OpenRouterAI processes API requests.

OpenRouter also supports fallbacks, so if a provider is unavailable or slow, another model automatically handles the request. Behind the scenes, OpenRouter optimizes latency and load balancing, routing traffic dynamically to ensure speed and reliability.

Next, let’s walk through making your first API call with OpenRouter, step by step.

Make your first OpenRouter API call

Now let’s bring everything together. We’ll connect to OpenRouter, send a prompt, read the response, and handle common issues with clean, practical code. Let’s start by setting up the OpenRouter account.

Setting up the OpenRouter account

Before sending requests, you need an OpenRouter account, an API key, and a model selection.

1. Visit the official website of OpenRouter and sign-in/sign-up using your email id.

2. On the dashboard, navigate to the “Keys” option

OpenRouter dashboard highlighting the keys section where users manage and generate OpenRouter API keys.

3. Select the “Create API Key” option

OpenRouter website showing the “Create API Key” button used to generate new OpenRouterAI access keys.

4. Give a name to this API key and select the “Create” option

Screenshot of OpenRouter interface where a user names their new API key before creating it.

5. You’ll get your API key in a dialog box. Copy this key and keep it secure.

Dialog box on OpenRouter displaying the newly generated API key for secure use in API calls.

Make a basic OpenRouter API call

Let’s start with a basic request. This Python code sends a prompt to OpenRouter and gets a response.

import requests
import os
# Set your API key (use environment variables in production)
API_KEY = "your-api-key-here"
BASE_URL = https://openrouter.ai/api/v1
def make_basic_call():
headers = {
"Authorization": f"Bearer {API_KEY}",
"HTTP-Referer": "https://yourapp.com", # Optional: helps with attribution
"X-Title": "My First OpenRouter App", # Optional: shows up in OpenRouter dashboard
}
payload = {
"model": "openai/gpt-4o",
"messages": [
{
"role": "user",
"content": "Explain quantum computing in one sentence."
}
],
"temperature": 0.7,
"max_tokens": 100
}
response = requests.post(
f"{BASE_URL}/chat/completions",
headers=headers,
json=payload
)
result = response.json()
if response.status_code == 200:
print("Response:", result["choices"][0]["message"]["content"])
print(f"Tokens used: {result['usage']['total_tokens']}")
else:
print(f"Error: {result}")
# Run it
make_basic_call()

In this code:

  • We’re sending a request to the same endpoint OpenAI uses (OpenRouter handles the routing)
  • The model parameter specifies GPT-4o
  • The request structure is identical to OpenAI’s API
  • The response includes the generated text and token usage

A sample output for this code is as follows:

Screenshot showing OpenRouter API call response displaying generated text output from an OpenRouterAI model.

Using the OpenRouter auto-router feature

Instead of specifying a model, we can let OpenRouter choose based on your criteria. This is where intelligent routing shines. This can be done by setting the model parameter to openrouter/auto:

import requests
API_KEY = "your-api-key-here"
BASE_URL = "https://openrouter.ai/api/v1"
def auto_route_by_cost():
headers = {
"Authorization": f"Bearer {API_KEY}",
"HTTP-Referer": "https://yourapp.com"
}
payload = {
"model": "openrouter/auto", # Let OpenRouter decide
"routing": {
"strategy": "cost" # Pick the cheapest model
},
"messages": [
{
"role": "user",
"content": "Write a product description for a smart water bottle."
}
],
"temperature": 0.7,
"max_tokens": 150
}
response = requests.post(
f"{BASE_URL}/chat/completions",
headers=headers,
json=payload
)
result = response.json()
if response.status_code == 200:
print("Response:", result["choices"][0]["message"]["content"])
# OpenRouter includes which model was used
print(f"Model used: {result.get('model')}")
print(f"Cost optimized: ✓")
else:
print(f"Error: {result}")
auto_route_by_cost()

In this code:

  • "model": "openrouter/auto" tells OpenRouter to choose
  • "strategy": "cost" means pick the cheapest model that can handle the request
  • "strategy": "speed" means pick the fastest model

A sample output for this code is:

Example of OpenRouter AutoRouter output showing how the platform automatically selects the best AI model for response generation.

Streaming responses for real-time output

For longer responses, streaming shows results in real-time instead of waiting for the complete response. We’ll set the stream parameter to True:

from openai import OpenAI
client = OpenAI(
api_key="your-api-key",
base_url="https://openrouter.ai/api/v1"
)
response = client.chat.completions.create(
model="openai/gpt-5",
messages=[
{"role": "user", "content": "Write a detailed explanation of how neural networks learn"}
],
stream=True
)
for chunk in response:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="")

This code creates an OpenAI client pointing to OpenRouter, sends a streaming request to GPT-5, and prints each token as it arrives in real-time.

Setting up fallback chains

If your primary model fails, OpenRouter automatically tries your fallbacks. This ensures your application stays online.

data = {
"model": "primary-model",
"route": {
"fallbacks": ["secondary-model", "tertiary-model"]
},
"messages": [
{"role": "user", "content": "Generate a marketing slogan for an AI tool."}
],
"max_tokens": 50
}
response = requests.post(ENDPOINT, headers=headers, json=data)
print(response.json()["choices"][0]["message"]["content"])

This sets up a fallback chain where OpenRouter tries the primary model first, then automatically falls back to secondary and tertiary models if the primary fails.

Error handling in OpenRouter

Production applications need robust error handling. Here’s how to handle common issues gracefully:

import time
for attempt in range(3):
try:
response = requests.post(ENDPOINT, headers=headers, json=data, timeout=10)
print(response.json()["choices"][0]["text"])
break
except requests.exceptions.RequestException as e:
print(f"Attempt {attempt+1} failed: {e}")
time.sleep(2)

You now have a complete toolkit to build production-ready applications with OpenRouter, from basic API calls and intelligent routing to streaming responses, fallback chains, and robust error handling.

Now let’s explore what happens when you want to work with data like images, PDFs, and multimodal AI.

Exploring the multimodal capabilities of OpenRouter

The OpenRouter platform supports multimodal models that can process images, PDFs, and other document types in addition to text. So, we can build applications that analyze photos, extract text from documents, generate images, and perform OCR through the same unified API.

Analyzing images

We’ll send an image URL to OpenRouter and get a detailed analysis back. The model will examine the image and respond to your questions about it:

import requests
API_KEY = "your-api-key-here"
BASE_URL = "https://openrouter.ai/api/v1"
def analyze_image_from_url():
headers = {
"Authorization": f"Bearer {API_KEY}",
"HTTP-Referer": "https://yourapp.com"
}
payload = {
"model": "openai/gpt-4o",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What's in this image? Describe it in detail."
},
{
"type": "image_url",
"image_url": {
"url": "https://example.com/image.jpg"
}
}
]
}
],
"max_tokens": 300
}
response = requests.post(
f"{BASE_URL}/chat/completions",
headers=headers,
json=payload
)
result = response.json()
print(result["choices"][0]["message"]["content"])
analyze_image_from_url()

Base64-encoded images

For local images or sensitive data, encode the image as base64 and send it directly in the request.

const imageBuffer = fs.readFileSync('path/to/image.jpg');
const base64Image = imageBuffer.toString('base64');
const payload = {
model: "openai/gpt-4o",
messages: [{
role: "user",
content: [
{ type: "text", text: "What's in this image?" },
{ type: "image_url", image_url: { url: data:image/jpeg;base64,${base64Image} } }
]
}],
max_tokens: 300
};

PDF document analysis

Extract and analyze content from PDF documents by sending the file as base64.

pdf_base64 = base64.b64encode(open('document.pdf', 'rb').read()).decode('utf-8')
payload = {
"model": "openai/gpt-4o",
"messages": [{
"role": "user",
"content": [
{ "type": "text", "text": "Summarize this PDF in 3 bullet points." },
{ "type": "document", "source": { "type": "base64", "media_type": "application/pdf", "data": pdf_base64 } }
]
}],
"max_tokens": 300
}

Image generation

Generate images from text descriptions using providers like DALL-E or Stable Diffusion on OpenRouter.

payload = {
"model": "openai/dall-e-3",
"prompt": "A futuristic city with flying cars and neon lights",
"size": "1024x1024",
"n": 1
}
response = requests.post(f"{BASE_URL}/images/generations", headers=headers, json=payload)
print(response.json()["data"][0]["url"])

Multimodal capabilities give you the power to build AI applications that understand the world beyond text. Now let’s see how OpenRouter compares to other solutions in the market.

OpenRouter Vs. LiteLLM: Key differences

Both OpenRouter and LiteLLM solve the same core problem: unified access to multiple LLM providers through a single API. However, they take different approaches.

Feature OpenRouter LiteLLM
Primary use Unified API for multiple LLM providers Lightweight local/on-premise LLM execution
Model access 400+ models from 60+ providers Pre-trained or downloaded models only
Routing / Auto selection Yes, auto-router selects best model per request No
Multimodal support Yes, depends on provider Limited
Billing and usage Centralized billing with platform fee Managed locally, no platform fees
Latency Slight overhead due to routing Low, runs locally
Ideal for Developers comparing multiple models, startups, enterprises Developers needing local control, experimentation, research
Fallbacks / Reliability Yes, fallback models supported No

In short, OpenRouter focuses on unifying hundreds of models across multiple providers. LiteLLM, on the other hand, emphasizes lightweight, local, or on-premise deployment of language models with lower latency. Choosing between them depends on whether your priority is model variety and API simplicity (OpenRouter) or deployment flexibility and control (LiteLLM).

So how are developers and companies actually using OpenRouter in real-world projects?

Real-world use cases of OpenRouter

OpenRouter enables practical applications across startups, enterprises, research, and integrations. Here are some of its use cases:

  • Startups: Quickly prototype AI-powered products using multiple LLMs, paying via a single dashboard and experimenting with different providers without rewriting code.

  • Enterprise teams: Centralize billing, enforce usage caps, and manage multiple departments accessing different models seamlessly.

  • Researchers: Compare model outputs for experiments or benchmarks easily, testing performance and accuracy across providers.

  • Integrators and developers: Build complex products like chatbots, virtual assistants, or agent-based systems that leverage multiple models for text, image, etc.

  • Open-source projects: Incorporate OpenRouter into AI libraries or frameworks to give users flexible model options out-of-the-box.

By abstracting the complexity of multiple providers, OpenRouter allows teams to focus on building innovative AI applications instead of managing integrations.

Conclusion

OpenRouter is a unified API and marketplace that gives developers access to hundreds of AI models from multiple providers through a single interface. It simplifies managing multiple APIs, offering features like auto-routing, fallback models, streaming responses, and multimodal support for images, PDFs, and documents. With centralized billing, usage tracking, and provider selection, OpenRouter enables startups, enterprises, researchers, and developers to build AI-powered applications without worrying about integrations.

If you want to explore further, check out Codecademy’s Learn How to Build AI Agents course.

Frequently asked questions

1. What is the difference between OpenAI and OpenRouter?

OpenAI is a provider of specific AI models like GPT-4 and ChatGPT, while OpenRouter is a unified API and marketplace that gives access to multiple providers’ models including OpenAI through a single endpoint.

2. How do OpenRouter credits work?

OpenRouter uses a credit system to track API usage across different models and providers. Each request consumes credits based on the model and request type, simplifying billing and usage tracking compared to managing multiple provider accounts separately.

3. What are the benefits of using OpenRouter?

The following are the benefits of OpenRouter:

  • Access hundreds of models from dozens of providers via a single API

  • Auto-routing selects the best model based on cost, speed, or reliability

  • Multimodal support for text, images, PDFs, and documents

  • Centralized billing and usage monitoring

  • Fallbacks for improved reliability

4. Is OpenRouter AI API free?

OpenRouter is free to sign up and experiment with, but usage typically involves model-specific costs plus a small platform fee (usually around 5%). The free tier allows limited access for testing and learning purposes.

5. What are the three types of routing in OpenRouter?

  • Manual routing: You specify exactly which model to use for each request.

  • Auto-routing: OpenRouter automatically selects the optimal model based on performance, cost, or speed.

  • Fallback routing: If the primary model fails, a secondary model is used to ensure reliability.

Codecademy Team

'The Codecademy Team, composed of experienced educators and tech experts, is dedicated to making tech skills accessible to all. We empower learners worldwide with expert-reviewed content that develops and enhances the technical skills needed to advance and succeed in their careers.'

Meet the full team

Learn more on Codecademy

  • Explore OpenAI’s API and learn how to write more effective generative AI prompts that help improve your results.
    • Beginner Friendly.
      < 1 hour
  • Leverage the OpenAI API within your Python code. Learn to import OpenAI modules, use chat completion methods, and craft effective prompts.
    • With Certificate
    • Intermediate.
      1 hour
  • Excel in OpenAI APIs using Python. Discover API key authentication, access to completions APIs via endpoints, model configurations, and control of creativity and response length.
    • Beginner Friendly.
      2 hours