March 26, 202612 min read

OpenAI vs Claude vs Gemini — AI API Comparison for Developers

A practical comparison of OpenAI, Claude, and Gemini APIs for developers — pricing, context windows, code generation quality, function calling, vision, speed, and rate limits with code examples.

openai claude gemini ai api comparison

If you're integrating an LLM into a product, you have three serious options: OpenAI's GPT models, Anthropic's Claude, and Google's Gemini. Each has real strengths, real weaknesses, and pricing structures that change every few months. The marketing from all three companies is predictably useless for making an engineering decision.

This is a practical comparison based on building with all three APIs. Pricing, context windows, code quality, function calling, latency, rate limits — the things that actually matter when you're writing the integration code.

Quick Comparison Table

Feature	OpenAI (GPT-4.1)	Claude (Opus 4)	Gemini (2.5 Pro)
Max context	1M tokens	200K (1M for Opus 4)	1M tokens
Input price (per 1M tokens)	$2.00	$15.00	$1.25 - $2.50
Output price (per 1M tokens)	$8.00	$75.00	$10.00
Cached input price	$0.50	$1.50	$0.315
Vision	Yes	Yes	Yes (+ video)
Function calling	Yes	Yes (tool_use)	Yes
Streaming	Yes	Yes	Yes
Batch API	Yes	Yes (Message Batches)	No
Fine-tuning	Yes	No	Yes
Time to first token	~300ms	~400ms	~250ms
Rate limit (free tier)	500 RPM	50 RPM	1500 RPM

Prices as of March 2026. These change frequently — check the pricing pages directly.

Calling Each API — Code Examples

OpenAI

import OpenAI from "openai";

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

// Basic completion
const response = await openai.chat.completions.create({
  model: "gpt-4.1",
  messages: [
    { role: "system", content: "You are a senior TypeScript developer." },
    { role: "user", content: "Write a debounce function with TypeScript generics." },
  ],
  temperature: 0.2,
  max_tokens: 1024,
});

console.log(response.choices[0].message.content);

// Streaming
const stream = await openai.chat.completions.create({
  model: "gpt-4.1",
  messages: [{ role: "user", content: "Explain React Server Components." }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}

Claude (Anthropic)

import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

// Basic completion
const response = await anthropic.messages.create({
  model: "claude-opus-4-20250514",
  max_tokens: 1024,
  system: "You are a senior TypeScript developer.",
  messages: [
    { role: "user", content: "Write a debounce function with TypeScript generics." },
  ],
});

// Claude returns content blocks, not a plain string
const text = response.content
  .filter((block) => block.type === "text")
  .map((block) => block.text)
  .join("");

console.log(text);

// Streaming
const stream = anthropic.messages.stream({
  model: "claude-opus-4-20250514",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Explain React Server Components." }],
});

for await (const event of stream) {
  if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
    process.stdout.write(event.delta.text);
  }
}

Gemini

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// Basic completion
const response = await ai.models.generateContent({
  model: "gemini-2.5-pro",
  contents: "Write a debounce function with TypeScript generics.",
  config: {
    systemInstruction: "You are a senior TypeScript developer.",
    temperature: 0.2,
    maxOutputTokens: 1024,
  },
});

console.log(response.text);

// Streaming
const streamResponse = await ai.models.generateContentStream({
  model: "gemini-2.5-pro",
  contents: "Explain React Server Components.",
});

for await (const chunk of streamResponse) {
  process.stdout.write(chunk.text ?? "");
}

Function Calling / Tool Use

All three support structured function calling. The implementations differ enough that you'll need model-specific code.

OpenAI Function Calling

const response = await openai.chat.completions.create({
  model: "gpt-4.1",
  messages: [
    { role: "user", content: "What's the weather in Tokyo and London?" },
  ],
  tools: [
    {
      type: "function",
      function: {
        name: "get_weather",
        description: "Get current weather for a city",
        parameters: {
          type: "object",
          properties: {
            city: { type: "string", description: "City name" },
            units: { type: "string", enum: ["celsius", "fahrenheit"] },
          },
          required: ["city"],
        },
      },
    },
  ],
  tool_choice: "auto",
});

// OpenAI may return multiple tool calls in one response
const toolCalls = response.choices[0].message.tool_calls;
for (const call of toolCalls ?? []) {
  const args = JSON.parse(call.function.arguments);
  console.log(Call ${call.function.name}(${JSON.stringify(args)}));
}

Claude Tool Use

const response = await anthropic.messages.create({
  model: "claude-opus-4-20250514",
  max_tokens: 1024,
  tools: [
    {
      name: "get_weather",
      description: "Get current weather for a city",
      input_schema: {
        type: "object",
        properties: {
          city: { type: "string", description: "City name" },
          units: { type: "string", enum: ["celsius", "fahrenheit"] },
        },
        required: ["city"],
      },
    },
  ],
  messages: [
    { role: "user", content: "What's the weather in Tokyo and London?" },
  ],
});

// Claude returns tool_use content blocks
for (const block of response.content) {
  if (block.type === "tool_use") {
    console.log(Call ${block.name}(${JSON.stringify(block.input)}));
    // block.id is used to send results back
  }
}

Gemini Function Calling

const response = await ai.models.generateContent({
  model: "gemini-2.5-pro",
  contents: "What's the weather in Tokyo and London?",
  config: {
    tools: [
      {
        functionDeclarations: [
          {
            name: "get_weather",
            description: "Get current weather for a city",
            parameters: {
              type: "object",
              properties: {
                city: { type: "string", description: "City name" },
                units: { type: "string", enum: ["celsius", "fahrenheit"] },
              },
              required: ["city"],
            },
          },
        ],
      },
    ],
  },
});

// Gemini returns function calls in parts
for (const part of response.candidates?.[0]?.content?.parts ?? []) {
  if (part.functionCall) {
    console.log(Call ${part.functionCall.name}(${JSON.stringify(part.functionCall.args)}));
  }
}

The API shapes are different, but the capability is equivalent. OpenAI's is the most mature. Claude's tool_use blocks integrate naturally with its content block architecture. Gemini's function declarations are verbose but work reliably.

Code Generation Quality

This matters if you're building a coding assistant, autocomplete, or any developer tool. Tested across hundreds of prompts for various projects on codeup.dev, here's what holds up:

OpenAI GPT-4.1: Consistently solid at generating boilerplate and standard patterns. Follows instructions well. Sometimes produces code that looks correct but has subtle logical bugs — always review carefully. Excellent at refactoring existing code. Claude Opus 4: Strongest at understanding complex codebases and multi-file changes. Tends to produce more idiomatic code that a senior dev would actually write. Better at explaining trade-offs in code decisions. Occasionally over-engineers solutions when a simple approach would work. Gemini 2.5 Pro: Surprisingly good at long-context code tasks. If you feed it an entire repo (thanks to the 1M token window), it maintains coherence better than you'd expect. Weaker on niche framework-specific code compared to the other two. Strongest when you give it lots of context.

For pure coding tasks, the ranking shifts depending on what you're doing:

Task	Best Model	Notes
Boilerplate generation	GPT-4.1	Fast, follows templates well
Complex refactoring	Claude Opus 4	Best at understanding intent
Whole-repo analysis	Gemini 2.5 Pro	1M context advantage
Bug fixing	Claude Opus 4	Strong at reasoning about edge cases
API integration code	GPT-4.1	Largest training set of API examples
Algorithm implementation	Tie	All three are competent
CSS/UI generation	GPT-4.1	Slightly more consistent styling

Vision Capabilities

All three handle image input. Gemini also handles video, which the others don't.

// OpenAI — image input
const response = await openai.chat.completions.create({
  model: "gpt-4.1",
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "What UI framework is this screenshot using?" },
        {
          type: "image_url",
          image_url: { url: "data:image/png;base64,..." },
        },
      ],
    },
  ],
});

// Claude — image input
const response = await anthropic.messages.create({
  model: "claude-opus-4-20250514",
  max_tokens: 1024,
  messages: [
    {
      role: "user",
      content: [
        {
          type: "image",
          source: { type: "base64", media_type: "image/png", data: "..." },
        },
        { type: "text", text: "What UI framework is this screenshot using?" },
      ],
    },
  ],
});

// Gemini — image input
const response = await ai.models.generateContent({
  model: "gemini-2.5-pro",
  contents: [
    {
      parts: [
        { inlineData: { mimeType: "image/png", data: "..." } },
        { text: "What UI framework is this screenshot using?" },
      ],
    },
  ],
});

For screenshot-to-code tasks, all three work. Claude is marginally better at understanding complex UI layouts. Gemini's video understanding opens unique use cases (analyzing screen recordings, tutorial videos).

Pricing Deep Dive

Cost calculation for a real scenario — a customer support chatbot processing 10,000 conversations/day, averaging 2,000 input tokens and 500 output tokens per conversation:

Daily token usage: 20M input + 5M output

Provider	Model	Daily Cost	Monthly Cost
OpenAI	GPT-4.1	$80	$2,400
Anthropic	Claude Sonnet 4	$90	$2,700
Anthropic	Claude Opus 4	$675	$20,250
Google	Gemini 2.5 Pro	$75	$2,250
OpenAI	GPT-4.1 mini	$12	$360
Google	Gemini 2.5 Flash	$6	$180

The cost gap between flagship and mid-tier models is massive. For most production use cases, the mid-tier models (GPT-4.1 mini, Claude Sonnet, Gemini Flash) deliver 90% of the capability at 10-20% of the cost. Reserve the flagship models for tasks where quality genuinely matters more than cost — code review, complex analysis, creative work. Caching matters. If your prompts share a long system prompt or repeated context, all three providers offer prompt caching at steep discounts (50-75% off input pricing). Architect your prompts with caching in mind.

Rate Limits and Reliability

Provider	Free Tier RPM	Tier 1 RPM	Tier 5 RPM
OpenAI	500	3,500	10,000
Anthropic	50	1,000	4,000
Google	1,500	2,000	—

Anthropic's rate limits are the most restrictive, especially on the free tier. If you're building a high-throughput application, budget for a higher tier from day one.

In terms of reliability, all three have occasional outages. OpenAI has the most users and the most visible outages. Anthropic tends to be more stable but has had capacity issues during peak demand. Google's infrastructure means Gemini rarely goes fully down, but latency spikes happen.

Structured Output

Need the model to return valid JSON every time? All three support it, with different approaches.

// OpenAI — response_format with JSON schema
const response = await openai.chat.completions.create({
  model: "gpt-4.1",
  messages: [{ role: "user", content: "List 3 programming languages with pros and cons" }],
  response_format: {
    type: "json_schema",
    json_schema: {
      name: "languages",
      schema: {
        type: "object",
        properties: {
          languages: {
            type: "array",
            items: {
              type: "object",
              properties: {
                name: { type: "string" },
                pros: { type: "array", items: { type: "string" } },
                cons: { type: "array", items: { type: "string" } },
              },
              required: ["name", "pros", "cons"],
            },
          },
        },
        required: ["languages"],
      },
    },
  },
});

// Claude — tool_use trick for structured output
// Claude doesn't have a dedicated JSON mode, but you can use a tool
// and extract the structured input
const response = await anthropic.messages.create({
  model: "claude-opus-4-20250514",
  max_tokens: 1024,
  tools: [
    {
      name: "output_languages",
      description: "Output the list of languages with pros and cons",
      input_schema: {
        type: "object",
        properties: {
          languages: {
            type: "array",
            items: {
              type: "object",
              properties: {
                name: { type: "string" },
                pros: { type: "array", items: { type: "string" } },
                cons: { type: "array", items: { type: "string" } },
              },
              required: ["name", "pros", "cons"],
            },
          },
        },
        required: ["languages"],
      },
    },
  ],
  tool_choice: { type: "tool", name: "output_languages" },
  messages: [
    { role: "user", content: "List 3 programming languages with pros and cons" },
  ],
});

// Gemini — responseSchema
const response = await ai.models.generateContent({
  model: "gemini-2.5-pro",
  contents: "List 3 programming languages with pros and cons",
  config: {
    responseMimeType: "application/json",
    responseSchema: {
      type: "object",
      properties: {
        languages: {
          type: "array",
          items: {
            type: "object",
            properties: {
              name: { type: "string" },
              pros: { type: "array", items: { type: "string" } },
              cons: { type: "array", items: { type: "string" } },
            },
            required: ["name", "pros", "cons"],
          },
        },
      },
      required: ["languages"],
    },
  },
});

OpenAI's JSON schema mode is the most reliable — it guarantees valid JSON matching your schema. Gemini's responseSchema works similarly. Claude requires the tool_use workaround, which works consistently but feels like a hack.

Which One Should You Pick?

Use OpenAI if:

You need the broadest ecosystem (most tutorials, most libraries, most community support)
Fine-tuning is important for your use case
You want the most predictable API behavior (they've had the longest to iron out edge cases)
You're building on Azure (Azure OpenAI gives you dedicated capacity)

Use Claude if:

Code quality is your primary concern (strongest at complex reasoning and coding)
You need long, coherent multi-turn conversations
Safety and alignment matter for your product (Anthropic leads here)
You're building developer tools or code assistants

Use Gemini if:

Cost is a primary concern (cheapest for high-volume applications)
You need the largest context window at the lowest price
Video understanding is part of your use case
You're already in the Google Cloud ecosystem
You want the most generous free tier for prototyping

The honest answer: for most production applications, start with the cheapest model that meets your quality bar, implement provider-agnostic abstractions in your code, and swap models as pricing and capabilities shift. The AI API landscape changes every quarter. Building your application around a single provider's API shape is a mistake you'll regret.

// Build a simple abstraction layer
interface LLMProvider {
  complete(params: {
    model: string;
    system: string;
    messages: { role: string; content: string }[];
    maxTokens: number;
  }): Promise<string>;
}

// Implement for each provider, swap via config
// This is what we do for all LLM-powered features on codeup.dev

Lock in the abstraction. Let the providers compete on price and quality. Switch when it makes sense.