Basic comparison between LLM providers

demodomain
. February 13, 2025
583 Views
Shares

I’ve created this table of information to assist me in remembering the features and restrictions of each provider.

Information is relevant when accessing the models via the API. The “retail-facing” (web) products often differ.

It’s not an exhaustive list with every detail. Use the links within the table to get the full / latest information directly from the providers.

NOTE: Currently I’m only using Google Vertex to gain access to Imagen 3. For text responses, I use the Gemini API.

Text / Multi-modal Models:

	Open AI	Google Gemini	Google Gemini via Vertex	Perplexity	Anthropic Claude 3.5 Sonnet	DeepSeek
Knowledge Cut-off	December 2023	November 2023	<–same	“Chat” models: December 2023 “Online” models: “real-time”	April 2024	Up for debate
Context Window (tokens). “Up to…” input / output	https://platform.openai.com/docs/models GPT models: 128k / 16k ‘o’ models: 200k / 100k	https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models 2 million / 8k Flash: 1 million / 8k	<–same	https://docs.perplexity.ai/guides/model-cards 200k / 8k Reasoning models: 27,072 / 8k	https://docs.anthropic.com/en/docs/about-claude/models 200k / 8k	128k / ?
Pre-cache Documents	100GB total storage per organisation (10GB max per user) 512MB (or 2 million tokens for text / document files) single file size limit 20MB per image, 50MB per .csv / spreadsheet, depending on the size of each row. Uploads are retained until you choose to delete them.	20GB total storage 2GB single file size limit retained for 48 hours.	<–same	No pre-caching available but you can add files to a prompt…25MB single file size limit, 4 files max	No pre-caching available but you can add files to a prompt… PDFs and images (text files / documents not allowed!). 5MB single file size limit, 100MB all files max.	None.
Inputs other than text	Base64 encoded files, files on any publicly-accessibe URL. Documents, URL (ID) to a file pre-uploaded via the API	Base64 encoded files, URL (ID) to a file pre-uploaded via the API	<–same	?	PDF, images	Text only
Outputs	images (?), Excel files (?)	Text only	Text only	images (Tier 2 only)	Text only	Text only
Other features	Assistants, where you can add up to 20 files per assistant but you can have unlimited Assistants. Via Assistants, the chat history and linked documents are maintained by OpenAI for up to 30 days.	Context Caching. You must upload a minimum of 32,768 tokens (about 128 pages of text) in a single request and you can specify how long it’s cached for. You can then reference the cache in your prompt. Now has web search features but I haven’t looked at that yet.	Vertex is a “platform” whereas Gemini is just a language model with an API. I’ do want to try out ‘m yet to properly test the “data stores”, “agents”, and grounding” features of Vertex	Returns citations, returns follow-up questions	Their API is really quite limited. “Projects” is not available via the API.
File formats allowed	https://platform.openai.com/docs/assistants/tools/code-interpreter	https://ai.google.dev/gemini-api/docs/document-processing		“plain text, code, or PDFs, as well as images” and for images… “JPEG, HEF, PNG, PDF.”	Base64 files in PDF, JPG, PNG, WEBP formats only	None.
API pricing in USD, per million tokens; input / output	https://platform.openai.com/docs/pricing GPT4: $2.50 / $10 o1: $15 / $60	https://ai.google.dev/pricing#2_0flash (No pricing for Gemini 2.0 yet) Gemini 2.0 Flash: $0.10 / $0.40	https://cloud.google.com/vertex-ai/generative-ai/pricing (No pricing for Gemini 2.0 yet) Gemini 2.0 Flash: $0.15 / $0.60	https://docs.perplexity.ai/guides/pricing Sonar Pro: $3.00 / $15.00 / $5 per 1,000 searches Sonar Pro Reasoning: $2 / $8 / $5 per 1,000 searches	https://www.anthropic.com/pricing#anthropic-api $3 / $15	https://api-docs.deepseek.com/quick_start/pricing $0.27 / $1.10 Reasoning model: $0.55 / $2.19
API limits	https://platform.openai.com/docs/guides/rate-limits#usage-tiers All depends on Tier. Anywhere between 500 RPM (10k TPM) and 10k RPM (150 million TPM)	https://ai.google.dev/gemini-api/docs/rate-limits#paid-tier-1 2k RPM / 4 million TPM	https://cloud.google.com/vertex-ai/generative-ai/docs/quotas Complicated, but let’s say 4 million TPM	https://docs.perplexity.ai/guides/usage-tiers Depends on Tier. 150 -> 1,00 RPM	https://docs.anthropic.com/en/api/rate-limits Depends on Tier. Anywhere between 50 RPM (20k TPM) and 4k RPM (400k TPM)
URL: Retail-product	https://chatgpt.com/	https://gemini.google.com/app	None	https://www.perplexity.ai/	https://claude.ai/	https://chat.deepseek.com/
URL: API documentation URL: API reference	https://platform.openai.com/docs/concepts https://platform.openai.com/docs/api-reference/introduction	https://ai.google.dev/gemini-api/docs https://ai.google.dev/api	https://cloud.google.com/vertex-ai/docs https://cloud.google.com/vertex-ai/docs/reference/rest	https://docs.perplexity.ai/guides/getting-started https://docs.perplexity.ai/api-reference/chat-completions	https://docs.anthropic.com/en/home https://docs.anthropic.com/en/api/getting-started	https://api-docs.deepseek.com/ https://api-docs.deepseek.com/api/deepseek-api
URL: Playground	https://platform.openai.com/playground/chat	https://aistudio.google.com/app/prompts	https://console.cloud.google.com/vertex-ai	https://labs.perplexity.ai/	https://console.anthropic.com/workbench/	None.

Image Generation Models:

	Stable Diffusion 3	DALL E 3	Flux Dev	Sdxl/lightning-4step	Google Imagen 3	Flux Pro
Prompt with reference image	No	No	Yes	No	Yes	No
Output megapixels	Up to 1.05mp (1024 x 1024)	Up to 1.84mp (1792 x 1024)	Up to 1.05mp (1024 x 1024)	Up to 1.64mp (1280 x 1280)	Up to 1.05mp (1024 x 1024)	Up to 2.07mp (1440 x 1440)
Number of output images	1 – 4	1	1 – 4	1 – 4	1 – ?	1
Re-writes your prompts	No	Yes	No	No	likely?	Optional

Contact Information

Basic comparison between LLM providers

Text / Multi-modal Models:

Image Generation Models:

demodomain

A Framework for API Data Syncing: The Ontraport to Supabase Case Study

The AI in Automation Puzzle: Why “Learning” is a Myth and How to Get Real Results

Real LLM Streaming with n8n – Here’s How (with a Little Help from Supabase)

The Shifting Sands of the AI Landscape

Real LLM Streaming with n8n – Here’s How (with a Little Help from Supabase)

Creating a “Custom GPT” with Open Web UI

The Shifting Sands of the AI Landscape

The Open WebUI RAG Conundrum: Chunks vs. Full Documents

The Open WebUI RAG Conundrum: Chunks vs. Full Documents

Creating a “Custom GPT” with Open Web UI

Multi-Model, Multi-Platform AI Pipe in OpenWebUI

Building a Comprehensive N8n Command Center with Grafana: The Detailed Journey

The Illusion of AI Knowledge: Making LLMs Work for You Without Overreliance

Understanding the Brains (or lack thereof) Behind Your Chat App: Why LLMs Aren’t What You…

My issue with all AI providers / models

From “AI in 7 Minutes” to Reality: The Hidden Costs and Complexities of Production-Ready AI

Contact Information

Basic comparison between LLM providers

Text / Multi-modal Models:

Image Generation Models:

AI Integration & Use Cases

Business Automation & Optimization

Case Studies

Generative AI Tools & Strategies

n8n

OpenWebUI (OWUI)

Work history

Related Posts