Categories: Artificial IntelligenceGenerative AI Tools & Strategies

Basic comparison between LLM providers

I’ve created this table of information to assist me in remembering the features and restrictions of each provider.

Information is relevant when accessing the models via the API. The “retail-facing” (web) products often differ.

It’s not an exhaustive list with every detail. Use the links within the table to get the full / latest information directly from the providers.

NOTE: Currently I’m only using Google Vertex to gain access to Imagen 3. For text responses, I use the Gemini API.

Text / Multi-modal Models:

	Open AI	Google Gemini	Google Gemini via Vertex	Perplexity	Anthropic Claude 3.5 Sonnet	DeepSeek
Knowledge Cut-off	December 2023	November 2023	<–same	“Chat” models: December 2023 “Online” models: “real-time”	April 2024	Up for debate
Context Window (tokens). “Up to…” input / output	https://platform.openai.com/docs/models GPT models: 128k / 16k ‘o’ models: 200k / 100k	https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models 2 million / 8k Flash: 1 million / 8k	<–same	https://docs.perplexity.ai/guides/model-cards 200k / 8k Reasoning models: 27,072 / 8k	https://docs.anthropic.com/en/docs/about-claude/models 200k / 8k	128k / ?
Pre-cache Documents	100GB total storage per organisation (10GB max per user) 512MB (or 2 million tokens for text / document files) single file size limit 20MB per image, 50MB per .csv / spreadsheet, depending on the size of each row. Uploads are retained until you choose to delete them.	20GB total storage 2GB single file size limit retained for 48 hours.	<–same	No pre-caching available but you can add files to a prompt…25MB single file size limit, 4 files max	No pre-caching available but you can add files to a prompt… PDFs and images (text files / documents not allowed!). 5MB single file size limit, 100MB all files max.	None.
Inputs other than text	Base64 encoded files, files on any publicly-accessibe URL. Documents, URL (ID) to a file pre-uploaded via the API	Base64 encoded files, URL (ID) to a file pre-uploaded via the API	<–same	?	PDF, images	Text only
Outputs	images (?), Excel files (?)	Text only	Text only	images (Tier 2 only)	Text only	Text only
Other features	Assistants, where you can add up to 20 files per assistant but you can have unlimited Assistants. Via Assistants, the chat history and linked documents are maintained by OpenAI for up to 30 days.	Context Caching. You must upload a minimum of 32,768 tokens (about 128 pages of text) in a single request and you can specify how long it’s cached for. You can then reference the cache in your prompt. Now has web search features but I haven’t looked at that yet.	Vertex is a “platform” whereas Gemini is just a language model with an API. I’ do want to try out ‘m yet to properly test the “data stores”, “agents”, and grounding” features of Vertex	Returns citations, returns follow-up questions	Their API is really quite limited. “Projects” is not available via the API.
File formats allowed	https://platform.openai.com/docs/assistants/tools/code-interpreter	https://ai.google.dev/gemini-api/docs/document-processing		“plain text, code, or PDFs, as well as images” and for images… “JPEG, HEF, PNG, PDF.”	Base64 files in PDF, JPG, PNG, WEBP formats only	None.
API pricing in USD, per million tokens; input / output	https://platform.openai.com/docs/pricing GPT4: $2.50 / $10 o1: $15 / $60	https://ai.google.dev/pricing#2_0flash (No pricing for Gemini 2.0 yet) Gemini 2.0 Flash: $0.10 / $0.40	https://cloud.google.com/vertex-ai/generative-ai/pricing (No pricing for Gemini 2.0 yet) Gemini 2.0 Flash: $0.15 / $0.60	https://docs.perplexity.ai/guides/pricing Sonar Pro: $3.00 / $15.00 / $5 per 1,000 searches Sonar Pro Reasoning: $2 / $8 / $5 per 1,000 searches	https://www.anthropic.com/pricing#anthropic-api $3 / $15	https://api-docs.deepseek.com/quick_start/pricing $0.27 / $1.10 Reasoning model: $0.55 / $2.19
API limits	https://platform.openai.com/docs/guides/rate-limits#usage-tiers All depends on Tier. Anywhere between 500 RPM (10k TPM) and 10k RPM (150 million TPM)	https://ai.google.dev/gemini-api/docs/rate-limits#paid-tier-1 2k RPM / 4 million TPM	https://cloud.google.com/vertex-ai/generative-ai/docs/quotas Complicated, but let’s say 4 million TPM	https://docs.perplexity.ai/guides/usage-tiers Depends on Tier. 150 -> 1,00 RPM	https://docs.anthropic.com/en/api/rate-limits Depends on Tier. Anywhere between 50 RPM (20k TPM) and 4k RPM (400k TPM)
URL: Retail-product	https://chatgpt.com/	https://gemini.google.com/app	None	https://www.perplexity.ai/	https://claude.ai/	https://chat.deepseek.com/
URL: API documentation URL: API reference	https://platform.openai.com/docs/concepts https://platform.openai.com/docs/api-reference/introduction	https://ai.google.dev/gemini-api/docs https://ai.google.dev/api	https://cloud.google.com/vertex-ai/docs https://cloud.google.com/vertex-ai/docs/reference/rest	https://docs.perplexity.ai/guides/getting-started https://docs.perplexity.ai/api-reference/chat-completions	https://docs.anthropic.com/en/home https://docs.anthropic.com/en/api/getting-started	https://api-docs.deepseek.com/ https://api-docs.deepseek.com/api/deepseek-api
URL: Playground	https://platform.openai.com/playground/chat	https://aistudio.google.com/app/prompts	https://console.cloud.google.com/vertex-ai	https://labs.perplexity.ai/	https://console.anthropic.com/workbench/	None.

Image Generation Models:

	Stable Diffusion 3	DALL E 3	Flux Dev	Sdxl/lightning-4step	Google Imagen 3	Flux Pro
Prompt with reference image	No	No	Yes	No	Yes	No
Output megapixels	Up to 1.05mp (1024 x 1024)	Up to 1.84mp (1792 x 1024)	Up to 1.05mp (1024 x 1024)	Up to 1.64mp (1280 x 1280)	Up to 1.05mp (1024 x 1024)	Up to 2.07mp (1440 x 1440)
Number of output images	1 – 4	1	1 – 4	1 – 4	1 – ?	1
Re-writes your prompts	No	Yes	No	No	likely?	Optional

demodomain

Next Business #1 »

Previous « My issue with all AI providers / models

Published by

demodomain

12 months ago

A Framework for API Data Syncing: The Ontraport to Supabase Case Study

This document outlines a detailed, real-world process for syncing data from a third-party API into…

3 months ago

The AI in Automation Puzzle: Why “Learning” is a Myth and How to Get Real Results

AI and automation are two distinct concepts that can work powerfully together, but they are…

4 months ago

Real LLM Streaming with n8n – Here’s How (with a Little Help from Supabase)

n8n, for all its power in workflow automation, is NOT natively built for streaming HTTP…

8 months ago

The Shifting Sands of the AI Landscape

In navigating the current AI landscape, it's become clear that relying on off-the-shelf platforms, even…

8 months ago

Understanding the Brains (or lack thereof) Behind Your Chat App: Why LLMs Aren’t What You Might Think

Large Language Models (LLMs) are incredible pieces of technology, capable of generating remarkably human-like text,…

9 months ago

Building a Comprehensive N8n Command Center with Grafana: The Detailed Journey

N8n provides two main views of your workflows. The workflow list shows you basic information…

11 months ago

Basic comparison between LLM providers

Text / Multi-modal Models:

Image Generation Models:

Recent Posts

A Framework for API Data Syncing: The Ontraport to Supabase Case Study

The AI in Automation Puzzle: Why “Learning” is a Myth and How to Get Real Results

Real LLM Streaming with n8n – Here’s How (with a Little Help from Supabase)

The Shifting Sands of the AI Landscape

Understanding the Brains (or lack thereof) Behind Your Chat App: Why LLMs Aren’t What You Might Think

Building a Comprehensive N8n Command Center with Grafana: The Detailed Journey

Recent Posts

Recent Comments

Basic comparison between LLM providers

Text / Multi-modal Models:

Image Generation Models:

Related Post

Recent Posts

A Framework for API Data Syncing: The Ontraport to Supabase Case Study

The AI in Automation Puzzle: Why “Learning” is a Myth and How to Get Real Results

Real LLM Streaming with n8n – Here’s How (with a Little Help from Supabase)

The Shifting Sands of the AI Landscape

Understanding the Brains (or lack thereof) Behind Your Chat App: Why LLMs Aren’t What You Might Think

Building a Comprehensive N8n Command Center with Grafana: The Detailed Journey

Recent Posts

Recent Comments