Basic comparison between LLM providers
-
demodomain
- . February 13, 2025
- 53 Views
-
Shares
I’ve created this table of information to assist me in remembering the features and restrictions of each provider.
Information is relevant when accessing the models via the API. The “retail-facing” (web) products often differ.
It’s not an exhaustive list with every detail. Use the links within the table to get the full / latest information directly from the providers.
NOTE: Currently I’m only using Google Vertex to gain access to Imagen 3. For text responses, I use the Gemini API.
Text / Multi-modal Models:
Open AI | Google Gemini | Google Gemini via Vertex |
Perplexity | Anthropic Claude 3.5 Sonnet | DeepSeek | |
Knowledge Cut-off | December 2023 | November 2023 | <–same | “Chat” models: December 2023 “Online” models: “real-time” |
April 2024 | Up for debate |
Context Window (tokens). “Up to…” input / output | https://platform.openai.com/docs/models GPT models: 128k / 16k ‘o’ models: 200k / 100k |
https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models 2 million / 8k Flash: 1 million / 8k |
<–same | https://docs.perplexity.ai/guides/model-cards 200k / 8k Reasoning models: 27,072 / 8k |
https://docs.anthropic.com/en/docs/about-claude/models 200k / 8k |
128k / ? |
Pre-cache Documents | 100GB total storage per organisation (10GB max per user) 512MB (or 2 million tokens for text / document files) single file size limit 20MB per image, 50MB per .csv / spreadsheet, depending on the size of each row. Uploads are retained until you choose to delete them. |
20GB total storage 2GB single file size limit retained for 48 hours. |
<–same | No pre-caching available but you can add files to a prompt…25MB single file size limit, 4 files max | No pre-caching available but you can add files to a prompt… PDFs and images (text files / documents not allowed!). 5MB single file size limit, 100MB all files max. | None. |
Inputs other than text | Base64 encoded files, files on any publicly-accessibe URL. Documents, URL (ID) to a file pre-uploaded via the API | Base64 encoded files, URL (ID) to a file pre-uploaded via the API | <–same | ? | PDF, images | Text only |
Outputs | images (?), Excel files (?) | Text only | Text only | images (Tier 2 only) | Text only | Text only |
Other features | Assistants, where you can add up to 20 files per assistant but you can have unlimited Assistants. Via Assistants, the chat history and linked documents are maintained by OpenAI for up to 30 days. |
Context Caching. You must upload a minimum of 32,768 tokens (about 128 pages of text) in a single request and you can specify how long it’s cached for. You can then reference the cache in your prompt. Now has web search features but I haven’t looked at that yet. |
Vertex is a “platform” whereas Gemini is just a language model with an API. I’ do want to try out ‘m yet to properly test the “data stores”, “agents”, and grounding” features of Vertex |
Returns citations, returns follow-up questions | Their API is really quite limited. “Projects” is not available via the API. | |
File formats allowed | https://platform.openai.com/docs/assistants/tools/code-interpreter | https://ai.google.dev/gemini-api/docs/document-processing | “plain text, code, or PDFs, as well as images” and for images… “JPEG, HEF, PNG, PDF.” | Base64 files in PDF, JPG, PNG, WEBP formats only | None. | |
API pricing in USD, per million tokens; input / output | https://platform.openai.com/docs/pricing GPT4: $2.50 / $10 o1: $15 / $60 |
https://ai.google.dev/pricing#2_0flash (No pricing for Gemini 2.0 yet) Gemini 2.0 Flash: $0.10 / $0.40 |
https://cloud.google.com/vertex-ai/generative-ai/pricing (No pricing for Gemini 2.0 yet) Gemini 2.0 Flash: $0.15 / $0.60 |
https://docs.perplexity.ai/guides/pricing Sonar Pro: $3.00 / $15.00 / $5 per 1,000 searches Sonar Pro Reasoning: $2 / $8 / $5 per 1,000 searches |
https://www.anthropic.com/pricing#anthropic-api $3 / $15 |
https://api-docs.deepseek.com/quick_start/pricing $0.27 / $1.10 Reasoning model: $0.55 / $2.19 |
API limits | https://platform.openai.com/docs/guides/rate-limits#usage-tiers All depends on Tier. Anywhere between 500 RPM (10k TPM) and 10k RPM (150 million TPM) |
https://ai.google.dev/gemini-api/docs/rate-limits#paid-tier-1 2k RPM / 4 million TPM |
https://cloud.google.com/vertex-ai/generative-ai/docs/quotas Complicated, but let’s say 4 million TPM |
https://docs.perplexity.ai/guides/usage-tiers Depends on Tier. 150 -> 1,00 RPM |
https://docs.anthropic.com/en/api/rate-limits Depends on Tier. Anywhere between 50 RPM (20k TPM) and 4k RPM (400k TPM) |
|
URL: Retail-product | https://chatgpt.com/ | https://gemini.google.com/app | None | https://www.perplexity.ai/ | https://claude.ai/ | https://chat.deepseek.com/ |
URL: API documentation URL: API reference |
https://platform.openai.com/docs/concepts https://platform.openai.com/docs/api-reference/introduction |
https://ai.google.dev/gemini-api/docs https://ai.google.dev/api |
https://cloud.google.com/vertex-ai/docs https://cloud.google.com/vertex-ai/docs/reference/rest |
https://docs.perplexity.ai/guides/getting-started https://docs.perplexity.ai/api-reference/chat-completions |
https://docs.anthropic.com/en/home https://docs.anthropic.com/en/api/getting-started |
https://api-docs.deepseek.com/ https://api-docs.deepseek.com/api/deepseek-api |
URL: Playground | https://platform.openai.com/playground/chat | https://aistudio.google.com/app/prompts | https://console.cloud.google.com/vertex-ai | https://labs.perplexity.ai/ | https://console.anthropic.com/workbench/ | None. |
Image Generation Models:
Stable Diffusion 3 | DALL E 3 | Flux Dev | Sdxl/lightning-4step | Google Imagen 3 | Flux Pro | |
Prompt with reference image | No | No | Yes | No | Yes | No |
Output megapixels | Up to 1.05mp (1024 x 1024) | Up to 1.84mp (1792 x 1024) | Up to 1.05mp (1024 x 1024) | Up to 1.64mp (1280 x 1280) | Up to 1.05mp (1024 x 1024) | Up to 2.07mp (1440 x 1440) |
Number of output images | 1 – 4 | 1 | 1 – 4 | 1 – 4 | 1 – ? | 1 |
Re-writes your prompts | No | Yes | No | No | likely? | Optional |