| Open AI | Google Gemini | Google Gemini via Vertex | Perplexity | Anthropic Claude 3.5 Sonnet | DeepSeek |
Knowledge Cut-off | December 2023 | November 2023 | <–same | “Chat” models: December 2023 “Online” models: “real-time” | April 2024 | Up for debate |
Context Window (tokens). “Up to…” input / output | https://platform.openai.com/docs/models
GPT models: 128k / 16k
‘o’ models: 200k / 100k | https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models
2 million / 8k
Flash: 1 million / 8k | <–same | https://docs.perplexity.ai/guides/model-cards
200k / 8k
Reasoning models: 27,072 / 8k | https://docs.anthropic.com/en/docs/about-claude/models
200k / 8k | 128k / ? |
Pre-cache Documents | 100GB total storage per organisation (10GB max per user) 512MB (or 2 million tokens for text / document files) single file size limit 20MB per image, 50MB per .csv / spreadsheet, depending on the size of each row. Uploads are retained until you choose to delete them. | 20GB total storage 2GB single file size limit retained for 48 hours. | <–same | No pre-caching available but you can add files to a prompt…25MB single file size limit, 4 files max | No pre-caching available but you can add files to a prompt… PDFs and images (text files / documents not allowed!). 5MB single file size limit, 100MB all files max. | None. |
Inputs other than text | Base64 encoded files, files on any publicly-accessibe URL. Documents, URL (ID) to a file pre-uploaded via the API | Base64 encoded files, URL (ID) to a file pre-uploaded via the API | <–same | ? | PDF, images | Text only |
Outputs | images (?), Excel files (?) | Text only | Text only | images (Tier 2 only) | Text only | Text only |
Other features | Assistants, where you can add up to 20 files per assistant but you can have unlimited Assistants.
Via Assistants, the chat history and linked documents are maintained by OpenAI for up to 30 days. | Context Caching. You must upload a minimum of 32,768 tokens (about 128 pages of text) in a single request and you can specify how long it’s cached for. You can then reference the cache in your prompt.
Now has web search features but I haven’t looked at that yet. | Vertex is a “platform” whereas Gemini is just a language model with an API.
I’ do want to try out ‘m yet to properly test the “data stores”, “agents”, and grounding” features of Vertex | Returns citations, returns follow-up questions | Their API is really quite limited. “Projects” is not available via the API. | |
File formats allowed | https://platform.openai.com/docs/assistants/tools/code-interpreter | https://ai.google.dev/gemini-api/docs/document-processing | | “plain text, code, or PDFs, as well as images” and for images… “JPEG, HEF, PNG, PDF.” | Base64 files in PDF, JPG, PNG, WEBP formats only | None. |
API pricing in USD, per million tokens; input / output | https://platform.openai.com/docs/pricing
GPT4: $2.50 / $10
o1: $15 / $60 | https://ai.google.dev/pricing#2_0flash
(No pricing for Gemini 2.0 yet)
Gemini 2.0 Flash: $0.10 / $0.40 | https://cloud.google.com/vertex-ai/generative-ai/pricing
(No pricing for Gemini 2.0 yet)
Gemini 2.0 Flash: $0.15 / $0.60 | https://docs.perplexity.ai/guides/pricing
Sonar Pro: $3.00 / $15.00 / $5 per 1,000 searches
Sonar Pro Reasoning: $2 / $8 / $5 per 1,000 searches | https://www.anthropic.com/pricing#anthropic-api
$3 / $15 | https://api-docs.deepseek.com/quick_start/pricing
$0.27 / $1.10
Reasoning model: $0.55 / $2.19 |
API limits | https://platform.openai.com/docs/guides/rate-limits#usage-tiers
All depends on Tier. Anywhere between 500 RPM (10k TPM) and 10k RPM (150 million TPM)
| https://ai.google.dev/gemini-api/docs/rate-limits#paid-tier-1
2k RPM / 4 million TPM | https://cloud.google.com/vertex-ai/generative-ai/docs/quotas
Complicated, but let’s say 4 million TPM | https://docs.perplexity.ai/guides/usage-tiers
Depends on Tier. 150 -> 1,00 RPM | https://docs.anthropic.com/en/api/rate-limits
Depends on Tier. Anywhere between 50 RPM (20k TPM) and 4k RPM (400k TPM) | |
URL: Retail-product | https://chatgpt.com/ | https://gemini.google.com/app | None | https://www.perplexity.ai/ | https://claude.ai/ | https://chat.deepseek.com/ |
URL: API documentation
URL: API reference | https://platform.openai.com/docs/concepts
https://platform.openai.com/docs/api-reference/introduction | https://ai.google.dev/gemini-api/docs
https://ai.google.dev/api | https://cloud.google.com/vertex-ai/docs
https://cloud.google.com/vertex-ai/docs/reference/rest | https://docs.perplexity.ai/guides/getting-started
https://docs.perplexity.ai/api-reference/chat-completions | https://docs.anthropic.com/en/home
https://docs.anthropic.com/en/api/getting-started | https://api-docs.deepseek.com/
https://api-docs.deepseek.com/api/deepseek-api |
URL: Playground | https://platform.openai.com/playground/chat | https://aistudio.google.com/app/prompts | https://console.cloud.google.com/vertex-ai | https://labs.perplexity.ai/ | https://console.anthropic.com/workbench/ | None. |