Contact Information

Australia

Happy to chat

Basic comparison between LLM providers

I’ve created this table of information to assist me in remembering the features and restrictions of each provider.

Information is relevant when accessing the models via the API. The “retail-facing” (web) products often differ.

It’s not an exhaustive list with every detail. Use the links within the table to get the full / latest information directly from the providers.

NOTE: Currently I’m only using Google Vertex to gain access to Imagen 3. For text responses, I use the Gemini API.

Text / Multi-modal Models:

  Open AI Google Gemini Google
Gemini via Vertex
Perplexity Anthropic Claude 3.5 Sonnet DeepSeek
Knowledge Cut-off December 2023 November 2023 <–same “Chat” models: December 2023
“Online” models: “real-time”
April 2024 Up for debate
Context Window (tokens). “Up to…” input / output https://platform.openai.com/docs/models

GPT models: 128k / 16k

‘o’ models: 200k / 100k
https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models

2 million / 8k

Flash: 1 million / 8k
<–same https://docs.perplexity.ai/guides/model-cards

200k / 8k

Reasoning models: 27,072 / 8k
https://docs.anthropic.com/en/docs/about-claude/models

200k / 8k
128k / ?
Pre-cache Documents 100GB total storage per organisation (10GB max per user)
512MB (or 2 million tokens for text / document files) single file size limit
20MB per image, 50MB per .csv / spreadsheet, depending on the size of each row.
Uploads are retained until you choose to delete them.
20GB total storage
2GB single file size limit
retained for 48 hours.
<–same No pre-caching available but you can add files to a prompt…25MB single file size limit, 4 files max No pre-caching available but you can add files to a prompt… PDFs and images (text files / documents not allowed!). 5MB single file size limit, 100MB all files max. None.
Inputs other than text Base64 encoded files, files on any publicly-accessibe URL. Documents, URL (ID) to a file pre-uploaded via the API Base64 encoded files, URL (ID) to a file pre-uploaded via the API <–same ? PDF, images Text only
Outputs images (?), Excel files (?) Text only Text only images (Tier 2 only) Text only Text only
Other features Assistants, where you can add up to 20 files per assistant but you can have unlimited Assistants.

Via Assistants, the chat history and linked documents are maintained by OpenAI for up to 30 days.
Context Caching. You must upload a minimum of 32,768 tokens (about 128 pages of text) in a single request and you can specify how long it’s cached for. You can then reference the cache in your prompt.

Now has web search features but I haven’t looked at that yet.
Vertex is a “platform” whereas Gemini is just a language model with an API.

I’ do want to try out ‘m yet to properly test the “data stores”, “agents”, and grounding” features of Vertex
Returns citations, returns follow-up questions Their API is really quite limited. “Projects” is not available via the API.  
File formats allowed https://platform.openai.com/docs/assistants/tools/code-interpreter https://ai.google.dev/gemini-api/docs/document-processing   “plain text, code, or PDFs, as well as images” and for images… “JPEG, HEF, PNG, PDF.” Base64 files in PDF, JPG, PNG, WEBP formats only None.
API pricing in USD, per million tokens; input / output https://platform.openai.com/docs/pricing

GPT4: $2.50 / $10

o1: $15 / $60
https://ai.google.dev/pricing#2_0flash

(No pricing for Gemini 2.0 yet)

Gemini 2.0 Flash: $0.10 / $0.40
https://cloud.google.com/vertex-ai/generative-ai/pricing

(No pricing for Gemini 2.0 yet)

Gemini 2.0 Flash: $0.15 / $0.60
https://docs.perplexity.ai/guides/pricing

Sonar Pro: $3.00 / $15.00 / $5 per 1,000 searches

Sonar Pro Reasoning: $2 / $8 / $5 per 1,000 searches
https://www.anthropic.com/pricing#anthropic-api

$3 / $15
https://api-docs.deepseek.com/quick_start/pricing

$0.27 / $1.10

Reasoning model: $0.55 / $2.19
API limits https://platform.openai.com/docs/guides/rate-limits#usage-tiers

All depends on Tier. Anywhere between 500 RPM (10k TPM) and 10k RPM (150 million TPM)

https://ai.google.dev/gemini-api/docs/rate-limits#paid-tier-1

2k RPM / 4 million TPM
https://cloud.google.com/vertex-ai/generative-ai/docs/quotas

Complicated, but let’s say 4 million TPM
https://docs.perplexity.ai/guides/usage-tiers

Depends on Tier. 150 -> 1,00 RPM
https://docs.anthropic.com/en/api/rate-limits

Depends on Tier. Anywhere between 50 RPM (20k TPM) and 4k RPM (400k TPM)
 
URL: Retail-product https://chatgpt.com/ https://gemini.google.com/app None https://www.perplexity.ai/ https://claude.ai/ https://chat.deepseek.com/
URL: API documentation

URL: API reference
https://platform.openai.com/docs/concepts

https://platform.openai.com/docs/api-reference/introduction
https://ai.google.dev/gemini-api/docs

https://ai.google.dev/api
https://cloud.google.com/vertex-ai/docs

https://cloud.google.com/vertex-ai/docs/reference/rest
https://docs.perplexity.ai/guides/getting-started

https://docs.perplexity.ai/api-reference/chat-completions
https://docs.anthropic.com/en/home

https://docs.anthropic.com/en/api/getting-started
https://api-docs.deepseek.com/

https://api-docs.deepseek.com/api/deepseek-api
URL: Playground https://platform.openai.com/playground/chat https://aistudio.google.com/app/prompts https://console.cloud.google.com/vertex-ai https://labs.perplexity.ai/ https://console.anthropic.com/workbench/ None.

 

Image Generation Models:

  Stable Diffusion 3 DALL E 3 Flux Dev Sdxl/lightning-4step Google Imagen 3 Flux Pro
Prompt with reference image No No Yes No Yes No
Output megapixels Up to 1.05mp (1024 x 1024) Up to 1.84mp (1792 x 1024) Up to 1.05mp (1024 x 1024) Up to 1.64mp (1280 x 1280) Up to 1.05mp (1024 x 1024) Up to 2.07mp (1440 x 1440)
Number of output images 1 – 4 1 1 – 4 1 – 4 1 – ? 1
Re-writes your prompts No Yes No No likely? Optional

Share:

administrator