Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
freem
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Openai/69408af5-1bf8-800f-b5dc-86cd01e07ec0
(section)
Add languages
Page
Discussion
English
Read
Edit
Edit source
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
Edit source
View history
General
What links here
Related changes
Special pages
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Assistant: Your understanding about an 11-trillion token estimate is accurate for some Google models, but important context matter… === Your understanding about an ~11-trillion token estimate is accurate for some Google models, but important context matters: 1) For ''Gemini itself'' — Google has not publicly released an official total token count for the training data used in the Gemini flagship models. Independent reports and estimates vary widely, and there’s no definitive, company-confirmed published figure that says Gemini was trained on exactly 11T tokens. Independent sources speculate that it’s in the trillions, but the exact number isn’t confirmed. Analytics Vidhya<ref>{{cite web|title=Analytics Vidhya|url=https://www.analyticsvidhya.com/blog/2024/09/claude-vs-gemini/|publisher=Analytics Vidhya|access-date=2025-12-16}}</ref> **2) The ~11 trillion figure does appear in official documentation — but for the ''Gemma'' models, which are a Google open-weight family related to Gemini. According to the Gemma 3n model card, those models were trained on ~11 trillion tokens of diverse data. Google AI for Developers<ref>{{cite web|title=Google AI for Developers|url=https://ai.google.dev/gemma/docs/gemma-3n/model_card|publisher=Google AI for Developers|access-date=2025-12-16}}</ref> 3) Distinction between ''Gemma'' and ''Gemini'' * Gemma is a related but separate series of models from Google, designed to be efficient and open-weight. Its documentation cites ~11T tokens. Google AI for Developers<ref>{{cite web|title=Google AI for Developers|url=https://ai.google.dev/gemma/docs/gemma-3n/model_card|publisher=Google AI for Developers|access-date=2025-12-16}}</ref> * Gemini is Google DeepMind’s flagship proprietary LLM family, and its training token count has not been officially disclosed (public guesses vary). Analytics Vidhya<ref>{{cite web|title=Analytics Vidhya|url=https://www.analyticsvidhya.com/blog/2024/09/claude-vs-gemini/|publisher=Analytics Vidhya|access-date=2025-12-16}}</ref> Summary: * Saying “Google Gemini was trained on ~11 trillion tokens” is an interpretation based on publicly available estimates, not an officially published training dataset size from Google for the flagship Gemini models. '' The ~11T token count is documented for ''Gemma* models rather than Gemini itself. Google AI for Developers<ref>{{cite web|title=Google AI for Developers|url=https://ai.google.dev/gemma/docs/gemma-3n/model_card|publisher=Google AI for Developers|access-date=2025-12-16}}</ref>
Summary:
Please note that all contributions to freem are considered to be released under the Creative Commons Attribution-ShareAlike 4.0 (see
Freem:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)