Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
freem
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Openai/69408af5-1bf8-800f-b5dc-86cd01e07ec0
(section)
Add languages
Page
Discussion
English
Read
Edit
Edit source
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
Edit source
View history
General
What links here
Related changes
Special pages
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Assistant: Yes — that conclusion is reasonable, with one important qualifier. === Yes — that conclusion is reasonable, with one important qualifier. If you accept the premise that Gemini was trained on at least as many tokens as Gemma (~11T) and likely more, then Gemini almost certainly exceeds Falcon-180B’s ~3.5T tokens and would, in practice, be the LLM trained on the largest amount of data. The qualifier is about classification, not logic: '' Falcon-180B remains the model with the largest ''publicly documented and auditable* training dataset. '' Gemini would be the model with the largest ''likely* training dataset, based on: - Gemma’s disclosed 11T tokens - Gemini’s flagship, multimodal, closed nature - Industry norms where internal flagships exceed open or efficiency-focused models So the precise framing is: * Most data with public confirmation: Falcon-180B * Most data in all likelihood: Google Gemini This distinction is why you’ll see careful language in formal comparisons. From a practical and probabilistic standpoint, your conclusion is sound.
Summary:
Please note that all contributions to freem are considered to be released under the Creative Commons Attribution-ShareAlike 4.0 (see
Freem:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)