Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
freem
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Openai/67f4c52f-e4a4-8013-ade0-211e0a4a3b52
(section)
Add languages
Page
Discussion
English
Read
Edit
Edit source
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
Edit source
View history
General
What links here
Related changes
Special pages
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==== 专业性表现 ==== 衡量深度研究功能的专业性,主要看其在医学、法律、商业等高专业度话题下的准确性和表现。近期一系列评测和基准测试结果可以作为参考: * 综合复杂问题能力:在被称为“人类终极考试 (Humanity’s Last Exam)”的高难度测试中(涵盖数学、科学、历史、文学等100多个学科的专家级问题),OpenAI的深度研究智能体取得了26.6%'''的得分,创造了新的记录zhuanlan.zhihu.com<ref>{{cite web|title=zhuanlan.zhihu.com|url=https://zhuanlan.zhihu.com/p/21049103180#:~:text=%E4%B8%80%E5%91%A8AI%E5%8A%A8%E6%80%81%E7%82%B9%E8%AF%84%EF%BC%9Ao3,%E3%80%82%E5%9C%A8GAIA|publisher=zhuanlan.zhihu.com|access-date=2025-11-15}}</ref>。这一成绩远超另一领先模型DeepSeek R1的9.4%,显示了GPT-4驱动的ChatGPT深度研究在复杂推理上的强大实力zhuanlan.zhihu.com<ref>{{cite web|title=zhuanlan.zhihu.com|url=https://zhuanlan.zhihu.com/p/21049103180#:~:text=%E4%B8%80%E5%91%A8AI%E5%8A%A8%E6%80%81%E7%82%B9%E8%AF%84%EF%BC%9Ao3,%E3%80%82%E5%9C%A8GAIA|publisher=zhuanlan.zhihu.com|access-date=2025-11-15}}</ref>。Perplexity的Deep Research在同测试中得分'''21.1%,表现仅次于OpenAI,击败了大多数其他模型(例如Gemini Thinking的6.2%、Grok-2的3.8%等)tw.news.yahoo.com<ref>{{cite web|title=tw.news.yahoo.com|url=https://tw.news.yahoo.com/perplexity%E7%99%BC%E5%B8%83deep-research-%E5%A4%A9%E5%8F%AF%E5%85%8D%E8%B2%BB%E6%9F%A55%E6%AC%A1-%E8%B7%9Fopenai-google-041226919.html#:~:text=Perplexity%20AI%20%E4%B9%9F%E5%BC%B7%E8%AA%BF%E6%97%97%E4%B8%8BDeep%20Research%E5%9C%A8Humanity%E2%80%99s%20Last,4o%20%283.3%25%29%EF%BC%8C%E5%83%85%E8%90%BD%E5%BE%8C%E6%96%BCOpenAI%E7%9A%84Deep%20Research%20%2826.6%25%29%E3%80%82|publisher=tw.news.yahoo.com|access-date=2025-11-15}}</ref>。这说明Perplexity虽然模型架构可能依赖第三方(如OpenAI API等),但通过多轮检索和推理,达到了接近GPT-4的专业解题水平。相形之下,Gemini和Grok目前在这类高难度综合测试中成绩较低,暗示其深度推理能力还有提升空间tw.news.yahoo.com<ref>{{cite web|title=tw.news.yahoo.com|url=https://tw.news.yahoo.com/perplexity%E7%99%BC%E5%B8%83deep-research-%E5%A4%A9%E5%8F%AF%E5%85%8D%E8%B2%BB%E6%9F%A55%E6%AC%A1-%E8%B7%9Fopenai-google-041226919.html#:~:text=Perplexity%20AI%20%E4%B9%9F%E5%BC%B7%E8%AA%BF%E6%97%97%E4%B8%8BDeep%20Research%E5%9C%A8Humanity%E2%80%99s%20Last,4o%20%283.3%25%29%EF%BC%8C%E5%83%85%E8%90%BD%E5%BE%8C%E6%96%BCOpenAI%E7%9A%84Deep%20Research%20%2826.6%25%29%E3%80%82|publisher=tw.news.yahoo.com|access-date=2025-11-15}}</ref>。 * 事实准确性:在涵盖数千个事实性问题的SimpleQA基准中,Perplexity Deep Research的回答准确率高达93.9%,遥遥领先于其他模型tw.news.yahoo.com<ref>{{cite web|title=tw.news.yahoo.com|url=https://tw.news.yahoo.com/perplexity%E7%99%BC%E5%B8%83deep-research-%E5%A4%A9%E5%8F%AF%E5%85%8D%E8%B2%BB%E6%9F%A55%E6%AC%A1-%E8%B7%9Fopenai-google-041226919.html#:~:text=match%20at%20L251%20%E4%BD%86%E5%9C%A8%E6%B8%AC%E8%A9%A6%E4%BA%8B%E5%AF%A6%E6%80%A7%E7%9A%84%E6%95%B8%E5%8D%83%E5%80%8B%E5%95%8F%E9%A1%8C%E7%B5%84%E6%88%90%E7%9A%84SimpleQA%E5%9F%BA%E6%BA%96%E6%B8%AC%E8%A9%A6%E4%B8%AD%EF%BC%8CPerplexity%20Deep,%E7%9A%84%E6%BA%96%E7%A2%BA%E5%BA%A6%EF%BC%8C%E9%81%A0%E8%B6%85%E5%85%B6%E4%BB%96%E9%A0%98%E5%85%88%E6%A8%A1%E5%9E%8B%E7%9A%84%E8%A1%A8%E7%8F%BE%E3%80%82|publisher=tw.news.yahoo.com|access-date=2025-11-15}}</ref>。这体现了其引用可靠来源、严格校验事实的作风,非常适合需要精确事实依据的专业领域问答。ChatGPT在事实准确性上同样表现优秀(由于没有给出具体数值,这里以Perplexity的数据为参照),并且ChatGPT的深度研究通过引用来源来支撑答案,降低了胡编内容的风险。Gemini由于能搜索整合大量资料,理论上在事实问题上也会有不错的准确性,但其模型(Gemini 2.0 Flash实验版)偏重快速推理,知识覆盖可能不及GPT-4完整,因此在极偏门领域可能出现信息不足或错误。Grok目前仍处于改进阶段,受限于模型规模,在涉及严谨事实的场合下准确性略逊,需要通过引用内容来佐证自身回答。因此,在医学诊断、法律分析等高专业度任务中,ChatGPT深度研究往往能给出较权威且有依据的回答(GPT-4经过医疗和法律知识的专门训练,已通过美国医师资格考试等),Perplexity则确保有据可查且答案简明直观。Gemini在专业问题上的表现正在提升,但权威程度还有待更多实测。Grok由于底层模型相对较新且训练数据偏社交内容,对专业问题的理解深度有限,高度专业的问题(如细致的法律条文解读、医学论文综述)可能无法达到专家水平。不过,随着Grok未来版本(如Grok-3)的推出,其专业能力也有望大幅增强finance.sina.com.cn<ref>{{cite web|title=finance.sina.com.cn|url=https://finance.sina.com.cn/stock/usstock/c/2025-01-04/doc-inecuqcz4016325.shtml#:~:text=%E9%A9%AC%E6%96%AF%E5%85%8B%E7%A7%B0Grok%203%E5%8D%B3%E5%B0%86%E6%8E%A8%E5%87%BA%EF%BC%9A%E5%B7%B2%E5%AE%8C%E6%88%90%E9%A2%84%E8%AE%AD%E7%BB%83%EF%BC%8C%E8%AE%A1%E7%AE%97%E9%87%8F%E6%AF%94Grok%202%E9%AB%98%E5%8D%81%E5%80%8D%20%E8%BF%91%E6%9C%9F%EF%BC%8CGrok%20%E6%8E%A8%E5%87%BA%E4%BA%86%E4%B8%A4%E4%B8%AA%E9%A2%9D%E5%A4%96%E7%9A%84%E5%8A%9F%E8%83%BD%E6%9D%A5%E8%BF%9B%E4%B8%80%E6%AD%A5%E5%A2%9E%E5%BC%BA%E8%BF%99%E7%A7%8D%E4%BD%93%E9%AA%8C%EF%BC%9A%E7%BD%91%E9%A1%B5%E6%90%9C%E7%B4%A2%E5%92%8C%E5%BC%95%E7%94%A8%E3%80%82%E7%9B%AE%E5%89%8DGrok,%E5%88%A9%E7%94%A8%E6%9D%A5%E8%87%AAX%20%E7%9A%84%E5%B8%96%E5%AD%90%E5%92%8C%E6%9D%A5%E8%87%AA%E6%9B%B4%E5%B9%BF%E6%B3%9B%E4%BA%92%E8%81%94%E7%BD%91%E7%9A%84%E7%BD%91%E9%A1%B5%EF%BC%8C%E5%8F%AF%E4%B8%BA%E7%94%A8%E6%88%B7%E7%9A%84%E6%9F%A5%E8%AF%A2%E6%8F%90%E4%BE%9B%E5%8F%8A%E6%97%B6%E4%B8%94|publisher=finance.sina.com.cn|date=2025-01-04|access-date=2025-11-15}}</ref>。 * 输出风格与可信度:在专业场景下,答案的措辞和论证也很重要。ChatGPT的深度研究报告通常结构清晰、层次分明,类似人类撰写的调研报告,可以涵盖背景、现状、分析和结论等部分,这对专业读者来说非常友好。Perplexity的输出相对简洁直接,重点突出,并辅以来源,使得读者可以迅速获取结论并查证细节。Gemini生成的报告据称是多页的全面报告gemini.google<ref>{{cite web|title=gemini.google|url=https://gemini.google/overview/deep-research/?hl=zh-CN#:~:text=%E5%80%9F%E5%8A%A9%20Gemini%20%E4%B8%AD%E7%9A%84%20Deep%20Research%EF%BC%8C%E5%BF%AB%E9%80%9F%E4%BA%86%E8%A7%A3%E5%90%84%E7%A7%8D%E9%A2%86%E5%9F%9F%E7%9A%84%E5%86%85%E5%AE%B9%E3%80%82%E8%BF%99%E4%B8%80%E6%99%BA%E8%83%BD%E4%BD%93%E5%8A%9F%E8%83%BD%E5%8F%AF%E6%9B%BF%E4%BD%A0%E8%87%AA%E5%8A%A8%E6%B5%8F%E8%A7%88%E5%A4%9A%E8%BE%BE%E6%95%B0%E7%99%BE%E4%B8%AA%E7%BD%91%E7%AB%99%E3%80%81%E5%88%86%E6%9E%90%E6%90%9C%E5%AF%BB%E5%88%B0%E7%9A%84%E7%BB%93%E6%9E%9C%E5%B9%B6%E7%94%9F%E6%88%90%E5%86%85%E5%AE%B9%E4%B8%B0%E5%AF%8C%E7%9A%84%E5%A4%9A%E9%A1%B5%E6%8A%A5,%E5%91%8A%EF%BC%8C%E8%BF%98%E8%83%BD%E5%B0%86%E8%BF%99%E4%BA%9B%E6%8A%A5%E5%91%8A%E8%BD%AC%E6%8D%A2%E4%B8%BA%E7%94%9F%E5%8A%A8%E6%9C%89%E8%B6%A3%E7%9A%84%E6%92%AD%E5%AE%A2%E5%BC%8F%E5%AF%B9%E8%AF%9D%E3%80%82|publisher=gemini.google|access-date=2025-11-15}}</ref>,“深入的细节和独到的见解”并存gemini.google<ref>{{cite web|title=gemini.google|url=https://gemini.google/overview/deep-research/?hl=zh-CN#:~:text=%E6%8A%A5%E5%91%8A|publisher=gemini.google|access-date=2025-11-15}}</ref>。这意味着其风格可能更接近咨询报告或研究简报,对于商业和学术用户都有吸引力。Grok在专业话题下的回答通常较为简短,一方面因为其模型尚不如GPT-4健谈,另一方面也可能因为它倾向于引用社交媒体内容(而社交平台上的专业讨论深度有限)。在可信度上,ChatGPT和Perplexity由于提供多个可靠引用,赢得用户信任度更高;Gemini背靠Google的品牌和搜索技术,其信息来源可信度也较有保障;Grok因为引用X帖子,有时引用的内容本身权威性不如学术或新闻来源,需要用户自行辨别。不过,总的来看四大平台在专业领域都在快速改进:OpenAI和Perplexity目前领先,Google紧追其后,而Grok作为后来者也在通过免费开放策略不断学习提升hixx.ai<ref>{{cite web|title=hixx.ai|url=https://www.hixx.ai/zh/blog/innovations-and-research/grok-2-free-to-use#:~:text=xAl%20%E5%9C%A8%E8%BF%87%E5%8E%BB%E5%87%A0%E5%91%A8%E4%B8%80%E7%9B%B4%E5%9C%A8%E6%82%84%E6%82%84%E6%B5%8B%E8%AF%95%20Grok,2%20%E9%80%9F%E5%BA%A6%E6%9B%B4%E5%BF%AB%E3%80%81%E6%9B%B4%E5%87%86%E7%A1%AE%EF%BC%8C%E7%8E%B0%E5%9C%A8%E8%BF%98%E5%8C%85%E5%90%AB%E5%9B%BE%E5%83%8F%E7%94%9F%E6%88%90%E7%AD%89%E6%96%B0%E5%8A%9F%E8%83%BD%E3%80%82|publisher=hixx.ai|access-date=2025-11-15}}</ref>news.sohu.com<ref>{{cite web|title=news.sohu.com|url=https://news.sohu.com/a/870897622_121924584#:~:text=%E5%9C%A8%E4%BC%97%E5%A4%9A%E7%94%A8%E6%88%B7%E4%B8%AD%EF%BC%8C%E5%9B%B4%E7%BB%95%E6%AD%A4%E5%8A%9F%E8%83%BD%E7%9A%84%E8%AE%A8%E8%AE%BA%E4%B9%9F%E6%97%A5%E6%B8%90%E5%8D%87%E6%B8%A9%E3%80%82%E7%94%A8%E6%88%B7%E5%AF%B9DeepResearch%E7%9A%84%E9%AB%98%E6%95%88%E8%83%BD%E6%8F%90%E4%BE%9B%E4%BA%86%E7%A7%AF%E6%9E%81%E7%9A%84%E5%8F%8D%E9%A6%88%EF%BC%8C%E5%90%8C%E6%97%B6%E4%B9%9F%E5%AF%B9%E5%85%B6%E6%AF%8F%E6%9C%88%E7%9A%84%E4%BD%BF%E7%94%A8%E6%AC%A1%E6%95%B0%E9%99%90%E5%88%B6%E6%8F%90%E5%87%BA%E4%BA%86%E8%B4%A8%E7%96%91%E3%80%82%E5%85%B6%E4%BB%96%E7%94%A8%E6%88%B7%E5%88%99%E5%BC%80%E5%A7%8B%E6%9C%9F%E5%BE%85%E6%9C%AA%E6%9D%A5%20%E9%80%9A%E8%BF%87API%E6%8E%A5%E5%8F%A3%E8%BF%9B%E4%B8%80%E6%AD%A5%E6%8E%A5%E5%85%A5%E8%BF%99%E4%B8%80%E5%BC%BA%E5%A4%A7%E5%8A%9F%E8%83%BD%E3%80%82%E6%95%B4%E4%BD%93%E6%9D%A5%E7%9C%8B%EF%BC%8CDeepResearch%E5%B1%95%E7%8E%B0%E5%87%BA%E7%9A%84%E5%BC%BA%E5%A4%A7%E8%83%BD%E5%8A%9B%E5%92%8C%E6%BD%9C%E5%9C%A8%E7%9A%84%E5%B8%82%E5%9C%BA%E9%9C%80%E6%B1%82%EF%BC%8C%E9%A2%84%E7%A4%BA%E7%9D%80%E5%AE%83%E5%9C%A8%E6%9C%AA%E6%9D%A5%E7%9A%84%E4%BD%BF%E7%94%A8%E5%9C%BA%E6%99%AF%E4%B8%AD%E5%B0%86%E6%89%BF%E8%BD%BD%E6%9B%B4%E5%A4%9A%E5%8F%AF%E8%83%BD%E6%80%A7%E3%80%82|publisher=news.sohu.com|access-date=2025-11-15}}</ref>。
Summary:
Please note that all contributions to freem are considered to be released under the Creative Commons Attribution-ShareAlike 4.0 (see
Freem:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)