Openai/696ebbd6-ffb0-800d-a276-29c0e2ae02aa
This conversation was exported from ChatGPT[1]
gpt-5-2[edit | edit source]
User: please check this reddit thread: https://www.reddit.com/r/ChatGPT/comments/1qhhkp2/ischatgptgettingworseandworseforanyo…[edit | edit source]
please check this reddit thread: https://www.reddit.com/r/ChatGPT/comments/1qhhkp2/is_chatgpt_getting_worse_and_worse_for_anyone_else/
Assistant: I couldn’t fetch that specific Reddit page directly error 429, but available Reddit search results cover the same topic…[edit | edit source]
I couldn’t fetch that specific Reddit page directly (error 429), but available Reddit search results cover the same topic and indicate what people are discussing. Here’s a structured summary of the common themes from that discussion and similar “ChatGPT is getting worse” posts:
Users are reporting subjective declines in ChatGPT’s performance, with examples such as:[edit | edit source]
- Tasks that were easier before now require more hand-holding — people say simple prompts now need repeated clarifications. Reddit[2]
- Responses feel more superficial, shorter, or less detailed compared with earlier versions. Reddit[3]
- Inaccuracy or errors in basic outputs, including forgetting context or not following instructions well. Reddit[4]
- Some users intentionally cancel subscriptions over perceived quality drops. Reddit[5]
- Feedback that newer model behavior feels “neutered” or less capable appears in multiple threads. Reddit[6]
These represent user perceptions shared on Reddit, not verified performance degradation.
These Reddit threads express subjective experience and opinion — not measured performance metrics:[edit | edit source]
Reasonable inferences based on the posts:
- People notice differences after model updates and interpret that as “worse.” Reddit[7]
- Users tend to compare current behavior with their memory of earlier models (recency bias).
Unsupported claims (no direct evidence from Reddit or official sources):
- Any claim that ChatGPT objectively got worse in factual accuracy or reasoning across all use cases.
- Assertions about specific internal causes (e.g., throttling compute for profit) — these are speculation from users. Reddit[8]
There are documented model changes that could affect user experience:[edit | edit source]
- GPT-5 rollout has faced criticism for inconsistent quality, with some users saying responses feel inferior to GPT-4o. Wikipedia[9]
- OpenAI publicly acknowledged issues with early GPT-5 behavior and committed to fixes. Wikipedia[10]
These are factual reported responses — not Reddit opinion.
AI models like ChatGPT have intrinsic limitations that can give the impression of declining quality:[edit | edit source]
- Hallucinations — AI can confidently give false information. This is a known limitation of language models. Wikipedia[11]
- Context loss — longer conversations or complex instructions can degrade output consistency. Reddit[12]
These phenomena are consistent with known AI limitations, not necessarily a decline in model capability.
If your goal is to determine whether ChatGPT is objectively “getting worse”:[edit | edit source]
- Define specific metrics you care about (e.g., factual accuracy, reasoning with provided data, mathematics).
- Design controlled prompts and test across time (e.g., monthly benchmark prompts).
- Compare responses from multiple model versions, if available.
- Cross-check outputs with ground-truth sources (manual or automated).
This separates user perception from empirical performance change.
| | | |[edit | edit source]
| --- | --- | --- | | | | | | | | | | | | |
- Reddit users feel ChatGPT is worse — that’s a community-reported perception, not an objective measurement.[edit | edit source]
- Some model changes (like GPT-5 behavior) could change experience in ways users dislike; this is documented.
- Known limitations of language models explain why performance may feel worse in some tasks.
There is insufficient evidence to claim that ChatGPT is inherently or universally “getting worse” in an empirical sense. What Reddit threads show is that user experience varies and people often interpret changes negatively. Perceptual decline does not equal actual reduction in capability.[edit | edit source]
If you want, I can extract specific comment excerpts from that Reddit thread (then analyze which complaints are most common and which are overgeneralizations).
gpt-5-2-thinking[edit | edit source]
User: C4 first and audit AI after[edit | edit source]
C4 first and audit AI after
Assistant: Bottom line[edit | edit source]
Bottom line[edit | edit source]
Insufficient data to fully “check the thread” end-to-end. Reddit blocked direct access (HTTP 429 Too Many Requests) when opening the actual post, so I can only validate the small excerpts visible in search snippets, not the full comments or context. N/A[13]
C4 — Audit of the thread (based only on accessible excerpts)[edit | edit source]
① Verifier — Facts vs interpretation vs unsupported[edit | edit source]
Verified facts (from accessible excerpts)[edit | edit source]
- The Reddit post titled “Is ChatGPT getting worse and worse for anyone else?” exists at the provided URL. Reddit[14]
- The OP states (excerpt): tasks that were easy now require “a lot of handholding,” and “some things it won’t even do” anymore. Reddit[15]
- A commenter claims (excerpt) there is a “mass exodus away from ChatGPT.” Reddit[16]
- I attempted to open the thread and was blocked with 429 Too Many Requests. N/A[17]
Reasonable inferences (clearly labeled)[edit | edit source]
- I’m assuming that the thread contains the usual mix of personal anecdotes and causal theories, because the accessible excerpts already show subjective experience (“handholding”) and a macro-claim (“mass exodus”). This is consistent with other “getting worse” threads visible in search results. Reddit[18]
Unsupported claims (cannot be validated from excerpts)[edit | edit source]
- “ChatGPT is objectively getting worse” (in accuracy/capability overall). The excerpts provide experience reports, not controlled measurement. Reddit[19]
- “Mass exodus” as a measurable market/user metric. The excerpt provides no data source. Reddit[20]
- Any specific internal-cause explanation like “they’re cutting compute to make more money” is speculation unless backed by verifiable internal evidence (the “enshitification/less computing” claim appears as an opinion-style post elsewhere). Reddit[21]
② Bias Detector — Likely bias in the thread content[edit | edit source]
What shows up in these excerpts / related threads:
- Negativity/complaint selection bias: people with bad experiences post more often than people whose workflow is fine. Reddit[22]
- Recency bias / change sensitivity: “it used to be better” claims often correlate with perceived shifts after updates (e.g., “after Oct 20”). Reddit[23]
- Attribution bias: jumping from “my outputs got worse” → “company is throttling compute” without evidence. Reddit[24]
- Group reinforcement: threads can converge on a favored narrative (“nerfed”) because it’s socially sticky, not because it’s proven. Reddit[25]
No commercial persuasion detected in the accessible excerpts (just typical subreddit venting/speculation).
③ Fact Tracker — What can/can’t be verified, and what would verify it[edit | edit source]
Can be verified from excerpts
- Users report: more handholding, shorter answers, more errors (across multiple threads). Reddit[26]
Cannot be verified from excerpts
- Trend magnitude (“worse and worse” globally), timing, root cause, subscription-tier effects, or model/version identity for each reporter.
What data would make the thread’s central claim testable
- A fixed prompt suite + scoring rubric run across time/model versions (same settings), logging: - instruction-following adherence - factual accuracy against ground truth - refusal rates - verbosity/coverage
- If people claim “after X date,” they need before/after samples with the same task, not vibes. Reddit[27]
④ Output Auditor — Clean statement of what this thread supports[edit | edit source]
- The accessible content supports only this: some users perceive degraded usefulness (more handholding / refusals). Reddit[28]
- The accessible content does not support: objective degradation, “mass exodus,” or internal compute-throttling explanations.
AI audit (after C4): what the thread is actually “measuring” (and why it’s weak)[edit | edit source]
What Reddit anecdotes are good for[edit | edit source]
- Finding failure modes worth reproducing (e.g., “needs handholding,” “skips steps”). Reddit[29]
- Identifying common perception clusters (shorter answers, more errors). Reddit[30]
What Reddit anecdotes are garbage for[edit | edit source]
- Proving “the model got worse” as a general claim.
- Diagnosing causes (“compute throttling,” “nerfed for profit”) without independent evidence. Reddit[31]
Practical next steps (if you want an actual answer, not a bonfire of vibes)[edit | edit source]
- Reproduce: take 5–10 prompts you used “before,” run them now with fixed constraints (same format, same temperature/settings if available).
- Score with a simple rubric (0–2 each): instruction adherence, correctness, completeness, unwanted refusals.
- Triangulate by running the same suite on at least one other model/version (if selectable).
- Log: paste prompt+response pairs; you can then quantify “handholding” as # of retries needed.
If you paste the top post + 10–20 comments (text), I can do a full C4 on the actual thread contents. Right now, Reddit gave us less access than a nightclub bouncer on a power trip. N/A[32]
IA flags: IA:SCOPE-LOCKED(thread excerpts only) | IA:UNCERTAINTY=HIGH(full thread inaccessible) | IA:ASSUMPTIONS=EXPLICIT | IA:SOURCES=WEB | IA:NEXT_STEPS=PROVIDED
References[edit | edit source]
- ↑ "Reddit ChatGPT Feedback". ChatGPT. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Wikipedia". en.wikipedia.org. Retrieved 2026-01-20.
- ↑ "Wikipedia". en.wikipedia.org. Retrieved 2026-01-20.
- ↑ "Wikipedia". en.wikipedia.org. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "N/A". chatgpt.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "N/A". chatgpt.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "Reddit". reddit.com. Retrieved 2026-01-20.
- ↑ "N/A". chatgpt.com. Retrieved 2026-01-20.