OpenAI GPT-5 coding tests by ZDNET: why GPT-4o still wins (for now)

Published: 2025-08-12 Category: AI News

OpenAI GPT-5 coding tests by ZDNET: why GPT-4o still wins (for now)

ZDNET tested OpenAI’s GPT-5 on hands-on programming tasks and found it underperformed compared to prior ChatGPT models, prompting a temporary return to GPT-4o for coding. Below is a neutral summary of ZDNET’s findings (testing and screenshots by ZDNET’s David Gewirtz).

Key takeaways from ZDNET’s tests

GPT-5 failed roughly half of the coding trials on first attempt.
Earlier OpenAI releases typically scored near-perfect on the same suite.
With legacy model fallback enabled, users can switch back to GPT-4o.

New in-chat code editor glitch
ZDNET noted a new Edit button for generated code. Edits appeared to save, but the session failed to restore properly, forcing a rerun of the original prompt.

Test 1: WordPress plugin
GPT-5 produced a single-file plugin that rendered a UI and updated counts but mislabeled “Lines to randomize” and, when Randomize was clicked, redirected to tools.php instead of returning results. After ZDNET requested a full replacement file (rather than a one-line patch), the revised plugin worked and correctly separated duplicates. Because the initial output did not function, ZDNET marked this test as a fail.

Test 2: String function rewrite
Asked to allow dollars and cents (without extra validation requirements), GPT-5 delivered a minimal, targeted rewrite. No unnecessary checks were added—exactly as requested—so ZDNET scored this a pass.

Test 3: Finding a subtle WordPress bug
This trial required arcane knowledge of how WordPress filters pass data. Like GPT-4/4o, GPT-5 reasoned correctly and articulated the fix. Pass.

Test 4: Cross-environment script (Keyboard Maestro + AppleScript + Chrome)
GPT-5 handled Keyboard Maestro references but stumbled on AppleScript specifics, including treating AppleScript as case-sensitive without a considering case block and referencing an undefined searchTerm. These errors produced runtime failures. Fail.

Model fallback now available
After broad user pushback, ZDNET observed an option to enable Show legacy models in ChatGPT settings on paid tiers, allowing a switch back to GPT-4o for coding work.

Bottom line
ZDNET concluded GPT-5’s deep-reasoning potential is promising, but for day-to-day coding they’re sticking with GPT-4o until GPT-5’s reliability improves.

More info here – Have a Story? Address it to the Editor and submit it here

About ZDNET

ZDNET is a business technology publication from Ziff Davis that helps professionals make smarter decisions about software, hardware, and AI. It covers enterprise IT, cybersecurity, cloud, data, and developer tools, alongside consumer tech that affects work, from laptops and phones to smart devices.

The site blends news, analysis, and opinion with hands-on product reviews, buying guides, and step-by-step how-tos. Its editors and contributing experts test devices, evaluate services, and compare competing platforms to highlight practical benefits, trade-offs, and total cost considerations. ZDNET’s coverage is designed for IT leaders, founders, developers, and decision-makers who need clear explanations, real-world benchmarks, and actionable guidance.

Regular columns and newsletters track long-term trends—such as automation, security, and cloud economics—while quick guides help readers deploy tools and troubleshoot common issues. With an emphasis on clarity, repeatable testing, and transparent recommendations, ZDNET aims to translate fast-moving innovation into reliable advice that saves time, reduces risk, and improves outcomes.

Featured image source: FT

Disclaimer

The information provided in this article is for general informational purposes only and from publicly available sources. While we strive for accuracy, we do not make any representations or warranties, express or implied, regarding the completeness, reliability, or validity of the content. This article does not make any direct claims about specific companies, individuals, or organizations. Any references to reports or external sources are for context and do not imply endorsement or verification of any specific allegations. Readers are encouraged to conduct their own research and seek professional advice before making business decisions. We disclaim any liability for any losses or damages incurred as a result of reliance on the information provided.