ZDNET’s key takeaways
- ChatGPT Codex wrote code and saved me time.
- It additionally created a severe bug, however it was in a position to get better.
- Codex continues to be based mostly on the GPT-4 LLM structure.
Nicely, vibe coding this isn’t. I discovered the expertise to be sluggish, cumbersome, tense, and incomplete. However it all labored out ultimately.
ChatGPT Codex is ChatGPT’s agentic device devoted to code writing and modification. It will possibly entry your GitHub repository, make adjustments, and difficulty pull requests. You may then assessment the outcomes and determine whether or not or to not incorporate them.
Additionally: The best way to transfer your codebase into GitHub for evaluation by ChatGPT Deep Analysis – and why you must
My major improvement undertaking is a PHP and JavaScript-based WordPress plugin for website safety. There is a fundamental plugin obtainable totally free, and a few add-on plugins that improve the capabilities of the core plugin. My non-public improvement repo accommodates all of this, in addition to some upkeep plugins I depend on for person help.
This repo accommodates 431 recordsdata. That is the primary time I’ve tried to get an AI to work throughout my total ecosystem of plugins in a personal repository. I beforehand used Jules so as to add a characteristic to the core plugin, however as a result of it solely had entry to the core plugin’s open supply repository, it could not keep in mind your entire ecosystem of merchandise.
Earlier final week, I made a decision to offer ChatGPT Codex a run at my code. Then this occurred.
GPT-5 launched
On Thursday, GPT-5 slammed into the AI world like a freight practice. Initially, OpenAI tried to power everybody to make use of the brand new mannequin. Subsequently, they added legacy mannequin help when a lot of their clients went ballistic.
I ran GPT-5 in opposition to my set of programming checks, and it failed half of them. So, I used to be notably interested by whether or not Codex nonetheless supported the GPT-4 structure or would power builders into GPT-5.
Nonetheless, once I queried Codex 5 days after GPT-5 launched, the AI responded that it was nonetheless based mostly on “OpenAl’s GPT-4 structure.”
I took two issues from that:
- OpenAI is not prepared to maneuver Codex coding to GPT-5 (which, recall, failed half my checks).
- The outcomes, conclusions, and screenshots I took of my Codex checks are nonetheless legitimate, since Codex continues to be based mostly on GPT-4.
With that, right here is the results of my still-very-much-not-GPT-5 have a look at ChatGPT Codex.
Getting began
My first step was asking ChatGPT Codex to look at the codebase. I used the Ask mode of Codex, which does evaluation, however would not truly change any code.
I hoped for one thing as deep and complete because the one I acquired from ChatGPT Deep Analysis a number of months in the past, however as a substitute, I acquired a a lot much less full evaluation.
I discovered a simpler method was to ask Codex to do a fast safety audit and let me know if there have been any points. This is how I prompted it.
Determine any severe safety issues. Ignore plugins Anybody With Hyperlink, License Fixer, and Settings Nuker. Anybody With Hyperlink is within the very early phases of coding, and isn’t prepared for code assessment. License Fixer and Settings Nuker are specialty plugins that don’t want a safety audit.
Codex recognized three fundamental areas for enchancment.
All three areas had been legitimate, though I’m not ready to change the serialization information construction right now, as a result of I am saving that for a complete preferences overhaul. The $_POST criticism is managed, however with a distinct method than Codex seen.
Additionally: One of the best AI for coding in 2025 (and what to not use)
The third space — the nonce and cross-site request forgery (CSRF) threat — was one thing value altering straight away. Whereas entry to the person interface for the plugin is assumed to be decided by login position, the plugins themselves do not explicitly test that the individual submitting the plugin settings for motion is allowed to take action.
That is what I made a decision to ask Codex to repair.
Fixing the code
Subsequent up, I instructed Codex to make fixes within the code. I modified the setting from Ask mode to Code mode so the AI would truly try adjustments. As with ChatGPT Agent, Codex spins up a digital terminal to do a few of its work.
When the method accomplished, Codex confirmed a diff (the distinction between authentic and to-be-modified code).
I used to be heartened to see that the adjustments had been fairly surgical. Codex did not attempt to rewrite massive sections of the plugin; it simply modified the small areas that wanted enchancment.
In a number of areas, it dug in and adjusted a number of extra strains, however these adjustments had been nonetheless fairly particular to the unique immediate.
At one level, I used to be curious to know why it added a brand new foreach loop to iterate over an array, so I requested.
As you’ll be able to see above, I obtained again a reasonably clear response on its reasoning. It made sense, so I moved on, persevering with to assessment Codex’s proposed adjustments.
All instructed, Codex proposed making adjustments to 9 separate recordsdata. As soon as I used to be happy with the adjustments, I clicked Create PR. That creates a pull request, which is how any GitHub person suggests adjustments to a codebase. As soon as the PR is created, the undertaking proprietor (me, on this case) has the choice to approve these adjustments, which provides them into the precise code.
It is a good mechanism, and Codex does a clear job of working inside GitHub’s setting.
As soon as I used to be satisfied the adjustments had been good, I merged Codex’s work again into the primary codebase.
Houston, we now have an issue
I introduced the adjustments down from GitHub to my take a look at machine and tried to run the now-modified plugin. Anticipate it…
Yeah. That is not what’s alleged to occur. To be honest, I’ve generated my very own share of error screens identical to that, so I am unable to actually get offended on the AI.
As an alternative, I took a screenshot of the error and handed it to Codex, together with a immediate telling Codex, “Selective Content material plugin now fails after making adjustments you urged. Listed below are the errors.”
It took the AI three minutes to recommend a repair, which it introduced to me in a brand new diff.
I merged that grow to be the codebase, as soon as once more introduced it all the way down to my take a look at server, and it labored. Disaster averted.
No vibe, no circulate
Once I’m not in a rush and I’ve the time, coding can present a really nice mind-set. I get right into a type of circulate with the language, the machine, and what looks like a connection between my fingers and the pc’s CPU. Not solely is it numerous enjoyable, however it may also be emotionally transcendent.
Working with ChatGPT Codex was not enjoyable. It wasn’t hateful. It simply wasn’t enjoyable. It felt extra like exchanging emails with a very recalcitrant contractor than having a gathering of the minds with a coding buddy.
Additionally: The best way to use GPT-5 in VS Code with GitHub Copilot
Codex supplied its responses in about 10 or quarter-hour, whereas the identical code would in all probability have taken me a number of hours.
Would I’ve created the identical bug as Codex? In all probability not. As a part of the method of pondering by means of that algorithm, I almost definitely would have averted the error Codex made. However I undoubtedly would have created a number of extra bugs based mostly on mistyping or syntax errors.
To be honest, had I launched the identical bug as Codex did, it might have taken me significantly longer than three minutes to search out and repair it. Add one other hour or so a minimum of.
So Codex did the job, however I wasn’t in circulate. Usually, once I code and I am inside a specific file or subsystem, I do numerous work in that space. It is like cleansing day. When you’re cleansing one a part of the toilet, you may as properly clear all of it.
However Codex clearly works greatest with small, easy directions. Give it one class of change, and work by means of that one change earlier than introducing new elements. Like I mentioned, it does work and it’s a useful gizmo. However utilizing it undoubtedly felt like extra of a chore than programming usually does, although it saved me numerous time.
Additionally: Google’s Jules AI coding agent constructed a brand new characteristic I might truly ship – whereas I made espresso
I haven’t got tangible take a look at outcomes, however after testing Google’s Jules in Could and ChatGPT’s Codex now, I get the impression that Jules is ready to get a deeper understanding of the code. At this level, I am unable to actually help that assertion with numerous information; it is simply an impression.
I’ll strive operating one other undertaking by means of Jules. It will likely be attention-grabbing to see if Codex adjustments a lot as soon as OpenAI feels secure sufficient to include GPT-5. Let’s understand that OpenAI eats its personal pet food with Codex, that means it makes use of Codex to construct its code. They may have seen the identical iffy outcomes I discovered in my checks. They could be ready till GPT-5 has baked for a bit longer.
Have you ever tried utilizing AI coding instruments like ChatGPT Codex or Google’s Jules in your improvement workflow? What sorts of duties did you throw at them? How properly did they carry out? Did you are feeling like the method helped you’re employed extra effectively? Did it sluggish you down and take you out of your coding circulate?
Do you like giving your instruments small, surgical jobs, or are you in search of an agent that may deal with big-picture structure and reasoning? Tell us within the feedback under.
You may comply with my day-to-day undertaking updates on social media. Make sure to subscribe to my weekly replace e-newsletter, and comply with me on Twitter/X at @DavidGewirtz, on Fb at Fb.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, on Bluesky at @DavidGewirtz.com, and on YouTube at YouTube.com/DavidGewirtzTV.