Hello and welcome to the 14th issue of AI Powered Engineering.

With so many new foundational model updates coming out, I decided to wait on putting out a new issue until everyone had their fair chance to release. And well, now here we are, and all the cards are on the table - we have three, super powerful foundational models, all with major improvements when it comes to both coding and planning, and more confusion than ever on which to use.

Couple this with the fact that both Cursor and Windsurf introduced their own models, that are also pretty darn good, and I think it’s safe to say we’re in the midst of model chaos.

At the same time, with this chaos comes great capabilities. There has never been a better time in history to build production-grade software, with the help of AI. Note - this should not be confused with, vibe coding production grade software, I still don’t think this is possible, and won’t be possible for quite some time.

There’s a reason why my newsletter is called “AI Powered Engineering” and it’s because this is not a newsletter for people who don’t know how to code and want to vibe code software. Not that there’s anything wrong with that, but my newsletter has a very specific target market, existing software engineers and data scientists that want to leverage AI to write better, more robust, production grade code.

And if this sounds like you, then it’s likely you are struggling to figure out if you should be using Claude 4.5 Opus, ChatGPT 5.2, or Gemini 3, or some combination of the three. My hope is that by the end of this issue - you have a clearer path forward.

And with that, let’s get to the good stuff, my last newsletter of the year, let’s rock 🤘

Claude Opus 4.5 vs. ChatGPT 5.2

To start with, I want to talk about the two models that I think deserve the limelight right now, and that’s Claude 4.5 Opus and ChatGPT 5.2. Both of these I think represent the absolute cutting edge when it comes to foundational models writing code.

You’ll notice I’m not mentioning Gemini 3 here and that’s because I honestly do think that Claude and ChatGPT are in a different league. I’ll just lay it out there and say it.

Of course, this doesn’t mean Gemini 3 doesn’t deserve accolades, but I’ll leave that for the next article. Right now, I want to focus on what I see as the real meat on the bones, which is these two juggernauts.

High level - here’s the key strength of each at a high-level:

GPT‑5.2: better at multi‑file changes, tricky bug repros, and “implement this feature across the service” prompts

Opus 4.5: extremely strong at getting single tasks right with fewer corrections and at sticking to the spec

And now to go a bit deeper, here’s a table you can use to evaluate which is the best tool for the job, based on what you’re looking for.

Use case	Better choice	Why it tends to win
Large refactors across a repo	ChatGPT 5.2	Stronger multi‑file reasoning and SWE‑Bench style performance.
New feature from product spec	ChatGPT 5.2	Good at planning, generating coherent, reviewable implementations.
Autonomously iterating until tests pass	Claude Opus 4.5	Agent‑style workflows and persistent debugging shine.
Explaining code/teaching juniors	Claude Opus 4.5	More balanced, conceptual explanations of coding principles.
Complex math/algorithms	ChatGPT 5.2	Stronger abstract reasoning and math benchmarks.
Quick, surgical bugfixes	Tie, slight Opus edge	Opus users report very reliable real‑world patches; 5.2 is also excellent.

One note, worth noting. I do recommend you spend $100/mo for Claude and $200/mo for ChatGPT if you want to use either of these models at their full horsepower. If you’re using a $20/mo plan for either or both of these, you might find yourself disappointed, and that’s because, it’s only $20, what do you expect?

I do always find it interesting when someone refuses to spend more than $20/mo and then claims that the model sucks. You get what you pay for, and ChatGPT 5.2 Pro absolutely cooks, but you have to pay $200/mo to be in the kitchen with it.

Benchmarks aren’t everything - going deeper with Claude Opus 4.5, ChatGPT 5.2 and Gemini

Okay, now it’s time to give Gemini some love again, because it is a very impressive model, but like I said above, I personally think the decision software engineers have at this exact moment in time is between Claude and ChatGPT when it comes to actually producing code.

But Gemini, well it’s a badass model, and I personally use it for quick planning quite a bit, and with Gemini Deep Research now out in the wild, it can really go deep.

Still, I can’t do justice to comparing these three models, and luckily I don’t have to since Nate Jones did a much better job than I ever would.

Seriously, if there’s one video you watch to compare these three models, this is the one.

Diving into the data: senior engineers accept more agent output than juniors

There’s a common misconception about how software engineers (not vibe coders) use AI, and I hear this a lot. The general thinking is that junior engineers now rely on AI a ton while Senior Engineers are still writing code by hand more often than not.

But that’s not what the data shows, in fact, it’s the opposite. Here’s the data Eric from Cursor shared in late November.

As you can see, Junior engineers actually accept the least amount of agent output vs. any other category. This makes sense to me, and here’s why. I think most junior engineers, (once again, not vibe coders, but actual software engineers with four years CS degrees) want to be able to know how their code works. When you’re just starting out in the software world, the last thing you want is to end up with a bunch of code that you now own, but don’t understand.

Eric shared a few more nuggets in his tweet from the data they’re seeing:

— # (#)

If you have some time, this tweet has a ton of replies and there are some interesting ones to read if you have the time. The key takeaway is that while vibe coders are accepting 100% of what an agent outputs, junior software engineers are not.

Parallel Agents - what I’m most excited about in 2026

Since this is my last newsletter of the year, I thought I may as well share what I’m the most excited about in 2026, and well, I kinda gave it away, it’s parallel agents. I’m seeing more and more super senior engineers leverage parallel agents and they’re seeing really strong results.

— # (#)

The tweet above shows a pretty jaw-dropping image of 24 Claude Code Opus agents running in parallel. In many ways, this is like scaling up a 24 person engineering team, with each engineer being incredibly focused on one specific task or area of expertise.

At Github Universe this year I went to a really interesting talk where they discussed how our role as Engineering leaders would change over the next year as our team became a balance of humans and agents.

A year ago, this would have sounded crazy - now, not so crazy.

If you want to read more about how to get parallel agents working for you, here’s a few articles I recommend taking a look at:

Okay, and that’s it for 2026. As you all know, this newsletter is completely free, I have no sponsors, and I have no ebook to sell you, no course for you to sign up to, or podcast I want to pitch you.

I’m an engineer and cto who is just crazy excited about what is happening in the world of AI Powered Engineering, and I like writing, so here we are, 14 issues in and still going strong.

If you like this newsletter, all I can ask is that you share it with more people who you think would enjoy it.

Thanks for reading, happy holidays, and I’ll see you next year!

Secret Bonus Video 👀

This section has been a hit - it’s a fun reward for people who read the whole thing so I’m just going to keep it rocking, enjoy!

AI Powered Engineering: Issue #14

Claude Opus 4.5 vs. ChatGPT 5.2

Benchmarks aren’t everything - going deeper with Claude Opus 4.5, ChatGPT 5.2 and Gemini

Diving into the data: senior engineers accept more agent output than juniors

Parallel Agents - what I’m most excited about in 2026

Secret Bonus Video 👀

Keep Reading

AI Powered Engineering

Home