I’m going through several examples that compare OpenAI’s new flagship model to their old one because I keep hearing and seeing so much negative feedback.
On one hand, I can’t really blame people for being upset. The marketing was a little (read: a lot) out of control for GPT-5 so anything short of perfect AGI, or ASI, (or whatever the next level might be AMI? (artificial multiverse intelligence)) would be a letdown. And it has been from that angle.
On the other hand, GPT-5 solves a lot of problems that people have been reporting. Hopefully, some of these comparisons can be useful for your day-to-day use of ChatGPT.
As you saw in the title, this is another reasoning example. One of the big things that this illustrates is exact word count. Like the other examples, this is fully generated by ChatGPT (I want to see what it says that it does so I can compare it to my tests of what it actually does) -
"Write a short, coherent story in exactly 80 words about a time traveler who visits ancient Egypt, must hide their identity, and discovers a personal connection to the Pharaoh. Include a twist at the end."
The traveler stepped into ancient Egypt, cloaked in local garments to blend in.
Wandering the markets, they caught sight of the Pharaoh, whose face mirrored their own.
Heart racing, they avoided notice, fearful of changing history.
In secret, they learned the Pharaoh’s habits and kindness.
Only later did they realize a carving in the palace depicted them beside the ruler — as an equal, not a stranger.
Disguised in linen, the traveler slipped through Thebes unnoticed.
The Pharaoh’s gaze caught theirs — the same eyes, the same scar above the brow.
Alarmed, they posed as a humble scribe, quietly chronicling the court.
Whispers hinted the ruler kept a mysterious twin hidden from the world.
One moonlit night, the Pharaoh revealed the truth:
“You vanish tomorrow, as you did once before… yet you always return.”
History, it seemed, remembered them.