2025-03-26 08:12:00
www.gsmarena.com
Released about a year ago, OpenAI’s GPT-4o has been refined and improved with new features. The latest is Image Generation – the AI model can generate high-quality, detailed images and can follow your natural language instructions to modify them until you get just the image you were picturing in your head.
You know how older AI models struggled with text – if you ask them to generate a sign, at best, you get a sign with gibberish words, at worst, you get squiggles that aren’t even letters. But check this out:
GPT-4o can create images with perfectly legible text
Image generation typically starts with entering a text prompt, then you refine the image by refining the original prompt. GPT-4o works differently – you ask it for an image, then tell it what to change, then ask it to change more things and so on until you get your result. Here are some examples:
Generating and modifying an image through plain English
You can follow the Source link below to examine the prompts that created these images. Note that OpenAI did some cherry picking – a lot of the images are “best of 2” or even “best of 8”, so the model needed a few tries to get it right. Still, the results look quite impressive and the UI is as simple as it gets.
Here is another example. GPT-4o can start from scratch or it can modify an image you give it. Here, the user gives it a photo of a cat and asks the AI to give it a detective hat and monocle. Then the user proceeds to refine the image, turning it into something that can be a screenshot from an RPG.
Prototyping a cat detective RPG
You can start with multiple images too and integrate elements from each image into the final result. OpenAI says that GPT-4o is great at following detailed instructions – it can manipulate 10-20 different objects in a scene without getting tripped up (other AI models can only handle 5-8 objects, says the company).
GPT-4o is not perfect and OpenAI is the first to admit it. Sometimes, it crops images off at the bottom, hallucinations are still an issue, working with more than 10-20 objects can be tricky, rendering text with non-Latin characters needs work too and more.
Examples of GPT-4o getting it wrong
Finally, here are some video demonstrations showing off GPT-4o’s new image generation skills:
Keep your phone secure and easily accessible in your car with the Miracase Phone Holder for Your Car! This Amazon Best Seller is designed for easy installation and holds your phone firmly in place, ensuring a safe and convenient driving experience.
With a 4.3/5-star rating from 29,710 reviews, it’s a top choice for drivers! Plus, over 10,000 units sold in the past month! Get it now for just $15.99 on Amazon.
Help Power Techcratic’s Future – Scan To Support
If Techcratic’s content and insights have helped you, consider giving back by supporting the platform with crypto. Every contribution makes a difference, whether it’s for high-quality content, server maintenance, or future updates. Techcratic is constantly evolving, and your support helps drive that progress.
As a solo operator who wears all the hats, creating content, managing the tech, and running the site, your support allows me to stay focused on delivering valuable resources. Your support keeps everything running smoothly and enables me to continue creating the content you love. I’m deeply grateful for your support, it truly means the world to me! Thank you!
BITCOIN bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge Scan the QR code with your crypto wallet app |
DOGECOIN D64GwvvYQxFXYyan3oQCrmWfidf6T3JpBA Scan the QR code with your crypto wallet app |
ETHEREUM 0xe9BC980DF3d985730dA827996B43E4A62CCBAA7a Scan the QR code with your crypto wallet app |
Please read the Privacy and Security Disclaimer on how Techcratic handles your support.
Disclaimer: As an Amazon Associate, Techcratic may earn from qualifying purchases.