ChatGPT’s new picture mannequin turned my article into handwriting
Abstract created by Sensible Solutions AI
In abstract:
- PCWorld examined ChatGPT’s new Pictures 2.0 mannequin, which demonstrates outstanding accuracy in rendering textual content inside AI-generated pictures, together with handwritten kinds.
- The upgraded mannequin is now out there to all customers and introduces enhanced capabilities like internet looking out, infographic creation, and multi-language help together with non-Latin scripts.
- Pictures 2.0’s improved textual content rendering opens sensible purposes for creating catalogs, storyboards, and detailed technical documentation with good textual accuracy.
Picture-generation fashions have a protracted historical past of bungling textual content. However whereas garbled letters was once a transparent AI inform, ChatGPT’s new image-generation device is one of the best I’ve ever seen at rendering textual content.
I requested ChatGPT’s Pictures 2.0 mannequin (out there now to all ChatGPT customers, together with these on the free tier) to take some textual content from a current story of mine and render it in pencil on a yellow authorized pad and, properly, it appears just about good to me:
Ben Patterson/Foundry
I additionally prompted it to create an infographic about AI tokens, instructing it first to look the net for correct data and to make use of a serif font in a panorama 3:2 side ratio. Right here’s what I acquired:

Ben Patterson/Foundry
Then I tasked Pictures 2.0 with creating one other infographic, this time detailing the varied Raspberry Pi fashions full with specs and different particulars:

Ben Patterson/Foundry
Lastly, I requested the mannequin to take a snapshot of me poolside and create a summer time lookbook of outfits, starring me:

Ben Patterson/Foundry
OpenAI says Pictures 2.0 is its first image-generation mannequin with “pondering” capabilities, that means it will possibly cease and ponder a picture immediate earlier than diving proper in.
With regards to textual content, Pictures 2.0 helps a wide range of languages, together with Japanese, Korean, Chinese language, Hindi, Bengali, and others that make use of non-Latin textual content.
It may possibly additionally search the net for real-time data earlier than rendering pictures, in addition to create a number of pictures in a single shot, good for rendering catalog pictures, comicbook-style panels, and storyboards.
OpenAI guarantees that Pictures 2.0 will ship an “unprecedented stage of specificity and constancy,” that means (hopefully) that it’ll do a greater job at immediate adherence–that’s, creating pictures that observe your prompts to the letter.
With this stage of accuracy, Pictures 2.0 might provide a solution to the query I’ve lengthy requested about image-generating fashions: What are they good for, except for creating goofy memes or creepy deepfakes? What’s the precise, sensible software?
Close to-instant typesetting, infographic creation, and catalog rendering may very well be a number of the options, though fixing a typo would require utterly re-rendering the picture.
It’s additionally doable that the extra you experiment with Pictures 2.0 (I’ve solely been enjoying with it for an hour or so), the extra the rendered pictures could look same-y, which is why you’d seemingly want a talented human prompter with an eye fixed for design on the helm.

