November 2, 2024

TechNewsInsight

Technology/Tech News – Get all the latest news on Technology, Gadgets with reviews, prices, features, highlights and specificatio

It is pointed out that the reason behind the difficulty of “character output” using artificial intelligence to generate images is similar to “mysterious kanji tattoos drawn by aliens” – GIGAZINE

It is pointed out that the reason behind the difficulty of “character output” using artificial intelligence to generate images is similar to “mysterious kanji tattoos drawn by aliens” – GIGAZINE



When using AI to generate images like Stable Diffusion or DALL-E 3, you tend to encounter issues like “vague patterns are output instead of letters'' and ''short words are written differently.'' There is a heated debate on the news site Social Hacker News on why AI is inefficient at generating images in “text output”.

Ask HN: Why can't image generation models spell? | Pirate News
https://news.ycombinator.com/item?id=39727376

Here is an example of creating an image with text using image generation AI. Equipped with DALL-E 3Image makerCreates an image containing the prompt “An image of the exterior of a ramen shop with the name “Ramen Fantasy” written on it. As a result, the phrase “Ramen Fantasy” is not output, and the misspelled word “RAIMEN” is generated. Correct or a puzzle is created. A kanji-like pattern was output.


Japanese characters appear to be converted to English and processed, so change the prompt to “Image of the exterior of a ramen shop with 'Ramen Eater' written on it” to create an image with English words. The results generated are lower. “Eater” became “eater”.


The problem of generating images in which the AI ​​cannot correctly output sentences seems to be worrying users around the world, and the social news site Hacker News posted: “I wanted to create an image that included my son’s name, but the image was misspelled. Although The name is only 5 letters long. Why is the AI ​​misspelling the image?” It received many comments.

Familiar with artificial intelligenceBranwen next to meHe explained the reasons why AI is not good at generating characters, such as “many image generation AI models are not able to learn text well” and “because they do not take character output into account when encoding prompts.”mention itHe is.

See also  Uncapped Games, led by StarCraft II's David Kim and others, has released a developer video diary. It aims to revolutionize RTS with new work.

In addition, Barking Cat explained the position that “the training data for the image-generating AI does not include enough textual information,” and “if an English artist who does not know Japanese at all creates a tattoo that includes kanji, he may not know what kanji looks like.” Even if they are Japanese, they don't know how to write kanji, so they can draw funny tattoos.”clarificationa job.

by Pablo Manriquez

In addition, developers of image generation AI models are also aware of the problem of “not being able to produce sentences well”, and research and development is progressing to improve the generation accuracy. For example, “Stable Diffusion 3” announced in February 2024 is attractive for its ability to output sentences accurately.

High-quality AI image generation “Stable Diffusion 3” has been announced, which is able to achieve “specific character imaging” and “multi-subject imaging” with high precision, which AI image generation is weak – GIGAZINE


Copy the title and URL of this article