Google first make a public release of its text-to-image AI model Imagen

Google Imagen produces images from texts much like DALL-E-2.  However, with the announcement of adding a very limited form of Imagen to its AI Test Kitchen App, Google has kept the AI model far from the public’s hand.

What is “Imagen”?

Imagen is a diffusion model that converts text to images.  Users can simply input text, and the AI system will use a frozen text encoder to turn the texts into embeddings which will then be mapped into images.  Imagen is developed by Google Brain, a deep-learning artificial intelligence research team.  With a deep understanding of languages, Imagen can produce a super high-resolution of photorealistic images.

Interactions with Imagen

City Dreamer

Inside City Dreamer, users can build an AI-generated city with whatever theme they like.  For example, if they enter “Christmas”, Imagen will then create a city with sample buildings and infrastructures with elements related to Christmas.  It is just like what you will see in SimCity.

Wobble

A monster is created according to user’s selection of the materials it is made of, such as clay, felt, marzipan and rubber.  The monster will also have an outfit of the users’ choice.  Imagen generates the monster, and you can name it and interact with it.

Limitations of Imagen

The AI Test Kitchen was launched earlier this year to let Google beta test some of its AI technologies.  Added to the AI Test Kitchen App, Imagen was set with some constraints.  Compared to DALL-E’s public beta, Imagen did not serve to have much freedom as users cannot request everything they like.  Josh Woodward, senior director of product management at Google, explained that it is to collect early feedback on the technology and find out how people will break it.  Therefore, they exercise control over how the user interacts with the system.

Challenges of Imagen

1. Misuse of the technology

When Imagen becomes a commercial product with full public access in the future, some of the users may abuse it if total freedom and control are given to users.

2. Reliance on databases

With the surging demand for data for training text-to-image AI models, datasets scraped from the web, biased and unsorted, are largely used by tech companies to train AI models.  These data may negatively impact how the AI model presents the images.

Whether Imagen will be developed into an AI model for public use is still under discussion.  Users can take the lead to try it with City Dreamer and Wobble with AI Test Kitchen.  It is free for download on Android and iOS.

References: The Verge, DPReview

AI Google Google AI Imagen Google Imagen Imagen 人工智能
Related Posts:

ChatGPT是什麼?ChatGPT 香港能用嗎?ChatGPT 支援中文介面?ChatGPT怎麼用?Preface馬上一文介紹聊天機械人ChatGPT。

Partnering with Preface to Provide Innovative Technology Education Curriculums

Experience the power of AI at our Generative AI Exhibition in Causeway Bay Flagship

眾多基於AI語言模型所開發的應用之中,Poe為何能脫穎而出受到大眾的青睞呢?如果你還不知道Poe,並且躍躍欲試,那麼本文亦會提供詳細的教程,讓你了解Poe的優缺點,並知曉如何透過Poe使用並體驗最新的AI語言模型。

了解在香港使用ChatGPT的方法及與一般Chatbot的區別。從GPT1.0至GPT5.0,理解ChatGPT背後的生成式語言模型。發掘創意應用,從學習新語言到虛擬補習,甚至心理輔導。解答ChatGPT香港用戶疑問,包括信用卡升級問題和ChatGPT API的使用。