Preface Logo

Summer!

Schools

Blog

Preface Logo

Google first make a public release of its text-to-image AI model Imagen

Google Imagen produces images from texts much like DALL-E-2.  However, with the announcement of adding a very limited form of Imagen to its AI Test Kitchen App, Google has kept the AI model far from the public’s hand.

What is “Imagen”?

Imagen is a diffusion model that converts text to images.  Users can simply input text, and the AI system will use a frozen text encoder to turn the texts into embeddings which will then be mapped into images.  Imagen is developed by Google Brain, a deep-learning artificial intelligence research team.  With a deep understanding of languages, Imagen can produce a super high-resolution of photorealistic images.

Interactions with Imagen

City Dreamer

Inside City Dreamer, users can build an AI-generated city with whatever theme they like.  For example, if they enter “Christmas”, Imagen will then create a city with sample buildings and infrastructures with elements related to Christmas.  It is just like what you will see in SimCity.

Wobble

A monster is created according to user’s selection of the materials it is made of, such as clay, felt, marzipan and rubber.  The monster will also have an outfit of the users’ choice.  Imagen generates the monster, and you can name it and interact with it.

Limitations of Imagen

The AI Test Kitchen was launched earlier this year to let Google beta test some of its AI technologies.  Added to the AI Test Kitchen App, Imagen was set with some constraints.  Compared to DALL-E’s public beta, Imagen did not serve to have much freedom as users cannot request everything they like.  Josh Woodward, senior director of product management at Google, explained that it is to collect early feedback on the technology and find out how people will break it.  Therefore, they exercise control over how the user interacts with the system.

Challenges of Imagen

1. Misuse of the technology

When Imagen becomes a commercial product with full public access in the future, some of the users may abuse it if total freedom and control are given to users.

2. Reliance on databases

With the surging demand for data for training text-to-image AI models, datasets scraped from the web, biased and unsorted, are largely used by tech companies to train AI models.  These data may negatively impact how the AI model presents the images.

Whether Imagen will be developed into an AI model for public use is still under discussion.  Users can take the lead to try it with City Dreamer and Wobble with AI Test Kitchen.  It is free for download on Android and iOS.

References: The Verge, DPReview

AI Google Google AI Imagen Google Imagen Imagen 人工智能
ad_20240416_summer2024_b
ad_20231005_future_design_b
ad_20231006_kids_trial_class_b
Related Posts:

提到 AI 工具,除了 ChatGPT 和 Claude 以外其實還有更多選擇。讓 Preface TechBites 為你介紹 Perplexity AI,一款無須 VPN 即可使用的 AI 搜尋引擎,更有力挑戰 Google 的王者地位!

由 ChatGPT 推出以後,AI 的發展一日千里,我們該如何展望行業發展前景?就讓 Preface 與你拆解 2024 年 AI 的最新趨勢!

OpenAI 近日正式發布了名為 Sora 的全新文字生成短片 AI 模型。雖然 Sora 尚未開放於香港或其他地區公開試用,但 Preface 將會帶你了解更多資訊,由 Sora 教學到收費均一應俱全!

卡塔爾世界盃2022揭幕,FIFA引入SAOT助理球證判斷越位之外,足球賽場內外都有各種科技應用,包括「智慧城市」管理系統、AI人工智能檢查人群、AR擴增實境轉播賽事等。Preface馬上介紹今屆卡塔爾世界盃中的3大科技應用!

The educational field is grappling with the integration of ChatGPT, an AI chatbot, as bans on its usage prove ineffective, and stakeholders must adapt to the potential benefits and challenges of AI in education.