Google Cloud launches Google Veo and Imagen 3

Google Cloud announced that it has begun offering its latest AI model, the video generation model, Google Veo. Additionally, the high-quality image generation model, Imagen 3, was released to Google Cloud users on Vertex AI.

Now in private preview on Vertex AI, Veo makes it easy for businesses to generate high-quality videos from simple text and image prompts. As the first hyperscaler to offer an image-to-video model, it helps businesses transform their existing creative assets into dynamic visuals. This groundbreaking technology unlocks new possibilities for creative expression and streamlines video production workflows.

Imagen 3 will be available to all Vertex AI users starting next week. Imagen 3 generates the most realistic, high-quality images from simple text prompts, surpassing previous versions of Imagen in detail, lighting and artifact reduction. Businesses can seamlessly create high-quality images that reflect their unique brand style and logos that can be used for marketing, advertising and product design.

Vertex AI provides an integrated platform that makes it easy to customize these models, evaluate their performance, and deploy them on any major infrastructure. In line with Google’s AI principles, the development and deployment of Veo and Imagen 3 on Vertex AI prioritizes safety and responsibility, incorporating precautions such as watermarking, safety filters, and data governance.

Google Veo: The most powerful video generation model now available in Vertex AI

Developed by Google DeepMind, Veo generates high-quality, high-resolution videos in a variety of cinematic and visual styles extremely fast based on text or image prompts. With a deep understanding of natural language and visual semantics, it generates videos that closely match the prompts. Vertex AI’s Veo creates coherent footage with people, videos and objects moving naturally throughout the footage.

Below is an example of the function of generating videos from images using Vertex AI’s Veo.

Image to video

Veo generates videos from existing AI-generated images. Below is an example of how Veo uses images generated with Imagen 3 (top two) and real images (bottom two) to create a short video clip.

※Click on the image to play

241204_GoogleCloud_02 — ※Click on the image to play

Text to video

Below is an example of how Veo uses text to create short video clips.

241204_GoogleCloud_0 — ※Click on the image to play

Vertex AI’s Veo enables businesses to easily generate high-quality videos from simple text and image prompts, reducing production time, lowering costs, and enabling rapid prototyping and refinement of video content. Veo’s technology acts as a companion to human creativity, allowing creators to offload the tedious, repetitive tasks of video production to AI, allowing them to focus on higher-level creative tasks.

Users like Agoda are leveraging AI models from Veo, Gemini, Imagen and others to significantly reduce and streamline production time. Whether marketers are crafting compelling social media posts, sales teams are creating compelling presentations or creative teams are exploring new concepts, Veo streamlines workflows and unlocks new possibilities in visual storytelling.

Imagen 3: Highest quality image generation model generally available on Vertex AI

Imagen 3 is said to be Google’s highest quality image conversion model for generating images from text, producing more detailed, realistic and lifelike images than ever before, with significantly fewer visual artifacts than previous models.

Starting next week, all Google Cloud users will have access to Imagen 3 on Vertex AI, which allows users to generate high-resolution images and videos from simple text prompts.

241204_GoogleCloud_04 — ※Click on the image to play

241204_GoogleCloud_05 — ※Click on the image to play

Additionally, image editing and customization features will be made available to the general public to users on the whitelist.

Imagen 3 provides tools for easy photo editing with text input. For example, you can update background, edit parts (mask-based editing), and upscale image resolution. Below are some examples of editing features.

241204_GoogleCloud_06 — ※Click on the image to play

It also allows for custom image generation incorporating brand, style and product features, streamlining the advertising and marketing production process and expanding creative possibilities.

Google: Enterprise Safety and Security

Digital Watermarking
- Using Google DeepMind’s SynthID, Google embeds invisible watermarks in generated images and videos to reduce the risk of misinformation.
Safety Filter
- Equipped with protection to prevent harmful content generation. Adheres to Google’s AI principles and continues to invest in new technologies.
Data Governance
- Customer data will not be used for training and will only be processed in accordance with customer instructions.
Copyright Indemnification
- Industry-first approach to mitigate copyright risk for generative AI services.