Announcement_1 | Xingchen Wan

We present VISTA and Maestro, two self-improving multimodal generation agents for text-to-video and text-to-image generation, respectively.

🚨 Google just dropped the most advanced self-improving video AI ever built.

It’s called VISTA, and it literally rewrites its own prompts to make every new generation better than the last.

No retraining. No fine-tuning. Just pure test-time self-reflection.

Here’s how it works:… pic.twitter.com/WyhX9uur9l
— Louis Gleeson (@aigleeson) October 24, 2025