Google DeepMind Sets Ambitious Goals with New AI Team to Simulate the Physical World
2 days ago
3 min read
0
2
0
Google is embarking on a groundbreaking endeavor to develop advanced AI models capable of simulating the physical world. This initiative will be spearheaded by Tim Brooks, a prominent figure who previously co-led OpenAI’s video generator, Sora, before joining Google DeepMind in October. Brooks announced his role and the new team’s mission in a recent post on X, signaling Google’s commitment to advancing generative AI technologies.
Brooks revealed DeepMind’s ambitious plans to create massive generative models that simulate the complexities of the real world. His new team, integrated within DeepMind, is actively recruiting talent to tackle this monumental task. According to job listings shared by Brooks, the team will collaborate closely with Google’s Gemini, Veo, and Genie teams. They aim to address critical challenges and scale AI models to unprecedented computational power.
Google’s Gemini is its flagship AI model suite, excelling in tasks such as image analysis and text generation. Meanwhile, Veo represents Google’s proprietary video generation model, and Genie is Google’s world model designed to simulate games and 3D environments in real-time. The latest Genie model, unveiled in December, showcases the ability to generate diverse and fully playable 3D worlds, pushing the boundaries of interactive media.
Brooks emphasizes the critical role of scaling AI training on video and multimodal data in the journey toward Artificial General Intelligence (AGI). AGI refers to AI systems capable of performing any task a human can, making it a long-standing aspiration in the field. According to Brooks, world models will drive advancements in various domains, including visual reasoning, simulation, planning for embodied agents, and real-time interactive entertainment. His team is tasked with developing tools for real-time interactive generation and exploring the integration of their models with multimodal systems like Gemini.
The potential of world models extends far beyond entertainment. Several tech companies and startups, including Fei-Fei Lee’s World Labs, Israeli upstart Decart, and Odyssey, are racing to innovate. They envision applications ranging from interactive video games and movies to realistic simulation environments for robotic training. However, the advent of world models has raised significant concerns within the creative industry. Game studios like Activision Blizzard have reportedly utilized AI to cut costs, increase productivity, and address staffing challenges, leading to widespread layoffs. A 2024 study commissioned by the Animation Guild predicts that over 100,000 jobs in the U.S. film, television, and animation sectors will be disrupted by AI technologies by 2026.
Despite these concerns, some startups, like Odyssey, have pledged to collaborate with creative professionals rather than replace them. Whether Google will adopt a similar approach remains uncertain, but the implications for the creative and tech industries are profound.
The issue of copyright presents another significant hurdle for world models. Some of these models are reportedly trained on snippets of video game playthroughs, potentially exposing developers to legal risks if unlicensed content is used. Google, which owns YouTube, asserts that it has the right to train its AI models on videos available on its platform under YouTube’s terms of service. However, the company has not clarified which videos it uses for training, leaving questions about copyright compliance.
The world watches closely as Google DeepMind assembles its new team and sets its sights on achieving these ambitious goals. Pursuing generative AI models capable of simulating the physical world represents a technological leap forward. It poses critical ethical, legal, and societal challenges that must be navigated with care.