NVIDIA Cosmos 3: First Open Physical AI Omnimodel Cuts Training Cycles to Days
Krasa AI
2026-06-04
6 minute read
NVIDIA Cosmos 3: First Open Physical AI Omnimodel Cuts Training Cycles to Days
NVIDIA unveiled Cosmos 3 at Computex 2026 — the world's first fully open foundation model that combines vision reasoning, world generation, and action prediction in a single system. Jensen Huang positioned it as the missing link for the next wave of robots, autonomous vehicles, and embodied agents. The model is free to download, comes in two sizes today with a third on the way, and ships alongside a new industry coalition of robotics and world-model labs.
The launch lands as physical AI moves from research demos to commercial deployment. The companies building humanoid robots, warehouse fleets, and self-driving systems all need the same thing: a foundation model that understands the physical world well enough to generate training data, predict outcomes, and drive policy decisions. Until now, every team has been building their own.
Why this matters
Training a robot in the real world is slow and expensive. Every new task, environment, or hardware revision means another round of data collection, simulation tuning, and policy training that can run into months. Synthetic data — video and sensor traces generated by AI — has been the obvious shortcut, but the quality has been mixed and the tools fragmented.
Cosmos 3 takes a swing at unifying that stack. The same model generates physically plausible video, predicts what an actuator should do next, and reasons over visual scenes in language. NVIDIA claims it can cut physical AI training and evaluation cycles "from months to days." For a category racing to commercial scale, that compression is the entire game.
The other shift is openness. Cosmos 3 is released as an open omnimodel, not gated behind an API. Robotics teams can fine-tune it on their own hardware, inspect the weights, and ship derivative models. That puts it in direct competition with the closed physical-AI stacks emerging from Google DeepMind, Tesla, and Figure — and it gives independent labs a credible base model to build on for the first time.
What was announced
Cosmos 3 is built on a mixture-of-transformers architecture that natively handles text, images, video, ambient sound, and actions. That single set of weights does three things: it generates synthetic worlds (the simulation half), it predicts actions given a goal (the policy half), and it reasons about what's happening in a scene (the vision half). Previous Cosmos releases split these capabilities across separate models — Cosmos 3 fuses them.
Two sizes shipped at launch. Cosmos 3 Super is a 32-billion-parameter model targeted at training pipelines and offline synthetic data generation. Cosmos 3 Nano is an 8-billion-parameter model for smaller compute budgets and faster iteration. A third variant, Cosmos 3 Edge, is coming for real-time inference at the edge — meaning the model can run on the robot itself rather than streaming back to a data center.
NVIDIA also launched the Cosmos Coalition, a group of world-model labs and robotics companies committed to building openly on the platform. Founding members include Agile Robots, Black Forest Labs, Generalist, LTX, Runway, and Skild AI. The coalition is structured to share fine-tuned models, evaluation benchmarks, and training data across members — the kind of cooperative pattern that Hugging Face brought to language models and that physical AI has mostly lacked.
Industry impact
The robotics and autonomous-vehicle stack just got a credible open base model, which changes the calculus for every team in the space. Smaller labs that couldn't afford to train a foundation world model now get one. Larger players who built closed stacks have to decide whether to keep investing or fine-tune Cosmos 3 instead.
NVIDIA's commercial angle is straightforward: the more physical AI training happens, the more H100s, Blackwells, and Vera Rubins they sell. Opening the model is the loss leader. The infrastructure it runs on is the business. The Cosmos Coalition members are also major NVIDIA hardware customers — Skild AI, Agile Robots, and Black Forest Labs all train on NVIDIA GPUs.
The competitive read for Google, Meta, and Tesla is more complicated. Each has invested heavily in their own physical world models. Cosmos 3 doesn't render those efforts obsolete, but it does mean the proprietary moat just got narrower. If Cosmos 3 becomes the default base for academic robotics research, the talent pipeline starts to flow toward NVIDIA's stack by default.
Expert perspectives
Industry analysts at Computex flagged the omnimodel framing as the headline. Most foundation models for robotics either focus on perception or on action — Cosmos 3 is the first widely available release that treats them as one problem. The mixture-of-transformers design also signals where NVIDIA thinks scaling laws for physical AI are heading: bigger unified models, not specialist pipelines stitched together.
Reactions from robotics teams on X were mostly enthusiastic about the open weights, with the caveat that real-world performance on specific hardware platforms is what will matter. Cosmos 1 and Cosmos 2 were strong on benchmarks but uneven in actual fleet deployments. Cosmos 3's real test is whether the Coalition members ship products that work.
What's next
Cosmos 3 Super and Cosmos 3 Nano are available now for download under an open license. The Coalition is open to additional partners, with NVIDIA signaling that more labs will join in the coming months. Cosmos 3 Edge is "coming soon" without a specific date, but it's the variant that matters most for shipped robotics products — real-time inference on-device is what gets the model from training pipelines into deployed systems.
Watch for fine-tuned variants from Coalition members over the next few months. Black Forest Labs has hinted at a video-focused fork, and Skild AI is expected to ship a humanoid-control derivative. The pattern Cosmos 3 sets — open base model, fine-tunes from specialists, shared evaluation — is what NVIDIA needs to establish before the closed competitors release their own platforms.
Bottom line
Cosmos 3 is the most credible swing yet at a single foundation model for physical AI, and the open release plus a multi-lab coalition gives it more momentum than any prior physical-AI launch. If you're building robots, autonomous vehicles, or any embodied AI product, this is the new default base model to evaluate against. If you're holding a closed alternative, the timeline to prove differentiation just shortened.
Sources
Don't fall behind
Expert AI Implementation →Related Articles
Anthropic Adds Services Track and Partner Hub to Claude Network
Anthropic launches a 3-tier Services Track and a public Partner Hub. 40,000 firms have applied; 10,000 consultants are certified.
min read
Apoha Exits Stealth With $36M to Build 'Liquid Brain' AI for Materials
UK startup Apoha emerges with $36M Series A and a wild new data type: how materials vibrate in liquid. The pitch is AI for materials discovery.
min read
Coralogix Raises $200M at $1.6B to Monitor AI Agents in Production
Coralogix's Series F bets that someone has to watch the AI agents. The $1.6B valuation comes 11 months after its last raise.
min read