
Google released the first local model Gemini Robotics On-Device
Google company presented the coolest model Gemini Robotics On-Device. This is the world’s first solution that combines computer vision, language understanding and physical actions in a single local package. Which frees robots from constant dependence on cloud computing!
The uniqueness of the new model lies in its universality. It works with both humanoid platforms and industrial two-handed manipulators. Impressive is also the system’s ability to perform the most complex two-handed operations. From manipulations with small objects to assembly of constructions and moving objects.
Learning efficiency also works excellently. The model needs only 100 demonstrations to master new actions! At the same time, the system was initially trained only on the ALOHA dataset with human instructions. But was able to transfer knowledge to diverse robotic platforms.
Google simultaneously released SDK Gemini Robotics. This is a toolkit for developers that allows customizing the model for specific tasks.
Fully autonomous operation for robots opens huge possibilities for application in conditions of unstable connection. Or for tasks requiring minimal response latency. And this could be the start of a new era of truly independent robots!