Killing It with Robots

Killing with Robots

While the world is stuck with chatbots and large language models (LLMs), Google DeepMind is building a league of its own. The company is hell bent on building robots and are acing it.

The company has unveiled RoboCat, a self-improving robotic agent, which is able to learn and perform a variety of tasks with different arms, and also generates new training data for self-improvement. It can pick up new tasks with less than 100 demonstrations, reducing the need for human-supervised training.

RoboCat is built on Gato (Spanish for “cat”), a generalist agent for processing language, images, and actions. The model was released in September 2022 with the sole intention of moving beyond text outputs, and becoming a multimodal and multi-embodiment generalist. After first training with Gato, the researchers launched RoboCat into “self-improvement” training with tasks never seen before.

The training process includes:

Obtaining a substantial number of demonstrations, ranging from 100 to 1000, for a new task or robot by utilising a human-controlled robotic arm.
Employing the collected demonstrations to fine-tune RoboCat specifically for this new task or arm, resulting in the development of a specialised spin-off agent.
Enabling the spin-off agent to further improve its performance on the new task or arm by practising it approximately 10,000 times, generating additional training data in the process.
Integrating both the demonstration data and the self-generated data into RoboCat’s existing training dataset to enhance its overall training data.
Utilising the updated training dataset to train a new version of RoboCat, incorporating the newly acquired knowledge and experience from the specialised spin-off agent.

Google DeepMind has clearly shown that it wants to move out of the traditional language processing models and step towards building agents that can do tasks. It is one of the few companies trying to make AI useful, instead of just building chatbots like others.

In another innovation discussed in a research paper named Agile Catching with Whole-Body MPC and Blackbox Policy Learning, the company shows a robot catching objects thrown at high-speed. The best part about this research is that it does not use any foundational models such as language modelling to achieve the task. Just with simple tracking and interception, the robot catches the balls thrown at it.

Google previewed its vision and capabilities last year with PaLM-E. This embodied a multimodal language model for performing tasks in the real world based on vision and images. Then with RT-1, Google Research used transformers for real-world control. This shows that Google bringing DeepMind back into its ecosystem was indeed a match made in heaven.

Standing Apart

The only close competitor to Google DeepMind in robotics is Boston Dynamics. The company has been upping the game in robotics ever since they released Spot and now the humanoid ‘Atlas’ is also in the making. This is not to say that no other company is building the robotic dream. Elon Musk’s Tesla unveiled Optimus last year, but is still under work and there is no sign of it anymore. Far less optimistic.

OpenAI once had a robotics division which built a robotic arm that could solve the Rubik’s cube. But the company shut it down in 2021. Now, it has decided to bet on it again and invested in a Norway-based startup called 1x.

In 2021, when OpenAI shut its robotics division, Google DeepMind took a huge step towards building more general robots. In a research blog, the company introduced vision-based robotic manipulation based on RGB-Stacking for enabling robots to understand the world and the objects around it.

Microsoft on the other hand is still in the ChatGPT hangover. In February, the company extended its capabilities in robotics arms, drones, and other home assistant robots calling the research ‘ChatGPT for Robotics’.

Interestingly, the company has a robotics lab called AI Lab Projects, where it is experimenting with AI and robots at the same time to automate a lot of tasks. For this, the lab has Paul-E, a 7 degree motion collaborative robot with embedded vision and high-res force control. Still, the research is nowhere close to how much DeepMind invests in the field.

The debate whether embodiment is required for AGI or not goes on forever, possibly, with Google DeepMind’s deep research into the field with integration of language models into machines.

The post Killing It with Robots appeared first on Analytics India Magazine.

Standing Apart

Meta restructures its AI unit below ‘Superintelligence Labs’

Why AI will eat McKinsey’s lunch — however not as we speak

Latest stories

Meta restructures its AI unit below ‘Superintelligence Labs’

Why AI will eat McKinsey’s lunch — however not...

As job losses loom, Anthropic launches program to trace AI’s...

Congress would possibly block state AI legal guidelines for a...

PetLibro’s new good digicam makes use of AI to explain...

You might also like...

Meta restructures its AI unit below ‘Superintelligence Labs’

Why AI will eat McKinsey’s lunch — however not as we speak

As job losses loom, Anthropic launches program to trace AI’s financial fallout