In a remarkable breakthrough for the world of robotics and artificial intelligence (AI), NVIDIA, a leading force in the electronics industry, and UC San Diego have collaborated to develop an innovative robotic teleoperation system called AnyTeleop. This system is set to revolutionize the way we interact with robots, allowing for remote control of these mechanical marvels to accomplish tasks from a distance. This could potentially enable users to explore museums virtually, perform maintenance in hard-to-reach places, or participate in events remotely in a more interactive manner.
The majority of teleoperation systems available today are tailored for specific robots and environments, which restricts their applicability in diverse real-world settings. AnyTeleop, however, is a versatile solution that can be employed in a broad range of scenarios. This computer vision-based teleoperation system allows for the remote operation of various robotic arms and hands to handle different manual tasks.
Dieter Fox, senior director of robotics research at NVIDIA, elaborated on the company’s vision: “At NVIDIA, our primary focus is on researching how humans can instruct robots to perform tasks. The conventional approach involved humans guiding the robot, but this method faced two significant challenges. Firstly, training state-of-the-art models required numerous demonstrations. Secondly, the setups often involved expensive equipment or sensory hardware and were designed exclusively for a specific robot or deployment environment.”
Fox and his team aimed to overcome these obstacles by developing a teleoperation system that is affordable, easy to deploy, and adaptable across different tasks, environments, and robotic systems. To train their system, they teleoperated both virtual robots in simulated environments and real robots in a physical environment. This approach eliminated the need to purchase and assemble a multitude of robots.
“AnyTeleop is a vision-based teleoperation system that enables humans to control dexterous robotic hand-arm systems using their hands,” Fox explained. “The system tracks human hand poses from single or multiple cameras and then retargets them to control the fingers of a multi-fingered robot hand. The wrist point is used to control the robot arm motion with a CUDA-powered motion planner.”
What sets AnyTeleop apart from previous teleoperation systems is its compatibility with different robot arms, robot hands, camera configurations, and various simulated or real-world environments. It can be employed in situations where users are either nearby or at distant locations.
The AnyTeleop platform also facilitates the collection of human demonstration data (i.e., data representing the movements and actions that humans perform when executing specific manual tasks). This data could be utilized to train robots to autonomously complete different tasks.
“The major breakthrough of AnyTeleop is its generalizable and easily deployable design,” Fox said. “One potential application is to deploy virtual environments and virtual robots in the cloud, allowing edge users with entry-level computers and cameras (like an iPhone or PC) to teleoperate them. This could ultimately revolutionize the data pipeline for researchers and industrial developers teaching robots new skills.”
In preliminary tests, AnyTeleop outperformed an existing teleoperation system designed for a specific robot, even when applied to this robot. This underscores its potential as a tool for enhancing teleoperation applications.
NVIDIA plans to release an open-source version of the AnyTeleop system soon, enabling research teams worldwide to test it and apply it to their robots. This promising new platform could significantly contribute to the scaling up of teleoperation systems while also facilitating the collection of training data for robotic manipulators.
Looking ahead, Fox said: “We now plan to use the collected data to explore further robot learning. One notable focus going forward is how to overcome the domain gaps when transferring robot models from simulation to the real world.