TIAGo NLUI

A Natural Language Interface for the TIAGo Robot

Read more

Tech Stack

PythonPython
ROSROS
OllamaOllama
PyTorchPyTorch
LinuxLinux

Overview

The TIAGo Robot NLUI (Natural Language User Interface) Framework is a software system designed to enable natural language interaction with the TIAGo robot developed by PAL Robotics. This research project was conducted under the supervision of Prof. Erez Karpas in the Cognitive Robotics Lab at the Technion. The framework translates spoken commands into executable robot actions, the robot can solve complex problem-solving tasks efficiently using task and motion planning.

System Workflow

The framework operates in several stages, integrating speech recognition, natural language processing, and robotics:

  • Speech-to-Text Conversion: The user speaks into a microphone, and their command is converted into text using a speech recognition module.
  • PDDL Problem Generation: The text command is processed by a fine-tuned large language model, which generates a PDDL (Planning Domain Definition Language) problem. This problem is based on the robot's predefined domain, which includes its capabilities, known objects, and their states.
  • Environment and Domain Knowledge: The robot uses data from its LiDAR scans to maintain a map of its environment. It also tracks the current state of objects and its own position.
  • Plan Generation: The generated PDDL problem, which specifies the initial state and the desired goal, is solved using the Fast Downward heuristic search algorithm. This produces a sequence of actions for the robot to execute.
  • Task Execution: The robot executes the planned actions, such as navigating, grasping objects, and interacting with appliances, to complete the user's command.

Technical Details

The framework integrates the following technologies and methodologies:

  • Speech Recognition: Converts spoken input into text using an open-source or proprietary library for ASR (Automatic Speech Recognition).
  • Large Language Model: A fine-tuned transformer-based model processes the text and generates PDDL problem definitions based on the robot's domain. The domain includes actions such as navigating, grasping, and interacting, defined in accordance with the robot's hardware capabilities.
  • LiDAR Mapping: The robot uses LiDAR to scan and map its environment, creating a representation of the workspace, which includes object positions and boundaries.
  • Planning Algorithm: The Fast Downward planning system solves the PDDL problem using heuristic search methods to generate an optimized sequence of actions.

Applications

The framework is designed for scenarios where natural language commands can simplify human-robot interaction. Examples include:

  • Assistance in homes and offices, such as fetching items or operating appliances.
  • Laboratory and industrial tasks where flexibility and adaptability are required.

Impact

The TIAGo Robot NLUI Framework bridges the gap between natural language and robotic planning, making it easier for non-experts to interact with complex robotic systems. By leveraging advanced NLP and planning algorithms, this project demonstrates the potential for robots to operate autonomously in dynamic environments, adapting to user-defined tasks in real time.

View on GitHub