Proj No. | A1098-251 |
Title | Virtual Robot Manipulation using Language-Guided 3D Value Maps |
Summary | Traditional robotic manipulation methods often depend on manually designed motion primitives, which limit adaptability to new tasks. With the advancements in large language models (LLMs), it is now possible to generate task-specific affordances and constraints directly from free-form language instructions. This project aims to implement and evaluate a fully virtual system where a simulated robot executes manipulation tasks based on LLM-generated 3D value maps. The framework will enable a simulated robot to understand and perform a variety of object manipulation tasks using language-guided motion planning without additional task-specific training. Project Scope This project focuses on developing and testing a language-driven robotic manipulation system in a virtual simulation environment. The key phases include: 1. Language-Guided Affordance and Constraint Mapping: o Utilize a large language model (e.g., GPT-4) to infer object affordances and constraints from natural language instructions. o Integrate a vision-language model (e.g., CLIP, OWL-ViT) to extract object properties and scene context from simulated RGB-D images. o Generate structured 3D voxel-based value maps encoding regions of high relevance for task execution. 2. Virtual Motion Planning with 3D Value Maps: o Use the generated value maps to synthesize end-effector trajectories for robotic manipulation. o Implement model-based motion planning (e.g., Model Predictive Control) to ensure real-time adaptability. o Evaluate the effectiveness of value maps in guiding a simulated robotic arm in different tasks. 3. Simulation-Based Experimentation and Evaluation: o Implement the entire framework in a physics-based simulator (e.g., Isaac Gym, Mujoco, or PyBullet). o Conduct virtual experiments on common robotic tasks such as grasping, placing, pushing, and opening doors/drawers. o Analyze the frameworks generalization across different virtual environments and task settings. Expected Outcomes A fully functional virtual framework for robotic manipulation using LLM-generated 3D value maps. A pipeline for translating natural language commands into executable motion plans. Evaluation results on the robustness and adaptability of language-based robotic motion planning. Insights into how language models can enhance task generalization in robotic manipulation. Required Resources Hardware: High-performance computing resources (GPU-enabled workstation for simulation). Software & Libraries: PyTorch, TensorFlow, ROS (Robot Operating System), OpenCV, Mujoco/PyBullet/Isaac Gym for simulation, and reinforcement learning toolkits. Candidate Requirements Strong programming skills in Python and familiarity with deep learning frameworks (PyTorch/TensorFlow). Knowledge of reinforcement learning, motion planning, and robotics simulation. Experience with robotic simulation tools (e.g., PyBullet, Mujoco, Isaac Gym). Interest in natural language processing and vision-language integration. |
Supervisor | Prof Xie Lihua (Loc:S2 > S2 B2C > S2 B2C 94, Ext: +65 67904524) |
Co-Supervisor | - |
RI Co-Supervisor | - |
Lab | Internet of Things Laboratory (Loc: S1-B4c-14, ext: 5470/5475) |
Single/Group: | Single |
Area: | Intelligent Systems and Control Engineering |
ISP/RI/SMP/SCP?: |