Proj No. | A3227-251 |
Title | Egocentric Computer Vision with Wearable Cameras |
Summary | Motivation & Objective: Cameras worn by humans can capture visual information from unique perspectives, offering exciting opportunities to analyze human behaviors and interactions in natural settings. For instance, mounting cameras on babies’ heads allows us to study how infants develop their visual recognition abilities, which can be an innovative approach in developmental psychology. Meanwhile, by wearing cameras on adults' heads, we can train AI models to understand how humans interact with real-world objects using their hands. Additionally, mounting cameras on our hands allows us to explore more detailed hand movements and interactions with objects, potentially aiding robotic arms in acquiring human motor skills. The objective of this project is to explore these computer vision techniques with wearable cameras, which have numerous potential applications in augmented reality, virtual reality, robotics, etc. Description of Project: This is an open-ended project, and students are welcome to pursue relevant topics based on their own interests. However, we are currently considering the following topics: Investigate publicly available models trained on videos collected from infants' points of view. Recognize the camera wearer's actions from the temporal changes in hand and object poses. Collect hand-centric videos using cameras mounted on human hands. Estimate hand poses and beyond (e.g., what the person is doing) from hand-centric videos. What you will learn & do: In this project, the students should complete the following tasks: Machine Learning and Computer Vision: Students will train state-of-the-art computer vision models, properly optimize them, evaluate the results, and potentially improve the models. Data Analysis: Students will carefully analyze the data, assess experimental results, and provide insights from the data. Data Collection (optional): While we have plenty of public data, students who are interested are welcome to mount cameras, collect videos of their activity, and analyze them. Required Skills: Students should be proficient in Python coding for data analysis (NumPy, Pandas, etc.), computer vision (OpenCV), and deep learning (PyTorch) using GPU servers running Ubuntu. Interested candidates can email their CVs to me (bihan.wen@ntu.edu.sg). Only qualified candidates will be notified. |
Supervisor | A/P Wen Bihan (Loc:S2 > S2 B2B > S2 B2B 54, Ext: +65 67904708) |
Co-Supervisor | - |
RI Co-Supervisor | - |
Lab | Centre for Information Sciences & System (CISS) (Loc: S2-B4b-05) |
Single/Group: | Single |
Area: | Digital Media Processing and Computer Engineering |
ISP/RI/SMP/SCP?: |