Proj No. | A3219-251 |
Title | Foundation Model for Urban Sensing and Modeling |
Summary | Ranging from surveillance cameras to satellites/drones, sensors in today’s urban environments provide vast amounts of data, which enables the developments of machine-learning techniques for urban modelling and analysis. While these techniques work, challenges associated with the dynamic and complex nature of urban environments remain under-addressed, thus have hindered further advancements of urban intelligence. ‘Multi-layer’ sensing is one promising solution: Instead of relying on a restricted scope of data sources, ‘multi-layer’ sensing integrates the information from diverse sensors to overcome the limitations of single data modalities . Thus, a more comprehensive understanding of the urban environments is expected. Recent progress in artificial intelligence (AI) for multimodal data fusion in geoscience applications has shown its effectiveness in mitigating the influence of noise and occlusion In this project, the student is going to work with our team to develop a foundational vision-language model (VLM) for ‘multi-layer’ sensing in urban environments. We aim to incorporate the domain knowledge (i.e., urban sensing data) into those foundational VLMs through a specialized learning paradigm, so that intensive data annotations can be sidestepped. The student needs to have a strong interest in AI and image processing. Experience in python and Pytorch is preferred. Interested candidates can email their CVs to me (bihan.wen@ntu.edu.sg). Only qualified candidates will be notified. |
Supervisor | A/P Wen Bihan (Loc:S2 > S2 B2B > S2 B2B 54, Ext: +65 67904708) |
Co-Supervisor | - |
RI Co-Supervisor | - |
Lab | Centre for Information Sciences & System (CISS) (Loc: S2-B4b-05) |
Single/Group: | Single |
Area: | Digital Media Processing and Computer Engineering |
ISP/RI/SMP/SCP?: |