Source author record

Guofa Li

Guofa Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Artificial Intelligence Computer Science and Game Theory

Catalog footprint

What is connected

3works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

GPT-4 Enhanced Multimodal Grounding for Autonomous Driving: Leveraging Cross-Modal Attention with Large Language Models

In the field of autonomous vehicles (AVs), accurately discerning commander intent and executing linguistic commands within a visual context presents a significant challenge. This paper introduces a sophisticated encoder-decoder framework, developed to address visual grounding in AVs.Our Context-Aware Visual Grounding (CAVG) model is an advanced system that integrates five core encoders-Text, Image, Context, and Cross-Modal-with a Multimodal decoder. This integration enables the CAVG model to adeptly capture contextual semantics and to learn human emotional features, augmented by state-of-the-art Large Language Models (LLMs) including GPT-4. The architecture of CAVG is reinforced by the implementation of multi-head cross-modal attention mechanisms and a Region-Specific Dynamic (RSD) layer for attention modulation. This architectural design enables the model to efficiently process and interpret a range of cross-modal inputs, yielding a comprehensive understanding of the correlation between verbal commands and corresponding visual scenes. Empirical evaluations on the Talk2Car dataset, a real-world benchmark, demonstrate that CAVG establishes new standards in prediction accuracy and operational efficiency. Notably, the model exhibits exceptional performance even with limited training data, ranging from 50% to 75% of the full dataset. This feature highlights its effectiveness and potential for deployment in practical AV applications. Moreover, CAVG has shown remarkable robustness and adaptability in challenging scenarios, including long-text command interpretation, low-light conditions, ambiguous command contexts, inclement weather conditions, and densely populated urban environments. The code for the proposed model is available at our Github.

preprint2022arXiv

Applications of Game Theory in Vehicular Networks: A Survey

In the Internet of Things (IoT) era, vehicles and other intelligent components in an intelligent transportation system (ITS) are connected, forming Vehicular Networks (VNs) that provide efficient and secure traffic and ubiquitous access to various applications. However, as the number of nodes in ITS increases, it is challenging to satisfy a varied and large number of service requests with different Quality of Service and security requirements in highly dynamic VNs. Intelligent nodes in VNs can compete or cooperate for limited network resources to achieve either an individual or a group's objectives. Game Theory (GT), a theoretical framework designed for strategic interactions among rational decision-makers sharing scarce resources, can be used to model and analyze individual or group behaviors of communicating entities in VNs. This paper primarily surveys the recent developments of GT in solving various challenges of VNs. This survey starts with an introduction to the background of VNs. A review of GT models studied in the VNs is then introduced, including its basic concepts, classifications, and applicable vehicular issues. After discussing the requirements of VNs and the motivation of using GT, a comprehensive literature review on GT applications in dealing with the challenges of current VNs is provided. Furthermore, recent contributions of GT to VNs integrating with diverse emerging 5G technologies are surveyed. Finally, the lessons learned are given, and several key research challenges and possible solutions for applying GT in VNs are outlined.

preprint2020arXiv

A Spontaneous Driver Emotion Facial Expression (DEFE) Dataset for Intelligent Vehicles

In this paper, we introduce a new dataset, the driver emotion facial expression (DEFE) dataset, for driver spontaneous emotions analysis. The dataset includes facial expression recordings from 60 participants during driving. After watching a selected video-audio clip to elicit a specific emotion, each participant completed the driving tasks in the same driving scenario and rated their emotional responses during the driving processes from the aspects of dimensional emotion and discrete emotion. We also conducted classification experiments to recognize the scales of arousal, valence, dominance, as well as the emotion category and intensity to establish baseline results for the proposed dataset. Besides, this paper compared and discussed the differences in facial expressions between driving and non-driving scenarios. The results show that there were significant differences in AUs (Action Units) presence of facial expressions between driving and non-driving scenarios, indicating that human emotional expressions in driving scenarios were different from other life scenarios. Therefore, publishing a human emotion dataset specifically for the driver is necessary for traffic safety improvement. The proposed dataset will be publicly available so that researchers worldwide can use it to develop and examine their driver emotion analysis methods. To the best of our knowledge, this is currently the only public driver facial expression dataset.