Source author record

Bufang Yang

Bufang Yang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Human-Computer Interaction Artificial Intelligence Distributed, Parallel, and Cluster Computing

Catalog footprint

What is connected

3works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

EgoLog: Ego-Centric Fine-Grained Daily Log with Ubiquitous Wearables

Despite advances in human activity recognition (HAR) with different modalities, a precise, robust, and accurate daily log system is not yet available. Current solutions primarily rely on controlled, lab-based data collection, which limits their real-world applicability. The challenges towards a fine-grained daily log are 1) contextual awareness, 2) spatial awareness, and 3) effective fusion of multi-modal sensor data. To solve them, we propose EgoLog, which integrates effective audio-IMU fusion for daily log with ubiquitous wearables. Our approach first fuses audio and IMU data from two perspectives: temporal understanding and spatial understanding. We extract scenario-level features and aggregate them in the time dimension, while using motion compensation to enhance the performance of sound source localization. The knowledge obtained from these steps is then integrated into a multi-modal HAR framework. Here, the scenario provides prior knowledge, and the spatial location helps differentiate the user from the background. Furthermore, we integrate a LLM to enhance scenario recognition through logical reasoning. The knowledge derived from the LLM is subsequently transferred back to the local device to enable efficient, on-device inference. Evaluated on both public and self-collected dataset, EgoLog achieves effective multimodal fusion for both activity and scenraio recognition, outperforms the baseline by 12% and 15%, respectively.

preprint2026arXiv

Pro$^2$Assist: Continuous Step-Aware Proactive Assistance with Multimodal Egocentric Perception for Long-Horizon Procedural Tasks

Procedural tasks with multiple ordered steps are ubiquitous in daily life. Recent advances in multimodal large language models (MLLMs) have enabled personal assistants that support daily activities. However, existing systems primarily provide reactive guidance triggered by user queries, or limited proactive assistance for isolated short-term events rather than long-horizon procedural tasks. In this work, we introduce Pro$^2$Assist, a step-aware proactive assistant that continuously tracks fine-grained task progress and reasons over the user's evolving state to provide timely assistance throughout tasks. Pro$^2$Assist leverages multimodal data from augmented reality (AR) glasses to achieve motion-based perception. It then extracts step-oriented procedural context from multi-scale temporal dynamics and task-specific expert knowledge. Based on both sensory input and procedural context, Pro$^2$Assist performs continuous reasoning to infer user needs and display timely assistance on AR glasses. We evaluate Pro$^2$Assist using a dataset curated from public sources and a real-world dataset collected on our testbed with AR glasses. Extensive evaluations show that Pro$^2$Assist outperforms the best-performing baselines by over 21% in procedural action understanding accuracy, and it achieves up to 2.29x the proactive timing accuracy of baselines. A user study with 20 participants further shows that 90% find Pro$^2$Assist useful, indicating its effectiveness for real-world procedural assistance.

preprint2025arXiv

PerCache: Predictive Hierarchical Cache for RAG Applications on Mobile Devices

Retrieval-augmented generation (RAG) has been extensively used as a de facto paradigm in various large language model (LLM)-driven applications on mobile devices, such as mobile assistants leveraging personal emails or meeting records. However, due to the lengthy prompts and the resource constraints, mobile RAG systems exhibit significantly high response latency. On this issue, one promising approach is to reuse intermediate computational results across different queries to eliminate redundant computation. But most existing approaches, such as KV cache reuse and semantic cache reuse, are designed for cloud settings and perform poorly, overlooking the distinctive characteristics of mobile RAG. We propose PerCache, a novel hierarchical cache solution designed for reducing end-to-end latency of personalized RAG applications on mobile platforms. PerCache adopts a hierarchical architecture that progressively matches similar queries and QKV cache to maximize the reuse of intermediate results at different computing stages. To improve cache hit rate, PerCache applies a predictive method to populate cache with queries that are likely to be raised in the future. In addition, PerCache can adapt its configurations to dynamic system loads, aiming at maximizing the caching utility with minimal resource consumption. We implement PerCache on top of an existing mobile LLM inference engine with commodity mobile phones. Extensive evaluations show that PerCache can surpass the best-performing baseline by 34.4% latency reduction across various applications and maintain optimal latency performance under dynamic resource changes.