Source author record

Yuan Liang

Yuan Liang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Computation and Language Artificial Intelligence Human-Computer Interaction eess.IV Cryptography and Security econ.GN Multimedia q-fin.EC

Catalog footprint

What is connected

13works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Topology Matters: Measuring Memory Leakage in Multi-Agent LLMs

Graph topology is a fundamental determinant of memory leakage in multi-agent LLM systems, yet its effects remain poorly quantified. We introduce MAMA (Multi-Agent Memory Attack), a framework that measures how network structure shapes leakage. MAMA operates on synthetic documents containing labeled Personally Identifiable Information (PII) entities, from which we generate sanitized task instructions. We execute a two-phase protocol: Engram (seeding private information into a target agent's memory) and Resonance (multi-round interaction where an attacker attempts extraction). Over 10 rounds, we measure leakage as exact-match recovery of ground-truth PII from attacker outputs. We evaluate six canonical topologies (complete, ring, chain, tree, star, star-ring) across $n\in\{4,5,6\}$, attacker-target placements, and base models. Results are consistent: denser connectivity, shorter attacker-target distance, and higher target centrality increase leakage; most leakage occurs in early rounds and then plateaus; model choice shifts absolute rates but preserves topology ordering; spatiotemporal/location attributes leak more readily than identity credentials or regulated identifiers. We distill practical guidance for system design: favor sparse or hierarchical connectivity, maximize attacker-target separation, and restrict hub/shortcut pathways via topology-aware access control.

preprint2023arXiv

Panacea or Placebo? Exploring Causal Effects of Nonlocal Vehicle Driving Restriction Policies on Traffic Congestion Using Difference-in-differences Approach

Car dependence has been threatening transportation sustainability as it contributes to congestion and associated externalities. In response, various transport policies that restrict the use of private vehicle have been implemented. However, empirical evaluations of such policies have been limited. To assess these policies' benefits and costs, it is imperative to accurately evaluate how such policies affect traffic conditions. In this study, we compile a refined spatio-temporal resolution data set of the floating-vehicle-based traffic performance index to examine the effects of a recent nonlocal vehicle driving restriction policy in Shanghai, one of most populous cities in the world. Specifically, we explore whether and how the policy impacted traffic speeds in the short term by employing a quasi-experimental difference-in-differences modeling approach. We find that: (1) In the first month, the policy led to an increase of the network-level traffic speed by 1.47% (0.352 km/h) during evening peak hours (17:00-19:00) but had no significant effects during morning peak hours (7:00-9:00). (2) The policy also helped improve the network-level traffic speed in some unrestricted hours (6:00, 12:00, 14:00, and 20:00) although the impact was marginal. (3) The short-term effects of the policy exhibited heterogeneity across traffic analysis zones. The lower the metro station density, the greater the effects were. We conclude that driving restrictions for non-local vehicles alone may not significantly reduce congestion, and their effects can differ both temporally and spatially. However, they can have potential side effects such as increased purchase and usage of new energy vehicles, owners of which can obtain a local license plate of Shanghai for free.

preprint2022arXiv

An Explore of Virtual Reality for Awareness of the Climate Change Crisis: A Simulation of Sea Level Rise

Virtual Reality (VR) technology has been shown to achieve remarkable results in multiple fields. Due to the nature of the immersive medium of Virtual Reality it logically follows that it can be used as a high-quality educational tool as it offers potentially a higher bandwidth than other mediums such as text, pictures and videos. This short paper illustrates the development of a climate change educational awareness application for virtual reality to simulate virtual scenes of local scenery and sea level rising until 2100 using prediction data. The paper also reports on the current in progress work of porting the system to Augmented Reality (AR) and future work to evaluate the system.

preprint2022arXiv

ConCL: Concept Contrastive Learning for Dense Prediction Pre-training in Pathology Images

Detectingandsegmentingobjectswithinwholeslideimagesis essential in computational pathology workflow. Self-supervised learning (SSL) is appealing to such annotation-heavy tasks. Despite the extensive benchmarks in natural images for dense tasks, such studies are, unfortunately, absent in current works for pathology. Our paper intends to narrow this gap. We first benchmark representative SSL methods for dense prediction tasks in pathology images. Then, we propose concept contrastive learning (ConCL), an SSL framework for dense pre-training. We explore how ConCL performs with concepts provided by different sources and end up with proposing a simple dependency-free concept generating method that does not rely on external segmentation algorithms or saliency detection models. Extensive experiments demonstrate the superiority of ConCL over previous state-of-the-art SSL methods across different settings. Along our exploration, we distll several important and intriguing components contributing to the success of dense pre-training for pathology images. We hope this work could provide useful data points and encourage the community to conduct ConCL pre-training for problems of interest. Code is available.

preprint2022arXiv

RAAT: Relation-Augmented Attention Transformer for Relation Modeling in Document-Level Event Extraction

In document-level event extraction (DEE) task, event arguments always scatter across sentences (across-sentence issue) and multiple events may lie in one document (multi-event issue). In this paper, we argue that the relation information of event arguments is of great significance for addressing the above two issues, and propose a new DEE framework which can model the relation dependencies, called Relation-augmented Document-level Event Extraction (ReDEE). More specifically, this framework features a novel and tailored transformer, named as Relation-augmented Attention Transformer (RAAT). RAAT is scalable to capture multi-scale and multi-amount argument relations. To further leverage relation information, we introduce a separate event relation prediction task and adopt multi-task learning method to explicitly enhance event extraction performance. Extensive experiments demonstrate the effectiveness of the proposed method, which can achieve state-of-the-art performance on two public datasets. Our code is available at https://github. com/TencentYoutuResearch/RAAT.

preprint2022arXiv

SocAoG: Incremental Graph Parsing for Social Relation Inference in Dialogues

Inferring social relations from dialogues is vital for building emotionally intelligent robots to interpret human language better and act accordingly. We model the social network as an And-or Graph, named SocAoG, for the consistency of relations among a group and leveraging attributes as inference cues. Moreover, we formulate a sequential structure prediction task, and propose an $α$-$β$-$γ$ strategy to incrementally parse SocAoG for the dynamic inference upon any incoming utterance: (i) an $α$ process predicting attributes and relations conditioned on the semantics of dialogues, (ii) a $β$ process updating the social relations based on related attributes, and (iii) a $γ$ process updating individual's attributes based on interpersonal social relations. Empirical results on DialogRE and MovieGraph show that our model infers social relations more accurately than the state-of-the-art methods. Moreover, the ablation study shows the three processes complement each other, and the case study demonstrates the dynamic relational inference.

preprint2022arXiv

Towards Socially Intelligent Agents with Mental State Transition and Human Utility

Building a socially intelligent agent involves many challenges. One of which is to track the agent's mental state transition and teach the agent to make decisions guided by its value like a human. Towards this end, we propose to incorporate mental state simulation and value modeling into dialogue agents. First, we build a hybrid mental state parser that extracts information from both the dialogue and event observations and maintains a graphical representation of the agent's mind; Meanwhile, the transformer-based value model learns human preferences from the human value dataset, ValueNet. Empirical results show that the proposed model attains state-of-the-art performance on the dialogue/action/emotion prediction task in the fantasy text-adventure game dataset, LIGHT. We also show example cases to demonstrate: (i) how the proposed mental state parser can assist the agent's decision by grounding on the context like locations and objects, and (ii) how the value model can help the agent make decisions based on its personal priorities.

preprint2021arXiv

Atlas-aware ConvNetfor Accurate yet Robust Anatomical Segmentation

Convolutional networks (ConvNets) have achieved promising accuracy for various anatomical segmentation tasks. Despite the success, these methods can be sensitive to data appearance variations. Considering the large variability of scans caused by artifacts, pathologies, and scanning setups, robust ConvNets are vital for clinical applications, while have not been fully explored. In this paper, we propose to mitigate the challenge by enabling ConvNets' awareness of the underlying anatomical invariances among imaging scans. Specifically, we introduce a fully convolutional Constraint Adoption Module (CAM) that incorporates probabilistic atlas priors as explicit constraints for predictions over a locally connected Conditional Random Field (CFR), which effectively reinforces the anatomical consistency of the labeling outputs. We design the CAM to be flexible for boosting various ConvNet, and compact for co-optimizing with ConvNets for fusion parameters that leads to the optimal performance. We show the advantage of such atlas priors fusion is two-fold with two brain parcellation tasks. First, our models achieve state-of-the-art accuracy among ConvNet-based methods on both datasets, by significantly reducing structural abnormalities of predictions. Second, we can largely boost the robustness of existing ConvNets, proved by: (i) testing on scans with synthetic pathologies, and (ii) training and evaluation on scans of different scanning setups across datasets. Our method is proposing to be easily adopted to existing ConvNets by fine-tuning with CAM plugged in for accuracy and robustness boosts.

preprint2021arXiv

Exploring Instance-Level Uncertainty for Medical Detection

The ability of deep learning to predict with uncertainty is recognized as key for its adoption in clinical routines. Moreover, performance gain has been enabled by modelling uncertainty according to empirical evidence. While previous work has widely discussed the uncertainty estimation in segmentation and classification tasks, its application on bounding-box-based detection has been limited, mainly due to the challenge of bounding box aligning. In this work, we explore to augment a 2.5D detection CNN with two different bounding-box-level (or instance-level) uncertainty estimates, i.e., predictive variance and Monte Carlo (MC) sample variance. Experiments are conducted for lung nodule detection on LUNA16 dataset, a task where significant semantic ambiguities can exist between nodules and non-nodules. Results show that our method improves the evaluating score from 84.57% to 88.86% by utilizing a combination of both types of variances. Moreover, we show the generated uncertainty enables superior operating points compared to using the probability threshold only, and can further boost the performance to 89.52%. Example nodule detections are visualized to further illustrate the advantages of our method.

preprint2021arXiv

Oral-3D: Reconstructing the 3D Bone Structure of Oral Cavity from 2D Panoramic X-ray

Panoramic X-ray (PX) provides a 2D picture of the patient's mouth in a panoramic view to help dentists observe the invisible disease inside the gum. However, it provides limited 2D information compared with cone-beam computed tomography (CBCT), another dental imaging method that generates a 3D picture of the oral cavity but with more radiation dose and a higher price. Consequently, it is of great interest to reconstruct the 3D structure from a 2D X-ray image, which can greatly explore the application of X-ray imaging in dental surgeries. In this paper, we propose a framework, named Oral-3D, to reconstruct the 3D oral cavity from a single PX image and prior information of the dental arch. Specifically, we first train a generative model to learn the cross-dimension transformation from 2D to 3D. Then we restore the shape of the oral cavity with a deformation module with the dental arch curve, which can be obtained simply by taking a photo of the patient's mouth. To be noted, Oral-3D can restore both the density of bony tissues and the curved mandible surface. Experimental results show that Oral-3D can efficiently and effectively reconstruct the 3D oral structure and show critical information in clinical applications, e.g., tooth pulling and dental implants. To the best of our knowledge, we are the first to explore this domain transformation problem between these two imaging methods.

preprint2021arXiv

T-Net: Learning Feature Representation with Task-specific Supervision for Biomedical Image Analysis

The encoder-decoder network is widely used to learn deep feature representations from pixel-wise annotations in biomedical image analysis. Under this structure, the performance profoundly relies on the effectiveness of feature extraction achieved by the encoding network. However, few models have considered adapting the attention of the feature extractor even in different kinds of tasks. In this paper, we propose a novel training strategy by adapting the attention of the feature extractor according to different tasks for effective representation learning. Specifically, the framework, named T-Net, consists of an encoding network supervised by task-specific attention maps and a posterior network that takes in the learned features to predict the corresponding results. The attention map is obtained by the transformation from pixel-wise annotations according to the specific task, which is used as the supervision to regularize the feature extractor to focus on different locations of the recognition object. To show the effectiveness of our method, we evaluate T-Net on two different tasks, i.e. , segmentation and localization. Extensive results on three public datasets (BraTS-17, MoNuSeg and IDRiD) have indicated the effectiveness and efficiency of our proposed supervision method, especially over the conventional encoding-decoding network.

preprint2020arXiv

OralCam: Enabling Self-Examination and Awareness of Oral Health Using a Smartphone Camera

Due to a lack of medical resources or oral health awareness, oral diseases are often left unexamined and untreated, affecting a large population worldwide. With the advent of low-cost, sensor-equipped smartphones, mobile apps offer a promising possibility for promoting oral health. However, to the best of our knowledge, no mobile health (mHealth) solutions can directly support a user to self-examine their oral health condition. This paper presents OralCam, the first interactive app that enables end-users' self-examination of five common oral conditions (diseases or early disease signals) by taking smartphone photos of one's oral cavity. OralCam allows a user to annotate additional information (e.g. living habits, pain, and bleeding) to augment the input image, and presents the output hierarchically, probabilistically and with visual explanations to help a laymen user understand examination results. Developed on our in-house dataset that consists of 3,182 oral photos annotated by dental experts, our deep learning based framework achieved an average detection sensitivity of 0.787 over five conditions with high localization accuracy. In a week-long in-the-wild user study (N=18), most participants had no trouble using OralCam and interpreting the examination results. Two expert interviews further validate the feasibility of OralCam for promoting users' awareness of oral health.

preprint2020arXiv

OralViewer: 3D Demonstration of Dental Surgeries for Patient Education with Oral Cavity Reconstruction from a 2D Panoramic X-ray

Patient's understanding on forthcoming dental surgeries is required by patient-centered care and helps reduce fear and anxiety. Due to the gap of expertise between patients and dentists, conventional techniques of patient education are usually not effective for explaining surgical steps. In this paper, we present \textit{OralViewer} -- the first interactive application that enables dentist's demonstration of dental surgeries in 3D to promote patients' understanding. \textit{OralViewer} takes a single 2D panoramic dental X-ray to reconstruct patient-specific 3D teeth structures, which are then assembled with registered gum and jaw bone models for complete oral cavity modeling. During the demonstration, \textit{OralViewer} enables dentists to show surgery steps with virtual dental instruments that can animate effects on a 3D model in real-time. A technical evaluation shows our deep learning based model achieves a mean Intersection over Union (IoU) of 0.771 for 3D teeth reconstruction. A patient study with 12 participants shows \textit{OralViewer} can improve patients' understanding of surgeries. An expert study with 3 board-certified dentists further verifies the clinical validity of our system.

Yuan Liang

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

Topology Matters: Measuring Memory Leakage in Multi-Agent LLMs

Panacea or Placebo? Exploring Causal Effects of Nonlocal Vehicle Driving Restriction Policies on Traffic Congestion Using Difference-in-differences Approach

An Explore of Virtual Reality for Awareness of the Climate Change Crisis: A Simulation of Sea Level Rise

ConCL: Concept Contrastive Learning for Dense Prediction Pre-training in Pathology Images

RAAT: Relation-Augmented Attention Transformer for Relation Modeling in Document-Level Event Extraction

SocAoG: Incremental Graph Parsing for Social Relation Inference in Dialogues

Towards Socially Intelligent Agents with Mental State Transition and Human Utility

Atlas-aware ConvNetfor Accurate yet Robust Anatomical Segmentation

Exploring Instance-Level Uncertainty for Medical Detection

Oral-3D: Reconstructing the 3D Bone Structure of Oral Cavity from 2D Panoramic X-ray

T-Net: Learning Feature Representation with Task-specific Supervision for Biomedical Image Analysis

OralCam: Enabling Self-Examination and Awareness of Oral Health Using a Smartphone Camera

OralViewer: 3D Demonstration of Dental Surgeries for Patient Education with Oral Cavity Reconstruction from a 2D Panoramic X-ray