Source author record

Xueping Peng

Xueping Peng appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Machine Learning Computation and Language Information Retrieval

Catalog footprint

What is connected

4works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

MIPO: Mutual Integration of Patient Journey and Medical Ontology for Healthcare Representation Learning

Representation learning on electronic health records (EHRs) plays a vital role in downstream medical prediction tasks. Although natural language processing techniques, such as recurrent neural networks, and self-attention, have been adapted for learning medical representations from hierarchical, time-stamped EHR data, they often struggle when either general or task-specific data are limited. Recent efforts have attempted to mitigate this challenge by incorporating medical ontologies (i.e., knowledge graphs) into self-supervised tasks like diagnosis prediction. However, two main issues remain: (1) small and uniform ontologies that lack diversity for robust learning, and (2) insufficient attention to the critical contexts or dependencies underlying patient journeys, which could further enhance ontology-based learning. To address these gaps, we propose MIPO (Mutual Integration of Patient Journey and Medical Ontology), a robust end-to-end framework that employs a Transformer-based architecture for representation learning. MIPO emphasizes task-specific representation learning through a sequential diagnosis prediction task, while also incorporating an ontology-based disease-typing task. A graph-embedding module is introduced to integrate information from patient visit records, thus alleviating data insufficiency. This setup creates a mutually reinforcing loop, where both patient-journey embedding and ontology embedding benefit from each other. We validate MIPO on two real-world benchmark datasets, showing that it consistently outperforms baseline methods under both sufficient and limited data conditions. Furthermore, the resulting diagnosis embeddings offer improved interpretability, underscoring the promise of MIPO for real-world healthcare applications.

preprint2022arXiv

Aspect-driven User Preference and News Representation Learning for News Recommendation

News recommender systems are essential for helping users to efficiently and effectively find out those interesting news from a large amount of news. Most of existing news recommender systems usually learn topic-level representations of users and news for recommendation, and neglect to learn more informative aspect-level features of users and news for more accurate recommendation. As a result, they achieve limited recommendation performance. Aiming at addressing this deficiency, we propose a novel Aspect-driven News Recommender System (ANRS) built on aspect-level user preference and news representation learning. Here, news aspect is fine-grained semantic information expressed by a set of related words, which indicates specific aspects described by the news. In ANRS, news aspect-level encoder and user aspect-level encoder are devised to learn the fine-grained aspect-level representations of user's preferences and news characteristics respectively, which are fed into click predictor to judge the probability of the user clicking the candidate news. Extensive experiments are done on the commonly used real-world dataset MIND, which demonstrate the superiority of our method compared with representative and state-of-the-art methods.

preprint2020arXiv

Self-Attention Enhanced Patient Journey Understanding in Healthcare System

Understanding patients' journeys in healthcare system is a fundamental prepositive task for a broad range of AI-based healthcare applications. This task aims to learn an informative representation that can comprehensively encode hidden dependencies among medical events and its inner entities, and then the use of encoding outputs can greatly benefit the downstream application-driven tasks. A patient journey is a sequence of electronic health records (EHRs) over time that is organized at multiple levels: patient, visits and medical codes. The key challenge of patient journey understanding is to design an effective encoding mechanism which can properly tackle the aforementioned multi-level structured patient journey data with temporal sequential visits and a set of medical codes. This paper proposes a novel self-attention mechanism that can simultaneously capture the contextual and temporal relationships hidden in patient journeys. A multi-level self-attention network (MusaNet) is specifically designed to learn the representations of patient journeys that is used to be a long sequence of activities. The MusaNet is trained in end-to-end manner using the training data derived from EHRs. We evaluated the efficacy of our method on two medical application tasks with real-world benchmark datasets. The results have demonstrated the proposed MusaNet produces higher-quality representations than state-of-the-art baseline methods. The source code is available in https://github.com/xueping/MusaNet.

preprint2019arXiv

Temporal Self-Attention Network for Medical Concept Embedding

In longitudinal electronic health records (EHRs), the event records of a patient are distributed over a long period of time and the temporal relations between the events reflect sufficient domain knowledge to benefit prediction tasks such as the rate of inpatient mortality. Medical concept embedding as a feature extraction method that transforms a set of medical concepts with a specific time stamp into a vector, which will be fed into a supervised learning algorithm. The quality of the embedding significantly determines the learning performance over the medical data. In this paper, we propose a medical concept embedding method based on applying a self-attention mechanism to represent each medical concept. We propose a novel attention mechanism which captures the contextual information and temporal relationships between medical concepts. A light-weight neural net, "Temporal Self-Attention Network (TeSAN)", is then proposed to learn medical concept embedding based solely on the proposed attention mechanism. To test the effectiveness of our proposed methods, we have conducted clustering and prediction tasks on two public EHRs datasets comparing TeSAN against five state-of-the-art embedding methods. The experimental results demonstrate that the proposed TeSAN model is superior to all the compared methods. To the best of our knowledge, this work is the first to exploit temporal self-attentive relations between medical events.