Researcher profile

Xiaoyu Lin

Xiaoyu Lin contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

Breaking the Quality-Privacy Tradeoff in Tabular Data Generation via In-Context Learning

Tabular data synthesis aims to generate high-quality data while preserving privacy. However, we find that existing tabular generative models exhibit a clear tradeoff in the small-data regime: improving data quality typically comes at the cost of increased memorization of training samples, thereby weakening privacy protection. This tradeoff arises because small training sets make it difficult for dataset-specific generative models to distinguish generalizable structure from sample-specific patterns. To address this, we propose DiffICL, which formulates tabular data generation as an in-context learning problem. Instead of fitting each dataset from scratch,DiffICL leverages pretrained structural priors learned from a large collection of datasets, enabling it to infer data distributions from limited context rather than memorizing individual samples. We evaluate DiffICL on 14 real-world datasets. Results show that DiffICL improves both data quality and privacy, and generate synthetic data that provides effective data augmentation. Our findings suggest that the quality-privacy tradeoff can be improved through better training paradigms.

preprint2022arXiv

DSR: Towards Drone Image Super-Resolution

Despite achieving remarkable progress in recent years, single-image super-resolution methods are developed with several limitations. Specifically, they are trained on fixed content domains with certain degradations (whether synthetic or real). The priors they learn are prone to overfitting the training configuration. Therefore, the generalization to novel domains such as drone top view data, and across altitudes, is currently unknown. Nonetheless, pairing drones with proper image super-resolution is of great value. It would enable drones to fly higher covering larger fields of view, while maintaining a high image quality. To answer these questions and pave the way towards drone image super-resolution, we explore this application with particular focus on the single-image case. We propose a novel drone image dataset, with scenes captured at low and high resolutions, and across a span of altitudes. Our results show that off-the-shelf state-of-the-art networks witness a significant drop in performance on this different domain. We additionally show that simple fine-tuning, and incorporating altitude awareness into the network's architecture, both improve the reconstruction performance.

preprint2022arXiv

Towards Robust Drone Vision in the Wild

The past few years have witnessed the burst of drone-based applications where computer vision plays an essential role. However, most public drone-based vision datasets focus on detection and tracking. On the other hand, the performance of most existing image super-resolution methods is sensitive to the dataset, specifically, the degradation model between high-resolution and low-resolution images. In this thesis, we propose the first image super-resolution dataset for drone vision. Image pairs are captured by two cameras on the drone with different focal lengths. We collect data at different altitudes and then propose pre-processing steps to align image pairs. Extensive empirical studies show domain gaps exist among images captured at different altitudes. Meanwhile, the performance of pretrained image super-resolution networks also suffers a drop on our dataset and varies among altitudes. Finally, we propose two methods to build a robust image super-resolution network at different altitudes. The first feeds altitude information into the network through altitude-aware layers. The second uses one-shot learning to quickly adapt the super-resolution model to unknown altitudes. Our results reveal that the proposed methods can efficiently improve the performance of super-resolution networks at varying altitudes.

preprint2021arXiv

Learning degraded image classification with restoration data fidelity

Learning-based methods especially with convolutional neural networks (CNN) are continuously showing superior performance in computer vision applications, ranging from image classification to restoration. For image classification, most existing works focus on very clean images such as images in Caltech-256 and ImageNet datasets. However, in most realistic scenarios, the acquired images may suffer from degradation. One important and interesting problem is to combine image classification and restoration tasks to improve the performance of CNN-based classification networks on degraded images. In this report, we explore the influence of degradation types and levels on four widely-used classification networks, and the use of a restoration network to eliminate the degradation's influence. We also propose a novel method leveraging a fidelity map to calibrate the image features obtained by pre-trained classification networks. We empirically demonstrate that our proposed method consistently outperforms the pre-trained networks under all degradation levels and types with additive white Gaussian noise (AWGN), and it even outperforms the re-trained networks for degraded images under low degradation levels. We also show that the proposed method is a model-agnostic approach that benefits different classification networks. Our results reveal that the proposed method is a promising solution to mitigate the effect caused by image degradation.

preprint2020arXiv

ADER: Adaptively Distilled Exemplar Replay Towards Continual Learning for Session-based Recommendation

Session-based recommendation has received growing attention recently due to the increasing privacy concern. Despite the recent success of neural session-based recommenders, they are typically developed in an offline manner using a static dataset. However, recommendation requires continual adaptation to take into account new and obsolete items and users, and requires "continual learning" in real-life applications. In this case, the recommender is updated continually and periodically with new data that arrives in each update cycle, and the updated model needs to provide recommendations for user activities before the next model update. A major challenge for continual learning with neural models is catastrophic forgetting, in which a continually trained model forgets user preference patterns it has learned before. To deal with this challenge, we propose a method called Adaptively Distilled Exemplar Replay (ADER) by periodically replaying previous training samples (i.e., exemplars) to the current model with an adaptive distillation loss. Experiments are conducted based on the state-of-the-art SASRec model using two widely used datasets to benchmark ADER with several well-known continual learning techniques. We empirically demonstrate that ADER consistently outperforms other baselines, and it even outperforms the method using all historical data at every update cycle. This result reveals that ADER is a promising solution to mitigate the catastrophic forgetting issue towards building more realistic and scalable session-based recommenders.