Researcher profile

Yi Dai

Yi Dai contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2026arXiv

FL-Sailer: Efficient and Privacy-Preserving Federated Learning for Scalable Single-Cell Epigenetic Data Analysis via Adaptive Sampling

Single-cell ATAC-seq (scATAC-seq) enables high-resolution mapping of chromatin accessibility, yet privacy regulations and data size constraints hinder multi-institutional sharing. Federated learning (FL) offers a privacy-preserving alternative, but faces three fundamental barriers in scATAC-seq analysis: ultra-high dimensionality, extreme sparsity, and severe cross-institutional heterogeneity. We propose FL-Sailer, the first FL framework designed for scATAC-seq data. FL-Sailer integrates two key innovations: (i) adaptive leverage score sampling, which selects biologically interpretable features while reducing dimensionality by 80%, and (ii) an invariant VAE architecture, which disentangles biological signals from technical confounders via mutual information minimization. We provide a convergence guarantee, showing that FL-Sailer converges to an approximate solution of the original high-dimensional problem with bounded error. Extensive experiments on synthetic and real epigenomic datasets demonstrate that FL-Sailer not only enables previously infeasible multi-institutional collaborations but also surpasses centralized methods by leveraging adaptive sampling as an implicit regularizer to suppress technical noise. Our work establishes that federated learning, when tailored to domain-specific challenges, can become a superior paradigm for collaborative epigenomic research.

preprint2021arXiv

Patterns for Representing Knowledge Graphs to Communicate Situational Knowledge of Service Robots

Service robots are envisioned to be adaptive to their working environment based on situational knowledge. Recent research focused on designing visual representation of knowledge graphs for expert users. However, how to generate an understandable interface for non-expert users remains to be explored. In this paper, we use knowledge graphs (KGs) as a common ground for knowledge exchange and develop a pattern library for designing KG interfaces for non-expert users. After identifying the types of robotic situational knowledge from the literature, we present a formative study in which participants used cards to communicate the knowledge for given scenarios. We iteratively coded the results and identified patterns for representing various types of situational knowledge. To derive design recommendations for applying the patterns, we prototyped a lab service robot and conducted Wizard-of-Oz testing. The patterns and recommendations could provide useful guidance in designing knowledge-exchange interfaces for robots.

preprint2020arXiv

Addressing the Grid-size Sensitivity Issue in Large-eddy Simulations of Stable Boundary Layers

In this study, we have identified certain fundamental limitations of a mixing length parameterization used in a popular turbulent kinetic energy-based subgrid-scale model. Replacing this parameterization with a more physically realistic one significantly improves the overall quality of the large-eddy simulations (LESs) of stable boundary layers. For the range of grid sizes considered here (specifically, 1 m -- 12.5 m), the revision dramatically reduces the grid-size sensitivity of the simulations. Most importantly, the revised scheme allows us to reliably estimate the first- and second-order statistics of a well-known LES intercomparison case, even with a coarse grid-size of O(10 m).