Source author record

Yihan Lin

Yihan Lin appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Quantitative Methods Robotics Artificial Intelligence Biological Physics Cell Behavior Computer Vision Molecular Networks Neural and Evolutionary Computing

Catalog footprint

What is connected

6works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

From Pixels to Tokens: A Systematic Study of Latent Action Supervision for Vision-Language-Action Models

Latent actions serve as an intermediate representation that enables consistent modeling of vision-language-action (VLA) models across heterogeneous datasets. However, approaches to supervising VLAs with latent actions are fragmented and lack a systematic comparison. This work structures the study of latent action supervision from two perspectives: (i) regularizing the trajectory via image-based latent actions, and (ii) unifying the target space with action-based latent actions. Under a unified VLA baseline, we instantiate and compare four representative integration strategies. Our results reveal a formulation-task correspondence: image-based latent actions benefit long-horizon reasoning and scene-level generalization, whereas action-based latent actions excel at complex motor coordination. Furthermore, we find that directly supervising the VLM with discrete latent action tokens yields the most effective performance. Finally, our experiments offer initial insights into the benefits of latent action supervision in mixed-data, suggesting a promising direction for VLA training. Code is available at https://github.com/RUCKBReasoning/From_Pixels_to_Tokens.

preprint2026arXiv

Mind Dreamer: Untethering Imagination via Active Latent Intervention on Latent Manifolds

Model-Based Reinforcement Learning (MBRL) leverages latent imagination for sample efficiency, yet remains constrained by Historical Tethering: imagination is typically initialized from observed states. This creates a learning asymmetry, where the world model's manifold discovery outpaces the policy's sparse-reward optimization. We propose Mind Dreamer (MD), a framework that operationalizes Active Latent Intervention (ALI) to transcend Markovian continuity. MD reformulates discovery as the minimization of a global Relay Manifold Expected Free Energy (R-EFE); by sampling initial states from a learned generator $s_0 \sim p_{gen}(\cdot)$ rather than the historical buffer, MD utilizes an adversarial generator to synthesize non-continuous latent jumps to epistemic blind spots that are physically plausible yet cognitively challenging. To resolve the credit assignment paradox across these spatial ruptures, we derive the Relay Value Function (RVF) and Relay Uncertainty Function (RUF). These potentials treat synthesized anchors as counterfactual intermediary states, propagating pragmatic and epistemic value through a principled Bellman-style formulation. Notably, we prove that uncertainty propagation across discontinuities necessitates a quadratic discount $γ^2$, establishing a formal epistemic horizon. Theoretically, MD approximates a variance-minimizing importance sampler that expands the manifold's spectral gap, reducing the hitting time to critical bottleneck states. Empirically, MD achieves a 1.67$\times$ average speedup over DreamerV3 on DeepMind Control Suite, reaching 8.8$\times$ in sparse-reward tasks.

preprint2022arXiv

Rethinking Pretraining as a Bridge from ANNs to SNNs

Spiking neural networks (SNNs) are known as a typical kind of brain-inspired models with their unique features of rich neuronal dynamics, diverse coding schemes and low power consumption properties. How to obtain a high-accuracy model has always been the main challenge in the field of SNN. Currently, there are two mainstream methods, i.e., obtaining a converted SNN through converting a well-trained Artificial Neural Network (ANN) to its SNN counterpart or training an SNN directly. However, the inference time of a converted SNN is too long, while SNN training is generally very costly and inefficient. In this work, a new SNN training paradigm is proposed by combining the concepts of the two different training methods with the help of the pretrain technique and BP-based deep SNN training mechanism. We believe that the proposed paradigm is a more efficient pipeline for training SNNs. The pipeline includes pipeS for static data transfer tasks and pipeD for dynamic data transfer tasks. SOTA results are obtained in a large-scale event-driven dataset ES-ImageNet. For training acceleration, we achieve the same (or higher) best accuracy as similar LIF-SNNs using 1/10 training time on ImageNet-1K and 2/5 training time on ES-ImageNet and also provide a time-accuracy benchmark for a new dataset ES-UCF101. These experimental results reveal the similarity of the functions of parameters between ANNs and SNNs and also demonstrate the various potential applications of this SNN training pipeline.

preprint2020arXiv

LIAF-Net: Leaky Integrate and Analog Fire Network for Lightweight and Efficient Spatiotemporal Information Processing

Spiking neural networks (SNNs) based on Leaky Integrate and Fire (LIF) model have been applied to energy-efficient temporal and spatiotemporal processing tasks. Thanks to the bio-plausible neuronal dynamics and simplicity, LIF-SNN benefits from event-driven processing, however, usually faces the embarrassment of reduced performance. This may because in LIF-SNN the neurons transmit information via spikes. To address this issue, in this work, we propose a Leaky Integrate and Analog Fire (LIAF) neuron model, so that analog values can be transmitted among neurons, and a deep network termed as LIAF-Net is built on it for efficient spatiotemporal processing. In the temporal domain, LIAF follows the traditional LIF dynamics to maintain its temporal processing capability. In the spatial domain, LIAF is able to integrate spatial information through convolutional integration or fully-connected integration. As a spatiotemporal layer, LIAF can also be used with traditional artificial neural network (ANN) layers jointly. Experiment results indicate that LIAF-Net achieves comparable performance to Gated Recurrent Unit (GRU) and Long short-term memory (LSTM) on bAbI Question Answering (QA) tasks, and achieves state-of-the-art performance on spatiotemporal Dynamic Vision Sensor (DVS) datasets, including MNIST-DVS, CIFAR10-DVS and DVS128 Gesture, with much less number of synaptic weights and computational overhead compared with traditional networks built by LSTM, GRU, Convolutional LSTM (ConvLSTM) or 3D convolution (Conv3D). Compared with traditional LIF-SNN, LIAF-Net also shows dramatic accuracy gain on all these experiments. In conclusion, LIAF-Net provides a framework combining the advantages of both ANNs and SNNs for lightweight and efficient spatiotemporal information processing.

preprint2014arXiv

Scaling laws governing stochastic growth and division of single bacterial cells

Uncovering the quantitative laws that govern the growth and division of single cells remains a major challenge. Using a unique combination of technologies that yields unprecedented statistical precision, we find that the sizes of individual Caulobacter crescentus cells increase exponentially in time. We also establish that they divide upon reaching a critical multiple ($\approx$1.8) of their initial sizes, rather than an absolute size. We show that when the temperature is varied, the growth and division timescales scale proportionally with each other over the physiological temperature range. Strikingly, the cell-size and division-time distributions can both be rescaled by their mean values such that the condition-specific distributions collapse to universal curves. We account for these observations with a minimal stochastic model that is based on an autocatalytic cycle. It predicts the scalings, as well as specific functional forms for the universal curves. Our experimental and theoretical analysis reveals a simple physical principle governing these complex biological processes: a single temperature-dependent scale of cellular time governs the stochastic dynamics of growth and division in balanced growth conditions.

preprint2012arXiv

Phase resetting reveals network dynamics underlying a bacterial cell cycle

Genomic and proteomic methods yield networks of biological regulatory interactions but do not provide direct insight into how those interactions are organized into functional modules, or how information flows from one module to another. In this work we introduce an approach that provides this complementary information and apply it to the bacterium Caulobacter crescentus, a paradigm for cell-cycle control. Operationally, we use an inducible promoter to express the essential transcriptional regulatory gene ctrA in a periodic, pulsed fashion. This chemical perturbation causes the population of cells to divide synchronously, and we use the resulting advance or delay of the division times of single cells to construct a phase resetting curve. We find that delay is strongly favored over advance. This finding is surprising since it does not follow from the temporal expression profile of CtrA and, in turn, simulations of existing network models. We propose a phenomenological model that suggests that the cell-cycle network comprises two distinct functional modules that oscillate autonomously and couple in a highly asymmetric fashion. These features collectively provide a new mechanism for tight temporal control of the cell cycle in C. crescentus. We discuss how the procedure can serve as the basis for a general approach for probing network dynamics, which we term chemical perturbation spectroscopy (CPS).