Source author record

Yu Gu

Yu Gu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

41works

23topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Long-Chain Reasoning Distillation via Adaptive Prefix Alignment

Large Language Models (LLMs) have demonstrated remarkable reasoning capabilities, particularly in solving complex mathematical problems. Recent studies show that distilling long reasoning trajectories can effectively enhance the reasoning performance of small-scale student models. However, teacher-generated reasoning trajectories are often excessively long and structurally complex, making them difficult for student models to learn. This mismatch leads to a gap between the provided supervision signal and the learning capacity of the student model. To address this challenge, we propose Prefix-ALIGNment distillation (P-ALIGN), a framework that fully exploits teacher CoTs for distillation through adaptive prefix alignment. Specifically, P-ALIGN adaptively truncates teacher-generated reasoning trajectories by determining whether the remaining suffix is concise and sufficient to guide the student model. Then, P-ALIGN leverages the teacher-generated prefix to supervise the student model, encouraging effective prefix alignment. Experiments on multiple mathematical reasoning benchmarks demonstrate that P-ALIGN outperforms all baselines by over 3%. Further analysis indicates that the prefixes constructed by P-ALIGN provide more effective supervision signals, while avoiding the negative impact of redundant and uncertain reasoning components. All code is available at https://github.com/NEUIR/P-ALIGN.

preprint2026arXiv

Pause and Reflect: Conformal Aggregation for Chain-of-Thought Reasoning

Chain-of-thought (CoT) reasoning with self-consistency improves performance by aggregating multiple sampled reasoning paths. In this setting, correctness is no longer tied to a single reasoning trace but to the aggregation rule over a pool of candidate paths, making aggregation uncertainty the central challenge. This issue is critical where confidently incorrect answers are far more costly than abstentions. We introduce a conformal procedure for CoT reasoning that directly addresses aggregation uncertainty. Our approach replaces majority voting with weighted score aggregation over reasoning paths and calibrates an abstention rule using conformal risk control. This approach leads to finite-sample guarantees on the confident-error rate--the probability that the system answers and is wrong. We further identify score separability as the key condition under which abstention provably improves selective accuracy, and derive closed-form expressions that predict accuracy gains from calibration data alone. The method is fully inference-time, and requires no retraining. Across four benchmarks, four open-source models, and three score classes, realized confident-error rates are consistent with the prescribed targets up to calibration-split and test-set variability. Our method achieves $90.1\%$ selective accuracy on GSM8K by abstaining on less than $5\%$ of problems, compared with $82\%$ accuracy under majority-voting baseline.

preprint2026arXiv

Revealing the Attention Floating Mechanism in Masked Diffusion Models

Masked diffusion models (MDMs), which leverage bidirectional attention and a denoising process, are narrowing the performance gap with autoregressive models (ARMs). However, their internal attention mechanisms remain under-explored. This paper investigates the attention behaviors in MDMs, revealing the phenomenon of Attention Floating. Unlike ARMs, where attention converges to a fixed sink, MDMs exhibit dynamic, dispersed attention anchors that shift across denoising steps and layers. Further analysis reveals its Shallow Structure-Aware, Deep Content-Focused attention mechanism: shallow layers utilize floating tokens to build a global structural framework, while deeper layers allocate more capability toward capturing semantic content. Empirically, this distinctive attention pattern provides a mechanistic explanation for the strong in-context learning capabilities of MDMs, allowing them to double the performance compared to ARMs in knowledge-intensive tasks. All codes and datasets are available at https://github.com/NEUIR/Attention-Floating.

preprint2026arXiv

What-If Analysis of Large Language Models: Explore the Game World Using Proactive Thinking

LLMs struggle with decision-making in high-stakes environments like MOBA games, primarily due to a lack of proactive reasoning and limited understanding of complex game dynamics. To address this, we propose What-if Analysis LLM (WiA-LLM), a framework that trains an LLM as an explicit, language-based world model. Instead of representing the environment in latent vectors, WiA-LLM uses natural language to simulate how the game state evolves over time in response to candidate actions, and provides textual justifications for these predicted outcomes. WiA-LLM is trained in two stages: supervised fine-tuning on human-like reasoning traces, followed by reinforcement learning with outcome-based rewards based on the alignment between predicted and actual future states. In the Honor of Kings (HoK) environment, WiA-LLM attains 74.2\% accuracy (27\%$\uparrow$ vs. base model) in forecasting game-state changes. In addition, WiA-LLM demonstrate strategic behavior more closely aligned with expert players than purely reactive LLMs, indicating enhanced foresight and expert-like decision-making.

preprint2024arXiv

Multichannel AV-wav2vec2: A Framework for Learning Multichannel Multi-Modal Speech Representation

Self-supervised speech pre-training methods have developed rapidly in recent years, which show to be very effective for many near-field single-channel speech tasks. However, far-field multichannel speech processing is suffering from the scarcity of labeled multichannel data and complex ambient noises. The efficacy of self-supervised learning for far-field multichannel and multi-modal speech processing has not been well explored. Considering that visual information helps to improve speech recognition performance in noisy scenes, in this work we propose a multichannel multi-modal speech self-supervised learning framework AV-wav2vec2, which utilizes video and multichannel audio data as inputs. First, we propose a multi-path structure to process multichannel audio streams and a visual stream in parallel, with intra- and inter-channel contrastive losses as training targets to fully exploit the spatiotemporal information in multichannel speech data. Second, based on contrastive learning, we use additional single-channel audio data, which is trained jointly to improve the performance of speech representation. Finally, we use a Chinese multichannel multi-modal dataset in real scenarios to validate the effectiveness of the proposed method on audio-visual speech recognition (AVSR), automatic speech recognition (ASR), visual speech recognition (VSR) and audio-visual speaker diarization (AVSD) tasks.

preprint2023arXiv

Fluctuations of the winding number of a directed polymer on a cylinder

We prove a central limit theorem for the winding number of a directed polymer on a cylinder, which is equivalent with proving the Gaussian fluctuations of the endpoint of the directed polymer in a spatial periodic environment.

preprint2022arXiv

A Framework for Controlling Multi-Robot Systems Using Bayesian Optimization and Linear Combination of Vectors

We propose a general framework for creating parameterized control schemes for decentralized multi-robot systems. A variety of tasks can be seen in the decentralized multi-robot literature, each with many possible control schemes. For several of them, the agents choose control velocities using algorithms that extract information from the environment and combine that information in meaningful ways. From this basic formation, a framework is proposed that classifies each robots' measurement information as sets of relevant scalars and vectors and creates a linear combination of the measured vector sets. Along with an optimizable parameter set, the scalar measurements are used to generate the coefficients for the linear combination. With this framework and Bayesian optimization, we can create effective control systems for several multi-robot tasks, including cohesion and segregation, pattern formation, and searching/foraging.

preprint2022arXiv

Combining deep learning and crowdsourcing geo-images to predict housing quality in rural China

Housing quality is an essential proxy for regional wealth, security and health. Understanding the distribution of housing quality is crucial for unveiling rural development status and providing political proposals. However,present rural house quality data highly depends on a top-down, time-consuming survey at the national or provincial level but fails to unpack the housing quality at the village level. To fill the gap between accurately depicting rural housing quality conditions and deficient data,we collect massive rural images and invite users to assess their housing quality at scale. Furthermore, a deep learning framework is proposed to automatically and efficiently predict housing quality based on crowd-sourcing rural images.

preprint2022arXiv

Gaussian fluctuations of replica overlap in directed polymers

In this short note, we prove a central limit theorem for a type of replica overlap of the Brownian directed polymer in a Gaussian random environment, in the low temperature regime and in all dimensions. The proof relies on a superconcentration result for the KPZ equation driven by a spatially mollified noise, which is inspired by the recent work of Chatterjee \cite{C1}.

preprint2022arXiv

HTGN-BTW: Heterogeneous Temporal Graph Network with Bi-Time-Window Training Strategy for Temporal Link Prediction

With the development of temporal networks such as E-commerce networks and social networks, the issue of temporal link prediction has attracted increasing attention in recent years. The Temporal Link Prediction task of WSDM Cup 2022 expects a single model that can work well on two kinds of temporal graphs simultaneously, which have quite different characteristics and data properties, to predict whether a link of a given type will occur between two given nodes within a given time span. Our team, named as nothing here, regards this task as a link prediction task in heterogeneous temporal networks and proposes a generic model, i.e., Heterogeneous Temporal Graph Network (HTGN), to solve such temporal link prediction task with the unfixed time intervals and the diverse link types. That is, HTGN can adapt to the heterogeneity of links and the prediction with unfixed time intervals within an arbitrary given time period. To train the model, we design a Bi-Time-Window training strategy (BTW) which has two kinds of mini-batches from two kinds of time windows. As a result, for the final test, we achieved an AUC of 0.662482 on dataset A, an AUC of 0.906923 on dataset B, and won 2nd place with an Average T-scores of 0.628942.

preprint2022arXiv

Mass Testing and Characterization of 20-inch PMTs for JUNO

Main goal of the JUNO experiment is to determine the neutrino mass ordering using a 20kt liquid-scintillator detector. Its key feature is an excellent energy resolution of at least 3 % at 1 MeV, for which its instruments need to meet a certain quality and thus have to be fully characterized. More than 20,000 20-inch PMTs have been received and assessed by JUNO after a detailed testing program which began in 2017 and elapsed for about four years. Based on this mass characterization and a set of specific requirements, a good quality of all accepted PMTs could be ascertained. This paper presents the performed testing procedure with the designed testing systems as well as the statistical characteristics of all 20-inch PMTs intended to be used in the JUNO experiment, covering more than fifteen performance parameters including the photocathode uniformity. This constitutes the largest sample of 20-inch PMTs ever produced and studied in detail to date, i.e. 15,000 of the newly developed 20-inch MCP-PMTs from Northern Night Vision Technology Co. (NNVT) and 5,000 of dynode PMTs from Hamamatsu Photonics K. K.(HPK).

preprint2022arXiv

Proprioceptive Slip Detection for Planetary Rovers in Perceptually Degraded Extraterrestrial Environments

Slip detection is of fundamental importance for the safety and efficiency of rovers driving on the surface of extraterrestrial bodies. Current planetary rover slip detection systems rely on visual perception on the assumption that sufficient visual features can be acquired in the environment. However, visual-based methods are prone to suffer in perceptually degraded planetary environments with dominant low terrain features such as regolith, glacial terrain, salt-evaporites, and poor lighting conditions such as dark caves and permanently shadowed regions. Relying only on visual sensors for slip detection also requires additional computational power and reduces the rover traversal rate. This paper answers the question of how to detect wheel slippage of a planetary rover without depending on visual perception. In this respect, we propose a slip detection system that obtains its information from a proprioceptive localization framework that is capable of providing reliable, continuous, and computationally efficient state estimation over hundreds of meters. This is accomplished by using zero velocity update, zero angular rate update, and non-holonomic constraints as pseudo-measurement updates on an inertial navigation system framework. The proposed method is evaluated on actual hardware and field-tested in a planetary-analog environment. The method achieves greater than 92% slip detection accuracy for distances around 150 m using only an IMU and wheel encoders.

preprint2021arXiv

A forward-backward SDE from the 2D nonlinear stochastic heat equation

We consider a nonlinear stochastic heat equation in spatial dimension $d=2$, forced by a white-in-time multiplicative Gaussian noise with spatial correlation length $\varepsilon>0$ but divided by a factor of $\sqrt{\log\varepsilon^{-1}}$. We impose a condition on the Lipschitz constant of the nonlinearity so that the problem is in the "weak noise" regime. We show that, as $\varepsilon\downarrow0$, the one-point distribution of the solution converges, with the limit characterized in terms of the solution to a forward-backward stochastic differential equation (FBSDE). We also characterize the limiting multipoint statistics of the solution, when the points are chosen on appropriate scales, in similar terms. Our approach is new even for the linear case, in which the FBSDE can be solved explicitly and we recover results of Caravenna, Sun, and Zygouras (Ann. Appl. Probab. 27(5):3050--3112, 2017).

preprint2021arXiv

A quenched local limit theorem for stochastic flows

We consider a particle undergoing Brownian motion in Euclidean space of any dimension, forced by a Gaussian random velocity field that is white in time and smooth in space. We show that conditional on the velocity field, the quenched density of the particle after a long time can be approximated pointwise by the product of a deterministic Gaussian density and a spacetime-stationary random field $U$. If the velocity field is additionally assumed to be incompressible, then $U\equiv 1$ almost surely and we obtain a local central limit theorem.

preprint2021arXiv

Beyond I.I.D.: Three Levels of Generalization for Question Answering on Knowledge Bases

Existing studies on question answering on knowledge bases (KBQA) mainly operate with the standard i.i.d assumption, i.e., training distribution over questions is the same as the test distribution. However, i.i.d may be neither reasonably achievable nor desirable on large-scale KBs because 1) true user distribution is hard to capture and 2) randomly sample training examples from the enormous space would be highly data-inefficient. Instead, we suggest that KBQA models should have three levels of built-in generalization: i.i.d, compositional, and zero-shot. To facilitate the development of KBQA models with stronger generalization, we construct and release a new large-scale, high-quality dataset with 64,331 questions, GrailQA, and provide evaluation settings for all three levels of generalization. In addition, we propose a novel BERT-based KBQA model. The combination of our dataset and model enables us to thoroughly examine and demonstrate, for the first time, the key role of pre-trained contextual embeddings like BERT in the generalization of KBQA.

preprint2021arXiv

ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders

This paper presents ByteSing, a Chinese singing voice synthesis (SVS) system based on duration allocated Tacotron-like acoustic models and WaveRNN neural vocoders. Different from the conventional SVS models, the proposed ByteSing employs Tacotron-like encoder-decoder structures as the acoustic models, in which the CBHG models and recurrent neural networks (RNNs) are explored as encoders and decoders respectively. Meanwhile an auxiliary phoneme duration prediction model is utilized to expand the input sequence, which can enhance the model controllable capacity, model stability and tempo prediction accuracy. WaveRNN neural vocoders are also adopted as neural vocoders to further improve the voice quality of synthesized songs. Both objective and subjective experimental results prove that the SVS method proposed in this paper can produce quite natural, expressive and high-fidelity songs by improving the pitch and spectrogram prediction accuracy and the models using attention mechanism can achieve best performance.

preprint2021arXiv

INSQ: An Influential Neighbor Set Based Moving kNN Query Processing System

We revisit the moving k nearest neighbor (MkNN) query, which computes one's k nearest neighbor set and maintains it while at move. Existing MkNN algorithms are mostly safe region based, which lack efficiency due to either computing small safe regions with a high recomputation frequency or computing larger safe regions but with a high cost for each computation. In this demonstration, we showcase a system named INSQ that adopts a novel algorithm called the Influential Neighbor Set (INS) algorithm to process the MkNN query in both two-dimensional Euclidean space and road networks. This algorithm uses a small set of safe guarding objects instead of safe regions. As long as the the current k nearest neighbors are closer to the query object than the safe guarding objects are, the current k nearest neighbors stay valid and no recomputation is required. Meanwhile, the region defined by the safe guarding objects is the largest possible safe region. This means that the recomputation frequency is also minimized and hence, the INS algorithm achieves high overall query processing efficiency.

preprint2021arXiv

JUNO Physics and Detector

The Jiangmen Underground Neutrino Observatory (JUNO) is a 20 kton LS detector at 700-m underground. An excellent energy resolution and a large fiducial volume offer exciting opportunities for addressing many important topics in neutrino and astro-particle physics. With 6 years of data, the neutrino mass ordering can be determined at 3-4 sigma and three oscillation parameters can be measured to a precision of 0.6% or better by detecting reactor antineutrinos. With 10 years of data, DSNB could be observed at 3-sigma; a lower limit of the proton lifetime of 8.34e33 years (90% C.L.) can be set by searching for p->nu_bar K^+; detection of solar neutrinos would shed new light on the solar metallicity problem and examine the vacuum-matter transition region. A core-collapse supernova at 10 kpc would lead to ~5000 IBD and ~2000 (300) all-flavor neutrino-proton (electron) scattering events. Geo-neutrinos can be detected with a rate of ~400 events/year. We also summarize the final design of the JUNO detector and the key R&D achievements. All 20-inch PMTs have been tested. The average photon detection efficiency is 28.9% for the 15,000 MCP PMTs and 28.1% for the 5,000 dynode PMTs, higher than the JUNO requirement of 27%. Together with the >20 m attenuation length of LS, we expect a yield of 1345 p.e. per MeV and an effective energy resolution of 3.02%/\sqrt{E (MeV)}$ in simulations. The underwater electronics is designed to have a loss rate <0.5% in 6 years. With degassing membranes and a micro-bubble system, the radon concentration in the 35-kton water pool could be lowered to <10 mBq/m^3. Acrylic panels of radiopurity <0.5 ppt U/Th are produced. The 20-kton LS will be purified onsite. Singles in the fiducial volume can be controlled to ~10 Hz. The JUNO experiment also features a double calorimeter system with 25,600 3-inch PMTs, a LS testing facility OSIRIS, and a near detector TAO.

preprint2021arXiv

Search Planning of a UAV/UGV Team with Localization Uncertainty in a Subterranean Environment

We present a waypoint planning algorithm for an unmanned aerial vehicle (UAV) that is teamed with an unmanned ground vehicle (UGV) for the task of search and rescue in a subterranean environment. The UAV and UGV are teamed such that the localization of the UAV is conducted on the UGV via the multi-sensor fusion of a fish-eye camera, 3D LIDAR, ranging radio, and a laser altimeter. Likewise, the trajectory planning of the UAV is conducted on the UGV, which is assumed to have a 3D map of the environment (e.g., from Simultaneous Localization and Mapping). The goal of the planning algorithm is to satisfy the mission's exploration criteria while reducing the localization error of the UAV by evaluating the belief space for potential exploration routes. The presented algorithm is evaluated in a relevant simulation environment where the planning algorithm is shown to be effective at reducing the localization errors of the UAV.

preprint2020arXiv

BeSense: Leveraging WiFi Channel Data and Computational Intelligence for Behavior Analysis

The ever evolving informatics technology has gradually bounded human and computer in a compact way. Understanding user behavior becomes a key enabler in many fields such as sedentary-related healthcare, human-computer interaction (HCI) and affective computing. Traditional sensor-based and vision-based user behavior analysis approaches are obtrusive in general, hindering their usage in realworld. Therefore, in this article, we first introduce WiFi signal as a new source instead of sensor and vision for unobtrusive user behaviors analysis. Then we design BeSense, a contactless behavior analysis system leveraging signal processing and computational intelligence over WiFi channel state information (CSI). We prototype BeSense on commodity low-cost WiFi devices and evaluate its performance in realworld environments. Experimental results have verified its effectiveness in recognizing user behaviors.

preprint2020arXiv

BrePartition: Optimized High-Dimensional kNN Search with Bregman Distances

Bregman distances (also known as Bregman divergences) are widely used in machine learning, speech recognition and signal processing, and kNN searches with Bregman distances have become increasingly important with the rapid advances of multimedia applications. Data in multimedia applications such as images and videos are commonly transformed into space of hundreds of dimensions. Such high-dimensional space has posed significant challenges for existing kNN search algorithms with Bregman distances, which could only handle data of medium dimensionality (typically less than 100). This paper addresses the urgent problem of high-dimensional kNN search with Bregman distances. We propose a novel partition-filter-refinement framework. Specifically, we propose an optimized dimensionality partitioning scheme to solve several non-trivial issues. First, an effective bound from each partitioned subspace to obtain exact kNN results is derived. Second, we conduct an in-depth analysis of the optimized number of partitions and devise an effective strategy for partitioning. Third, we design an efficient integrated index structure for all the subspaces together to accelerate the search processing. Moreover, we extend our exact solution to an approximate version by a trade-off between the accuracy and efficiency. Experimental results on four real-world datasets and two synthetic datasets show the clear advantage of our method in comparison to state-of-the-art algorithms.

preprint2020arXiv

Cooperative Navigation Using Pairwise Communication with Ranging and Magnetic Anomaly Measurements

The problem of cooperative localization for a small group of Unmanned Aerial Vehicles (UAVs) in a GNSS denied environment is addressed in this paper. The presented approach contains two sequential steps: first, an algorithm called cooperative ranging localization, formulated as an Extended Kalman Filter (EKF), estimates each UAV's relative pose inside the group using inter-vehicle ranging measurements; second, an algorithm named cooperative magnetic localization, formulated as a particle filter, estimates each UAV's global pose through matching the group's magnetic anomaly measurements to a given magnetic anomaly map. In this study, each UAV is assumed to only perform a ranging measurement and data exchange with one other UAV at any point in time. A simulator is developed to evaluate the algorithms with magnetic anomaly maps acquired from airborne geophysical survey. The simulation results show that the average estimated position error of a group of 16 UAVs is approximately 20 meters after flying about 180 kilometers in 1 hour. Sensitivity analysis shows that the algorithms can tolerate large variations of velocity, yaw rate, and magnetic anomaly measurement noises. Additionally, the UAV group shows improved position estimation robustness with both high and low resolution maps as more UAVs are added into the group.

preprint2020arXiv

Feasibility and physics potential of detecting $^8$B solar neutrinos at JUNO

The Jiangmen Underground Neutrino Observatory~(JUNO) features a 20~kt multi-purpose underground liquid scintillator sphere as its main detector. Some of JUNO's features make it an excellent experiment for $^8$B solar neutrino measurements, such as its low-energy threshold, its high energy resolution compared to water Cherenkov detectors, and its much large target mass compared to previous liquid scintillator detectors. In this paper we present a comprehensive assessment of JUNO's potential for detecting $^8$B solar neutrinos via the neutrino-electron elastic scattering process. A reduced 2~MeV threshold on the recoil electron energy is found to be achievable assuming the intrinsic radioactive background $^{238}$U and $^{232}$Th in the liquid scintillator can be controlled to 10$^{-17}$~g/g. With ten years of data taking, about 60,000 signal and 30,000 background events are expected. This large sample will enable an examination of the distortion of the recoil electron spectrum that is dominated by the neutrino flavor transformation in the dense solar matter, which will shed new light on the tension between the measured electron spectra and the predictions of the standard three-flavor neutrino oscillation framework. If $Δm^{2}_{21}=4.8\times10^{-5}~(7.5\times10^{-5})$~eV$^{2}$, JUNO can provide evidence of neutrino oscillation in the Earth at the about 3$σ$~(2$σ$) level by measuring the non-zero signal rate variation with respect to the solar zenith angle. Moveover, JUNO can simultaneously measure $Δm^2_{21}$ using $^8$B solar neutrinos to a precision of 20\% or better depending on the central value and to sub-percent precision using reactor antineutrinos. A comparison of these two measurements from the same detector will help elucidate the current tension between the value of $Δm^2_{21}$ reported by solar neutrino experiments and the KamLAND experiment.

preprint2020arXiv

Fluctuations of a nonlinear stochastic heat equation in dimensions three and higher

We study the solution to a nonlinear stochastic heat equation in $d\geq 3$. The equation is driven by a Gaussian multiplicative noise that is white in time and smooth in space. For a small coupling constant, we prove (i) the solution converges to the stationary distribution in large time; (ii) the diffusive scale fluctuations are described by the Edwards-Wilkinson equation.

preprint2020arXiv

TAO Conceptual Design Report: A Precision Measurement of the Reactor Antineutrino Spectrum with Sub-percent Energy Resolution

The Taishan Antineutrino Observatory (TAO, also known as JUNO-TAO) is a satellite experiment of the Jiangmen Underground Neutrino Observatory (JUNO). A ton-level liquid scintillator detector will be placed at about 30 m from a core of the Taishan Nuclear Power Plant. The reactor antineutrino spectrum will be measured with sub-percent energy resolution, to provide a reference spectrum for future reactor neutrino experiments, and to provide a benchmark measurement to test nuclear databases. A spherical acrylic vessel containing 2.8 ton gadolinium-doped liquid scintillator will be viewed by 10 m^2 Silicon Photomultipliers (SiPMs) of >50% photon detection efficiency with almost full coverage. The photoelectron yield is about 4500 per MeV, an order higher than any existing large-scale liquid scintillator detectors. The detector operates at -50 degree C to lower the dark noise of SiPMs to an acceptable level. The detector will measure about 2000 reactor antineutrinos per day, and is designed to be well shielded from cosmogenic backgrounds and ambient radioactivities to have about 10% background-to-signal ratio. The experiment is expected to start operation in 2022.

preprint2019arXiv

Improved Planetary Rover Inertial Navigation and Wheel Odometry Performance through Periodic Use of Zero-Type Constraints

We present an approach to enhance wheeled planetary rover dead-reckoning localization performance by leveraging the use of zero-type constraint equations in the navigation filter. Without external aiding, inertial navigation solutions inherently exhibit cubic error growth. Furthermore, for planetary rovers that are traversing diverse types of terrain, wheel odometry is often unreliable for use in localization, due to wheel slippage. For current Mars rovers, computer vision-based approaches are generally used whenever there is a high possibility of positioning error; however, these strategies require additional computational power, energy resources, and significantly slow down the rover traverse speed. To this end, we propose a navigation approach that compensates for the high likelihood of odometry errors by providing a reliable navigation solution that leverages non-holonomic vehicle constraints as well as state-aware pseudo-measurements (e.g., zero velocity and zero angular rate) updates during periodic stops. By using this, computationally expensive visual-based corrections could be performed less often. Experimental tests that compare against GPS-based localization are used to demonstrate the accuracy of the proposed approach. The source code, post-processing scripts, and example datasets associated with the paper are published in a public repository.

preprint2016arXiv

High order correctors and two-scale expansions in stochastic homogenization

In this paper, we study high order correctors in stochastic homogenization. We consider elliptic equations in divergence form on $\mathbb{Z}^d$, with the random coefficients constructed from i.i.d. random variables. We prove moment bounds on the high order correctors and their gradients under dimensional constraints. It implies the existence of stationary correctors and stationary gradients in high dimensions. As an application, we prove a two-scale expansion of the solutions to the random PDE, which identifies the first and higher order random fluctuations in a strong sense.

preprint2016arXiv

Near-Optimal Disjoint-Path Facility Location Through Set Cover by Pairs

In this paper we consider two special cases of the "cover-by-pairs" optimization problem that arise when we need to place facilities so that each customer is served by two facilities that reach it by disjoint shortest paths. These problems arise in a network traffic monitoring scheme proposed by Breslau et al. and have potential applications to content distribution. The "set-disjoint" variant applies to networks that use the OSPF routing protocol, and the "path-disjoint" variant applies when MPLS routing is enabled, making better solutions possible at the cost of greater operational expense. Although we can prove that no polynomial-time algorithm can guarantee good solutions for either version, we are able to provide heuristics that do very well in practice on instances with real-world network structure. Fast implementations of the heuristics, made possible by exploiting mathematical observations about the relationship between the network instances and the corresponding instances of the cover-by-pairs problem, allow us to perform an extensive experimental evaluation of the heuristics and what the solutions they produce tell us about the effectiveness of the proposed monitoring scheme. For the set-disjoint variant, we validate our claim of near-optimality via a new lower-bounding integer programming formulation. Although computing this lower bound requires solving the NP-hard Hitting Set problem and can underestimate the optimal value by a linear factor in the worst case, it can be computed quickly by CPLEX, and it equals the optimal solution value for all the instances in our extensive testbed.

preprint2016arXiv

On generalized Gaussian free fields and stochastic homogenization

We study a generalization of the notion of Gaussian free field (GFF). Although the extension seems minor, we first show that a generalized GFF does not satisfy the spatial Markov property, unless it is a classical GFF. In stochastic homogenization, the scaling limit of the corrector is a possibly generalized GFF described in terms of an "effective fluctuation tensor" that we denote by $\mathsf{Q}$. We prove an expansion of $\mathsf{Q}$ in the regime of small ellipticity ratio. This expansion shows that the scaling limit of the corrector is not necessarily a classical GFF, and in particular does not necessarily satisfy the Markov property.

preprint2016arXiv

Stationary patterns and their selection mechanism of Urban crime models with heterogeneous near-repeat victimization effect

In this paper, we study two PDEs that generalize the urban crime model proposed by Short \emph{et al}. [Math. Models Methods Appl. Sci., 18 (2008), pp. 1249-1267]. Our modifications are made under assumption of the spatial heterogeneity of both the near-repeat victimization effect and the dispersal strategy of criminal agents. We investigate pattern formations in the reaction-advection-diffusion systems with nonlinear diffusion over multi-dimensional bounded domains subject to homogeneous Neumann boundary conditions. It is shown that the positive homogeneous steady state loses its stability as the intrinsic near-repeat victimization rate $ε$ decreases and spatially nonconstant solutions emerge through bifurcation. Moreover, we find the wavemode selection mechanism through rigorous stability analysis of these nontrivial patterns, which shows that the only stable pattern must have wavenumber that maximizes the bifurcation value. Based on this wavemode selection mechanism, we will be able to precisely predict the formation of stable aggregates of the house attractiveness and criminal population density, at least when the diffusion rate $ε$ is around the principal bifurcation value. Our theoretical results also suggest that large domains support more stable aggregates than small domains. Finally, we perform extensive numerical simulations over 1D intervals and 2D squares to illustrate and verify our theoretical findings. Our numerics also include some interesting phenomena such as the merging of two interior spikes and the emerging of new spikes, etc. These nontrivial solutions can model the well observed aggregation phenomenon in urban criminal activities.

preprint2015arXiv

A central limit theorem for fluctuations in one dimensional stochastic homogenization

In this paper, we analyze the random fluctuations in a one dimensional stochastic homogenization problem and prove a central limit result, i.e., the first order fluctuations can be described by a Gaussian process that solves an SPDE with additive spatial white noise. Using a probabilistic approach, we obtain a precise error decomposition up to the first order, which helps to decompose the limiting Gaussian process, with one of the components corresponding to the corrector obtained by a formal two scale expansion.

preprint2015arXiv

Magnetic rotational spectroscopy for probing rheology of nanoliter droplets and thin films

In-situ characterization of minute amounts of complex fluids is a challenge. Magnetic Rotational Spectroscopy (MRS) with submicron probes offers flexibility and accuracy providing desired spatial and temporal resolution in characterization of nanoliter droplets and thin films when other methods fall short. MRS analyzes distinct features of the in-plane rotation of a magnetic probe, when its magnetic moment makes full revolution following an external rotating magnetic field. The probe demonstrates a distinguishable movement which changes from rotation to tumbling to trembling as the frequency of rotation of the driving magnetic field changes. In practice, MRS has been used in analysis of gelation of thin polymer films, ceramic precursors, and nanoliter droplets of insect biofluids. MRS is a young field, but it has many potential applications requiring rheological characterization of scarcely available, chemically reacting complex fluids.

preprint2015arXiv

Pointwise two-scale expansion for parabolic equations with random coefficients

We investigate the first-order correction in the homogenization of linear parabolic equations with random coefficients. In dimension $3$ and higher and for coefficients having a finite range of dependence, we prove a pointwise version of the two-scale expansion. A similar expansion is derived for elliptic equations in divergence form. The result is surprising, since it was not expected to be true without further symmetry assumptions on the law of the coefficients.

preprint2015arXiv

Scaling limit of fluctuations in stochastic homogenization

We investigate the global fluctuations of solutions to elliptic equations with random coefficients in the discrete setting. In dimension $d\geq 3$ and for i.i.d.\ coefficients, we show that after a suitable scaling, these fluctuations converge to a Gaussian field that locally resembles a (generalized) Gaussian free field. The paper begins with a heuristic derivation of the result, which can be read independently and was obtained jointly with Scott Armstrong.

preprint2015arXiv

The random Schrödinger equation: homogenization in time-dependent potentials

We analyze the solutions of the Schrödinger equation with the low frequency initial data and a time-dependent weakly random potential. We prove a homogenization result for the low frequency component of the wave field. We also show that the dynamics generates a non-trivial energy in the high frequencies, which do not homogenize -- the high frequency component of the wave field remains random and the evolution of its energy is described by a kinetic equation. The transition from the homogenization of the low frequencies to the random limit of the high frequencies is illustrated by understanding the size of the small random fluctuations of the low frequency component.

preprint2015arXiv

The random Schrödinger equation: slowly decorrelating time-dependent potentials

We analyze the weak-coupling limit of the random Schrödinger equation with low frequency initial data and a slowly decorrelating random potential. For the probing signal with a sufficiently long wavelength, we prove a homogenization result, that is, the properly compensated wave field admits a deterministic limit in the "very low" frequency regime. The limit is "anomalous" in the sense that the solution behaves as $\exp(-Dt^{s})$ with $s>1$ rather than the "usual"~$\exp(-Dt)$ homogenized behavior when the random potential is rapidly decorrelating. Unlike in rapidly decorrelating potentials, as we decrease the wavelength of the probing signal, stochasticity appears in the asymptotic limit -- there exists a critical scale depending on the random potential which separates the deterministic and stochastic regimes.

preprint2014arXiv

An invariance principle for Brownian motion in random scenery

We prove an invariance principle for Brownian motion in Gaussian or Poissonian random scenery by the method of characteristic functions. Annealed asymptotic limits are derived in all dimensions, with a focus on the case of dimension $d=2$, which is the main new contribution of the paper.

preprint2014arXiv

Fluctuations of Parabolic Equations with Large Random Potentials

In this paper, we present a fluctuation analysis of a type of parabolic equations with large, highly oscillatory, random potentials around the homogenization limit. With a Feynman-Kac representation, the Kipnis-Varadhan's method, and a quantitative martingale central limit theorem, we derive the asymptotic distribution of the rescaled error between heterogeneous and homogenized solutions under different assumptions in dimension $d\geq 3$. The results depend highly on whether a stationary corrector exits.

preprint2014arXiv

Homogenization of Parabolic Equations with Large Time-dependent Random Potential

This paper concerns the homogenization problem of a parabolic equation with large, time-dependent, random potentials in high dimensions $d\geq 3$. Depending on the competition between temporal and spatial mixing of the randomness, the homogenization procedure turns to be different. We characterize the difference by proving the corresponding weak convergence of Brownian motion in random scenery. When the potential depends on the spatial variable macroscopically, we prove a convergence to SPDE.

preprint2014arXiv

Weak Convergence Approach for Parabolic Equations with Large, Highly Oscillatory, Random Potential

This paper concerns the macroscopic behavior of solutions to parabolic equations with large, highly oscillatory, random potential. When the correlation function of the random potential satisfies a specific integrability condition, we show that the random solution converges, as the correlation length of the medium tends to zero, to the deterministic solution of a homogenized equation in dimension $d\geq3$. Our derivation is based on a Feynman-Kac probabilistic representation and the Kipnis-Varadhan method applied to weak convergence of Brownian motions in random sceneries. For sufficiently mixing coefficients, we also provide an optimal rate of convergence to the homogenized limit using a quantitative martingale central limit theorem. As soon as the above integrability condition fails, the solution is expected to remain stochastic in the limit of a vanishing correlation length. For a large class of potentials given as functionals of Gaussian fields, we show the convergence of solutions to stochastic partial differential equations (SPDE) with multiplicative noise. The Feynman-Kac representation and the corresponding weak convergence of Brownian motions in random sceneries allows us to explain the transition from deterministic to stochastic limits as a function of the correlation function of the random potential.

preprint2012arXiv

Radiative Transport Limit of Dirac Equations with Random Electromagnetic Field

This paper concerns the kinetic limit of the Dirac equation with random electromagnetic field. We give a detailed mathematical analysis of the radiative transport limit for the phase space energy density of solutions to the Dirac equation. Our derivation is based on a martingale method and a perturbed test function expansion. This requires the electromagnetic field to be a space-time random field. The main mathematical tool in the derivation of the kinetic limit is the matrix-valued Wigner transform of the vector-valued Dirac solution. The major novelty compared to the scalar (Schrödinger) case is the proof of convergence of cross-modes to 0 weakly in space and almost surely in probability. The propagating modes are shown to converge in an appropriate strong sense to their deterministic limit.

Yu Gu

What is connected

Connect this record

See the researcher in context

Building this map preview

41 published item(s)

Long-Chain Reasoning Distillation via Adaptive Prefix Alignment

Pause and Reflect: Conformal Aggregation for Chain-of-Thought Reasoning

Revealing the Attention Floating Mechanism in Masked Diffusion Models

What-If Analysis of Large Language Models: Explore the Game World Using Proactive Thinking

Multichannel AV-wav2vec2: A Framework for Learning Multichannel Multi-Modal Speech Representation

Fluctuations of the winding number of a directed polymer on a cylinder

A Framework for Controlling Multi-Robot Systems Using Bayesian Optimization and Linear Combination of Vectors

Combining deep learning and crowdsourcing geo-images to predict housing quality in rural China

Gaussian fluctuations of replica overlap in directed polymers

HTGN-BTW: Heterogeneous Temporal Graph Network with Bi-Time-Window Training Strategy for Temporal Link Prediction

Mass Testing and Characterization of 20-inch PMTs for JUNO

Proprioceptive Slip Detection for Planetary Rovers in Perceptually Degraded Extraterrestrial Environments

A forward-backward SDE from the 2D nonlinear stochastic heat equation

A quenched local limit theorem for stochastic flows

Beyond I.I.D.: Three Levels of Generalization for Question Answering on Knowledge Bases

ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders

INSQ: An Influential Neighbor Set Based Moving kNN Query Processing System

JUNO Physics and Detector

Search Planning of a UAV/UGV Team with Localization Uncertainty in a Subterranean Environment

BeSense: Leveraging WiFi Channel Data and Computational Intelligence for Behavior Analysis

BrePartition: Optimized High-Dimensional kNN Search with Bregman Distances

Cooperative Navigation Using Pairwise Communication with Ranging and Magnetic Anomaly Measurements

Feasibility and physics potential of detecting $^8$B solar neutrinos at JUNO

Fluctuations of a nonlinear stochastic heat equation in dimensions three and higher

TAO Conceptual Design Report: A Precision Measurement of the Reactor Antineutrino Spectrum with Sub-percent Energy Resolution

Improved Planetary Rover Inertial Navigation and Wheel Odometry Performance through Periodic Use of Zero-Type Constraints

High order correctors and two-scale expansions in stochastic homogenization

Near-Optimal Disjoint-Path Facility Location Through Set Cover by Pairs

On generalized Gaussian free fields and stochastic homogenization

Stationary patterns and their selection mechanism of Urban crime models with heterogeneous near-repeat victimization effect

A central limit theorem for fluctuations in one dimensional stochastic homogenization

Magnetic rotational spectroscopy for probing rheology of nanoliter droplets and thin films

Pointwise two-scale expansion for parabolic equations with random coefficients

Scaling limit of fluctuations in stochastic homogenization

The random Schrödinger equation: homogenization in time-dependent potentials

The random Schrödinger equation: slowly decorrelating time-dependent potentials

An invariance principle for Brownian motion in random scenery

Fluctuations of Parabolic Equations with Large Random Potentials

Homogenization of Parabolic Equations with Large Time-dependent Random Potential

Weak Convergence Approach for Parabolic Equations with Large, Highly Oscillatory, Random Potential

Radiative Transport Limit of Dirac Equations with Random Electromagnetic Field