Source author record

Le He

Le He appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory Machine Learning math.IT Artificial Intelligence physics.optics

Catalog footprint

What is connected

4works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Generative Actor-Critic with Soft Bridge Policies

Expressive generative policies such as diffusion and flow models are appealing for MaxEnt online reinforcement learning because of their ability to model multimodal and highly non-Gaussian action distributions. However, training effective soft generative policies faces two obstacles that often arise together. First, marginal action densities are often unavailable, so existing methods typically rely on entropy bounds, heuristic proxies or approximations. Second, iterative shared-parameter samplers raise inference cost and require backpropagation through time over repeated network evaluations, increasing memory cost and destabilizing policy optimization. These obstacles motivate us to seek a generative policy that exposes a tractable MaxEnt objective while requiring only a single sampled actor forward pass for action generation. To this end, we propose soft generative actor-critic (SoftGAC), whose actor defines a stochastic bridge from a fixed base latent to a terminal action latent in pre-tanh space. This structured bridge allows us to lift the MaxEnt objective as an analytically tractable path-wise relative-entropy objective against a high-entropy reference process. In practical finite-step implementation, this relative entropy reduces exactly to sampled transition control energy and thus provides principled soft regularization. Moreover, we keep the single-pass actor lightweight by using small step-specific bridge transitions, each evaluated only once per sampled action, while maintaining a parameter budget comparable to strong actor baselines. Extensive experiments on challenging continuous-control benchmarks show that SoftGAC attains higher or competitive returns than strong generative policy baselines, including diffusion and flow-matching policies, while staying in the low-latency regime of one-pass actors and showing considerable improvements in the compute-return tradeoff.

preprint2022arXiv

Towards Optimally Efficient Search with Deep Learning for Large-Scale MIMO Systems

This paper investigates the optimal signal detection problem with a particular interest in large-scale multiple-input multiple-output (MIMO) systems. The problem is NP-hard and can be solved optimally by searching the shortest path on the decision tree. Unfortunately, the existing optimal search algorithms often involve prohibitively high complexities, which indicates that they are infeasible in large-scale MIMO systems. To address this issue, we propose a general heuristic search algorithm, namely, hyper-accelerated tree search (HATS) algorithm. The proposed algorithm employs a deep neural network (DNN) to estimate the optimal heuristic, and then use the estimated heuristic to speed up the underlying memory-bounded search algorithm. This idea is inspired by the fact that the underlying heuristic search algorithm reaches the optimal efficiency with the optimal heuristic function. Simulation results show that the proposed algorithm reaches almost the optimal bit error rate (BER) performance in large-scale systems, while the memory size can be bounded. In the meanwhile, it visits nearly the fewest tree nodes. This indicates that the proposed algorithm reaches almost the optimal efficiency in practical scenarios, and thereby it is applicable for large-scale systems. Besides, the code for this paper is available at \url{https://github.com/skypitcher/hats}.

preprint2021arXiv

Learning based signal detection for MIMO systems with unknown noise statistics

This paper aims to devise a generalized maximum likelihood (ML) estimator to robustly detect signals with unknown noise statistics in multiple-input multiple-output (MIMO) systems. In practice, there is little or even no statistical knowledge on the system noise, which in many cases is non-Gaussian, impulsive and not analyzable. Existing detection methods have mainly focused on specific noise models, which are not robust enough with unknown noise statistics. To tackle this issue, we propose a novel ML detection framework to effectively recover the desired signal. Our framework is a fully probabilistic one that can efficiently approximate the unknown noise distribution through a normalizing flow. Importantly, this framework is driven by an unsupervised learning approach, where only the noise samples are required. To reduce the computational complexity, we further present a low-complexity version of the framework, by utilizing an initial estimation to reduce the search space. Simulation results show that our framework outperforms other existing algorithms in terms of bit error rate (BER) in non-analytical noise environments, while it can reach the ML performance bound in analytical noise environments. The code of this paper is available at https://github.com/skypitcher/manfe.

preprint2016arXiv

Light Drag Effect of Vacuum Tube Versus Light Propagation in Stationary Vacuum Tube with Moving Source and Receiver

We presented a new way to examine the principle of relativity of Special Relativity. According to the principle of relativity, the light dragging by moving media and the light propagation in stationary media with moving source and receiver should be two totally equivalent phenomena. We select a vacuum tube with two glass rods at two ends as the optical media. The length of the middle vacuum cell is L and the thicknesses of the glass rods with refractive index n are D1 and D2. The light drag effect of the moving vacuum tube with speed v is a first-order effect, delta t = 2(n-1)(D1+D2)v/c^2, which is independent of L because vacuum does not perform a drag effect. Predicted by the principle of relativity, the change of the light propagation time interval with stationary vacuum tube and moving source and receiver must be the same, i.e., delta tao = delta t = 2(n-1)(D1+D2)v/c^2. However all analyses have shown that the change of the propagation time interval delta tao is caused by the motion of the receiver during the light propagation in the vacuum tube. Thus, the contribution of the glass rods in delta tao is 2n(D1+D2)v/c^2, not 2(n-1)(D1+D2)v/c^2 in delta t. Importantly, the contribution of the vacuum cell in delta tao is 2Lv/c^2, not zero in delta t. Our analyses are solid in optics. The genuine tests of the prediction of the principle of relativity can be conducted by the experiments with two atomic clocks, or the experiments with fiber Sagnac interferometers.