Source author record

Hao Yu

Hao Yu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

41works

35topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A Unified Pair-GRPO Family: From Implicit to Explicit Preference Constraints for Stable and General RL Alignment

Large language model (LLM) alignment via reinforcement learning from human preferences (RLHF) suffers from unstable policy updates, ambiguous gradient directions, poor interpretability, and high gradient variance in mainstream pairwise preference learning paradigms. To systematically address these limitations, we establish a unified theoretical framework for preference-based RL optimization centered on the Pair-GRPO family, comprising two tightly coupled variants: Soft-Pair-GRPO and Hard-Pair-GRPO. Soft-Pair-GRPO is a minimal modification of Group Relative Policy Optimization (GRPO) that replaces group-normalized scalar rewards with binary pairwise preference rewards, retaining GRPO's clipped surrogate and KL-regularized structure. We prove a critical gradient equivalence theorem: under first-order Taylor expansion around the current policy, Soft-Pair-GRPO's gradient is a positive scalar multiple of standard GRPO's gradient, explaining its empirical stability despite discarding continuous reward magnitudes. Building on this foundation, we propose Hard-Pair-GRPO, an advanced variant introducing explicit local probability constraints and constrained KL-fitting optimization to further suppress gradient noise and global policy drift. We provide comprehensive theoretical guarantees for both variants--including monotonic policy improvement, deterministic gradient direction, gradient-variance reduction, and dynamic step-size convergence. Extensive experiments on standard LLM alignment benchmarks (HH-RLHF,UltraFeedback) and the MuJoCo continuous control task HalfCheetah-v4 demonstrate that our Pair-GRPO family consistently outperforms state-of-the-art baselines in alignment quality, human preference win rate, training stability, and generalization to general reinforcement learning. Ablation studies validate the critical contributions of each core component.

preprint2023arXiv

Thick branes in Born-Infeld determinantal gravity in Weitzenböck spacetime

By adopting the idea of Born-Infeld electromagnetism, the Born-Infeld determinantal gravity in Weitzenböck spacetime provides a way to smooth the Big Bang singularity at the classical level. We consider a thick braneworld scenario in the higher-dimensional extension of this gravity, and investigate the torsion effects on the brane structure and gravitational perturbation. For three particular parameter choices, analytic domain wall solutions are obtained. They have a similar brane configuration that the brane thickness becomes thinner as the spacetime torsion gets stronger. For each model, the massless graviton is localized on the brane with the width of localization decreasing with the enhancement of the spacetime torsion, while the massive gravitons propagate in the bulk and contribute a correction term proportional to ${1}/{(k r)^{3}}$ to the Newtonian potential. A sparsity constraint on the fundamental 5-dimensional gravitational scale is estimated from the gravitational experiment. Moreover, the parameter ranges in which the Kaluza-Klein gravitons are tachyonic free are analyzed.

preprint2022arXiv

Hamilton Cycles In Primitive Graphs of Order $2rs$

After long term efforts, it was recently proved in \cite{DKM2} that except for the Peterson graph, every connected vertex-transitive graph of order $rs$ has a Hamilton cycle, where $r$ and $s$ are primes. A natural topic is to solve the hamiltonian problem for connected vertex-transitive graphs of $2rs$. This topic is quite trivial, as the problem is still unsolved even for that of $r=3$. In this paper, it is shown that except for the Coxeter graph, every connected vertex-transitive graph of order $2rs$ contains a Hamilton cycle, provided the automorphism group acts primitively on vertices.

preprint2022arXiv

Leveraging Affect Transfer Learning for Behavior Prediction in an Intelligent Tutoring System

In this work, we propose a video-based transfer learning approach for predicting problem outcomes of students working with an intelligent tutoring system (ITS). By analyzing a student's face and gestures, our method predicts the outcome of a student answering a problem in an ITS from a video feed. Our work is motivated by the reasoning that the ability to predict such outcomes enables tutoring systems to adjust interventions, such as hints and encouragement, and to ultimately yield improved student learning. We collected a large labeled dataset of student interactions with an intelligent online math tutor consisting of 68 sessions, where 54 individual students solved 2,749 problems. The dataset is public and available at https://www.cs.bu.edu/faculty/betke/research/learning/ . Working with this dataset, our transfer-learning challenge was to design a representation in the source domain of pictures obtained "in the wild" for the task of facial expression analysis, and transferring this learned representation to the task of human behavior prediction in the domain of webcam videos of students in a classroom environment. We developed a novel facial affect representation and a user-personalized training scheme that unlocks the potential of this representation. We designed several variants of a recurrent neural network that models the temporal structure of video sequences of students solving math problems. Our final model, named ATL-BP for Affect Transfer Learning for Behavior Prediction, achieves a relative increase in mean F-score of 50% over the state-of-the-art method on this new dataset.

preprint2022arXiv

Multi-Scale Context-Guided Lumbar Spine Disease Identification with Coarse-to-fine Localization and Classification

Accurate and efficient lumbar spine disease identification is crucial for clinical diagnosis. However, existing deep learning models with millions of parameters often fail to learn with only hundreds or dozens of medical images. These models also ignore the contextual relationship between adjacent objects, such as between vertebras and intervertebral discs. This work introduces a multi-scale context-guided network with coarse-to-fine localization and classification, named CCF-Net, for lumbar spine disease identification. Specifically, in learning, we divide the localization objective into two parallel tasks, coarse and fine, which are more straightforward and effectively reduce the number of parameters and computational cost. The experimental results show that the coarse-to-fine design presents the potential to achieve high performance with fewer parameters and data requirements. Moreover, the multi-scale context-guided module can significantly improve the performance by 6.45% and 5.51% with ResNet18 and ResNet50, respectively. Our code is available at https://github.com/czifan/CCFNet.pytorch.

preprint2022arXiv

Topological superconductivity in a topological insulator

Topological superconductivity is an exotic quantum phenomenon with coupled nontrivial topological order and superconductivity together. A direct idea for producing topological superconductors is to create superconductivity based on the well recognized topological insulators. The topological insulating states in highly efficient thermoelectric materials Bi$_2$Te$_3$ and Bi$_2$Se$_3$ and their alloy Bi$_{2}$Te$_{3-x}$Se$_{x}$ have been established from angle-resolved photoemission and transport experiments. Superconductivity was also observed based on these popular topological insulators by the application of pressure chemical dopant and heterostructures. However, the experiments mainly focusing on Bi$_{2}$Se$_3$ doped by metals have not provided the consistent evidence to support the topological superconductivity. Here we carry out a systematic high-pressure study on a topological insulator Bi$_{2}$Te$_{2.7}$Se$_{0.3}$ to provide the convincing evidence for the expected topological superconductivity. Four phases with different structures are found upon compression. The topological surface state is identified in the entire initial phase, while superconductivity is found to coexist with such a state of the compressed material after its passing the electronic topological transition, followed by three other superconducting phases without topological character. For these superconducting phases, we observe that the upper critical field follows with the temperature in the critical exponent ${2/3}$ for the first one with the topological surface state and $1$ for the left. These observations support the realization of the topological superconductivity in the initial phase according to the theoretically proposed critical field measure. This work also points out a big pool and new direction for finding topological superconductors from topological thermoelectric materials.

preprint2022arXiv

Training Vision Transformers with Only 2040 Images

Vision Transformers (ViTs) is emerging as an alternative to convolutional neural networks (CNNs) for visual recognition. They achieve competitive results with CNNs but the lack of the typical convolutional inductive bias makes them more data-hungry than common CNNs. They are often pretrained on JFT-300M or at least ImageNet and few works study training ViTs with limited data. In this paper, we investigate how to train ViTs with limited data (e.g., 2040 images). We give theoretical analyses that our method (based on parametric instance discrimination) is superior to other methods in that it can capture both feature alignment and instance similarities. We achieve state-of-the-art results when training from scratch on 7 small datasets under various ViT backbones. We also investigate the transferring ability of small datasets and find that representations learned from small datasets can even improve large-scale ImageNet training.

preprint2021arXiv

High-performance quantum entanglement generation via cascaded second-order nonlinear processes

In this paper, we demonstrate the generation of high-performance entangled photon-pairs in different degrees of freedom from a single piece of fiber pigtailed periodically poled LiNbO$_3$ (PPLN) waveguide. We utilize cascaded second-order nonlinear optical processes, i.e. second-harmonic generation (SHG) and spontaneous parametric down conversion (SPDC), to generate photon-pairs. Previously, the performance of the photon pairs is contaminated by Raman noise photons from the fiber pigtails. Here by integrating the PPLN waveguide with noise rejecting filters, we obtain a coincidence-to-accidental ratio (CAR) higher than 52,600 with photon-pair generation and detection rate of 52.3 kHz and 3.5 kHz, respectively. Energy-time, frequency-bin and time-bin entanglement is prepared by coherently superposing correlated two-photon states in these degrees of freedom, respectively. The energy-time entangled two-photon states achieve the maximum value of CHSH-Bell inequality of S=2.708$\pm$0.024 with a two-photon interference visibility of 95.74$\pm$0.86%. The frequency-bin entangled two-photon states achieve fidelity of 97.56$\pm$1.79% with a spatial quantum beating visibility of 96.85$\pm$2.46%. The time-bin entangled two-photon states achieve the maximum value of CHSH-Bell inequality of S=2.595$\pm$0.037 and quantum tomographic fidelity of 89.07$\pm$4.35%. Our results provide a potential candidate for quantum light source in quantum photonics.

preprint2021arXiv

Mixup Without Hesitation

Mixup linearly interpolates pairs of examples to form new samples, which is easy to implement and has been shown to be effective in image classification tasks. However, there are two drawbacks in mixup: one is that more training epochs are needed to obtain a well-trained model; the other is that mixup requires tuning a hyper-parameter to gain appropriate capacity but that is a difficult task. In this paper, we find that mixup constantly explores the representation space, and inspired by the exploration-exploitation dilemma in reinforcement learning, we propose mixup Without hesitation (mWh), a concise, effective, and easy-to-use training algorithm. We show that mWh strikes a good balance between exploration and exploitation by gradually replacing mixup with basic data augmentation. It can achieve a strong baseline with less training time than original mixup and without searching for optimal hyper-parameter, i.e., mWh acts as mixup without hesitation. mWh can also transfer to CutMix, and gain consistent improvement on other machine learning and computer vision tasks such as object detection. Our code is open-source and available at https://github.com/yuhao318/mwh

preprint2021arXiv

On the correspondence between energy conservation and energy-momentum tensor conservation in cosmology

The correspondence between the thermodynamic energy equation satisfied by a closed co-moving volume and the conservation equation satisfied by the energy-momentum tensor of the matter inside the co-moving volume is extended to a more general system with an arbitrary cosmological horizon and a heat source. The energy of the system consisting of a cosmological horizon and its internal matter could be conserved by defining a surface energy on the horizons. Therefore, energy conservation and energy-momentum tensor conservation can always be consistent for such a system. On the other hand, from the perspective of classical thermodynamics, one can define an effective pressure at the cosmological horizon to guarantee that the thermodynamic energy equation inside the horizon is consistent with the energy-momentum tensor conservation equation of the matter inside the horizon. These systems can satisfy the generalized second law of thermodynamics under appropriate conditions. The definitions of the surface energy and the effective pressure are extended to the gravity theory with non-minimal coupling between geometry and matter, in which geometry could be regarded as a heat source.

preprint2021arXiv

Spectrally multiplexed heralded single photon source at telecom-band

Heralded single photon source (HSPS) is an important way in generating genuine single photon, having advantages of experimental simplicity and versatility. However, HSPS intrinsically suffers from the trade-off between the heralded single photon rate and the single photon purity. To overcome this, one can apply multiplexing technology in different degrees of freedom to enhance the performance of HSPS. Here, by employing spectral multiplexing and active feed-forward spectral manipulating, we demonstrate a HSPS at 1.5 μm telecom-band. Our experimental results show that the spectral multiplexing effectively erases the frequency correlation of pair source and significantly improves the heralded single photon rate while keeping the g{^(^2^)}(0) as low as 0.0006{\pm}0.0001. The Hong-Ou-Mandel interference between the heralded single photons and photons from an independent weak coherent source indicates a high indistinguishability. Our results pave a way for scalable HSPS by spectral multiplexing towards deterministic single photon emission.

preprint2020arXiv

A Low Complexity Algorithm with $O(\sqrt{T})$ Regret and $O(1)$ Constraint Violations for Online Convex Optimization with Long Term Constraints

This paper considers online convex optimization over a complicated constraint set, which typically consists of multiple functional constraints and a set constraint. The conventional online projection algorithm (Zinkevich, 2003) can be difficult to implement due to the potentially high computation complexity of the projection operation. In this paper, we relax the functional constraints by allowing them to be violated at each round but still requiring them to be satisfied in the long term. This type of relaxed online convex optimization (with long term constraints) was first considered in Mahdavi et al. (2012). That prior work proposes an algorithm to achieve $O(\sqrt{T})$ regret and $O(T^{3/4})$ constraint violations for general problems and another algorithm to achieve an $O(T^{2/3})$ bound for both regret and constraint violations when the constraint set can be described by a finite number of linear constraints. A recent extension in \citet{Jenatton16ICML} can achieve $O(T^{\max\{θ,1-θ\}})$ regret and $O(T^{1-θ/2})$ constraint violations where $θ\in (0,1)$. The current paper proposes a new simple algorithm that yields improved performance in comparison to prior works. The new algorithm achieves an $O(\sqrt{T})$ regret bound with $O(1)$ constraint violations.

preprint2020arXiv

A Very Simple Estimate Of Rational Homological Dimension Of Moduli Spaces Of Riemann Surfaces With Boundary And Marked Points

The moduli spaces of compact and connected Riemann surfaces has been a central topic in modern mathematics in recent years. Thus their homological dimensions become important invariants. Motivated by the emergence mathematical counterparts of open-closed string theory, we give an estimate of rational homological dimension of Riemann suraces with possible boundary and marked points(can lie on both interior and boundary). We hope it will have applications in open-closed theory, for example, open-closed Gromov-Witten theory in the future.

preprint2020arXiv

Constraint on the radius of five-dimensional dS spacetime with GW170817 and GRB 170817A

The recent detections of the gravitational wave (GW) event GW170817 and its electromagnetic counterpart GRB 170817A produced by a binary neutron star (NS) merger is a new milestone of multimessenger astronomy. The time interval between these two signals has attracted widespread attention from physicists. In the braneworld scenario, GWs could propagate through the bulk while electromagnetic waves (EMWs) are bounded on the brane, i.e., our Universe. Therefore, the trajectories of GWs and EMWs may follow different pathes. If GWs and EMWs are originated simultaneously from the same source on the brane, they are expected to arrive at the observer successively. Consequently, the time delay between GW170817 and GRB 170817A may carry the information of the extra dimension. In this paper, we try to investigate the phenomenon in the context of a five-dimensional dS ($\text{dS}_5$) spacetime. We first study two special Universe models, i.e., de Sitter and Einstein-de Sitter models, and calculate the gravitation horizon radius in each case. For the real Universe, we then consider the $Λ$CDM model. Our results show that for the de Sitter model of the Universe, the $\text{dS}_5$ radius could not contribute to the time delay. With the data of the observation, we constrain the $\text{dS}_5$ radius to $\ell\gtrsim7.5\times10^{2}\,\text{Tpc}$ for the Einstein-de Sitter model and $\ell\gtrsim2.4\times10^{3}\,\text{Tpc}$ for the $Λ$CDM model. After considering the uncertainty in the source redshift and the time-lags given by different astrophysical processes of the binary NS merger, we find that our constraints are not sensitive to the redshift in the range of (0.005, 0.01) and the time-lag in the range of (-100s, 1.734s).

preprint2020arXiv

Control of microswimmers by spiral nematic vortices: transition from individual to collective motion and contraction, expansion, and stable circulation of bacterial swirls

Active systems comprised of self-propelled units show fascinating transitions from Brownian-like dynamics to collective coherent motion. Swirling of swimming bacteria is a spectacular example. This study demonstrates that a nematic liquid crystal environment patterned as a spiral vortex controls individual-to-collective transition in bacterial swirls and defines whether they expand or shrink. In dilute dispersions, the bacteria swim along open spiral trajectories, following the pre-imposed molecular orientation. The trajectories are nonpolar. As their concentration exceeds some threshold, the bacteria condense into unipolar circular swirls resembling stable limit cycles. This collective circular motion is controlled by the spiral angle that defines the splay-to-bend ratio of the background director. Vortices with dominating splay shrink the swirls towards the center, while vortices with dominating bend expand them to the periphery. 45o spiraling vortices with splay-bend parity produce the most stable swirls. All the dynamic scenarios are explained by hydrodynamic interactions of bacteria mediated by the patterned passive nematic environment and by the coupling between the concentration and orientation. The acquired knowledge of how to control individual and collective motion of microswimmers by a nematic environment can help in the development of microscopic mechanical systems.

preprint2020arXiv

Dynamical mixing between $2^3S_1$ and $1^3D_1$ charmed mesons

In charmed $D$ and $D_s$ mesons sector, the matrix of a Hamiltonian in a quark potential model is computed in the $2^3S_1$ and $1^3D_1$ subspace. The masses of four mixed states of $2^3S_1$ and $1^3D_1$ denoted with $D^*_1(2635)$, $D^*_1(2739)$, $D^*_{s1}(2715)$ and $D^*_{s1}(2805)$ are obtained. It is an off-diagonal part of the spin-orbit tensor interaction that causes the mixing between the $2^3S_1$ and $1^3D_1$ states. The mixing angles between the $2^3S_1$ and $1^3D_1$ states are tiny. Under the mixing, a $^3P_0$ model is employed to compute the hadronic decay widths of all OZI-allowed decay channels of the four mixed states. The two light mixed states $D^*_1(2635)$ and $D^*_{s1}(2715)$ are close in mass to $D^*_J(2600)$ and $D^*_{s1}(2700)$, while the two heavy mixed states $D^*_1(2739)$ and $D^*_{s1}(2805)$ are lighter in mass than $D(2750)$ and $D^*_{s1}(2860)$. The mixing angles obtained from dynamical interaction are inconsistent with the mixing angles obtained from hadronic decay. Based on mass spectra and hadronic decay analyses, $D^*_J(2600)$, $D(2750)$, $D^*_{s1}(2700)$ and $D^*_{s1}(2860)$ are impossibly the mixed states of $2^3S_1$ and $1^3D_1$ at the small mixing angles. The inconsistence implies that $D^*_1(2760)$ and $D^*_{s1}(2860)$ have not been properly resolved from present experimental data, or there exist large unknown off-diagonal interactions that result in large mixing angles.

preprint2020arXiv

HOTCAKE: Higher Order Tucker Articulated Kernels for Deeper CNN Compression

The emerging edge computing has promoted immense interests in compacting a neural network without sacrificing much accuracy. In this regard, low-rank tensor decomposition constitutes a powerful tool to compress convolutional neural networks (CNNs) by decomposing the 4-way kernel tensor into multi-stage smaller ones. Building on top of Tucker-2 decomposition, we propose a generalized Higher Order Tucker Articulated Kernels (HOTCAKE) scheme comprising four steps: input channel decomposition, guided Tucker rank selection, higher order Tucker decomposition and fine-tuning. By subjecting each CONV layer to HOTCAKE, a highly compressed CNN model with graceful accuracy trade-off is obtained. Experiments show HOTCAKE can compress even pre-compressed models and produce state-of-the-art lightweight networks.

preprint2020arXiv

Particle and entropy production in the Running Vacuum Universe

We study particle production and the corresponding entropy increase in the context of cosmology with dynamical vacuum. We focus on the particular form that has been called "running vacuum model" (RVM), which is known to furnish a successful description of the overall current observations at a competitive level with the concordance $Λ$CDM model. It also provides an elegant global explanation of the cosmic history from a non-singular initial state in the very early universe up to our days and further into the final de Sitter era. The model has no horizon problem and provides an alternative explanation for the early inflation and its graceful exit, as well as a powerful mechanism for generating the large entropy of the current universe. The energy-momentum tensor of matter is generally non-conserved in such context owing to particle creation or annihilation. We analyze general thermodynamical aspects of particle and entropy production in the RVM. We first study the entropy of particles in the comoving volume during the early universe and late universe. Then, in order to obtain a more physical interpretation, we pay attention to the entropy contribution from the cosmological apparent horizon, its interior and its surface. On combining the inner volume entropy with the entropy on the horizon, we elucidate with detailed calculations whether the evolution of the entropy of the RVM universe satisfies the Generalized Second Law of Thermodynamics. We find it is so and we prove that the essential reason for it is the existence of a positive cosmological constant.

preprint2020arXiv

Polar jets of swimming bacteria condensed by a patterned liquid crystal

Active matter exhibits remarkable collective behavior in which flows, continuously generated by active particles, are intertwined with the orientational order of these particles. The relationship remains poorly understood as the activity and order are difficult to control independently. Here we demonstrate important facets of this interplay by exploring dynamics of swimming bacteria in a liquid crystalline environment with pre-designed periodic splay and bend in molecular orientation. The bacteria are expelled from the bend regions and condense into polar jets that propagate and transport cargo unidirectionally along the splay regions. The bacterial jets remain stable even when the local concentration exceeds the threshold of bending instability in a non-patterned system. Collective polar propulsion and different role of bend and splay are explained by an advection-diffusion model and by numerical simulations that treat the system as a two-phase active nematic. The ability of prepatterned liquid crystalline medium to streamline the chaotic movements of swimming bacteria into polar jets that can carry cargo along a predesigned trajectory opens the door for potential applications in cell sorting, microscale delivery and soft microrobotics.

preprint2020arXiv

Topology control of human cells monolayer by liquid crystal elastomer

Biological cells in living tissues form dynamic patterns with local orientational order and topological defects. Here we demonstrate an approach to produce cell monolayer with the predesigned orientational patterns using human dermal fibroblast cells (HDF) placed onto a photoaligned liquid crystal elastomer (LCE). The alignment of cells is caused by anisotropic swelling of the substrates in contact with the aqueous cell growth medium. The patterns predesigned in the LCE cause a strong spatial variation of cell phenotype (evidenced by shape variations), their surface density and number density fluctuations. The concentration of cells is significantly higher near the cores of positive-strength defects as compared to negative-strength defects. Unbinding of defect pairs intrinsic to active matter is suppressed by anisotropic surface anchoring. The geometry of arrays allows one to estimate the elastic and surface anchoring characteristics of the tissues. The demonstrated patterned LCE approach could be used to control the collective behavior of cells in living tissues, cell differentiation, and tissue morphogenesis.

preprint2020arXiv

Vibration-induced actuation of droplets on microstructured surfaces

When a liquid droplet impacts a vibrated micro-structured surface with asymmetric topology, the liquids perform a horizontal motion during its bouncing. The moving effect is observed when the liquid is in contact with a low surface energy surface (e.g. hydrophobic) and over a wide amplitude and frequency range. We propose that the motion direction of liquid droplets is driven by a force exerted by the unbalanced vapor flow between the contact of solid and the liquid due to the asymmetric geometry. We observe the levitation and movement dynamics of the droplet impacting on a vibrated micro-structured surface to reveal the processes responsible for the transitional regime between the moving, unmoved, and broken droplet as the vibration amplitude and frequency increases. Based on the insight provided by the experiment and on the analysis of the kinetic energy of the droplet, we develop a quantitative model for the dynamic movement and its dependence on the vibration characteristics.

preprint2019arXiv

Thick brane in reduced Horndeski theory

Horndeski theory is the most general scalar-tensor theory retaining second-order field equations, although the action includes higher-order terms. This is achieved by a special choice of coupling constants. In this paper, we investigate thick brane system in reduced Horndeski theory, especially the effect of the non-minimal derivative coupling on thick brane. First, the equations of motion are presented and a set of analytic background solutions are obtained. Then, to investigate the stability of the background scalar profile, we present a novel canonically normalized method, and show that although the original background scalar field is unstable, the canonical one is stable. The stability of the thick brane under tensor perturbation is also considered. It is shown that the tachyon is absent and the graviton zero mode can be localized on the brane. The localized graviton zero mode recovers the four-dimensional Newtonian potential and the presence of the non-minimal derivative coupling results in a splitting of its wave function. The correction of the massive graviton KK modes to the Newtonian potential is also analyzed briefly.

preprint2016arXiv

A Binary Convolutional Encoder-decoder Network for Real-time Natural Scene Text Processing

In this paper, we develop a binary convolutional encoder-decoder network (B-CEDNet) for natural scene text processing (NSTP). It converts a text image to a class-distinguished salience map that reveals the categorical, spatial and morphological information of characters. The existing solutions are either memory consuming or run-time consuming that cannot be applied to real-time applications on resource-constrained devices such as advanced driver assistance systems. The developed network can process multiple regions containing characters by one-off forward operation, and is trained to have binary weights and binary feature maps, which lead to both remarkable inference run-time speedup and memory usage reduction. By training with over 200, 000 synthesis scene text images (size of $32\times128$), it can achieve $90\%$ and $91\%$ pixel-wise accuracy on ICDAR-03 and ICDAR-13 datasets. It only consumes $4.59\ ms$ inference run-time realized on GPU with a small network size of 2.14 MB, which is up to $8\times$ faster and $96\%$ smaller than it full-precision version.

preprint2016arXiv

A Chessboard Model of Human Brain and One Application on Memory Capacity

The famous claim that we only use about 10% of the brain capacity has recently been challenged. Researchers argue that we are likely to use the whole brain, against the 10% claim. Some evidence and results from relevant studies and experiments related to memory in the field of neuroscience leads to the conclusion that if the rest 90% of the brain is not used, then many neural pathways would degenerate. What is memory? How does the brain function? What would be the limit of memory capacity? This article provides a model established upon the physiological and neurological characteristics of the human brain, which could give some theoretical support and scientific explanation to explain some phenomena. It may not only have theoretically significance in neuroscience, but could also be practically useful to fill in the gap between the natural and machine intelligence.

preprint2016arXiv

A Primal-Dual Type Algorithm with the $O(1/t)$ Convergence Rate for Large Scale Constrained Convex Programs

This paper considers large scale constrained convex programs, which are usually not solvable by interior point methods or other Newton-type methods due to the prohibitive computation and storage complexity for Hessians and matrix inversions. Instead, large scale constrained convex programs are often solved by gradient based methods or decomposition based methods. The conventional primal-dual subgradient method, aka, Arrow-Hurwicz-Uzawa subgradient method, is a low complexity algorithm with the $O(1/\sqrt{t})$ convergence rate, where $t$ is the number of iterations. If the objective and constraint functions are separable, the Lagrangian dual type method can decompose a large scale convex program into multiple parallel small scale convex programs. The classical dual gradient algorithm is an example of Lagrangian dual type methods and has convergence rate $O(1/\sqrt{t})$. Recently, a new Lagrangian dual type algorithm with faster $O(1/t)$ convergence is proposed in Yu and Neely (2015). However, if the objective or constraint functions are not separable, each iteration of the Lagrangian dual type method in Yu and Neely (2015) requires to solve a large scale unconstrained convex program, which can have huge complexity. This paper proposes a new primal-dual type algorithm, which only involves simple gradient updates at each iteration and has the $O(1/t)$ convergence rate.

preprint2016arXiv

A Probabilistic Sample Path Convergence Time Analysis of Drift-Plus-Penalty Algorithm for Stochastic Optimization

This paper considers the problem of minimizing the time average of a controlled stochastic process subject to multiple time average constraints on other related processes. The probability distribution of the random events in the system is unknown to the controller. A typical application is time average power minimization subject to network throughput constraints for different users in a network with time varying channel conditions. We show that with probability at least $1-2δ$, the classical drift-plus-penalty algorithm provides a sample path $\mathcal{O}(\varepsilon)$ approximation to optimality with a convergence time $\mathcal{O}(\frac{1}{\varepsilon^2}\max\left\{\log^2\frac1\varepsilon\log\frac2δ,~\log^3\frac2δ\right\})$, where $\varepsilon>0$ is a parameter related to the algorithm. When there is only one constraint, we further show that the convergence time can be improved to $\mathcal{O}\left(\frac{1}{\varepsilon^2}\log^2\frac1δ\right)$.

preprint2016arXiv

Current-Driven Domain Wall Motion: Velocity, Current and Phase Transition

The relation between domain wall motion and intensity of driven current is examined in a phenomenological theory where the kinetic energy is expanded as a series of polynomial function of current density just as the Landau phase transition theory. The dependency of velocity on current density is root square which degenerates into linear if the current is much higher than the critical value. The theory result is consistent with several previous experiments and also can explain the change of critical current in the presence of temperature. The role of temperature plays in the dynamics of domain wall motion is also discussed. The phase transition theory in terms of current density is employed to explain the critical behavior of domain wall motion.

preprint2016arXiv

Dispersion and Scaling Law of Dynamic Hysteresis Based on the Landau-Lifshitz-Gilbert Model

Hysteresis dispersion under a varying external field Hex is investigated through numerical simulations based on the Landau-Lifshitz-Gilbert (LLG) equation, indicating the energy dissipation can be determined by W(η) = A (f, H0). A linear relation between area of hysteresis and magnitude of external field is discovered. Evolution of hysteresis is also investigated under oscillating external field.

preprint2016arXiv

Dual flows in hyperbolic space and de Sitter space

We consider contracting flows in $(n+1)$-dimensional hyperbolic space and expanding flows in $(n+1)$-dimensional de Sitter space. When the flow hypersurfaces are strictly convex we relate the contracting hypersurfaces and the expanding hypersurfaces by the Gauss map. The contracting hypersurfaces shrink to a point $x_0$ in finite time while the expanding hypersurfaces converge to the maximal slice $\{ τ=0\}$. After rescaling, by the same scale factor, the resclaed contracting hypersurfaces converge to a unit geodesic sphere, while the rescaled expanding hypersufaces converge to slice $\{ τ= -1\}$ exponential fast in $C^\infty(\mathbb{S}^n)$.

preprint2016arXiv

Gravitational resonances on $f(R)$-brane

In this paper, we investigate various $f(R)$-brane models and compare their gravitational resonance structures with the corresponding general relativity (GR)-branes. {Starting from some known GR-brane solutions}, we derive thick $f(R)$-brane solutions such that the metric, scalar field, and scalar potential coincide with those of the corresponding GR-branes. {We find that for branes generated by a single or several canonical scalar fields, there is no obvious distinction between the GR-branes and corresponding $f(R)$-branes in terms of gravitational resonance structure.} Then we discuss the branes generated by K-fields. In this case, there could exist huge differences between GR-branes and $f(R)$-branes.

preprint2015arXiv

Black phosphorus as a new broadband saturable absorber for infrared passively Q-switched fiber lasers

Black phosphorus (BP) with its enticing electric and optical properties is intensely researched in the field of optoelectronics. In this paper, Q-switched pulses at 1550 nm and 2 um wavelengths are obtained by inserting bulk-structured BP based saturable absorber (SA) into an erbium-doped fiber laser (EDFL) and an thulium/holmium-doped fiber laser (THDFL), respectively. The BP-SA was prepared by depositing powered BP material on to the flat side of a side-polished single mode fiber. Q-switched 1550 nm pulses with width tuned from 9.35 to 31 us were obtained for the EDFL. For the THDFL, over 100 nm wavelength range could be achieved from 1832 to 1935 nm by adjusting the pump power. To the best of our knowledge, these results demonstrated the broadband saturable absorption property of BP and for the first time verified that BP as a new two-dimensional material for applications in saturable absorption devices.

preprint2015arXiv

Mid-infrared ultra-short mode-locked fiber laser utilizing topological insulator Bi2Te3 nano-sheets as the saturable absorber

The newly-emergent two-dimensional topological insulators (TIs) have shown their unique electronic and optical properties, such as good thermal management, high nonlinear refraction index and ultrafast relaxation time. Their narrow energy band gaps predict their optical absorption ability further into the mid-infrared region and their possibility to be very broadband light modulators ranging from the visible to the mid-infrared region. In this paper, a mid-infrared mode-locked fluoride fiber laser with TI Bi2Te3 nano-sheets as the saturable absorber is presented. Continuous wave lasing, Q-switched and continuous-wave mode-locking (CW-ML) operations of the laser are observed sequentially by increasing the pump power. The observed CW-ML pulse train has a pulse repetition rate of 10.4 MHz, a pulse width of ~6 ps, and a center wavelength of 2830 nm. The maximum achievable pulse energy is 8.6 nJ with average power up to 90 mW. This work forcefully demonstrates the promising applications of two-dimensional TIs for ultra-short laser operation and nonlinear optics in the mid-infrared region.

preprint2015arXiv

Tensor perturbations of Palatini $f(\mathcal{R})$-branes

We investigate the thick brane model in Palatini $f(\mathcal{R})$ gravity. The brane is generated by a real scalar field with a scalar potential. We solve the system analytically and obtain a series of thick brane solutions for the $f(\mathcal{R})=\mathcal{R}+α\mathcal{R}^2$-brane model. It is shown that tensor perturbations of the metric are stable for $df({\mathcal{R}})/d{\mathcal{R}}>0$. For nonconstant curvature solutions, the graviton zero mode can be localized on the brane, which indicates that the four-dimensional gravity can be recovered on the brane. Mass spectrum of graviton KK modes and their corrections to the Newtonian potential are also discussed.

preprint2015arXiv

The structure of $f(R)$-brane model

Recently, a family of interesting analytical brane solutions were found in $f(R)$ gravity with $f(R)=R+αR^2$ in Ref. [Phys. Lett. B 729, 127 (2014)]. In these solutions, inner brane structure can be turned on by tuning the value of the parameter $α$. In this paper, we investigate how the parameter $α$ affects the localization and the quasilocalization of the tensorial gravitons around these solutions. It is found that, in a range of $α$, despite the brane has an inner structure, there is no graviton resonance. However, in some other regions of the parameter space, although the brane has no internal structure, the effective potential for the graviton KK modes has a singular structure, and there exists a series of graviton resonant modes. The contribution of the massive graviton KK modes to the Newton's law of gravity is discussed shortly.

preprint2014arXiv

Study of the material photon and electron background and the liquid argon detector veto efficiency of the CDEX-10 experiment

The China Dark Matter Experiment (CDEX) is located at the China Jinping underground laboratory (CJPL) and aims to directly detect the WIMP flux with high sensitivity in the low mass region. Here we present a study of the predicted photon and electron backgrounds including the background contribution of the structure materials of the germanium detector, the passive shielding materials, and the intrinsic radioactivity of the liquid argon that serves as an anti-Compton active shielding detector. A detailed geometry is modeled and the background contribution has been simulated based on the measured radioactivities of all possible components within the GEANT4 program. Then the photon and electron background level in the energy region of interest (<10^-2 events kg-1 day-1 keV-1 (cpkkd)) is predicted based on Monte Carlo simulations. The simulated result is consistent with the design goal of CDEX-10 experiment, 0.1 cpkkd, which shows that the active and passive shield design of CDEX-10 is effective and feasible.

preprint2013arXiv

Duality Codes and the Integrality Gap Bound for Index Coding

This paper considers a base station that delivers packets to multiple receivers through a sequence of coded transmissions. All receivers overhear the same transmissions. Each receiver may already have some of the packets as side information, and requests another subset of the packets. This problem is known as the index coding problem and can be represented by a bipartite digraph. An integer linear program is developed that provides a lower bound on the minimum number of transmissions required for any coding algorithm. Conversely, its linear programming relaxation is shown to provide an upper bound that is achievable by a simple form of vector linear coding. Thus, the information theoretic optimum is bounded by the integrality gap between the integer program and its linear relaxation. In the special case when the digraph has a planar structure, the integrality gap is shown to be zero, so that exact optimality is achieved. Finally, for non-planar problems, an enhanced integer program is constructed that provides a smaller integrality gap. The dual of this problem corresponds to a more sophisticated partial clique coding strategy that time-shares between Reed-Solomon erasure codes. This work illuminates the relationship between index coding, duality, and integrality gaps between integer programs and their linear relaxations.

preprint2013arXiv

Introduction of the CDEX experiment

Weakly Interacting Massive Particles (WIMPs) are the candidates of dark matter in our universe. Up to now any direct interaction of WIMP with nuclei has not been observed yet. The exclusion limits of the spin-independent cross section of WIMP-nucleon which have been experimentally obtained is about 10^{-7}pb at high mass region and only 10^{-5}pb} at low mass region. China Jin-Ping underground laboratory CJPL is the deepest underground lab in the world and provides a very promising environment for direct observation of dark matter. The China Dark Matter Experiment (CDEX) experiment is going to directly detect the WIMP flux with high sensitivity in the low mass region. Both CJPL and CDEX have achieved a remarkable progress in recent two years. The CDEX employs a point-contact germanium semi-conductor detector PCGe whose detection threshold is less than 300 eV. We report the measurement results of Muon flux, monitoring of radioactivity and Radon concentration carried out in CJPL, as well describe the structure and performance of the 1 kg PCGe detector CDEX-1 and 10kg detector array CDEX-10 including the detectors, electronics, shielding and cooling systems. Finally we discuss the physics goals of the CDEX-1, CDEX-10 and the future CDEX-1T detectors.

preprint2013arXiv

The CDEX-1 1 kg Point-Contact Germanium Detector for Low Mass Dark Matter Searches

The CDEX Collaboration has been established for direct detection of light dark matter particles, using ultra-low energy threshold p-type point-contact germanium detectors, in China JinPing underground Laboratory (CJPL). The first 1 kg point-contact germanium detector with a sub-keV energy threshold has been tested in a passive shielding system located in CJPL. The outputs from both the point-contact p+ electrode and the outside n+ electrode make it possible to scan the lower energy range of less than 1 keV and at the same time to detect the higher energy range up to 3 MeV. The outputs from both p+ and n+ electrode may also provide a more powerful method for signal discrimination for dark matter experiment. Some key parameters, including energy resolution, dead time, decay times of internal X-rays, and system stability, have been tested and measured. The results show that the 1 kg point-contact germanium detector, together with its shielding system and electronics, can run smoothly with good performances. This detector system will be deployed for dark matter search experiments.

preprint2012arXiv

Stress in Spin-Valve Nanopillars due to Spin Transfer

We report a mechanical effect in spin-valve nanopillars due to spin transfer. A polarized current carrying electron spins transfers torque to local magnetization and leads to a magnetic switching of free layer. Like classical Einstein-de Haas effect, the conservation of angular momentum needs the free layer to offset the change of angular momentum and then a mechanical rotation occurs. The free layer is not free standing, so the mechanical angular momentum will be revealed as stress and strain. We study the effect of a spin induced stress in a nanopillar device with in-plane magnetization. Our calculations show that the tress in as device is dependent on frequency and the ratio of length/thickness and about 1 MPa at GHz. It is concluded that the stress owing to spin transfer is much less than the internal stress of film and does not introduce damage to the device.

preprint2010arXiv

Game Theoretical Power Control for Open-Loop Overlaid Network MIMO Systems with Partial Cooperation

Network MIMO is considered to be a key solution for the next generation wireless systems in breaking the interference bottleneck in cellular systems. In the MIMO systems, open-loop transmission scheme is used to support mobile stations (MSs) with high mobilities because the base stations (BSs) do not need to track the fast varying channel fading. In this paper, we consider an open-loop network MIMO system with $K$ BSs serving K private MSs and $M^c$ common MS based on a novel partial cooperation overlaying scheme. Exploiting the heterogeneous path gains between the private MSs and the common MSs, each of the $K$ BSs serves a private MS non-cooperatively and the $K$ BSs also serve the $M^c$ common MSs cooperatively. The proposed scheme does not require closed loop instantaneous channel state information feedback, which is highly desirable for high mobility users. Furthermore, we formulate the long-term distributive power allocation problem between the private MSs and the common MSs at each of the $K$ BSs using a partial cooperative game. We show that the long-term power allocation game has a unique Nash Equilibrium (NE) but standard best response update may not always converge to the NE. As a result, we propose a low-complexity distributive long-term power allocation algorithm which only relies on the local long-term channel statistics and has provable convergence property. Through numerical simulations, we show that the proposed open-loop SDMA scheme with long-term distributive power allocation can achieve significant performance advantages over the other reference baseline schemes.

preprint2010arXiv

Rank-Constrained Schur-Convex Optimization with Multiple Trace/Log-Det Constraints

Rank-constrained optimization problems have received an increasing intensity of interest recently, because many optimization problems in communications and signal processing applications can be cast into a rank-constrained optimization problem. However, due to the non-convex nature of rank constraints, a systematic solution to general rank-constrained problems has remained open for a long time. In this paper, we focus on a rank-constrained optimization problem with a Schur-convex/concave objective function and multiple trace/logdeterminant constraints. We first derive a structural result on the optimal solution of the rank-constrained problem using majorization theory. Based on the solution structure, we transform the rank-constrained problem into an equivalent problem with a unitary constraint. After that, we derive an iterative projected steepest descent algorithm which converges to a local optimal solution. Furthermore, we shall show that under some special cases, we can derive a closed-form global optimal solution. The numerical results show the superior performance of our proposed technique over the baseline schemes.

Hao Yu

What is connected

Connect this record

See the researcher in context

Building this map preview

41 published item(s)

A Unified Pair-GRPO Family: From Implicit to Explicit Preference Constraints for Stable and General RL Alignment

Thick branes in Born-Infeld determinantal gravity in Weitzenböck spacetime

Hamilton Cycles In Primitive Graphs of Order $2rs$

Leveraging Affect Transfer Learning for Behavior Prediction in an Intelligent Tutoring System

Multi-Scale Context-Guided Lumbar Spine Disease Identification with Coarse-to-fine Localization and Classification

Topological superconductivity in a topological insulator

Training Vision Transformers with Only 2040 Images

High-performance quantum entanglement generation via cascaded second-order nonlinear processes

Mixup Without Hesitation

On the correspondence between energy conservation and energy-momentum tensor conservation in cosmology

Spectrally multiplexed heralded single photon source at telecom-band

A Low Complexity Algorithm with $O(\sqrt{T})$ Regret and $O(1)$ Constraint Violations for Online Convex Optimization with Long Term Constraints

A Very Simple Estimate Of Rational Homological Dimension Of Moduli Spaces Of Riemann Surfaces With Boundary And Marked Points

Constraint on the radius of five-dimensional dS spacetime with GW170817 and GRB 170817A

Control of microswimmers by spiral nematic vortices: transition from individual to collective motion and contraction, expansion, and stable circulation of bacterial swirls

Dynamical mixing between $2^3S_1$ and $1^3D_1$ charmed mesons

HOTCAKE: Higher Order Tucker Articulated Kernels for Deeper CNN Compression

Particle and entropy production in the Running Vacuum Universe

Polar jets of swimming bacteria condensed by a patterned liquid crystal

Topology control of human cells monolayer by liquid crystal elastomer

Vibration-induced actuation of droplets on microstructured surfaces

Thick brane in reduced Horndeski theory

A Binary Convolutional Encoder-decoder Network for Real-time Natural Scene Text Processing

A Chessboard Model of Human Brain and One Application on Memory Capacity

A Primal-Dual Type Algorithm with the $O(1/t)$ Convergence Rate for Large Scale Constrained Convex Programs

A Probabilistic Sample Path Convergence Time Analysis of Drift-Plus-Penalty Algorithm for Stochastic Optimization

Current-Driven Domain Wall Motion: Velocity, Current and Phase Transition

Dispersion and Scaling Law of Dynamic Hysteresis Based on the Landau-Lifshitz-Gilbert Model

Dual flows in hyperbolic space and de Sitter space

Gravitational resonances on $f(R)$-brane

Black phosphorus as a new broadband saturable absorber for infrared passively Q-switched fiber lasers

Mid-infrared ultra-short mode-locked fiber laser utilizing topological insulator Bi2Te3 nano-sheets as the saturable absorber

Tensor perturbations of Palatini $f(\mathcal{R})$-branes

The structure of $f(R)$-brane model

Study of the material photon and electron background and the liquid argon detector veto efficiency of the CDEX-10 experiment

Duality Codes and the Integrality Gap Bound for Index Coding

Introduction of the CDEX experiment

The CDEX-1 1 kg Point-Contact Germanium Detector for Low Mass Dark Matter Searches

Stress in Spin-Valve Nanopillars due to Spin Transfer

Game Theoretical Power Control for Open-Loop Overlaid Network MIMO Systems with Partial Cooperation

Rank-Constrained Schur-Convex Optimization with Multiple Trace/Log-Det Constraints