Source author record

Huan Li

Huan Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

27works

22topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Forcing-KV: Hybrid KV Cache Compression for Efficient Autoregressive Video Diffusion Models

Autoregressive (AR) video diffusion models adopt a streaming generation framework, enabling long-horizon video generation with real-time responsiveness, as exemplified by the Self Forcing training paradigm. However, existing AR video diffusion models still suffer from significant attention complexity and severe memory overhead due to the redundant key-value (KV) caches across historical frames, which limits scalability. In this paper, we tackle this challenge by introducing KV cache compression into autoregressive video diffusion. We observe that attention heads in mainstream AR diffusion models exhibit markedly distinct attention patterns and functional roles that remain stable across samples and denoising steps. Building on our empirical study of head-wise functional specialization, we divide the attention heads into two categories: static heads, which focus on transitions across autoregressive chunks and intra-frame fidelity, and dynamic heads, which govern inter-frame motion and consistency. We then propose Forcing-KV, a hybrid KV cache compression strategy that performs structured static pruning for static heads and dynamic pruning based on segment-wise similarity for dynamic heads. While maintaining output quality, our method achieves a generation speed of over 29 frames per second on a single NVIDIA H200 GPU along with 30% cache memory reduction, delivering up to 1.35x and 1.50x speedups on LongLive and Self Forcing at 480P resolution, and further scaling to 2.82x speedup at 1080P resolution. Code and demo videos are provided at https://zju-jiyicheng.github.io/Forcing-KV-Page.

preprint2026arXiv

The grip of grammar on meaning uncertainty: cross-linguistic evidence, neural correlates, and clinical relevance

Isolated word meanings are inherently uncertain. This uncertainty reduces when they are combined and anchored in context. We propose that grammar compresses meaning uncertainty cross-linguistically, which is reflected in brain and selectively disrupted in disorders. Compression was operationalized as the relative difference between non-contextual surprisal estimated from lexical frequency, and contextual surprisal from grammar-sensitive models. In narratives from 20 languages, contextual surprisal reduced frequency-based surprisal. This reduction closely tracked the surprisal cost of reversing word order, and scaled with richer, non-redundant lexis as organized by more complex but optimal dependency structure. During fMRI, surprisal and its reduction explained BOLD activity for comprehension and production in overlapping but distinct regions. Uncertainty reduction was significantly attenuated in aphasia, dementia, and schizophrenia, but remained intact where primary deficit is not language. These findings position uncertainty reduction via grammar as a foundational concept that illuminates principles, brain basis, and disruptions of language.

preprint2022arXiv

Bridging the Gap: Commonality and Differences between Online and Offline COVID-19 Data

With the onset of the COVID-19 pandemic, news outlets and social media have become central tools for disseminating and consuming information. Because of their ease of access, users seek COVID-19-related information from online social media (i.e., online news) and news outlets (i.e., offline news). Online and offline news are often connected, sharing common topics while each has unique, different topics. A gap between these two news sources can lead to misinformation propagation. For instance, according to the Guardian, most COVID-19 misinformation comes from users on social media. Without fact-checking social media news, misinformation can lead to health threats. In this paper, we focus on the novel problem of bridging the gap between online and offline data by monitoring their common and distinct topics generated over time. We employ Twitter (online) and local news (offline) data for a time span of two years. Using online matrix factorization, we analyze and study online and offline COVID-19-related data differences and commonalities. We design experiments to show how online and offline data are linked together and what trends they follow.

preprint2022arXiv

Long-range transport of 2D excitons with acoustic waves

Excitons are elementary optical excitation in semiconductors. The ability to manipulate and transport these quasiparticles would enable excitonic circuits and devices for quantum photonic technologies. Recently, interlayer excitons in 2D semiconductors have emerged as a promising candidate for engineering excitonic devices due to their long lifetime, large exciton binding energy, and gate tunability. However, the charge-neutral nature of the excitons leads to weak response to the in-plane electric field and thus inhibits transport beyond the diffusion length. Here, we demonstrate the directional transport of interlayer excitons in bilayer WSe2 driven by the propagating potential traps induced by surface acoustic waves (SAW). We show that at 100 K, the SAW-driven excitonic transport is activated above a threshold acoustic power and reaches 20 mm, a distance at least ten times longer than the diffusion length and only limited by the device size. Temperature-dependent measurement reveals the transition from the diffusion-limited regime at low temperature to the acoustic field-driven regime at elevated temperature. Our work shows that acoustic waves are an effective, contact-free means to control exciton dynamics and transport, promising for realizing 2D materials-based excitonic devices such as exciton transistors, switches, and transducers up to room temperature.

preprint2022arXiv

On Regularity Lemma and Barriers in Streaming and Dynamic Matching

We present a new approach for finding matchings in dense graphs by building on Szemerédi's celebrated Regularity Lemma. This allows us to obtain non-trivial albeit slight improvements over longstanding bounds for matchings in streaming and dynamic graphs. In particular, we establish the following results for $n$-vertex graphs: * A deterministic single-pass streaming algorithm that finds a $(1-o(1))$-approximate matching in $o(n^2)$ bits of space. This constitutes the first single-pass algorithm for this problem in sublinear space that improves over the $\frac{1}{2}$-approximation of the greedy algorithm. * A randomized fully dynamic algorithm that with high probability maintains a $(1-o(1))$-approximate matching in $o(n)$ worst-case update time per each edge insertion or deletion. The algorithm works even against an adaptive adversary. This is the first $o(n)$ update-time dynamic algorithm with approximation guarantee arbitrarily close to one. Given the use of regularity lemma, the improvement obtained by our algorithms over trivial bounds is only by some $(\log^*{n})^{Θ(1)}$ factor. Nevertheless, in each case, they show that the ``right'' answer to the problem is not what is dictated by the previous bounds. Finally, in the streaming model, we also present a randomized $(1-o(1))$-approximation algorithm whose space can be upper bounded by the density of certain Ruzsa-Szemerédi (RS) graphs. While RS graphs by now have been used extensively to prove streaming lower bounds, ours is the first to use them as an upper bound tool for designing improved streaming algorithms.

preprint2022arXiv

Sublinear Algorithms for Hierarchical Clustering

Hierarchical clustering over graphs is a fundamental task in data mining and machine learning with applications in domains such as phylogenetics, social network analysis, and information retrieval. Specifically, we consider the recently popularized objective function for hierarchical clustering due to Dasgupta. Previous algorithms for (approximately) minimizing this objective function require linear time/space complexity. In many applications the underlying graph can be massive in size making it computationally challenging to process the graph even using a linear time/space algorithm. As a result, there is a strong interest in designing algorithms that can perform global computation using only sublinear resources. The focus of this work is to study hierarchical clustering for massive graphs under three well-studied models of sublinear computation which focus on space, time, and communication, respectively, as the primary resources to optimize: (1) (dynamic) streaming model where edges are presented as a stream, (2) query model where the graph is queried using neighbor and degree queries, (3) MPC model where the graph edges are partitioned over several machines connected via a communication channel. We design sublinear algorithms for hierarchical clustering in all three models above. At the heart of our algorithmic results is a view of the objective in terms of cuts in the graph, which allows us to use a relaxed notion of cut sparsifiers to do hierarchical clustering while introducing only a small distortion in the objective function. Our main algorithmic contributions are then to show how cut sparsifiers of the desired form can be efficiently constructed in the query model and the MPC model. We complement our algorithmic results by establishing nearly matching lower bounds that rule out the possibility of designing better algorithms in each of these models.

preprint2022arXiv

Variance Reduced EXTRA and DIGing and Their Optimal Acceleration for Strongly Convex Decentralized Optimization

We study stochastic decentralized optimization for the problem of training machine learning models with large-scale distributed data. We extend the widely used EXTRA and DIGing methods with variance reduction (VR), and propose two methods: VR-EXTRA and VR-DIGing. The proposed VR-EXTRA requires the time of $O((κ_s+n)\log\frac{1}ε)$ stochastic gradient evaluations and $O((κ_b+κ_c)\log\frac{1}ε)$ communication rounds to reach precision $ε$, which are the best complexities among the non-accelerated gradient-type methods, where $κ_s$ and $κ_b$ are the stochastic condition number and batch condition number for strongly convex and smooth problems, respectively, $κ_c$ is the condition number of the communication network, and $n$ is the sample size on each distributed node. The proposed VR-DIGing has a little higher communication cost of $O((κ_b+κ_c^2)\log\frac{1}ε)$. Our stochastic gradient computation complexities are the same as the ones of single-machine VR methods, such as SAG, SAGA, and SVRG, and our communication complexities keep the same as those of EXTRA and DIGing, respectively. To further speed up the convergence, we also propose the accelerated VR-EXTRA and VR-DIGing with both the optimal $O((\sqrt{nκ_s}+n)\log\frac{1}ε)$ stochastic gradient computation complexity and $O(\sqrt{κ_bκ_c}\log\frac{1}ε)$ communication complexity. Our stochastic gradient computation complexity is also the same as the ones of single-machine accelerated VR methods, such as Katyusha, and our communication complexity keeps the same as those of accelerated full batch decentralized methods, such as MSDA.

preprint2020arXiv

AIBench: An Agile Domain-specific Benchmarking Methodology and an AI Benchmark Suite

Domain-specific software and hardware co-design is encouraging as it is much easier to achieve efficiency for fewer tasks. Agile domain-specific benchmarking speeds up the process as it provides not only relevant design inputs but also relevant metrics, and tools. Unfortunately, modern workloads like Big data, AI, and Internet services dwarf the traditional one in terms of code size, deployment scale, and execution path, and hence raise serious benchmarking challenges. This paper proposes an agile domain-specific benchmarking methodology. Together with seventeen industry partners, we identify ten important end-to-end application scenarios, among which sixteen representative AI tasks are distilled as the AI component benchmarks. We propose the permutations of essential AI and non-AI component benchmarks as end-to-end benchmarks. An end-to-end benchmark is a distillation of the essential attributes of an industry-scale application. We design and implement a highly extensible, configurable, and flexible benchmark framework, on the basis of which, we propose the guideline for building end-to-end benchmarks, and present the first end-to-end Internet service AI benchmark. The preliminary evaluation shows the value of our benchmark suite---AIBench against MLPerf and TailBench for hardware and software designers, micro-architectural researchers, and code developers. The specifications, source code, testbed, and results are publicly available from the web site \url{http://www.benchcouncil.org/AIBench/index.html}.

preprint2020arXiv

An Efficient PTAS for Stochastic Load Balancing with Poisson Jobs

We give the first polynomial-time approximation scheme (PTAS) for the stochastic load balancing problem when the job sizes follow Poisson distributions. This improves upon the 2-approximation algorithm due to Goel and Indyk (FOCS'99). Moreover, our approximation scheme is an efficient PTAS that has a running time double exponential in $1/ε$ but nearly-linear in $n$, where $n$ is the number of jobs and $ε$ is the target error. Previously, a PTAS (not efficient) was only known for jobs that obey exponential distributions (Goel and Indyk, FOCS'99). Our algorithm relies on several probabilistic ingredients including some (seemingly) new results on scaling and the so-called "focusing effect" of maximum of Poisson random variables which might be of independent interest.

preprint2020arXiv

Decentralized Accelerated Gradient Methods With Increasing Penalty Parameters

In this paper, we study the communication and (sub)gradient computation costs in distributed optimization and give a sharp complexity analysis for the proposed distributed accelerated gradient methods. We present two algorithms based on the framework of the accelerated penalty method with increasing penalty parameters. Our first algorithm is for smooth distributed optimization and it obtains the near optimal $O\left(\sqrt{\frac{L}{ε(1-σ_2(W))}}\log\frac{1}ε\right)$ communication complexity and the optimal $O\left(\sqrt{\frac{L}ε}\right)$ gradient computation complexity for $L$-smooth convex problems, where $σ_2(W)$ denotes the second largest singular value of the weight matrix $W$ associated to the network and $ε$ is the target accuracy. When the problem is $μ$-strongly convex and $L$-smooth, our algorithm has the near optimal $O\left(\sqrt{\frac{L}{μ(1-σ_2(W))}}\log^2\frac{1}ε\right)$ complexity for communications and the optimal $O\left(\sqrt{\frac{L}μ}\log\frac{1}ε\right)$ complexity for gradient computations. Our communication complexities are only worse by a factor of $\left(\log\frac{1}ε\right)$ than the lower bounds for the smooth distributed optimization. %As far as we know, our method is the first to achieve both communication and gradient computation lower bounds up to an extra logarithm factor for smooth distributed optimization. Our second algorithm is designed for non-smooth distributed optimization and it achieves both the optimal $O\left(\frac{1}{ε\sqrt{1-σ_2(W)}}\right)$ communication complexity and $O\left(\frac{1}{ε^2}\right)$ subgradient computation complexity, which match the communication and subgradient computation complexity lower bounds for non-smooth distributed optimization.

preprint2020arXiv

Revisiting EXTRA for Smooth Distributed Optimization

EXTRA is a popular method for dencentralized distributed optimization and has broad applications. This paper revisits EXTRA. First, we give a sharp complexity analysis for EXTRA with the improved $O\left(\left(\frac{L}μ+\frac{1}{1-σ_2(W)}\right)\log\frac{1}{ε(1-σ_2(W))}\right)$ communication and computation complexities for $μ$-strongly convex and $L$-smooth problems, where $σ_2(W)$ is the second largest singular value of the weight matrix $W$. When the strong convexity is absent, we prove the $O\left(\left(\frac{L}ε+\frac{1}{1-σ_2(W)}\right)\log\frac{1}{1-σ_2(W)}\right)$ complexities. Then, we use the Catalyst framework to accelerate EXTRA and obtain the $O\left(\sqrt{\frac{L}{μ(1-σ_2(W))}}\log\frac{ L}{μ(1-σ_2(W))}\log\frac{1}ε\right)$ communication and computation complexities for strongly convex and smooth problems and the $O\left(\sqrt{\frac{L}{ε(1-σ_2(W))}}\log\frac{1}{ε(1-σ_2(W))}\right)$ complexities for non-strongly convex ones. Our communication complexities of the accelerated EXTRA are only worse by the factors of $\left(\log\frac{L}{μ(1-σ_2(W))}\right)$ and $\left(\log\frac{1}{ε(1-σ_2(W))}\right)$ from the lower complexity bounds for strongly convex and non-strongly convex problems, respectively.

preprint2020arXiv

Structural transition, metallization and superconductivity in quasi 2D layered PdS$_2$ under compression

Based on first-principles simulations and calculations, we explore the evolution of crystal structure, electronic structure and transport properties of quasi 2D layered PdS2 under uniaxial stress and hydrostatic pressure. The coordination of the Pd ions plays crucial roles in the structural transition, electronic structure and transport properties of PdS2. An interesting ferroelastic phase transition with lattice reorientation is revealed under uniaxial compressive stress, which originates from the bond reconstructions of the unusual PdS4 square-planar coordination. By contrast, the layered structure transforms to 3D cubic pyrite-type structure under hydrostatic pressure. In contrast to the experimental proposed coexistence of layered PdS2-type structure with cubic pyrite-type structure at intermediate pressure range, we predict that the compression-induced intermediate phase showing the same structural symmetry with the ambient phase, except of sharply contracted interlayer-distances. The coordination environments of the Pd ions have changed from square-planar to distorted octahedra in the intermediate phase, which results in the bandwidth broaden and orbital-selective metallization. In addition, the superconductivity comes from the cubic pyrite-type structure protected topological nodal-line states. The strong correlations between structural transition, electronic structure and transport properties in PdS2 provide a platform to study the fundamental physics of the interplay between crystal structure and transport behavior, and the competition between diverse phases.

preprint2019arXiv

Valence transition in topological Kondo insulator

We investigate the valence transition in three-dimensional topological Kondo insulator through slave-boson analysis of periodic Anderson model. By including the effect of intra-atomic Coulomb correlation $U_{fc}$ between conduction and local electrons, we find a first-order valence transition from Kondo region to mixed valence upon ascending of local level above a critical $U_{fc}$, and this valence transition usually occurs very close to or simultaneously with a topological transition. Near the parameter region of zero-temperature valence transition, rise of temperature can generate a thermal valence transition from mixed valence to Kondo region, accompanied by a first-order topological transition. Remarkably, above a critical $U_{fc}$ which is considerable smaller than that generating paramagnetic valence transition, the original continuous antiferromagnetic transition is shifted to first order one, at which a discontinuous valence shift takes place. Upon increased $U_{fc}$, the paramagnetic valence transition approaches then converges with the first-order antiferromagnetic transition, leaving an significant valence shift on the magnetic boundary. The continuous antiferromagnetic transition, first-order antiferromagnetic transition, paramagnetic valence transition and topological transitions are all summarized in a global phase diagram. Our proposed exotic transition processes can help to understand the thermal valence variation as well as the valence shift around the pressure-induced magnetic transition in topological Kondo insulator candidates and in other heavy-fermion systems.

preprint2016arXiv

Optomechanical measurement of photon spin angular momentum and optical torque in integrated photonic devices

Photons carry linear momentum, and spin angular momentum when circularly or elliptically polarized. During light-matter interaction, transfer of linear momentum leads to optical forces, while angular momentum transfer induces optical torque. Optical forces including radiation pressure and gradient forces have long been utilized in optical tweezers and laser cooling. In nanophotonic devices optical forces can be significantly enhanced, leading to unprecedented optomechanical effects in both classical and quantum regimes. In contrast, to date, the angular momentum of light and the optical torque effect remain unexplored in integrated photonics. Here, we demonstrate the measurement of the spin angular momentum of photons propagating in a birefringent waveguide and the use of optical torque to actuate rotational motion of an optomechanical device. We show that the sign and magnitude of the optical torque are determined by the photon polarization states that are synthesized on the chip. Our study reveals the mechanical effect of photon's polarization degree of freedom and demonstrates its control in integrated photonic devices. Exploiting optical torque and optomechanical interaction with photon angular momentum can lead to torsional cavity optomechanics and optomechanical photon spin-orbit coupling, as well as applications such as optomechanical gyroscope and torsional magnetometry.

preprint2016arXiv

Phase diagram of Kondo-Heisenberg model on honeycomb lattice with geometrical frustration

We calculated the phase diagram of the Kondo-Heisenberg model on two-dimensional honeycomb lattice with both nearest-neighbor and next-nearest-neighbor antiferromagnetic spin exchanges, to investigate the interplay between RKKY and Kondo interactions at presence of magnetic frustration. Within a mean-field decoupling technology in slave-fermion representation, we derived the zero-temperature phase diagram as a function of Kondo coupling $J_k$ and frustration strength $Q$. The geometrical frustration can destroy the magnetic order, driving the original antiferromagnetic (AF) phase to non-magnetic valence bond state (VBS). In addition, we found two distinct VBS. As $J_k$ is increased, a phase transition from AF to Kondo paramagnetic (KP) phase occurs, without the intermediate phase coexisting AF order with Kondo screening found in square lattice systems. In the KP phase, the enhancement of frustration weakens the Kondo screening effect, resulting in a phase transition from KP to VBS. We also found a process to recover the AF order from VBS by increasing $J_k$ in a wide range of frustration strength. Our work may provide deeper understanding for the phase transitions in heavy-fermion materials, particularly for those exhibiting triangular frustration.

preprint2016arXiv

Strategies for Searching Video Content with Text Queries or Video Examples

The large number of user-generated videos uploaded on to the Internet everyday has led to many commercial video search engines, which mainly rely on text metadata for search. However, metadata is often lacking for user-generated videos, thus these videos are unsearchable by current search engines. Therefore, content-based video retrieval (CBVR) tackles this metadata-scarcity problem by directly analyzing the visual and audio streams of each video. CBVR encompasses multiple research topics, including low-level feature design, feature fusion, semantic detector training and video search/reranking. We present novel strategies in these topics to enhance CBVR in both accuracy and speed under different query inputs, including pure textual queries and query by video examples. Our proposed strategies have been incorporated into our submission for the TRECVID 2014 Multimedia Event Detection evaluation, where our system outperformed other submissions in both text queries and video example queries, thus demonstrating the effectiveness of our proposed approaches.

preprint2015arXiv

Acousto-optic modulation of a photonic crystal nanocavity with Lamb waves in microwave K band

Integrating nanoscale electromechanical transducers and nanophotonic devices potentially can enable new acousto-optic devices to reach unprecedented high frequencies and modulation efficiency. Here, we demonstrate acousto-optic modulation of a photonic crystal nanocavity using Lamb waves with frequency up to 19 GHz, reaching the microwave K band. The devices are fabricated in suspended aluminum nitride membrane. Excitation of acoustic waves is achieved with interdigital transducers with periods as small as 300 nm. Confining both acoustic wave and optical wave within the thickness of the membrane leads to improved acousto-optic modulation efficiency in the new devices than that obtained in previous surface acoustic wave devices. Our system demonstrates a novel scalable optomechanical platform where strong acousto-optic coupling between cavity-confined photons and high frequency traveling phonons can be explored.

preprint2015arXiv

Anomalous behavior of trapping in extended dendrimers with a perfect trap

Compact and extended dendrimers are two important classes of dendritic polymers. The impact of the underlying structure of compact dendrimers on dynamical processes has been much studied, yet the relation between the dynamical and structural properties of extended dendrimers remains not well understood. In this paper, we study the trapping problem in extended dendrimers with generation-dependent segment lengths, which is different from that of compact dendrimers where the length of the linear segments is fixed. We first consider a particular case that the deep trap is located at the central node, and derive an exact formula for the average trapping time (ATT) defined as the average of the source-to-trap mean first passage time over all starting points. Then, using the obtained result we deduce a closed-form expression for the ATT to an arbitrary trap node, based on which we further obtain an explicit solution to the ATT corresponding to the trapping issue with the trap uniformly distributed in the polymer systems. We show that the trap location has a substantial influence on the trapping efficiency measured by the ATT, which increases with the shortest distance from the trap to the central node, a phenomenon similar to that for compact dendrimers. In contrast to this resemblance, the leading terms of ATTs for the three trapping problems differ drastically between extended and compact dendrimers, with the trapping processes in the extended dendrimers being less efficient than in compact dendrimers.

preprint2015arXiv

Fast Proximal Linearized Alternating Direction Method of Multiplier with Parallel Splitting

The Augmented Lagragian Method (ALM) and Alternating Direction Method of Multiplier (ADMM) have been powerful optimization methods for general convex programming subject to linear constraint. We consider the convex problem whose objective consists of a smooth part and a nonsmooth but simple part. We propose the Fast Proximal Augmented Lagragian Method (Fast PALM) which achieves the convergence rate $O(1/K^2)$, compared with $O(1/K)$ by the traditional PALM. In order to further reduce the per-iteration complexity and handle the multi-blocks problem, we propose the Fast Proximal ADMM with Parallel Splitting (Fast PL-ADMM-PS) method. It also partially improves the rate related to the smooth part of the objective function. Experimental results on both synthesized and real world data demonstrate that our fast methods significantly improve the previous PALM and ADMM.

preprint2015arXiv

Nanophotonic cavity optomechanics with propagating phonons in microwave Ku band

Sideband-resolved coupling between multiple photonic nanocavities and propagating mechanical waves in microwave Ku-band is demonstrated. Coherent and strong photon-phonon interaction is manifested with optomechanically induced transparency and absorption, and phase-coherent interaction in multiple cavities. Inside an echo chamber it is shown that a phonon pulse can interact with an embedded nanocavity for multiple times. Our device provides a scalable platform to optomechanically couple phonons and photons for microwave photonics and quantum photonics.

preprint2015arXiv

Phase evolution of the two-dimensional Kondo lattice model near half-filling

Within a mean-field approximation, the ground state and finite temperature phase diagrams of the two-dimensional Kondo lattice model have been carefully studied as functions of the Kondo coupling $J$ and the conduction electron concentration $n_{c}$. In addition to the conventional hybridization between local moments and itinerant electrons, a staggered hybridization is proposed to characterize the interplay between the antiferromagnetism and the Kondo screening effect. As a result, a heavy fermion antiferromagnetic phase is obtained and separated from the pure antiferromagnetic ordered phase by a first-order Lifshitz phase transition, while a continuous phase transition exists between the heavy fermion antiferromagnetic phase and the Kondo paramagnetic phase. We have developed a efficient theory to calculate these phase boundaries. As $n_{c}$ decreases from the half-filling, the region of the heavy fermion antiferromagnetic phase shrinks and finally disappears at a critical point $n_{c}^{*}=0.8228$, leaving a first-order critical line between the pure antiferromagnetic phase and the Kondo paramagnetic phase for $n_{c}<n_{c}^{* }$. At half-filling limit, a finite temperature phase diagram is also determined on the Kondo coupling and temperature ($J$-$T$) plane. Notably, as the temperature is increased, the region of the heavy fermion antiferromagnetic phase is reduced continuously, and finally converges to a single point, together with the pure antiferromagnetic phase and the Kondo paramagnetic phase. The phase diagrams with such triple point may account for the observed phase transitions in related heavy fermion materials.

preprint2014arXiv

Linearized Alternating Direction Method with Parallel Splitting and Adaptive Penalty for Separable Convex Programs in Machine Learning

Many problems in machine learning and other fields can be (re)for-mulated as linearly constrained separable convex programs. In most of the cases, there are multiple blocks of variables. However, the traditional alternating direction method (ADM) and its linearized version (LADM, obtained by linearizing the quadratic penalty term) are for the two-block case and cannot be naively generalized to solve the multi-block case. So there is great demand on extending the ADM based methods for the multi-block case. In this paper, we propose LADM with parallel splitting and adaptive penalty (LADMPSAP) to solve multi-block separable convex programs efficiently. When all the component objective functions have bounded subgradients, we obtain convergence results that are stronger than those of ADM and LADM, e.g., allowing the penalty parameter to be unbounded and proving the sufficient and necessary conditions} for global convergence. We further propose a simple optimality measure and reveal the convergence rate of LADMPSAP in an ergodic sense. For programs with extra convex set constraints, with refined parameter estimation we devise a practical version of LADMPSAP for faster convergence. Finally, we generalize LADMPSAP to handle programs with more difficult objective functions by linearizing part of the objective function as well. LADMPSAP is particularly suitable for sparse representation and low-rank recovery problems because its subproblems have closed form solutions and the sparsity and low-rankness of the iterates can be preserved during the iteration. It is also highly parallelizable and hence fits for parallel or distributed computing. Numerical experiments testify to the advantages of LADMPSAP in speed and numerical accuracy.

preprint2014arXiv

Optomechanical photon shuttling between photonic cavities

Mechanical motion of photonic devices driven by optical forces provides a profound means of coupling between optical fields. The current focus of these optomechanical effects has been on cavity optomechanics systems in which co-localized optical and mechanical modes interact strongly to enable wave-mixing between photons and phonons and backaction cooling of mechanical modes. Alternatively, extended mechanical modes can also induce strong nonlocal effects on propagating optical fields or multiple localized optical modes at distances. Here, we demonstrate a novel multi-cavity optomechanical device: a "photon see-saw", in which torsional optomechanical motion can shuttle photons between two photonic crystal nanocavities. The resonance frequencies of the two cavities, one on each side of the see-saw, are modulated anti-symmetrically by the device's rotation. Pumping photons into one cavity excites optomechanical self-oscillation which strongly modulates the inter-cavity coupling and shuttles photons to the other empty cavity during every oscillation cycle in a well regulated fashion.

preprint2012arXiv

D-wave superconductivity induced by short-range antiferromagnetic correlations in the two-dimensional Kondo lattice model

The possible heavy fermion superconductivity is carefully reexamined in the two-dimensional Kondo lattice model with an antiferromagnetic Heisenberg superexchange between local magnetic moments. In order to establish an effective mean field theory in the limit of the paramagnetic heavy Fermi liquid and near the half-filling case, we find that the spinon singlet pairing from the local antiferromagnetic short-range correlations can reduce the ground state energy substantially. In the presence of the Kondo screening effect, the Cooper pairs between the conduction electrons is induced. Depending on the ratio of the Heisenberg and the Kondo exchange couplings, the resulting superconducting state is characterized by either a d-wave nodal or d-wave nodeless state, and a continuous phase transition exists between these two states. These results are related to some quasi-two dimensional heary fermion superconductors.

preprint2012arXiv

Flexible and tunable silicon photonic circuits on plastic substrates

Flexible microelectronics has shown tremendous promise in a broad spectrum of applications, especially those that cannot be addressed by conventional microelectronics in rigid materials and constructions1-3. These unconventional yet important applications range from flexible consumer electronics to conformal sensor arrays and biomedical devices. A recent successful paradigm shift in implementing flexible electronics is to physically transfer and bond highly integrated devices made in high-quality, crystalline semiconductor materials on to plastic materials4-8. Here we demonstrate a flexible form of silicon photonics on plastic substrates using the transfer-and-bond fabrication method. Photonic circuits including interferometers and resonators have been transferred onto flexible plastic substrates with preserved functionalities and performance. By mechanically deforming the flexible substrates, the optical characteristics of the devices can be tuned reversibly over a remarkably large range. The demonstration of the new flexible photonic system based on the silicon-on-plastic (SOP) material platform could open the door to a plethora of novel applications, including tunable photonics, optomechanical sensors and bio-mechanical and bio-photonic probes.

preprint2012arXiv

Optical absorption in graphene integrated on silicon waveguides

To fully utilize graphene's remarkable optical properties for optoelectronic applications, it needs to be integrated in planar photonic systems. Here, we demonstrate integration of graphene on silicon photonic circuits and precise measurement of the optical absorption coefficient in a graphene/waveguide hybrid structure. A method based on Mach-Zehnder interferometry is employed to achieve high measurement precision and consistency, yielding a maximal value of absorption coefficient of 0.2 dB/μm when graphene is located directly on top of the waveguide. The results agree with theoretical model utilizing the universal ac conductivity in graphene. Our work provides an important guide for the design and optimization of integrated graphene optoelectronic devices.

preprint2012arXiv

The fundamental Diagram of Pedestrian Model with Slow Reaction

The slow-to-start models are a classical cellular automata model in simulating vehicle traffic. However, to our knowledge, the slow-to-start effect has not considered in modeling pedestrian dynamic. We verify the similar behavior between pedestrian and vehicle, and propose an new lattice gas (LG) model called the slow reaction (SR) model to describe the pedestrian's delayed reaction in single-file movement. We simulate and reproduce the Seyfried's field experiments at the research centre Julich, and use its empirical data to validate our SR model. We compare the SR model with the standard LG model. We test different probability of slow reaction ps in SR model and found the simulation data of ps=0.3 fit the empirical data best. The RMS error of mean velocity of SR model is smaller than that of standard LG model. In the range of ps=0.1~0.3, our fundamental diagram between velocity and density by simulation coincides with field experiments. The distribution of individual velocity in fundamental diagram in SR model agrees with the empirical data better than that of standard LG model. In addition, we observe the stop-and-go waves and phase separation in pedestrian flow by simulation. We reproduced the phenomena of uneven distribution of interspaces by SR model while the standard LG model did not implement. The SR model can reproduce the evolution of spatio-temporal structures of pedestrian flow with higher fidelity to Seyfried's experiments than the standard LG model.

Huan Li

What is connected

Connect this record

See the researcher in context

Building this map preview

27 published item(s)

Forcing-KV: Hybrid KV Cache Compression for Efficient Autoregressive Video Diffusion Models

The grip of grammar on meaning uncertainty: cross-linguistic evidence, neural correlates, and clinical relevance

Bridging the Gap: Commonality and Differences between Online and Offline COVID-19 Data

Long-range transport of 2D excitons with acoustic waves

On Regularity Lemma and Barriers in Streaming and Dynamic Matching

Sublinear Algorithms for Hierarchical Clustering

Variance Reduced EXTRA and DIGing and Their Optimal Acceleration for Strongly Convex Decentralized Optimization

AIBench: An Agile Domain-specific Benchmarking Methodology and an AI Benchmark Suite

An Efficient PTAS for Stochastic Load Balancing with Poisson Jobs

Decentralized Accelerated Gradient Methods With Increasing Penalty Parameters

Revisiting EXTRA for Smooth Distributed Optimization

Structural transition, metallization and superconductivity in quasi 2D layered PdS$_2$ under compression

Valence transition in topological Kondo insulator

Optomechanical measurement of photon spin angular momentum and optical torque in integrated photonic devices

Phase diagram of Kondo-Heisenberg model on honeycomb lattice with geometrical frustration

Strategies for Searching Video Content with Text Queries or Video Examples

Acousto-optic modulation of a photonic crystal nanocavity with Lamb waves in microwave K band

Anomalous behavior of trapping in extended dendrimers with a perfect trap

Fast Proximal Linearized Alternating Direction Method of Multiplier with Parallel Splitting

Nanophotonic cavity optomechanics with propagating phonons in microwave Ku band

Phase evolution of the two-dimensional Kondo lattice model near half-filling

Linearized Alternating Direction Method with Parallel Splitting and Adaptive Penalty for Separable Convex Programs in Machine Learning

Optomechanical photon shuttling between photonic cavities

D-wave superconductivity induced by short-range antiferromagnetic correlations in the two-dimensional Kondo lattice model

Flexible and tunable silicon photonic circuits on plastic substrates

Optical absorption in graphene integrated on silicon waveguides

The fundamental Diagram of Pedestrian Model with Slow Reaction