Source author record

Jinming Liu

Jinming Liu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision eess.IV physics.app-ph physics.med-ph quant-ph Computation and Language cond-mat.mes-hall cond-mat.mtrl-sci physics.atom-ph

Catalog footprint

What is connected

11works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

An Efficient Streaming Video Understanding Framework with Agentic Control

Streaming video requires handling dynamic information density under strict latency budgets. Yet, existing methods typically employ static strategies, such as fixed memory compression or reliance on a single model, forcing a trade-off: fast models fail on complex queries, while always-on heavy models violate real-time constraints and overcomplicate simple queries. Rather than fixing these decisions upfront, we propose R3-Streaming (Remember, Respond, Reason), which formulates streaming video understanding as a cascaded control problem: for each query, the system compresses memory, judges response readiness, and routes computation sequentially, so that each downstream decision builds on progressively refined information states. To optimize this pipeline, we introduce an age-aware forgetting policy for memory compression, as aggressively compressing historical frames can yield substantial performance gains. For compute routing, we propose TB-GRPO, a target-balanced reinforcement learning objective that routes hard queries to a stronger model while preventing mode collapse. Extensive evaluations demonstrate that R3-Streaming achieves state-of-the-art results among streaming MLLMs, reaching 57.92 on OVO-Bench and 76.36 on StreamingBench, while reducing visual token usage by 95 to 96 percent.

preprint2026arXiv

Generation Navigator: A State-Aware Agentic Framework for Image Generation

Despite rapid advances in text-to-image generation, faithfully realizing user intent remains challenging, often requiring manual multi-turn trial and error. To automate this process, existing systems rely on either simple prompt rewriting or closed-loop agents driven by hand-crafted rules, rather than learning to adapt actions to the evolving generation process. In this paper, we reformulate image generation as a state-conditioned action-making problem and propose Generation Navigator, a multi-turn T2I agent that learns to dynamically steer the generation trajectory and output the next action. However, training this agent via reinforcement learning introduces a critical credit assignment challenge: naively rewarding a trajectory based solely on a single state assigns equal credit to all actions in the rollout, ignores the quality dynamics across turns, and fails to distinguish actions that improve the trajectory from those that degrade it or waste turns without progress. We resolve this with PRE-GRPO (Peak-Retention-Efficiency Group Relative Policy Optimization), a trajectory-level reinforcement learning objective that explicitly rewards discovering a high-quality image (Peak), avoiding subsequent quality degradation across turns (Retention), and minimizing unnecessary turns (Efficiency). Experiments show substantial improvements across benchmarks, reaching a WISE score of 0.90 and 79.06% reasoning accuracy on T2I-ReasonBench.

preprint2026arXiv

Speak While Watching: Unleashing TRUE Real-Time Video Understanding Capability of Multimodal Large Language Models

Multimodal Large Language Models (MLLMs) have achieved strong performance across many tasks, yet most systems remain limited to offline inference, requiring complete inputs before generating outputs. Recent streaming methods reduce latency by interleaving perception and generation, but still enforce a sequential perception-generation cycle, limiting real-time interaction. In this work, we target a fundamental bottleneck that arises when extending MLLMs to real-time video understanding: the global positional continuity constraint imposed by standard positional encoding schemes. While natural in offline inference, this constraint tightly couples perception and generation, preventing effective input-output parallelism. To address this limitation, we propose a parallel streaming framework that relaxes positional continuity through three designs: Overlapped, Group-Decoupled, and Gap-Isolated. These designs enable simultaneous perception and generation, allowing the model to process incoming inputs while producing responses in real time. Extensive experiments reveal that Group-Decoupled achieves the best efficiency-performance balance, maintaining high fluency and accuracy while significantly reducing latency. We further show that the proposed framework yields up to 2x acceleration under balanced perception-generation workloads, establishing a principled pathway toward speak-while-watching real-time systems. We make all our code publicly available: https://github.com/EIT-NLP/Speak-While-Watching.

preprint2022arXiv

Element Doping Enhanced Charge-to-Spin Conversion Efficiency in Amorphous PtSn4 Dirac Semimetal

Topological semimetals (TSs) are promising candidates for low-power spin-orbit torque (SOT) devices due to their large charge-to-spin conversion efficiency. Here, we investigated the charge-to-spin conversion efficiency of amorphous PtSn4 (5 nm)/CoFeB (2.5-12.5 nm) layered structures prepared by a magnetron sputtering method at room temperature. The charge-to-spin ratio of PtSn4/CoFeB bilayers was 0.08, characterized by a spin torque ferromagnetic resonance (ST-FMR) technique. This ratio can further increase to 0.14 by inducing dopants, like Al and CoSi, into PtSn4. The dopants can also decrease (Al doping) or increase (CoSi doping) the resistivity of PtSn4. The work proposed a way to enhance the spin-orbit coupling (SOC) in amorphous TSs with dopants.

preprint2022arXiv

Learned Lossless Image Compression With Combined Autoregressive Models And Attention Modules

Lossless image compression is an essential research field in image compression. Recently, learning-based image compression methods achieved impressive performance compared with traditional lossless methods, such as WebP, JPEG2000, and FLIF. However, there are still many impressive lossy compression methods that can be applied to lossless compression. Therefore, in this paper, we explore the methods widely used in lossy compression and apply them to lossless compression. Inspired by the impressive performance of the Gaussian mixture model (GMM) shown in lossy compression, we generate a lossless network architecture with GMM. Besides noticing the successful achievements of attention modules and autoregressive models, we propose to utilize attention modules and add an extra autoregressive model for raw images in our network architecture to boost the performance. Experimental results show that our approach outperforms most classical lossless compression methods and existing learning-based methods.

preprint2022arXiv

Memory-Efficient Learned Image Compression with Pruned Hyperprior Module

Learned Image Compression (LIC) gradually became more and more famous in these years. The hyperprior-module-based LIC models have achieved remarkable rate-distortion performance. However, the memory cost of these LIC models is too large to actually apply them to various devices, especially to portable or edge devices. The parameter scale is directly linked with memory cost. In our research, we found the hyperprior module is not only highly over-parameterized, but also its latent representation contains redundant information. Therefore, we propose a novel pruning method named ERHP in this paper to efficiently reduce the memory cost of hyperprior module, while improving the network performance. The experiments show our method is effective, reducing at least 22.6% parameters in the whole model while achieving better rate-distortion performance.

preprint2022arXiv

Room temperature spin-orbit torque efficiency in sputtered low-temperature superconductor delta-TaN

In the course of searching for promising topological materials for applications in future topological electronics, we evaluated spin-orbit torques (SOTs) in high-quality sputtered $δ-$TaN/Co20Fe60B20 devices through spin-torque ferromagnetic resonance ST-FMR and spin pumping measurements. From the ST-FMR characterization we observed a significant linewidth modulation in the magnetic Co20Fe60B20 layer attributed to the charge-to-spin conversion generated from the $δ-$TaN layer. Remarkably, the spin-torque efficiency determined from ST-FMR and spin pumping measurements is as large as $Θ =$ 0.034 and 0.031, respectively. These values are over two times larger than for $α-$Ta, but almost five times lower than for $β-$Ta, which can be attributed to the low room temperature electrical resistivity $\sim 74μΩ$ cm in $δ-$TaN. A large spin diffusion length of at least $\sim8$ nm is estimated, which is comparable to the spin diffusion length in pure Ta. Comprehensive experimental analysis, together with density functional theory calculations, indicates that the origin of the pronounced SOT effect in $δ-$TaN can be mostly related to a significant contribution from the Berry curvature associated with the presence of a topically nontrivial electronic band structure in the vicinity of the Fermi level (EF). Through additional detailed theoretical analysis, we also found that an isostructural allotrope of the superconducting $δ-$TaN phase, the simple hexagonal structure, $θ-$TaN, has larger Berry curvature, and that, together with expected reasonable charge conductivity, it can also be a promising candidate for exploring a generation of spin-orbit torque magnetic random access memory as cheap, temperature stable, and highly efficient spin current sources.

preprint2020arXiv

Magnetic Particle Spectroscopy: A Short Review of Applications

Magnetic particle spectroscopy (MPS), also called magnetization response spectroscopy, is a novel measurement tool derived from magnetic particle imaging (MPI). It can be interpreted as a zero-dimensional version of MPI scanner. MPS was primarily designed for characterizing superparamagnetic iron oxide nanoparticles (SPIONs) regarding their applicability for MPI. In recent years, it has evolved into an independent, versatile, highly sensitive, inexpensive platform for biological and biomedical assays, cell labeling and tracking, and blood analysis. MPS has also developed into an auxiliary tool for magnetic imaging and hyperthermia by providing high spatial and temporal mappings of temperature and viscosity. Furthermore, other MPS-based applications are being explored such as magnetic fingerprints for product tracking and identification in supply chains. There are a variety of novel MPS-based applications being reported and demonstrated by many groups. In this short review, we highlighted some of the representative applications based on MPS platform, thereby providing a roadmap of this technology and our insights for researchers in this area.

preprint2020arXiv

Scalability and high-efficiency of an $(n+1)$-qubit Toffoli gate sphere via blockaded Rydberg atoms

The Toffoli gate serving as a basic building block for reversible quantum computation, has manifested its great potentials in improving the error-tolerant rate in quantum communication. While current route to the creation of Toffoli gate requires implementing sequential single- and two-qubit gates, limited by longer operation time and lower average fidelity. We develop a new theoretical protocol to construct a universal $(n+1)$-qubit Toffoli gate sphere based on the Rydberg blockade mechanism, by constraining the behavior of one central target atom with $n$ surrounding control atoms. Its merit lies in the use of only five $π$ pulses independent of the control atom number $n$ which leads to the overall gate time as fast as $\sim$125$n$s and the average fidelity closing to 0.999. The maximal filling number of control atoms can be up to $n=46$, determined by the spherical diameter which is equal to the blockade radius, as well as by the nearest neighbor spacing between two trapped-atom lattices. Taking $n=2,3,4$ as examples we comparably show the gate performance with experimentally accessible parameters, and confirm that the gate errors mainly attribute to the imperfect blockade strength, the spontaneous atomic loss and the imperfect ground-state preparation. In contrast to an one-dimensional-array configuration it is remarkable that the spherical atomic sample preserves a high-fidelity output against the increasing of $n$, shedding light on the study of scalable quantum simulation and entanglement with multiple neutral atoms.

preprint2020arXiv

Single-core Or Multi-core? A Mini Review on Magnetic Nanoparticles for Magnetic Particle Spectroscopy-based Bioassays

Magnetic particle spectroscopy (MPS) is a technology that derives from magnetic particle imaging (MPI) and thrives as a standalone platform for many biological and biomedical applications, benefiting from the facile preparation and chemical modification of magnetic nanoparticles (MNPs). In recent years, MPS has been reported in extensive literatures as a versatile platform for different bioassay purposes using artificially designed MNPs, where the MNPs serve as magnetic tracers, the surface functionalized reagents (e.g., antibodies, aptamers, peptides, etc.) and tiny probes capturing target analytes from biofluid samples. The biochemical complexes on MNP surfaces can be tailored for different bioassay requirements, while the design of MNPs are of less attention for MPS-based bioassays. For MNPs in most bioassay applications, superparamagnetism is prerequisite to avoid agglomerates and false magnetic signals. Single- and multi-core superparamagnetic nanoparticles (SPMNPs) are prevalently used in MPS-based bioassays. In this mini review, we compared the pros & cons of different MPS platforms realizing volumetric- and surface-based bioassays with single- and multi-core nanoparticles, respectively.

preprint2009arXiv

Optical rotation of heavy hole spins by non-Abelian geometrical means

A non-Abelian geometric method is proposed for rotating of heavy hole spins in a singly positive charged quantum dot in Voigt geometry. The key ingredient is the delay-dependent non-Abelian geometric phase, which is produced by the nonadiabatic transition between the two degenerate dark states. We demonstrate, by controlling the pump, the Stokes and the driving fields, that the rotations about $y$- and $z$-axes with arbitrary angles can be realized with high fidelity. Fast initialization and heavy hole spin state readout are also possible.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint