Researcher profile

Yue Feng

Yue Feng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2026arXiv

Hi-GaTA: Hierarchical Gated Temporal Aggregation Adapter for Surgical Video Report Generation

Automated, clinician-grade assessment reports for surgical procedures could reduce documentation burden and provide objective feedback, yet remain challenging due to the difficulty of aligning dense spatio-temporal video representations with language-based reasoning and the scarcity of high-quality, privacy-preserving datasets. To address this gap, we establish a benchmark comprising 214 high-quality simulated surgical videos paired with surgeon-authored evaluation reports. Building on this resource, we propose a Perception-Alignment-Reasoning framework for surgical video report generation, featuring Hi-GaTA, a novel lightweight temporal adapter that efficiently compresses long video sequences into compact, LLM-compatible visual prefix tokens through short-to-long-range temporal aggregation. For robust visual perception, we pretrain Sur40k, a surgical-specific ViViT-style video encoder on 40,000 minutes of public surgical videos to capture fine-grained spatio-temporal procedural priors. Hi-GaTA employs a temporal pyramid with text-conditioned dual cross-attention, and improves multi-scale consistency through cross-level gated fusion and an increasing-depth strategy. Finally, we fine-tune the LLM backbone using LoRA to enable coherent and stylistically consistent surgical report generation under limited supervision. Experiments show our approach achieves the best overall performance, with consistent gains over strong Multimodal Large Language Model (MLLM) baselines. Ablation studies further validate the effectiveness of each proposed component.

preprint2024arXiv

Augmented Subspace Scheme for Eigenvalue Problem by Weak Galerkin Finite Element Method

This study proposes a class of augmented subspace schemes for the weak Galerkin (WG) finite element method used to solve eigenvalue problems. The augmented subspace is built with the conforming linear finite element space defined on the coarse mesh and the eigenfunction approximations in the WG finite element space defined on the fine mesh. Based on this augmented subspace, solving the eigenvalue problem in the fine WG finite element space can be reduced to the solution of the linear boundary value problem in the same WG finite element space and a low dimensional eigenvalue problem in the augmented subspace. The proposed augmented subspace techniques have the second order convergence rate with respect to the coarse mesh size, as demonstrated by the accompanying error estimates. Finally, a few numerical examples are provided to validate the proposed numerical techniques.

preprint2022arXiv

ASSIST: Towards Label Noise-Robust Dialogue State Tracking

The MultiWOZ 2.0 dataset has greatly boosted the research on dialogue state tracking (DST). However, substantial noise has been discovered in its state annotations. Such noise brings about huge challenges for training DST models robustly. Although several refined versions, including MultiWOZ 2.1-2.4, have been published recently, there are still lots of noisy labels, especially in the training set. Besides, it is costly to rectify all the problematic annotations. In this paper, instead of improving the annotation quality further, we propose a general framework, named ASSIST (lAbel noiSe-robuSt dIalogue State Tracking), to train DST models robustly from noisy labels. ASSIST first generates pseudo labels for each sample in the training set by using an auxiliary model trained on a small clean dataset, then puts the generated pseudo labels and vanilla noisy labels together to train the primary model. We show the validity of ASSIST theoretically. Experimental results also demonstrate that ASSIST improves the joint goal accuracy of DST by up to $28.16\%$ on MultiWOZ 2.0 and $8.41\%$ on MultiWOZ 2.4, compared to using only the vanilla noisy labels.

preprint2022arXiv

Dynamic Schema Graph Fusion Network for Multi-Domain Dialogue State Tracking

Dialogue State Tracking (DST) aims to keep track of users' intentions during the course of a conversation. In DST, modelling the relations among domains and slots is still an under-studied problem. Existing approaches that have considered such relations generally fall short in: (1) fusing prior slot-domain membership relations and dialogue-aware dynamic slot relations explicitly, and (2) generalizing to unseen domains. To address these issues, we propose a novel \textbf{D}ynamic \textbf{S}chema \textbf{G}raph \textbf{F}usion \textbf{Net}work (\textbf{DSGFNet}), which generates a dynamic schema graph to explicitly fuse the prior slot-domain membership relations and dialogue-aware dynamic slot relations. It also uses the schemata to facilitate knowledge transfer to new domains. DSGFNet consists of a dialogue utterance encoder, a schema graph encoder, a dialogue-aware schema graph evolving network, and a schema graph enhanced dialogue state decoder. Empirical results on benchmark datasets (i.e., SGD, MultiWOZ2.1, and MultiWOZ2.2), show that DSGFNet outperforms existing methods.

preprint2022arXiv

Enabling Massage Actions: An Interactive Parallel Robot with Compliant Joints

We propose a parallel massage robot with compliant joints based on the series elastic actuator (SEA), offering a unified force-position control approach. First, the kinematic and static force models are established for obtaining the corresponding control variables. Then, a novel force-position control strategy is proposed to separately control the force-position along the normal direction of the surface and another two-direction displacement, without the requirement of a robotic dynamics model. To evaluate its performance, we implement a series of robotic massage experiments. The results demonstrate that the proposed massage manipulator can successfully achieve desired forces and motion patterns of massage tasks, arriving at a high-score user experience.

preprint2022arXiv

Improved uniform error bounds of the time-splitting methods for the long-time (nonlinear) Schrödinger equation

We establish improved uniform error bounds for the time-splitting methods for the long-time dynamics of the Schrödinger equation with small potential and the nonlinear Schrödinger equation (NLSE) with weak nonlinearity. For the Schrödinger equation with small potential characterized by a dimensionless parameter $\varepsilon \in (0, 1]$ representing the amplitude of the potential, we employ the unitary flow property of the (second-order) time-splitting Fourier pseudospectral (TSFP) method in $L^2$-norm to prove a uniform error bound at $C(T)(h^m +τ^2)$ up to the long time $T_\varepsilon= T/\varepsilon$ for any $T>0$ and uniformly for $0<\varepsilon\le1$, while $h$ is the mesh size, $τ$ is the time step, $m \ge 2$ depends on the regularity of the exact solution, and $C(T) =C_0+C_1T$ grows at most linearly with respect to $T$ with $C_0$ and $C_1$ two positive constants independent of $T$, $\varepsilon$, $h$ and $τ$. Then by introducing a new technique of {\sl regularity compensation oscillation} (RCO) in which the high frequency modes are controlled by regularity and the low frequency modes are analyzed by phase cancellation and energy method, an improved uniform error bound at $O(h^{m-1} + \varepsilon τ^2)$ is established in $H^1$-norm for the long-time dynamics up to the time at $O(1/\varepsilon)$ of the Schrödinger equation with $O(\varepsilon)$-potential with $m \geq 3$, which is uniformly for $\varepsilon\in(0,1]$. Moreover, the RCO technique is extended to prove an improved uniform error bound at $O(h^{m-1} + \varepsilon^2τ^2)$ in $H^1$-norm for the long-time dynamics up to the time at $O(1/\varepsilon^2)$ of the cubic NLSE with $O(\varepsilon^2)$-nonlinearity strength, uniformly for $\varepsilon \in (0, 1]$. Extensions to the first-order and fourth-order time-splitting methods are discussed.

preprint2022arXiv

Learning to Execute Actions or Ask Clarification Questions

Collaborative tasks are ubiquitous activities where a form of communication is required in order to reach a joint goal. Collaborative building is one of such tasks. We wish to develop an intelligent builder agent in a simulated building environment (Minecraft) that can build whatever users wish to build by just talking to the agent. In order to achieve this goal, such agents need to be able to take the initiative by asking clarification questions when further information is needed. Existing works on Minecraft Corpus Dataset only learn to execute instructions neglecting the importance of asking for clarifications. In this paper, we extend the Minecraft Corpus Dataset by annotating all builder utterances into eight types, including clarification questions, and propose a new builder agent model capable of determining when to ask or execute instructions. Experimental results show that our model achieves state-of-the-art performance on the collaborative building task with a substantial improvement. We also define two new tasks, the learning to ask task and the joint learning task. The latter consists of solving both collaborating building and learning to ask tasks jointly.

preprint2020arXiv

Distinct Topological Surface States on the Two Terminations of MnBi$_4$Te$_7$

The recent discovered intrinsic magnetic topological insulator MnBi2Te4 have been met with unusual success in hosting emergent phenomena such as the quantum anomalous Hall effect and the axion insulator states. However, the surface-bulk correspondence of the Mn-Bi-Te family, composed by the superlattice-like MnBi2Te4/(Bi2Te3)n (n = 0, 1, 2, 3 ...) layered structure, remains intriguing but elusive. Here, by using scanning tunneling microscopy (STM) and angle-resolved photoemission spectroscopy (ARPES) techniques, we unambiguously assign the two distinct surface states of MnBi4Te7 (n = 1) to the quintuple-layer (QL) Bi2Te3 termination and the septuple-layer (SL) MnBi2Te4 termination, respectively. A comparison of the experimental observations with theoretical calculations reveals the diverging topological behaviors, especially the hybridization effect between magnetic and nonmagnetic layers, on the two terminations: a gap on the QL termination originating from the topological surface states of the QL hybridizing with the bands of the beneath SL, and a gapless Dirac-cone band structure on the SL termination with time-reversal symmetry. The quasi-particle interference patterns further confirm the topological nature of the surface states for both terminations, continuing far above the Fermi energy. The QL termination carries a spin-helical Dirac state with hexagonal warping, while at the SL termination, a strongly canted helical state from the surface lies between a pair of Rashba-split states from its neighboring layer. Our work elucidates an unprecedented hybridization effect between the building blocks of the topological surface states, and also reveals the termination-dependent time-reversal symmetry breaking in a magnetic topological insulator, rendering an ideal platform to realize the half-integer quantum Hall effect and relevant quantum phenomena.

preprint2020arXiv

Long time error analysis of the fourth-order compact finite difference methods for the nonlinear Klein-Gordon equation with weak nonlinearity

We present the fourth-order compact finite difference (4cFD) discretizations for the long time dynamics of the nonlinear Klein-Gordon equation (NKGE), while the nonlinearity strength is characterized by $\varepsilon^p$ with a constant $p \in \mathbb{N}^+$ and a dimensionless parameter $\varepsilon \in (0, 1]$. Based on analytical results of the life-span of the solution, rigorous error bounds of the 4cFD methods are carried out up to the time at $O(\varepsilon^{-p})$. We pay particular attention to how error bounds depend explicitly on the mesh size $h$ and time step $τ$ as well as the small parameter $\varepsilon \in (0, 1]$, which indicate that, in order to obtain `correct&#39; numerical solutions up to the time at $O(\varepsilon^{-p})$, the $\varepsilon$-scalability (or meshing strategy requirement) of the 4cFD methods should be taken as: $h = O(\varepsilon^{p/4})$ and $τ= O(\varepsilon^{p/2})$. It has better spatial resolution capacity than the classical second order central difference methods. By a rescaling in time, it is equivalent to an oscillatory NKGE whose solution propagates waves with wavelength at $O(1)$ in space and $O(\varepsilon^p)$ in time. It is straightforward to get the error bounds of the oscillatory NKGE in the fixed time. Finally, numerical results are provided to confirm our theoretical analysis.

preprint2020arXiv

Uniform error bounds of an exponential wave integrator for the long-time dynamics of the nonlinear Klein-Gordon equation

We establish uniform error bounds of an exponential wave integrator Fourier pseudospectral (EWI-FP) method for the long-time dynamics of the nonlinear Klein-Gordon equation (NKGE) with a cubic nonlinearity whose strength is characterized by $\varepsilon^2$ with $\varepsilon \in (0, 1]$ a dimensionless parameter. When $0 < \varepsilon \ll 1$, the problem is equivalent to the long-time dynamics of the NKGE with small initial data (and $O(1)$ cubic nonlinearity), while the amplitude of the initial data (and the solution) is at $O(\varepsilon)$. For the long-time dynamics of the NKGE up to the time at $O(1/\varepsilon^{2})$, the resolution and error bounds of the classical numerical methods depend significantly on the small parameter $\varepsilon$, which causes severe numerical burdens as $\varepsilon \to 0^+$. The EWI-FP method is fully explicit, symmetric in time and has many superior properties in solving wave equations. By adapting the energy method combined with the method of mathematical induction, we rigorously carry out the uniform error bounds of the EWI-FP discretization at $O(h^{m_0} + \varepsilon^{2-β}τ^2)$ up to the time at $O(1/\varepsilon^β)$ with $0 \leq β\leq 2$, mesh size $h$, time step $τ$ and $m_0$ an integer depending on the regularity of the solution. By a rescaling in time, our results are straightforwardly extended to the error bounds and $\varepsilon$-scalability (or meshing strategy requirement) of the EWI-FP method for an oscillatory NKGE, whose solution propagates waves with wavelength at $O(1)$ and $O(\varepsilon^β)$ in space and time, respectively, and wave speed at $O(\varepsilon^{-β})$. Finally, extensive numerical results are reported to confirm our error estimates.