Researcher profile

Dawei Li

Dawei Li contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2026arXiv

Cornerstones or Stumbling Blocks? Deciphering the Rock Tokens in On-Policy Distillation

While recent work in Reinforcement Learning with Verifiable Rewards (RLVR) has shown that a small subset of critical tokens disproportionately drives reasoning gains, an analogous token-level understanding of On-Policy Distillation (OPD) remains largely unexplored. In this work, we investigate high-loss tokens, a token type that--as the most direct signal of student-teacher mismatch under OPD's per-token KL objective--should progressively diminish as training converges according to existing studies; however, our empirical analysis shows otherwise. Even after OPD training reaches apparent saturation, a substantial subset of tokens continues to exhibit persistently high loss; these tokens, which we term Rock Tokens, can account for up to 18\% of the tokens in generated outputs. Our investigation reveals two startling paradoxes. First, despite their high occurrence frequency providing a disproportionately large share of total gradient norms, Rock Tokens themselves remain stagnant throughout training, resisting teacher-driven corrections. Second, through causal intervention, we find that these tokens provide negligible functional contribution to the model's actual reasoning performance. These findings suggest that a vast amount of optimization bandwidth is spent on structural and discourse residuals that the student model cannot or need not internalize. By deconstructing these dynamics, we demonstrate that strategically bypassing these ``stumbling blocks'' can significantly streamline the alignment process, challenging the necessity of uniform token weighting and offering a more efficient paradigm for large-scale model distillation.

preprint2022arXiv

C3KG: A Chinese Commonsense Conversation Knowledge Graph

Existing commonsense knowledge bases often organize tuples in an isolated manner, which is deficient for commonsense conversational models to plan the next steps. To fill the gap, we curate a large-scale multi-turn human-written conversation corpus, and create the first Chinese commonsense conversation knowledge graph which incorporates both social commonsense knowledge and dialog flow information. To show the potential of our graph, we develop a graph-conversation matching approach, and benchmark two graph-grounded conversational tasks.

preprint2022arXiv

Robust Coordinated Longitudinal Control of MAV Based on Energy State

Fixed-wing Miniature Air Vehicle (MAV) is not only coupled with longitudinal motion, but also more susceptible to wind disturbance due to its lighter weight, which brings more challenges to its altitude and airspeed controller design. Therefore, in this paper, an improved longitudinal control strategy based on energy state, is proposed to address the above-mentioned issues. The control strategy utilizes the Linear Extended State Observer (LESO) to observe the energy states and the disturbance of the MAV, and then designs a Multiple-Input Multiple-Output (MIMO) controller based on a more coordinated Total Energy Control (TEC) strategy to control the airspeed and altitude of the MAV. The performance of this control strategy has been successfully verified in a Model-in-the-Loop (MIL) simulation with Simulink, and a comparative test with the classical TEC algorithm is carried out.

preprint2021arXiv

A prognostic dynamic model applicable to infectious diseases providing easily visualized guides -- A case study of COVID-19 in the UK

A reasonable prediction of infectious diseases transmission process under different disease control strategies is an important reference point for policy makers. Here we established a dynamic transmission model via Python and realized comprehensive regulation of disease control measures. We classified government interventions into three categories and introduced three parameters as descriptions for the key points in disease control, these being intraregional growth rate, interregional communication rate, and detection rate of infectors. Our simulation predicts the infection by COVID-19 in the UK would be out of control in 73 days without any interventions; at the same time, herd immunity acquisition will begin from the epicentre. After we introduced government interventions, single intervention is effective in disease control but at huge expense while combined interventions would be more efficient, among which, enhancing detection number is crucial in control strategy of COVID-19. In addition, we calculated requirements for the most effective vaccination strategy based on infection number in real situation. Our model was programmed with iterative algorithms, and visualized via cellular automata, it can be applied to similar epidemics in other regions if the basic parameters are inputted, and is able to synthetically mimick the effect of multiple factors in infectious disease control.

preprint2021arXiv

On a Faster $R$-Linear Convergence Rate of the Barzilai-Borwein Method

The Barzilai-Borwein (BB) method has demonstrated great empirical success in nonlinear optimization. However, the convergence speed of BB method is not well understood, as the known convergence rate of BB method for quadratic problems is much worse than the steepest descent (SD) method. Therefore, there is a large discrepancy between theory and practice. To shrink this gap, we prove that the BB method converges $R$-linearly at a rate of $1-1/κ$, where $κ$ is the condition number, for strongly convex quadratic problems. In addition, an example with the theoretical rate of convergence is constructed, indicating the tightness of our bound.

preprint2020arXiv

An Improved Quadrature Voltage-Controlled Oscillator with Through-Silicon-Via Inductor in Three-dimensional Integrated Circuits

Low-power quadrature voltage-controlled oscillator (QVCO) design utilizing transformer-feedback and current-reuse techniques with increased frequency range is proposed in this paper. With increasing demand for QVCOs in on-chip applications, the conventional spiral inductor based approaches for QVCOs has become a major bottleneck due to their large size. To address this concern, we propose to replace the conventional spiral inductor based approaches with through-silicon-via (TSV) inductor based approach in three-dimensional integrated circuits (3D ICs). In addition, the proposed QVCO circuit can provide higher frequency range of operation compared with conventional designs. Experimental results show by replacing conventional spiral transformers with TSV transformers, up to 3.9x reduction in metal resource consumption. The proposed QVCOs achieves a phase noise of -114 $dBc/Hz$@1 $MHz$ and -111.2 $dBc/Hz$@1 $MHz$ at the carrier of 2.5 $GHz$ for toroidal TSV transformed based-QVCO and vertical spiral transformer based-QVCO respectively. The power consumption is only 1.5 $mW$ and 1.7 $mW$ for toroidal TSV transformed based-QVCO and vertical spiral transformer based-QVCO respectively, under the supply voltage of 0.7 $V$.

preprint2020arXiv

Class-incremental Learning via Deep Model Consolidation

Deep neural networks (DNNs) often suffer from "catastrophic forgetting" during incremental learning (IL) --- an abrupt degradation of performance on the original set of classes when the training objective is adapted to a newly added set of classes. Existing IL approaches tend to produce a model that is biased towards either the old classes or new classes, unless with the help of exemplars of the old data. To address this issue, we propose a class-incremental learning paradigm called Deep Model Consolidation (DMC), which works well even when the original training data is not available. The idea is to first train a separate model only for the new classes, and then combine the two individual models trained on data of two distinct set of classes (old classes and new classes) via a novel double distillation training objective. The two existing models are consolidated by exploiting publicly available unlabeled auxiliary data. This overcomes the potential difficulties due to the unavailability of original training data. Compared to the state-of-the-art techniques, DMC demonstrates significantly better performance in image classification (CIFAR-100 and CUB-200) and object detection (PASCAL VOC 2007) in the single-headed IL setting.

preprint2020arXiv

Polar Coupling Enabled Nonlinear Optical Filtering at MoS$_2$/Ferroelectric Heterointerfaces

Complex oxide heterointerfaces and van der Waals heterostructures present two versatile but intrinsically different platforms for exploring emergent quantum phenomena and designing new functionalities. The rich opportunity offered by the synergy between these two classes of materials, however, is yet to be charted. Here, we report an unconventional nonlinear optical filtering effect resulting from the interfacial polar alignment between monolayer MoS$_2$ and a neighboring ferroelectric oxide thin film. The second harmonic generation response at the heterointerface is either substantially enhanced or almost entirely quenched by an underlying ferroelectric domain wall depending on its chirality, and can be further tailored by the polar domains. Unlike the extensively studied coupling mechanisms driven by charge, spin, and lattice, the interfacial tailoring effect is solely mediated by the polar symmetry, as well explained via our density functional theory calculations, pointing to a new material strategy for the functional design of nanoscale reconfigurable optical applications.