Researcher profile

Qun Chen

Qun Chen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

$ξ$-DPO: Direct Preference Optimization via Ratio Reward Margin

Reference-free preference optimization has emerged as an efficient alternative to reinforcement learning from human feedback, with Simple Preference Optimization(SimPO) demonstrating strong performance by eliminating the explicit reference model through a simple objective. However, the joint tuning of the hyperparameters $β$ and $γ$ in SimPO remains a central challenge. We argue that this difficulty arises because the margin formulation in SimPO is not easily interpretable across datasets with different reward gap structures. To better understand this issue, we conduct a comprehensive analysis of SimPO and find that $β$ implicitly controls sample filtering, while the effect of $γ$ depends on the reward gap structure of the dataset. Motivated by these observations, we propose $ξ$-DPO: Direct preference optimization via ratio reward margin. We first reformulate the preference objective through an equivalent transformation, changing the optimization target from maximizing the likelihood of reward gaps to minimizing the distance between reward gaps and optimal margins. Then, we redefine the reward in a ratio form between the chosen and rejected, which effectively cancels the effect of $β$ and yields a bounded and interpretable margin. This margin is called the ratio reward margin and is denoted by $ξ$. Unlike the margin $γ$ in SimPO, $ξ$ explicitly represents the desired relative separation between chosen and rejected responses and can be determined from the initial reward gap distribution, avoiding repeated trial-and-error tuning. ....

preprint2022arXiv

Adaptive Deep Learning for Entity Resolution by Risk Analysis

The state-of-the-art performance on entity resolution (ER) has been achieved by deep learning. However, deep models are usually trained on large quantities of accurately labeled training data, and can not be easily tuned towards a target workload. Unfortunately, in real scenarios, there may not be sufficient labeled training data, and even worse, their distribution is usually more or less different from the target workload even when they come from the same domain. To alleviate the said limitations, this paper proposes a novel risk-based approach to tune a deep model towards a target workload by its particular characteristics. Built on the recent advances on risk analysis for ER, the proposed approach first trains a deep model on labeled training data, and then fine-tunes it by minimizing its estimated misprediction risk on unlabeled target data. Our theoretical analysis shows that risk-based adaptive training can correct the label status of a mispredicted instance with a fairly good chance. We have also empirically validated the efficacy of the proposed approach on real benchmark data by a comparative study. Our extensive experiments show that it can considerably improve the performance of deep models. Furthermore, in the scenario of distribution misalignment, it can similarly outperform the state-of-the-art alternative of transfer learning by considerable margins. Using ER as a test case, we demonstrate that risk-based adaptive training is a promising approach potentially applicable to various challenging classification tasks.

preprint2022arXiv

Fundamental limit to the rectification of near-field heat flow: The potential of intrinsic semiconductor films

We derive the fundamental limit to near-field radiative thermal rectification mediated by an intrinsic semiconductor film within the framework of fluctuational electrodynamics. By leveraging the electromagnetic local density of states, we identify ε"_H/ε"_L as an upper bound on the rectification magnitude, where ε"_H and ε"_L are respectively the imaginary parts of the film permittivity at high and low temperatures. This bound is tight and can be approached regardless of whether the film is suspended or supported. For intrinsic silicon the limit can in principle exceed 10^9. Our work highlights the possibility of controlling heat flow as effectively as electric current, and offers guidelines to potentially achieve this goal.

preprint2021arXiv

Radiative thermal diode via hyperbolic metamaterials

Hyperbolic metamaterials (HMMs) support propagating waves with arbitrarily large wavevectors over broad spectral ranges, and are uniquely valuable for engineering radiative thermal transport in the near field. Here, by employing a rational design approach based on the electromagnetic local density of states, we demonstrate the ability of HMMs to substantially rectify radiative heat flow. Our idea is to establish a forward-biased scenario where the two HMM-based terminals of a thermal diode feature overlapped hyperbolic bands which result in a large heat current, and suppress the reverse heat flow by creating spectrally mismatched density of states as the temperature bias is flipped. As an example, we present a few high-performance thermal diodes by pairing HMMs made of polar dielectrics and metal-to-insulator transition (MIT) materials in the form of periodic nanowire arrays, and considering three representative kinds of substrates. Upon optimization, we theoretically achieve a rectification ratio of 324 at a 100 nm gap, which remains greater than 148 for larger gap sizes up to 1 um over a wide temperature range. The maximum rectification represents an almost 1000-fold increase compared to a bulk diode using the same materials, and is twice that of state-of-the-art designs. Our work highlights the potential of HMMs for rectifying radiative heat flow, and may find applications in advanced thermal management and energy conversion systems.

preprint2020arXiv

Carleman estimates for a stochastic degenerate parabolic equation and applications to null controllability and an inverse random source problem

In this paper, we establish two Carleman estimates for a stochastic degenerate parabolic equation. The first one is for the backward stochastic degenerate parabolic equation with singular weight function. Combining this Carleman estimate and an approximate argument, we prove the null controllability of the forward stochastic degenerate parabolic equation with the gradient term. The second one is for the forward stochastic degenerate parabolic equation with regular weighted function, based on which we obtain the Lipschitz stability for an inverse problem of determining a random source depending only on time in the forward stochastic degenerate parabolic equation.