Researcher profile

Yun Chen

Yun Chen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
20works
0followers
16topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

20 published item(s)

preprint2026arXiv

FinSafetyBench: Evaluating LLM Safety in Real-World Financial Scenarios

Large language models (LLMs) are increasingly applied in financial scenarios. However, they may produce harmful outputs, including facilitating illegal activities or unethical behavior, posing serious compliance risks. To systematically evaluate LLM safety in finance, we propose FinSafetyBench, a bilingual (English-Chinese) red-teaming benchmark designed to test an LLM's refusal of requests that violate financial compliance. Grounded in real-world financial crime cases and ethics standards, the benchmark comprises 14 subcategories spanning financial crimes and ethical violations. Through extensive experiments on general-purpose and finance-specialized LLMs under three representative attack settings, we identify critical vulnerabilities that allow adversarial prompts to bypass compliance safeguards. Further analysis reveals stronger susceptibility in Chinese contexts and highlights the limitations of prompt-level defenses against sophisticated or implicit manipulation strategies.

preprint2023arXiv

Attention-based Interactive Disentangling Network for Instance-level Emotional Voice Conversion

Emotional Voice Conversion aims to manipulate a speech according to a given emotion while preserving non-emotion components. Existing approaches cannot well express fine-grained emotional attributes. In this paper, we propose an Attention-based Interactive diseNtangling Network (AINN) that leverages instance-wise emotional knowledge for voice conversion. We introduce a two-stage pipeline to effectively train our network: Stage I utilizes inter-speech contrastive learning to model fine-grained emotion and intra-speech disentanglement learning to better separate emotion and content. In Stage II, we propose to regularize the conversion with a multi-view consistency mechanism. This technique helps us transfer fine-grained emotion and maintain speech content. Extensive experiments show that our AINN outperforms state-of-the-arts in both objective and subjective metrics.

preprint2022arXiv

Angular dependency of spatial frequency modulation in diffusion media

An optical field will undergo coherent diffusion when it is mapped into thermal-motioned atoms, e.g., in a slow or storage light process. As was demonstrated before, such diffusion effect is equivalent to a spatial low-pass filter attenuating the high spatial frequency (SF) components of the optical field. Here, employing electromagnetically induced transparency (EIT) based light storage in hot atomic vapor, we demonstrate that the angular deviation between the control and probe beams could be utilized as a degree of freedom to modulate the SF of the probe beam. The principle is to change the diffusion-induced low-pass filter into a band-pass filter, whose SF response can be tuned by varying the direction and magnitude of the angular deviation. Transverse multimode light fields, such as optical images and Laguerre-Gaussian modes are utilized to study such SF modulation. Our findings could be broadly applied to the fields of quantum information processing, all-optical image manipulation and imaging through diffusive media.

preprint2022arXiv

Comparing the scalar-field dark energy models with recent observations

We investigate the general properties of a class of scalar-field dark energy models (i.e., $ϕ$CDM models) which behave like cosmological trackers at early times. Particularly, we choose three $ϕ$CDM models with typical potentials, i.e., $V(ϕ)\propto ϕ^{-α}$ (inverse power-law (IPL) model), $V(ϕ)\propto \coth^αϕ$ (L-model) and $V(ϕ)\propto \cosh(αϕ)$ (Oscillatory tracker model), where the latter two models are based on the $α$-attractors originated from the study of inflation. These models, which reduce to the $Λ$CDM model with $α\to 0$, are studied and compared with the recent observations, including the Pantheon sample of type Ia supernovae (SNe Ia), baryon acoustic oscillations (BAO) measurements extracted from 6dFGS, BOSS and eBOSS, as well as the temperature and polarization anisotropy power spectra data of cosmic microwave background radiation (CMB) from Planck 2018 results. The observational constraints from the combining sample (SNe Ia + BAO + CMB) indicate that none of the three $ϕ$CDM models exclude the $Λ$CDM model at $68.3\%$ confidence level. We find that the CMB anisotropy data have obvious advantages in constraining the dark energy models compared with other cosmological probes, which is particularly evident in the L-model. Furthermore, we apply the Bayesian evidence to compare the $ϕ$CDM models and the $Λ$CDM model with the analysis of the combining sample. The concordance $Λ$CDM model is still the most supported one. In addition, among the three $ϕ$CDM models, the IPL model is the most competitive one, while the L-model/Oscillatory tacker model is moderately/strongly disfavored.

preprint2022arXiv

Deep Learning-based Link Configuration for Radar-aided Multiuser mmWave Vehicle-to-Infrastructure Communication

Configuring millimeter wave links following a conventional beam training protocol, as the one proposed in the current cellular standard, introduces a large communication overhead, specially relevant in vehicular systems, where the channels are highly dynamic. In this paper, we propose the use of a passive radar array to sense automotive radar transmissions coming from multiple vehicles on the road, and a radar processing chain that provides information about a reduced set of candidate beams for the links between the road-infrastructure and each one of the vehicles. This prior information can be later leveraged by the beam training protocol to significantly reduce overhead. The radar processing chain estimates both the timing and chirp rates of the radar signals, isolates the individual signals by filtering out interfering radar chirps, and estimates the spatial covariance of each individual radar transmission. Then, a deep network is used to translate features of these radar spatial covariances into features of the communication spatial covariances, by learning the intricate mapping between radar and communication channels, in both line-of-sight and non-line-of-sight settings. The communication rates and outage probabilities of this approach are compared against exhaustive search and pure radar-aided beam training methods (without deep learning-based mapping), and evaluated on multi-user channels simulated by ray tracing. Results show that: (i) the proposed processing chain can reliably isolate the spatial covariances for individual radars, and (ii) the radar-to-communications translation strategy based on deep learning provides a significant improvement over pure radar-aided methods in both LOS and NLOS channels.

preprint2022arXiv

Direct Estimate of the Post-Newtonian Parameter and Cosmic Curvature from Galaxy-scale Strong Gravitational Lensing

Einstein's theory of general relativity (GR) has been precisely tested on solar system scales, but extragalactic tests are still poorly performed. In this work, we use a newly compiled sample of galaxy-scale strong gravitational lenses to test the validity of GR on kiloparsec scales. In order to solve the circularity problem caused by the preassumption of a specific cosmological model based on GR, we employ the distance sum rule in the Friedmann-Lema\^ıtre-Robertson-Walker metric to directly estimate the parameterized post-Newtonian (PPN) parameter $γ_{\rm PPN}$ and the cosmic curvature $Ω_k$ by combining observations of strong lensing and Type Ia supernovae. This is the first simultaneous measurement of $γ_{\rm PPN}$ and $Ω_k$ without any assumptions about the contents of the universe or the theory of gravity. Our results show that $γ_{\rm PPN}=1.11^{+0.11}_{-0.09}$ and $Ω_{k}=0.48^{+1.09}_{-0.71}$, indicating a strong degeneracy between the two quantities. The measured $γ_{\rm PPN}$, which is consistent with the prediction of 1 from GR, provides a precise extragalactic test of GR with a fractional accuracy better than 9.0\%. If a prior of the spatial flatness (i.e., $Ω_{k}=0$) is adopted, the PPN parameter constraint can be further improved to $γ_{\rm PPN}=1.07^{+0.07}_{-0.07}$, representing a precision of 6.5\%. On the other hand, in the framework of GR (i.e., $γ_{\rm PPN}=1$), our results are still marginally compatible with zero curvature ($Ω_k=-0.12^{+0.48}_{-0.36}$), supporting no significant deviation from a flat universe.

preprint2022arXiv

Exploring Adversarial Robustness of Multi-Sensor Perception Systems in Self Driving

Modern self-driving perception systems have been shown to improve upon processing complementary inputs such as LiDAR with images. In isolation, 2D images have been found to be extremely vulnerable to adversarial attacks. Yet, there have been limited studies on the adversarial robustness of multi-modal models that fuse LiDAR features with image features. Furthermore, existing works do not consider physically realizable perturbations that are consistent across the input modalities. In this paper, we showcase practical susceptibilities of multi-sensor detection by placing an adversarial object on top of a host vehicle. We focus on physically realizable and input-agnostic attacks as they are feasible to execute in practice, and show that a single universal adversary can hide different host vehicles from state-of-the-art multi-modal detectors. Our experiments demonstrate that successful attacks are primarily caused by easily corrupted image features. Furthermore, we find that in modern sensor fusion methods which project image features into 3D, adversarial attacks can exploit the projection process to generate false positives across distant regions in 3D. Towards more robust multi-modal perception systems, we show that adversarial training with feature denoising can boost robustness to such attacks significantly. However, we find that standard adversarial defenses still struggle to prevent false positives which are also caused by inaccurate associations between 3D LiDAR points and 2D pixels.

preprint2022arXiv

Investigating the dynamical models of cosmology with recent observations and upcoming gravitational-wave data

We explore and compare the capabilities of the recent observations of standard cosmological probes and the future observations of gravitational-wave (GW) standard sirens on constraining cosmological parameters. It is carried out in the frameworks of two typical dynamical models of cosmology, i.e., the $ω_0ω_a$CDM model with $ω(z) = ω_0 +ω_a*z/(1+z)$, and the $ξ$-index model with $ρ_X\proptoρ_ma^ξ$, where $ω(z)$ is the dark energy equation of state, and $ρ_X$ and $ρ_m$ are the energy densities of dark energy and matter, respectively. In the cosmological analysis, the employed data sets include the recent observations of the standard cosmological probes, i.e., Type Ia supernovae (SNe Ia), baryon acoustic oscillation (BAO) and cosmic microwave background (CMB), and also the mock GW standard siren sample with 1000 merging neutron star events anticipated from the third-generation detectors. In the scenarios of both $ω_0ω_a$CDM and $ξ$-index models, it turns out that the mock GW sample can reduce the uncertainty of the Hubble constant $H_0$ by about 50\% relative to that from the joint SNe+BAO+CMB sample; nevertheless, the SNe+BAO+CMB sample demonstrates better performance on limiting other parameters. Furthermore, the Bayesian evidence is applied to compare the dynamical models with the $Λ$CDM model. The Bayesian evidences computed from the SNe+BAO+CMB sample reveal that the $Λ$CDM model is the most supported one; moreover, the $ω_0ω_a$CDM model is more competitive than the $ξ$-index model.

preprint2022arXiv

Joint Initial Access and Localization in Millimeter Wave Vehicular Networks: a Hybrid Model/Data Driven Approach

High resolution compressive channel estimation provides information for vehicle localization when a hybrid mmWave MIMO system is considered. Complexity and memory requirements can, however, become a bottleneck when high accuracy localization is required. An additional challenge is the need of path order information to apply the appropriate geometric relationships between the channel path parameters and the vehicle, RSU and scatterers position. In this paper, we propose a low complexity channel estimation strategy of the angle of departure and time difference of arrival based on multidimensional orthogonal matching pursuit. We also design a deep neural network that predicts the order of the channel paths so only the LoS and first order reflections are used for localization. Simulation results obtained with realistic vehicular channels generated by ray tracing show that sub-meter accuracy can be achieved for 50% of the users, without resorting to perfect synchronization assumptions or unfeasible all-digital high resolution MIMO architectures.

preprint2022arXiv

LitMind Dictionary: An Open-Source Online Dictionary

Dictionaries can help language learners to learn vocabulary by providing definitions of words. Since traditional dictionaries present word senses as discrete items in predefined inventories, they fall short of flexibility, which is required in providing specific meanings of words in particular contexts. In this paper, we introduce the LitMind Dictionary (https://dictionary.litmind.ink), an open-source online generative dictionary that takes a word and context containing the word as input and automatically generates a definition as output. Incorporating state-of-the-art definition generation models, it supports not only Chinese and English, but also Chinese-English cross-lingual queries. Moreover, it has a user-friendly front-end design that can help users understand the query words quickly and easily. All the code and data are available at https://github.com/blcuicall/litmind-dictionary.

preprint2022arXiv

Multi-task unscented Kalman inversion (MUKI): a derivative-free joint inversion framework and its application to joint inversion of geophysical data

In the geophysical joint inversion, the gradient and Bayesian Markov Chain Monte Carlo (MCMC) sampling-based methods are widely used owing to their fast convergences or global optimality. However, these methods either require the computation of gradients and easily fall into local optimal solutions, or cost much time to carry out the millions of forward calculations in a huge sampling space. Different from these two methods, taking advantage of the recently developed unscented Kalman method in computational mathematics, we extend an iterative gradient-free Bayesian joint inversion framework, i.e., Multi-task unscented Kalman inversion (MUKI). In this new framework, information from various observations is incorporated, the model is iteratively updated in a derivative-free way, and a Gaussian approximation to the posterior distribution of the model parameters is obtained. We apply the MUKI to the joint inversion of receiver functions and surface wave dispersion, which is well-established and widely used to construct the crustal and upper mantle structure of the earth. Based on synthesized and real data, the tests demonstrate that MUKI can recover the model more efficiently than the gradient-based method and the Markov Chain Monte Carlo method, and it would be a promising approach to resolve the geophysical joint inversion problems.

preprint2022arXiv

Multitasking Framework for Unsupervised Simple Definition Generation

The definition generation task can help language learners by providing explanations for unfamiliar words. This task has attracted much attention in recent years. We propose a novel task of Simple Definition Generation (SDG) to help language learners and low literacy readers. A significant challenge of this task is the lack of learner's dictionaries in many languages, and therefore the lack of data for supervised training. We explore this task and propose a multitasking framework SimpDefiner that only requires a standard dictionary with complex definitions and a corpus containing arbitrary simple texts. We disentangle the complexity factors from the text by carefully designing a parameter sharing scheme between two decoders. By jointly training these components, the framework can generate both complex and simple definitions simultaneously. We demonstrate that the framework can generate relevant, simple definitions for the target words through automatic and manual evaluations on English and Chinese datasets. Our method outperforms the baseline model by a 1.77 SARI score on the English dataset, and raises the proportion of the low level (HSK level 1-3) words in Chinese definitions by 3.87%.

preprint2022arXiv

Towards Making the Most of Multilingual Pretraining for Zero-Shot Neural Machine Translation

This paper demonstrates that multilingual pretraining and multilingual fine-tuning are both critical for facilitating cross-lingual transfer in zero-shot translation, where the neural machine translation (NMT) model is tested on source languages unseen during supervised training. Following this idea, we present SixT+, a strong many-to-English NMT model that supports 100 source languages but is trained with a parallel dataset in only six source languages. SixT+ initializes the decoder embedding and the full encoder with XLM-R large and then trains the encoder and decoder layers with a simple two-stage training strategy. SixT+ achieves impressive performance on many-to-English translation. It significantly outperforms CRISS and m2m-100, two strong multilingual NMT systems, with an average gain of 7.2 and 5.0 BLEU respectively. Additionally, SixT+ offers a set of model parameters that can be further fine-tuned to other unsupervised tasks. We demonstrate that adding SixT+ initialization outperforms state-of-the-art explicitly designed unsupervised NMT models on Si<->En and Ne<->En by over 1.2 average BLEU. When applied to zero-shot cross-lingual abstractive summarization, it produces an average performance gain of 12.3 ROUGE-L over mBART-ft. We conduct detailed analyses to understand the key ingredients of SixT+, including multilinguality of the auxiliary parallel data, positional disentangled encoder, and the cross-lingual transferability of its encoder.

preprint2021arXiv

Few-Shot Domain Adaptation for Grammatical Error Correction via Meta-Learning

Most existing Grammatical Error Correction (GEC) methods based on sequence-to-sequence mainly focus on how to generate more pseudo data to obtain better performance. Few work addresses few-shot GEC domain adaptation. In this paper, we treat different GEC domains as different GEC tasks and propose to extend meta-learning to few-shot GEC domain adaptation without using any pseudo data. We exploit a set of data-rich source domains to learn the initialization of model parameters that facilitates fast adaptation on new resource-poor target domains. We adapt GEC model to the first language (L1) of the second language learner. To evaluate the proposed method, we use nine L1s as source domains and five L1s as target domains. Experiment results on the L1 GEC domain adaptation dataset demonstrate that the proposed approach outperforms the multi-task transfer learning baseline by 0.50 $F_{0.5}$ score on average and enables us to effectively adapt to a new L1 domain with only 200 parallel sentences.

preprint2021arXiv

YACLC: A Chinese Learner Corpus with Multidimensional Annotation

Learner corpus collects language data produced by L2 learners, that is second or foreign-language learners. This resource is of great relevance for second language acquisition research, foreign-language teaching, and automatic grammatical error correction. However, there is little focus on learner corpus for Chinese as Foreign Language (CFL) learners. Therefore, we propose to construct a large-scale, multidimensional annotated Chinese learner corpus. To construct the corpus, we first obtain a large number of topic-rich texts generated by CFL learners. Then we design an annotation scheme including a sentence acceptability score as well as grammatical error and fluency-based corrections. We build a crowdsourcing platform to perform the annotation effectively (https://yaclc.wenmind.net). We name the corpus YACLC (Yet Another Chinese Learner Corpus) and release it as part of the CUGE benchmark (http://cuge.baai.ac.cn). By analyzing the original sentences and annotations in the corpus, we found that YACLC has a considerable size and very high annotation quality. We hope this corpus can further enhance the studies on Chinese International Education and Chinese automatic grammatical error correction.

preprint2020arXiv

A Deep Reinforcement Learning Approach to Efficient Drone Mobility Support

The growing deployment of drones in a myriad of applications relies on seamless and reliable wireless connectivity for safe control and operation of drones. Cellular technology is a key enabler for providing essential wireless services to flying drones in the sky. Existing cellular networks targeting terrestrial usage can support the initial deployment of low-altitude drone users, but there are challenges such as mobility support. In this paper, we propose a novel handover framework for providing efficient mobility support and reliable wireless connectivity to drones served by a terrestrial cellular network. Using tools from deep reinforcement learning, we develop a deep Q-learning algorithm to dynamically optimize handover decisions to ensure robust connectivity for drone users. Simulation results show that the proposed framework significantly reduces the number of handovers at the expense of a small loss in signal strength relative to the baseline case where a drone always connect to a base station that provides the strongest received signal strength.

preprint2020arXiv

Dictionary-based Data Augmentation for Cross-Domain Neural Machine Translation

Existing data augmentation approaches for neural machine translation (NMT) have predominantly relied on back-translating in-domain (IND) monolingual corpora. These methods suffer from issues associated with a domain information gap, which leads to translation errors for low frequency and out-of-vocabulary terminology. This paper proposes a dictionary-based data augmentation (DDA) method for cross-domain NMT. DDA synthesizes a domain-specific dictionary with general domain corpora to automatically generate a large-scale pseudo-IND parallel corpus. The generated pseudo-IND data can be used to enhance a general domain trained baseline. The experiments show that the DDA-enhanced NMT models demonstrate consistent significant improvements, outperforming the baseline models by 3.75-11.53 BLEU. The proposed method is also able to further improve the performance of the back-translation based and IND-finetuned NMT models. The improvement is associated with the enhanced domain coverage produced by DDA.

preprint2020arXiv

DSDNet: Deep Structured self-Driving Network

In this paper, we propose the Deep Structured self-Driving Network (DSDNet), which performs object detection, motion prediction, and motion planning with a single neural network. Towards this goal, we develop a deep structured energy based model which considers the interactions between actors and produces socially consistent multimodal future predictions. Furthermore, DSDNet explicitly exploits the predicted future distributions of actors to plan a safe maneuver by using a structured planning cost. Our sample-based formulation allows us to overcome the difficulty in probabilistic inference of continuous random variables. Experiments on a number of large-scale self driving datasets demonstrate that our model significantly outperforms the state-of-the-art.

preprint2020arXiv

Learning Lane Graph Representations for Motion Forecasting

We propose a motion forecasting model that exploits a novel structured map representation as well as actor-map interactions. Instead of encoding vectorized maps as raster images, we construct a lane graph from raw map data to explicitly preserve the map structure. To capture the complex topology and long range dependencies of the lane graph, we propose LaneGCN which extends graph convolutions with multiple adjacency matrices and along-lane dilation. To capture the complex interactions between actors and maps, we exploit a fusion network consisting of four types of interactions, actor-to-lane, lane-to-lane, lane-to-actor and actor-to-actor. Powered by LaneGCN and actor-map interactions, our model is able to predict accurate and realistic multi-modal trajectories. Our approach significantly outperforms the state-of-the-art on the large scale Argoverse motion forecasting benchmark.

preprint2020arXiv

PnPNet: End-to-End Perception and Prediction with Tracking in the Loop

We tackle the problem of joint perception and motion forecasting in the context of self-driving vehicles. Towards this goal we propose PnPNet, an end-to-end model that takes as input sequential sensor data, and outputs at each time step object tracks and their future trajectories. The key component is a novel tracking module that generates object tracks online from detections and exploits trajectory level features for motion forecasting. Specifically, the object tracks get updated at each time step by solving both the data association problem and the trajectory estimation problem. Importantly, the whole model is end-to-end trainable and benefits from joint optimization of all tasks. We validate PnPNet on two large-scale driving datasets, and show significant improvements over the state-of-the-art with better occlusion recovery and more accurate future prediction.