Researcher profile

Shuo Sun

Shuo Sun contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
16works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

16 published item(s)

preprint2026arXiv

Contact resistance and interfacial engineering: Advances in high-performance 2D-TMD based devices

The development of advanced electronic devices is contingent upon sustainable material development and pioneering research breakthroughs. Traditional semiconductor-based electronic technology faces constraints in material thickness scaling and energy efficiency. Atomically thin two-dimensional (2D) transition metal dichalcogenides (TMDs) have emerged as promising candidates for next-generation nanoelectronics and optoelectronic applications, boasting high electron mobility, mechanical strength, and a customizable band gap. Despite these merits, the Fermi level pinning effect introduces uncontrollable Schottky barriers at metal-2D-TMD contacts, challenging prediction through the Schottky-Mott rule. These barriers fundamentally lead to elevated contact resistance and limited current-delivery capability, impeding the enhancement of 2D-TMD transistor and integrated circuit properties. In this review, we succinctly outline the Fermi pinning effect mechanism and peculiar contact resistance behavior at metal/2D-TMD interfaces. Subsequently, highlights on the recent advances in overcoming contact resistance in 2D-TMDs devices, encompassing interface interaction and hybridization, van der Waals (vdW) contacts, prefabricated metal transfer and charge-transfer doping will be addressed. Finally, the discussion extends to challenges and offers insights into future developmental prospects.

preprint2026arXiv

Senses Wide Shut: A Representation-Action Gap in Omnimodal LLMs

When an omnimodal large language model accepts a question whose textual premise contradicts what it actually sees or hears, does the failure lie in perception or in action? Recent omnimodal models are positioned as perception-grounded agents that jointly process video, audio, and text, yet a basic form of grounding remains untested: catching a textual claim that conflicts with the model's own sensory input. We introduce IMAVB, a curated 500-clip benchmark of long-form movies with a 2x2 design crossing target modality (vision, audio) and premise condition (standard, misleading), which lets us measure conflict detection separately from ordinary multimodal comprehension. Across eight open-source omnimodal LLMs and Gemini 3.1 Pro, we document a Representation-Action Gap: hidden states reliably encode premise-perception mismatches even when the same models almost never reject the false claim in their outputs. Behaviorally, models fall into two failure modes: under-rejection, in which they answer misleading questions as if the false premise were true; and over-rejection, in which they reject more often but also reject standard questions, sacrificing ordinary comprehension accuracy. The gap is modality-asymmetric (audio grounding underperforms vision) and prompt-resistant across seven variants. As an initial diagnostic intervention, a probe-guided logit adjustment (PGLA) re-injects the encoded mismatch signal into decoding and consistently improves rejection behavior. Together, these results suggest the bottleneck for omnimodal grounding lies in translation, not perception.

preprint2026arXiv

Single-Atom Tuning of Structural and Optoelectronic Properties in Halogenated Anthracene-Based Covalent Organic Frameworks

Strategies for tuning structural and (opto-)electronic properties are fundamental to the rational design of functional materials. Here, we present a molecular design approach for precisely modulating the optoelectronic properties of covalent organic frameworks (COFs) through single-atom halogen substitution on $π$-extended anthracene linkers. Using a Wurster-type tetratopic amine (W-NH$_2$) and a series of anthracene-based dialdehydes bearing H, Cl, Br, or I at the 2-position, a family of imine-linked COFs, W-A-X (X = H, Cl, Br, I), was synthesized, all displaying well-ordered porous structures. The halogen substituent strongly influences framework formation, with brominated COFs forming substantially larger crystalline domains than their chloro- and iodo-functionalized analogues. UV-vis absorption and photoluminescence measurements reveal a systematic redshift across the series $(\mathrm{H < Cl < Br < I})$, demonstrating that a single-atom modification effectively tunes the optical response. Time-dependent density functional theory calculations on both isolated fragments and extended COF models attribute these trends to halogen-induced changes in the COF band structure and provide a mechanistic understanding of how single-atom substitution influences the optoelectronic properties of the extended $π$-framework. Overall, this study establishes single-atom halogen substitution as a powerful and modular strategy for tailoring the structural and optical properties of anthracene-based COFs.

preprint2025arXiv

FineFT: Efficient and Risk-Aware Ensemble Reinforcement Learning for Futures Trading

Futures are contracts obligating the exchange of an asset at a predetermined date and price, notable for their high leverage and liquidity and, therefore, thrive in the Crypto market. RL has been widely applied in various quantitative tasks. However, most methods focus on the spot and could not be directly applied to the futures market with high leverage because of 2 challenges. First, high leverage amplifies reward fluctuations, making training stochastic and difficult to converge. Second, prior works lacked self-awareness of capability boundaries, exposing them to the risk of significant loss when encountering new market state (e.g.,a black swan event like COVID-19). To tackle these challenges, we propose the Efficient and Risk-Aware Ensemble Reinforcement Learning for Futures Trading (FineFT), a novel three-stage ensemble RL framework with stable training and proper risk management. In stage I, ensemble Q learners are selectively updated by ensemble TD errors to improve convergence. In stage II, we filter the Q-learners based on their profitabilities and train VAEs on market states to identify the capability boundaries of the learners. In stage III, we choose from the filtered ensemble and a conservative policy, guided by trained VAEs, to maintain profitability and mitigate risk with new market states. Through extensive experiments on crypto futures in a high-frequency trading environment with high fidelity and 5x leverage, we demonstrate that FineFT outperforms 12 SOTA baselines in 6 financial metrics, reducing risk by more than 40% while achieving superior profitability compared to the runner-up. Visualization of the selective update mechanism shows that different agents specialize in distinct market dynamics, and ablation studies certify routing with VAEs reduces maximum drawdown effectively, and selective update improves convergence and performance.

preprint2022arXiv

DeepScalper: A Risk-Aware Reinforcement Learning Framework to Capture Fleeting Intraday Trading Opportunities

Reinforcement learning (RL) techniques have shown great success in many challenging quantitative trading tasks, such as portfolio management and algorithmic trading. Especially, intraday trading is one of the most profitable and risky tasks because of the intraday behaviors of the financial market that reflect billions of rapidly fluctuating capitals. However, a vast majority of existing RL methods focus on the relatively low frequency trading scenarios (e.g., day-level) and fail to capture the fleeting intraday investment opportunities due to two major challenges: 1) how to effectively train profitable RL agents for intraday investment decision-making, which involves high-dimensional fine-grained action space; 2) how to learn meaningful multi-modality market representation to understand the intraday behaviors of the financial market at tick-level. Motivated by the efficient workflow of professional human intraday traders, we propose DeepScalper, a deep reinforcement learning framework for intraday trading to tackle the above challenges. Specifically, DeepScalper includes four components: 1) a dueling Q-network with action branching to deal with the large action space of intraday trading for efficient RL optimization; 2) a novel reward function with a hindsight bonus to encourage RL agents making trading decisions with a long-term horizon of the entire trading day; 3) an encoder-decoder architecture to learn multi-modality temporal market embedding, which incorporates both macro-level and micro-level market information; 4) a risk-aware auxiliary task to maintain a striking balance between maximizing profit and minimizing risk. Through extensive experiments on real-world market data spanning over three years on six financial futures, we demonstrate that DeepScalper significantly outperforms many state-of-the-art baselines in terms of four financial criteria.

preprint2022arXiv

Quantitative Stock Investment by Routing Uncertainty-Aware Trading Experts: A Multi-Task Learning Approach

Quantitative investment is a fundamental financial task that highly relies on accurate stock prediction and profitable investment decision making. Despite recent advances in deep learning (DL) have shown stellar performance on capturing trading opportunities in the stochastic stock market, we observe that the performance of existing DL methods is sensitive to random seeds and network initialization. To design more profitable DL methods, we analyze this phenomenon and find two major limitations of existing works. First, there is a noticeable gap between accurate financial predictions and profitable investment strategies. Second, investment decisions are made based on only one individual predictor without consideration of model uncertainty, which is inconsistent with the workflow in real-world trading firms. To tackle these two limitations, we first reformulate quantitative investment as a multi-task learning problem. Later on, we propose AlphaMix, a novel two-stage mixture-of-experts (MoE) framework for quantitative investment to mimic the efficient bottom-up trading strategy design workflow of successful trading firms. In Stage one, multiple independent trading experts are jointly optimized with an individual uncertainty-aware loss function. In Stage two, we train neural routers (corresponding to the role of a portfolio manager) to dynamically deploy these experts on an as-needed basis. AlphaMix is also a universal framework that is applicable to various backbone network architectures with consistent performance gains. Through extensive experiments on long-term real-world data spanning over five years on two of the most influential financial markets (US and China), we demonstrate that AlphaMix significantly outperforms many state-of-the-art baselines in terms of four financial criteria.

preprint2022arXiv

Value Functions for Depth-Limited Solving in Zero-Sum Imperfect-Information Games

We provide a formal definition of depth-limited games together with an accessible and rigorous explanation of the underlying concepts, both of which were previously missing in imperfect-information games. The definition works for an arbitrary extensive-form game and is not tied to any specific game-solving algorithm. Moreover, this framework unifies and significantly extends three approaches to depth-limited solving that previously existed in extensive-form games and multiagent reinforcement learning but were not known to be compatible. A key ingredient of these depth-limited games are value functions. Focusing on two-player zero-sum imperfect-information games, we show how to obtain optimal value functions and prove that public information provides both necessary and sufficient context for computing them. We provide a domain-independent encoding of the domains that allows for approximating value functions even by simple feed-forward neural networks, which are then able to generalize to unseen parts of the game. We use the resulting value network to implement a depth-limited version of counterfactual regret minimization. In three distinct domains, we show that the algorithm&#39;s exploitability is roughly linearly dependent on the value network&#39;s quality and that it is not difficult to train a value network with which depth-limited CFR&#39;s performance is as good as that of CFR with access to the full game.

preprint2020arXiv

Development of Quantum InterConnects for Next-Generation Information Technologies

Just as classical information technology rests on a foundation built of interconnected information-processing systems, quantum information technology (QIT) must do the same. A critical component of such systems is the interconnect, a device or process that allows transfer of information between disparate physical media, for example, semiconductor electronics, individual atoms, light pulses in optical fiber, or microwave fields. While interconnects have been well engineered for decades in the realm of classical information technology, quantum interconnects (QuICs) present special challenges, as they must allow the transfer of fragile quantum states between different physical parts or degrees of freedom of the system. The diversity of QIT platforms (superconducting, atomic, solid-state color center, optical, etc.) that will form a quantum internet poses additional challenges. As quantum systems scale to larger size, the quantum interconnect bottleneck is imminent, and is emerging as a grand challenge for QIT. For these reasons, it is the position of the community represented by participants of the NSF workshop on Quantum Interconnects that accelerating QuIC research is crucial for sustained development of a national quantum science and technology program. Given the diversity of QIT platforms, materials used, applications, and infrastructure required, a convergent research program including partnership between academia, industry and national laboratories is required. This document is a summary from a U.S. National Science Foundation supported workshop held on 31 October - 1 November 2019 in Alexandria, VA. Attendees were charged to identify the scientific and community needs, opportunities, and significant challenges for quantum interconnects over the next 2-5 years.

preprint2020arXiv

Generation of Tin-Vacancy Centers in Diamond via Shallow Ion Implantation and Subsequent Diamond Overgrowth

Group-IV color centers in diamond have garnered great interest for their potential as optically active solid-state spin qubits. Future utilization of such emitters requires the development of precise site-controlled emitter generation techniques that are compatible with high-quality nanophotonic devices. This task is more challenging for color centers with large group-IV impurity atoms, which are otherwise promising because of their predicted long spin coherence times without a dilution refrigerator. For example, when applied to the negatively charged tin-vacancy (SnV$^-$) center, conventional site-controlled color center generation methods either damage the diamond surface or yield bulk spectra with unexplained features. Here we demonstrate a novel method to generate site-controlled SnV$^-$ centers with clean bulk spectra. We shallowly implant Sn ions through a thin implantation mask and subsequently grow a layer of diamond via chemical vapor deposition. This method can be extended to other color centers and integrated with quantum nanophotonic device fabrication.

preprint2020arXiv

Modeling Document Interactions for Learning to Rank with Regularized Self-Attention

Learning to rank is an important task that has been successfully deployed in many real-world information retrieval systems. Most existing methods compute relevance judgments of documents independently, without holistically considering the entire set of competing documents. In this paper, we explore modeling documents interactions with self-attention based neural networks. Although self-attention networks have achieved state-of-the-art results in many NLP tasks, we find empirically that self-attention provides little benefit over baseline neural learning to rank architecture. To improve the learning of self-attention weights, We propose simple yet effective regularization terms designed to model interactions between documents. Evaluations on publicly available Learning to Rank (LETOR) datasets show that training self-attention network with our proposed regularization terms can significantly outperform existing learning to rank methods.

preprint2020arXiv

Narrow-linewidth tin-vacancy centers in a diamond waveguide

Integrating solid-state quantum emitters with photonic circuits is essential for realizing large-scale quantum photonic processors. Negatively charged tin-vacancy (SnV$^-$) centers in diamond have emerged as promising candidates for quantum emitters because of their excellent optical and spin properties including narrow-linewidth emission and long spin coherence times. SnV$^-$ centers need to be incorporated in optical waveguides for efficient on-chip routing of the photons they generate. However, such integration has yet to be realized. In this Letter, we demonstrate the coupling of SnV$^-$ centers to a nanophotonic waveguide. We realize this device by leveraging our recently developed shallow ion implantation and growth method for generation of high-quality SnV$^-$ centers and the advanced quasi-isotropic diamond fabrication technique. We confirm the compatibility and robustness of these techniques through successful coupling of narrow-linewidth SnV$^-$ centers (as narrow as $36\pm2$ MHz) to the diamond waveguide. Furthermore, we investigate the stability of waveguide-coupled SnV$^-$ centers under resonant excitation. Our results are an important step toward SnV$^-$-based on-chip spin-photon interfaces, single-photon nonlinearity, and photon-mediated spin interactions.

preprint2020arXiv

Rotating multistate boson stars

In this paper, we construct rotating boson stars composed of the coexisting states of two scalar fields, including the ground and first excited states. We show the coexisting phase with both the ground and first excited states for rotating multistate boson stars. In contrast to the solutions of the nodeless boson stars, the rotating boson stars with two states have two types of nodes, including the $^1S^2S$ state and the $^1S^2P$ state. Moreover, we explore the properties of the mass $M$ of rotating boson stars with two states as a function of the synchronized frequency $ω$, as well as the nonsynchronized frequency $ω_2$. Finally, we also study the dependence of the mass $M$ of rotating boson stars with two states on angular momentum for both the synchronized frequency $ω$ and the nonsynchronized frequency $ω_2$.

preprint2020arXiv

Spectrally reconfigurable quantum emitters enabled by optimized fast modulation

The ability to shape photon emission facilitates strong photon-mediated interactions between disparate physical systems, thereby enabling applications in quantum information processing, simulation and communication. Spectral control in solid state platforms such as color centers, rare earth ions, and quantum dots is particularly attractive for realizing such applications on-chip. Here we propose the use of frequency-modulated optical transitions for spectral engineering of single photon emission. Using a scattering-matrix formalism, we find that a two-level system, when modulated faster than its optical lifetime, can be treated as a single-photon source with a widely reconfigurable photon spectrum that is amenable to standard numerical optimization techniques. To enable the experimental demonstration of this spectral control scheme, we investigate the Stark tuning properties of the silicon vacancy in silicon carbide, a color center with promise for optical quantum information processing technologies. We find that the silicon vacancy possesses excellent spectral stability and tuning characteristics, allowing us to probe its fast modulation regime, observe the theoretically-predicted two-photon correlations, and demonstrate spectral engineering. Our results suggest that frequency modulation is a powerful technique for the generation of new light states with unprecedented control over the spectral and temporal properties of single photons.

preprint2020arXiv

Unsupervised Quality Estimation for Neural Machine Translation

Quality Estimation (QE) is an important component in making Machine Translation (MT) useful in real-world applications, as it is aimed to inform the user on the quality of the MT output at test time. Existing approaches require large amounts of expert annotated data, computation and time for training. As an alternative, we devise an unsupervised approach to QE where no training or access to additional resources besides the MT system itself is required. Different from most of the current work that treats the MT system as a black box, we explore useful information that can be extracted from the MT system as a by-product of translation. By employing methods for uncertainty quantification, we achieve very good correlation with human judgments of quality, rivalling state-of-the-art supervised QE models. To evaluate our approach we collect the first dataset that enables work on both black-box and glass-box approaches to QE.

preprint2019arXiv

Deforming charged black holes with dipolar differential rotation boundary

Motivated by the recent studies of the novel asymptotically global AdS$_4$ black hole with deforming horizon, we consider the action of Einstein-Maxwell gravity in AdS spacetime and construct the charged deforming AdS black holes with differential boundary. In contrast to deforming black hole without charge, there exists at least one value of horizon for an arbitrary temperature. The extremum of temperature is determined by charge $q$ and divides the range of temperature into several parts. Moreover, we use an isometric embedding in the three-dimensional space to investigate the horizon geometry. We also study the entropy and quasinormal modes of deforming charged AdS black hole. It is interesting to find there exist two families of black hole solutions with different horizon radius for a fixed temperature, but these two black holes have same horizon geometry and entropy. Due to the existence of charge $q$, the phase diagram of entropy is more complicated.

preprint2019arXiv

Weak cosmic censorship in Born-Infeld electrodynamics and bound on charge-to-mass ratio

We construct a class of counterexamples to cosmic censorship in four dimensional Einstein-Born-Infeld theory with asymptotically anti-de Sitter boundary conditions, and investigate the effect of the Born-Infeld parameter $b$ in comparison with the counterpart of Einstein-Maxwell theory. When a charged massive scalar field is included into the action, we find that this class of counterexamples to cosmic censorship would be removed if the charge of scalar fields is above the minimum value of charge $q_{min}$. In particular, the minimum value of charge required to preserve cosmic censorship increases with the increasing of Born-Infeld parameter. Meanwhile, we also show the lower bounds on charge-to-mass ratio with the different values of Born-Infeld parameter.