Source author record

Shiyu Wang

Shiyu Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning quant-ph Artificial Intelligence Computation and Language Computer Vision Applications Computation eess.SP eess.SY math.ST Methodology q-fin.CP q-fin.RM Statistics Theory Systems and Control

Catalog footprint

What is connected

14works

15topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

General reasoning represents a long-standing and formidable challenge in artificial intelligence. Recent breakthroughs, exemplified by large language models (LLMs) and chain-of-thought prompting, have achieved considerable success on foundational reasoning tasks. However, this success is heavily contingent upon extensive human-annotated demonstrations, and models' capabilities are still insufficient for more complex problems. Here we show that the reasoning abilities of LLMs can be incentivized through pure reinforcement learning (RL), obviating the need for human-labeled reasoning trajectories. The proposed RL framework facilitates the emergent development of advanced reasoning patterns, such as self-reflection, verification, and dynamic strategy adaptation. Consequently, the trained model achieves superior performance on verifiable tasks such as mathematics, coding competitions, and STEM fields, surpassing its counterparts trained via conventional supervised learning on human demonstrations. Moreover, the emergent reasoning patterns exhibited by these large-scale models can be systematically harnessed to guide and enhance the reasoning capabilities of smaller models.

preprint2026arXiv

Nested Spatio-Temporal Time Series Forecasting

Spatiotemporal forecasting is critical for real-world applications like traffic management, yet capturing reliable interactions remains challenging under noisy and non-stationary conditions. Existing methods primarily rely on historical spatial priors, often failing to account for evolving temporal correlations and suffering from systematic errors. In this work, we propose a nested forecasting framework that couples future macro-level regional trends with micro-level historical observations, enabling top-down guidance from abstract future representations for fine-grained forecasting. Specifically, we employ a spectral clustering-based approach to construct semantically coherent regions, providing both theoretical and empirical evidence that this representation effectively filters systematic noise while preserving essential trends. Building on this, we develop a progressive coarse-to-fine predictor to integrate these representative features into the inference process. This enables the model to leverage trend predictions to anticipate dynamic anomalies, such as periodic offsets, in advance. Furthermore, extensive experiments on multiple high-dimensional datasets demonstrate that our method consistently outperforms state-of-the-art baselines, validating the effectiveness of future macro-guided nested forecasting.

preprint2026arXiv

TimeDistill: Efficient Long-Term Time Series Forecasting with MLP via Cross-Architecture Distillation

Transformer-based and CNN-based methods demonstrate strong performance in long-term time series forecasting. However, their high computational and storage requirements can hinder large-scale deployment. To address this limitation, we propose integrating lightweight MLP with advanced architectures using knowledge distillation (KD). Our preliminary study reveals different models can capture complementary patterns, particularly multi-scale and multi-period patterns in the temporal and frequency domains. Based on this observation, we introduce TimeDistill, a cross-architecture KD framework that transfers these patterns from teacher models (e.g., Transformers, CNNs) to MLP. Additionally, we provide a theoretical analysis, demonstrating that our KD approach can be interpreted as a specialized form of mixup data augmentation. TimeDistill improves MLP performance by up to 18.6%, surpassing teacher models on eight datasets. It also achieves up to 7X faster inference and requires 130X fewer parameters. Furthermore, we conduct extensive evaluations to highlight the versatility and effectiveness of TimeDistill.

preprint2025arXiv

Selective Excitation of Superconducting Qubits with a Shared Control Line through Pulse Shaping

In conventional architectures of superconducting quantum computers, each qubit is connected to its own control line, leading to a commensurate increase in the number of microwave lines as the system scales. Frequency-multiplexed qubit control addresses this problem by enabling multiple qubits to share a single microwave line. However, it can cause unwanted excitation of non-target qubits, especially when the detuning between qubits is smaller than the pulse bandwidth. Here, we propose a selective-excitation-pulse (SEP) technique that suppresses unwanted excitations by shaping a drive pulse to create null points at non-target qubit frequencies. In a proof-of-concept experiment with three fixed-frequency transmon qubits, we demonstrate that the SEP technique achieves single-qubit gate fidelities comparable to those obtained with conventional Gaussian pulses while effectively suppressing unwanted excitations in non-target qubits. These results highlight the SEP technique as a promising tool for enhancing frequency-multiplexed qubit control.

preprint2024arXiv

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

The rapid development of open-source large language models (LLMs) has been truly remarkable. However, the scaling law described in previous literature presents varying conclusions, which casts a dark cloud over scaling LLMs. We delve into the study of scaling laws and present our distinctive findings that facilitate scaling of large scale models in two commonly used open-source configurations, 7B and 67B. Guided by the scaling laws, we introduce DeepSeek LLM, a project dedicated to advancing open-source language models with a long-term perspective. To support the pre-training phase, we have developed a dataset that currently consists of 2 trillion tokens and is continuously expanding. We further conduct supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) on DeepSeek LLM Base models, resulting in the creation of DeepSeek Chat models. Our evaluation results demonstrate that DeepSeek LLM 67B surpasses LLaMA-2 70B on various benchmarks, particularly in the domains of code, mathematics, and reasoning. Furthermore, open-ended evaluations reveal that DeepSeek LLM 67B Chat exhibits superior performance compared to GPT-3.5.

preprint2022arXiv

A Meta Reinforcement Learning Approach for Predictive Autoscaling in the Cloud

Predictive autoscaling (autoscaling with workload forecasting) is an important mechanism that supports autonomous adjustment of computing resources in accordance with fluctuating workload demands in the Cloud. In recent works, Reinforcement Learning (RL) has been introduced as a promising approach to learn the resource management policies to guide the scaling actions under the dynamic and uncertain cloud environment. However, RL methods face the following challenges in steering predictive autoscaling, such as lack of accuracy in decision-making, inefficient sampling and significant variability in workload patterns that may cause policies to fail at test time. To this end, we propose an end-to-end predictive meta model-based RL algorithm, aiming to optimally allocate resource to maintain a stable CPU utilization level, which incorporates a specially-designed deep periodic workload prediction model as the input and embeds the Neural Process to guide the learning of the optimal scaling actions over numerous application services in the Cloud. Our algorithm not only ensures the predictability and accuracy of the scaling strategy, but also enables the scaling decisions to adapt to the changing workloads with high sample efficiency. Our method has achieved significant performance improvement compared to the existing algorithms and has been deployed online at Alipay, supporting the autoscaling of applications for the world-leading payment platform.

preprint2022arXiv

Global Regular Network for Writer Identification

Writer identification has practical applications for forgery detection and forensic science. Most models based on deep neural networks extract features from character image or sub-regions in character image, which ignoring features contained in page-region image. Our proposed global regular network (GRN) pays attention to these features. GRN network consists of two branches: one branch takes page handwriting as input to extract global features, and the other takes word handwriting as input to extract local features. Global features and local features merge in a global residual way to form overall features of the handwriting. The proposed GRN has two attributions: one is adding a branch to extract features contained in page; the other is using residual attention network to extract local feature. Experiments demonstrate the effectiveness of both strategies. On CVL dataset, our models achieve impressive 99.98% top-1 accuracy and 100% top-5 accuracy with shorter training time and fewer network parameters, which exceeded the state-of-the-art structure. The experiment shows the powerful ability of the network in the field of writer identification. The source code is available at https://github.com/wangshiyu001/GRN.

preprint2022arXiv

Realization of fast all-microwave CZ gates with a tunable coupler

The development of high-fidelity two-qubit quantum gates is essential for digital quantum computing. Here, we propose and realize an all-microwave parametric Controlled-Z (CZ) gates by coupling strength modulation in a superconducting Transmon qubit system with tunable couplers. After optimizing the design of the tunable coupler together with the control pulse numerically, we experimentally realized a 100 ns CZ gate with high fidelity of 99.38%$ \pm$0.34% and the control error being 0.1%. We note that our CZ gates are not affected by pulse distortion and do not need pulse correction, {providing a solution for the real-time pulse generation in a dynamic quantum feedback circuit}. With the expectation of utilizing our all-microwave control scheme to reduce the number of control lines through frequency multiplexing in the future, our scheme draws a blueprint for the high-integrable quantum hardware design.

preprint2022arXiv

Sample Recycling for Nested Simulation with Application in Portfolio Risk Measurement

Nested simulation is a natural approach to tackle nested estimation problems in operations research and financial engineering. The outer-level simulation generates outer scenarios and the inner-level simulations are run in each outer scenario to estimate the corresponding conditional expectation. The resulting sample of conditional expectations is then used to estimate different risk measures of interest. Despite its flexibility, nested simulation is notorious for its heavy computational burden. We introduce a novel simulation procedure that reuses inner simulation outputs to improve efficiency and accuracy in solving nested estimation problems. We analyze the convergence rates of the bias, variance, and MSE of the resulting estimator. In addition, central limit theorems and variance estimators are presented, which lead to asymptotically valid confidence intervals for the nested risk measure of interest. We conduct numerical studies on two financial risk measurement problems. Our numerical studies show consistent results with the asymptotic analysis and show that the proposed approach outperforms the standard nested simulation and a state-of-art regression approach for nested estimation problems.

preprint2021arXiv

Experimental exploration of five-qubit quantum error correcting code with superconducting qubits

Quantum error correction is an essential ingredient for universal quantum computing. Despite tremendous experimental efforts in the study of quantum error correction, to date, there has been no demonstration in the realisation of universal quantum error correcting code, with the subsequent verification of all key features including the identification of an arbitrary physical error, the capability for transversal manipulation of the logical state, and state decoding. To address this challenge, we experimentally realise the $[\![5,1,3]\!]$ code, the so-called smallest perfect code that permits corrections of generic single-qubit errors. In the experiment, having optimised the encoding circuit, we employ an array of superconducting qubits to realise the $[\![5,1,3]\!]$ code for several typical logical states including the magic state, an indispensable resource for realising non-Clifford gates. The encoded states are prepared with an average fidelity of $57.1(3)\%$ while with a high fidelity of $98.6(1)\%$ in the code space. Then, the arbitrary single-qubit errors introduced manually are identified by measuring the stabilizers. We further implement logical Pauli operations with a fidelity of $97.2(2)\%$ within the code space. Finally, we realise the decoding circuit and recover the input state with an overall fidelity of $74.5(6)\%$, in total with $92$ gates. Our work demonstrates each key aspect of the $[\![5,1,3]\!]$ code and verifies the viability of experimental realization of quantum error correcting codes with superconducting qubits.

preprint2021arXiv

FADACS: A Few-shot Adversarial Domain Adaptation Architecture for Context-Aware Parking Availability Sensing

Existing research on parking availability sensing mainly relies on extensive contextual and historical information. In practice, the availability of such information is a challenge as it requires continuous collection of sensory signals. In this study, we design an end-to-end transfer learning framework for parking availability sensing to predict parking occupancy in areas in which the parking data is insufficient to feed into data-hungry models. This framework overcomes two main challenges: 1) many real-world cases cannot provide enough data for most existing data-driven models, and 2) it is difficult to merge sensor data and heterogeneous contextual information due to the differing urban fabric and spatial characteristics. Our work adopts a widely-used concept, adversarial domain adaptation, to predict the parking occupancy in an area without abundant sensor data by leveraging data from other areas with similar features. In this paper, we utilise more than 35 million parking data records from sensors placed in two different cities, one a city centre and the other a coastal tourist town. We also utilise heterogeneous spatio-temporal contextual information from external resources, including weather and points of interest. We quantify the strength of our proposed framework in different cases and compare it to the existing data-driven approaches. The results show that the proposed framework is comparable to existing state-of-the-art methods and also provide some valuable insights on parking availability prediction.

preprint2021arXiv

Floquet Prethermal Phase Protected by U(1) Symmetry on a Superconducting Quantum Processor

Periodically driven systems, or Floquet systems, exhibit many novel dynamics and interesting out-of-equilibrium phases of matter. Those phases arising with the quantum systems' symmetries, such as global $U(1)$ symmetry, can even show dynamical stability with symmetry-protection. Here we experimentally demonstrate a $U(1)$ symmetry-protected prethermal phase, via performing a digital-analog quantum simulation on a superconducting quantum processor. The dynamical stability of this phase is revealed by its robustness against external perturbations. We also find that the spin glass order parameter in this phase is stabilized by the interaction between the spins. Our work reveals a promising prospect in discovering emergent quantum dynamical phases with digital-analog quantum simulators.

preprint2020arXiv

Emulating quantum teleportation of a Majorana zero mode qubit

Topological quantum computation based on anyons is a promising approach to achieve fault-tolerant quantum computing. The Majorana zero modes in the Kitaev chain are an example of non-Abelian anyons where braiding operations can be used to perform quantum gates. Here we perform a quantum simulation of topological quantum computing, by teleporting a qubit encoded in the Majorana zero modes of a Kitaev chain. The quantum simulation is performed by mapping the Kitaev chain to its equivalent spin version, and realizing the ground states in a superconducting quantum processor. The teleportation transfers the quantum state encoded in the spin-mapped version of the Majorana zero mode states between two Kitaev chains. The teleportation circuit is realized using only braiding operations, and can be achieved despite being restricted to Clifford gates for the Ising anyons. The Majorana encoding is a quantum error detecting code for phase flip errors, which is used to improve the average fidelity of the teleportation for six distinct states from $70.76 \pm 0.35 \% $ to $84.60 \pm 0.11 \%$, well beyond the classical bound in either case.

preprint2015arXiv

Sequential Design for Computerized Adaptive Testing that Allows for Response Revision

In computerized adaptive testing (CAT), items (questions) are selected in real time based on the already observed responses, so that the ability of the examinee can be estimated as accurately as possible. This is typically formulated as a non-linear, sequential, experimental design problem with binary observations that correspond to the true or false responses. However, most items in practice are multiple-choice and dichotomous models do not make full use of the available data. Moreover, CAT has been heavily criticized for not allowing test-takers to review and revise their answers. In this work, we propose a novel CAT design that is based on the polytomous nominal response model and in which test-takers are allowed to revise their responses at any time during the test. We show that as the number of administered items goes to infinity, the proposed estimator is (i) strongly consistent for any item selection and revision strategy and (ii) asymptotically normal when the items are selected to maximize the Fisher information at the current ability estimate and the number of revisions is smaller than the number of items. We also present the findings of a simulation study that supports our asymptotic results.

Shiyu Wang

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Nested Spatio-Temporal Time Series Forecasting

TimeDistill: Efficient Long-Term Time Series Forecasting with MLP via Cross-Architecture Distillation

Selective Excitation of Superconducting Qubits with a Shared Control Line through Pulse Shaping

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

A Meta Reinforcement Learning Approach for Predictive Autoscaling in the Cloud

Global Regular Network for Writer Identification

Realization of fast all-microwave CZ gates with a tunable coupler

Sample Recycling for Nested Simulation with Application in Portfolio Risk Measurement

Experimental exploration of five-qubit quantum error correcting code with superconducting qubits

FADACS: A Few-shot Adversarial Domain Adaptation Architecture for Context-Aware Parking Availability Sensing

Floquet Prethermal Phase Protected by U(1) Symmetry on a Superconducting Quantum Processor

Emulating quantum teleportation of a Majorana zero mode qubit

Sequential Design for Computerized Adaptive Testing that Allows for Response Revision