Source author record

Heng Wang

Heng Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

27works

20topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

CoSER: A Comprehensive Literary Dataset and Framework for Training and Evaluating LLM Role-Playing and Persona Simulation

Role-playing language agents (RPLAs) have emerged as promising applications of large language models (LLMs). However, simulating established characters presents a challenging task for RPLAs, due to the lack of authentic character datasets and nuanced evaluation methods using such data. In this paper, we present CoSER, a collection of a high-quality dataset, open models, and an evaluation protocol towards effective RPLAs of established characters. The CoSER dataset covers 17,966 characters from 771 renowned books. It provides authentic dialogues with real-world intricacies, as well as diverse data types such as conversation setups, character experiences and internal thoughts. Drawing from acting methodology, we introduce given-circumstance acting for training and evaluating role-playing LLMs, where LLMs sequentially portray multiple characters in book scenes. Using our dataset, we develop CoSER 8B and CoSER 70B, i.e., advanced open role-playing LLMs built on LLaMA-3.1 models. Extensive experiments demonstrate the value of the CoSER dataset for RPLA training, evaluation and retrieval. Moreover, CoSER 70B exhibits state-of-the-art performance surpassing or matching GPT-4o on our evaluation and three existing benchmarks, i.e., achieving 75.80% and 93.47% accuracy on the InCharacter and LifeChoice benchmarks respectively.

preprint2026arXiv

Thinking Traps in Long Chain-of-Thought: A Measurable Study and Trap-Aware Adaptive Restart

Scaling test-time compute via Long Chain-of-Thought (Long-CoT) significantly enhances reasoning capabilities, yet extended generation does not guarantee correctness: after an early wrong commitment, models may keep elaborating a self-consistent but incorrect prefix. Through fine-grained trajectory analysis, we identify Thinking Traps, prefix-dominant deadlocks where later reflection, alternative attempts, or verification fails to revise the root error. On a curated subset of DAPO-MATH, 89\% of failures exhibit such traps. To solve this problem, we introduce TAAR (Trap-Aware Adaptive Restart), a test-time control framework that trains a diagnostic policy to predict two signals from partial trajectories: a trap index for where to truncate and an escape probability for whether and how strongly to intervene. At inference time, TAAR truncates the trajectory before the predicted trap segment and adaptively restarts decoding; for severely trapped cases, it applies stronger perturbations, including higher-temperature resampling and an optional structured reboot suffix. Experiments on challenging mathematical and scientific reasoning benchmarks (AIME24, AIME25, GPQA-Diamond, HMMT25, BRUMO25) show that TAAR improves reasoning performance without fine-tuning base model parameters.

preprint2024arXiv

Can Language Models Solve Graph Problems in Natural Language?

Large language models (LLMs) are increasingly adopted for a variety of tasks with implicit graphical structures, such as planning in robotics, multi-hop question answering or knowledge probing, structured commonsense reasoning, and more. While LLMs have advanced the state-of-the-art on these tasks with structure implications, whether LLMs could explicitly process textual descriptions of graphs and structures, map them to grounded conceptual spaces, and perform structured operations remains underexplored. To this end, we propose NLGraph (Natural Language Graph), a comprehensive benchmark of graph-based problem solving designed in natural language. NLGraph contains 29,370 problems, covering eight graph reasoning tasks with varying complexity from simple tasks such as connectivity and shortest path up to complex problems such as maximum flow and simulating graph neural networks. We evaluate LLMs (GPT-3/4) with various prompting approaches on the NLGraph benchmark and find that 1) language models do demonstrate preliminary graph reasoning abilities, 2) the benefit of advanced prompting and in-context learning diminishes on more complex graph problems, while 3) LLMs are also (un)surprisingly brittle in the face of spurious correlations in graph and problem settings. We then propose Build-a-Graph Prompting and Algorithmic Prompting, two instruction-based approaches to enhance LLMs in solving natural language graph problems. Build-a-Graph and Algorithmic prompting improve the performance of LLMs on NLGraph by 3.07% to 16.85% across multiple tasks and settings, while how to solve the most complicated graph reasoning tasks in our setup with language models remains an open research question. The NLGraph benchmark and evaluation code are available at https://github.com/Arthur-Heng/NLGraph.

preprint2023arXiv

One is All: Bridging the Gap Between Neural Radiance Fields Architectures with Progressive Volume Distillation

Neural Radiance Fields (NeRF) methods have proved effective as compact, high-quality and versatile representations for 3D scenes, and enable downstream tasks such as editing, retrieval, navigation, etc. Various neural architectures are vying for the core structure of NeRF, including the plain Multi-Layer Perceptron (MLP), sparse tensors, low-rank tensors, hashtables and their compositions. Each of these representations has its particular set of trade-offs. For example, the hashtable-based representations admit faster training and rendering but their lack of clear geometric meaning hampers downstream tasks like spatial-relation-aware editing. In this paper, we propose Progressive Volume Distillation (PVD), a systematic distillation method that allows any-to-any conversions between different architectures, including MLP, sparse or low-rank tensors, hashtables and their compositions. PVD consequently empowers downstream applications to optimally adapt the neural representations for the task at hand in a post hoc fashion. The conversions are fast, as distillation is progressively performed on different levels of volume representations, from shallower to deeper. We also employ special treatment of density to deal with its specific numerical instability problem. Empirical evidence is presented to validate our method on the NeRF-Synthetic, LLFF and TanksAndTemples datasets. For example, with PVD, an MLP-based NeRF model can be distilled from a hashtable-based Instant-NGP model at a 10X~20X faster speed than being trained the original NeRF from scratch, while achieving a superior level of synthesis quality. Code is available at https://github.com/megvii-research/AAAI2023-PVD.

preprint2023arXiv

Temporal Perceiving Video-Language Pre-training

Video-Language Pre-training models have recently significantly improved various multi-modal downstream tasks. Previous dominant works mainly adopt contrastive learning to achieve global feature alignment across modalities. However, the local associations between videos and texts are not modeled, restricting the pre-training models' generality, especially for tasks requiring the temporal video boundary for certain query texts. This work introduces a novel text-video localization pre-text task to enable fine-grained temporal and semantic alignment such that the trained model can accurately perceive temporal boundaries in videos given the text description. Specifically, text-video localization consists of moment retrieval, which predicts start and end boundaries in videos given the text description, and text localization which matches the subset of texts with the video features. To produce temporal boundaries, frame features in several videos are manually merged into a long video sequence that interacts with a text sequence. With the localization task, our method connects the fine-grained frame representations with the word representations and implicitly distinguishes representations of different instances in the single modality. Notably, comprehensive experimental results show that our method significantly improves the state-of-the-art performance on various benchmarks, covering text-to-video retrieval, video question answering, video captioning, temporal action localization and temporal moment retrieval. The code will be released soon.

preprint2022arXiv

Canonical Mean Filter for Almost Zero-Shot Multi-Task classification

The support set is a key to providing conditional prior for fast adaption of the model in few-shot tasks. But the strict form of support set makes its construction actually difficult in practical application. Motivated by ANIL, we rethink the role of adaption in the feature extractor of CNAPs, which is a state-of-the-art representative few-shot method. To investigate the role, Almost Zero-Shot (AZS) task is designed by fixing the support set to replace the common scheme, which provides corresponding support sets for the different conditional prior of different tasks. The AZS experiment results infer that the adaptation works little in the feature extractor. However, CNAPs cannot be robust to randomly selected support sets and perform poorly on some datasets of Meta-Dataset because of its scattered mean embeddings responded by the simple mean operator. To enhance the robustness of CNAPs, Canonical Mean Filter (CMF) module is proposed to make the mean embeddings intensive and stable in feature space by mapping the support sets into a canonical form. CMFs make CNAPs robust to any fixed support sets even if they are random matrices. This attribution makes CNAPs be able to remove the mean encoder and the parameter adaptation network at the test stage, while CNAP-CMF on AZS tasks keeps the performance with one-shot tasks. It leads to a big parameter reduction. Precisely, 40.48\% parameters are dropped at the test stage. Also, CNAP-CMF outperforms CNAPs in one-shot tasks because it addresses inner-task unstable performance problems. Classification performance, visualized and clustering results verify that CMFs make CNAPs better and simpler.

preprint2022arXiv

Direct observation of nodeless superconductivity and phonon modes in electron-doped copper oxide Sr$_{1-x}$Nd$_x$CuO$_2$

The microscopic understanding of high-temperature superconductivity in cuprates has been hindered by the apparent complexity of crystal structures in these materials. We used scanning tunneling microscopy and spectroscopy to study an electron-doped copper oxide compound Sr$_{1-x}$Nd$_x$CuO$_2$ that has only bare cations separating the CuO$_2$ planes and thus the simplest infinite-layer structure among all cuprate superconductors. Tunneling conductance spectra of the major CuO$_2$ planes in the superconducting state revealed direct evidence for a nodeless pairing gap, regardless of variation of its magnitude with the local doping of trivalent neodymium. Furthermore, three distinct bosonic modes are observed as multiple peak-dip-hump features outside the superconducting gaps and their respective energies depend little on the spatially varying gaps. Along with the bosonic modes with energies identical to those of the external, bending and stretching phonons of copper oxides, our findings indicate their origin from lattice vibrations rather than spin excitations.

preprint2022arXiv

High-throughput decoder of quasi-cyclic LDPC codes with limited precision for continuous-variable quantum key distribution systems

More than Mbps secret key rate was demonstrated for continuous-variable quantum key distribution (CV-QKD) systems, but real-time postprocessing is not allowed, which is restricted by the throughput of the error correction decoding in postprocessing. In this paper, a high-throughput FPGA-based quasi-cyclic LDPC decoder is proposed and implemented to support Mbps real-time secret key rate generation for CV-QKD for the first time. A residual bit error correction algorithm is used to solve the problem of high frame errors rate (FER) caused by the limited precision of the decoder. Specifically, real-time high-speed decoding for CV-QKD systems with typical code rates 0.2 and 0.1 is implemented on a commercial FPGA, and two throughputs of 360.92Mbps and 194.65Mbps are achieved, respectively, which can support 17.97 Mbps and 2.48 Mbps real-time generation of secret key rates under typical transmission distances of 25km and 50km, correspondingly. The proposed method paves the way for high-rate real-time CV-QKD deployment in secure metropolitan area network.

preprint2022arXiv

Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity

Open-world instance segmentation is the task of grouping pixels into object instances without any pre-determined taxonomy. This is challenging, as state-of-the-art methods rely on explicit class semantics obtained from large labeled datasets, and out-of-domain evaluation performance drops significantly. Here we propose a novel approach for mask proposals, Generic Grouping Networks (GGNs), constructed without semantic supervision. Our approach combines a local measure of pixel affinity with instance-level mask supervision, producing a training regimen designed to make the model as generic as the data diversity allows. We introduce a method for predicting Pairwise Affinities (PA), a learned local relationship between pairs of pixels. PA generalizes very well to unseen categories. From PA we construct a large set of pseudo-ground-truth instance masks; combined with human-annotated instance masks we train GGNs and significantly outperform the SOTA on open-world instance segmentation on various benchmarks including COCO, LVIS, ADE20K, and UVO. Code is available on project website: https://sites.google.com/view/generic-grouping/.

preprint2022arXiv

Secure two-way fiber-optic time transfer against sub-ns asymmetric delay attack

Two-way fiber-optic time transfer is a promising precise time synchronization technique with sub-nanosecond accuracy. However, asymmetric delay attack is a serious threat which cannot be prevent by any encryption method. In this paper, a dynamic model based scheme is proposed to defense the sub-nanosecond asymmetric delay attack. A threshold is set according to the estimated time difference by a two-state clock model where the fixed frequency difference is excluded from the time difference to detect the asymmetric delay attack which is smaller than the time difference induced by the fixed frequency difference. Theoretical simulation and experimental demonstration are implemented to prove the feasibility of the scheme. A two-way fiber-optic time transfer system with time stability with 24.5ps, 3.98ps, and 2.95ps at 1s, 10s, and 100s averaging time is shown under sub-ns asymmetric time delay attack experimentally. The proposed method provides a promising secure sub-ns precise time synchronization technique against asymmetric delay attack.

preprint2022arXiv

Spatiality-guided Transformer for 3D Dense Captioning on Point Clouds

Dense captioning in 3D point clouds is an emerging vision-and-language task involving object-level 3D scene understanding. Apart from coarse semantic class prediction and bounding box regression as in traditional 3D object detection, 3D dense captioning aims at producing a further and finer instance-level label of natural language description on visual appearance and spatial relations for each scene object of interest. To detect and describe objects in a scene, following the spirit of neural machine translation, we propose a transformer-based encoder-decoder architecture, namely SpaCap3D, to transform objects into descriptions, where we especially investigate the relative spatiality of objects in 3D scenes and design a spatiality-guided encoder via a token-to-token spatial relation learning objective and an object-centric decoder for precise and spatiality-enhanced object caption generation. Evaluated on two benchmark datasets, ScanRefer and ReferIt3D, our proposed SpaCap3D outperforms the baseline method Scan2Cap by 4.94% and 9.61% in CIDEr@0.5IoU, respectively. Our project page with source code and supplementary files is available at https://SpaCap3D.github.io/ .

preprint2022arXiv

Towards Generalisable Audio Representations for Audio-Visual Navigation

In audio-visual navigation (AVN), an intelligent agent needs to navigate to a constantly sound-making object in complex 3D environments based on its audio and visual perceptions. While existing methods attempt to improve the navigation performance with preciously designed path planning or intricate task settings, none has improved the model generalisation on unheard sounds with task settings unchanged. We thus propose a contrastive learning-based method to tackle this challenge by regularising the audio encoder, where the sound-agnostic goal-driven latent representations can be learnt from various audio signals of different classes. In addition, we consider two data augmentation strategies to enrich the training sounds. We demonstrate that our designs can be easily equipped to existing AVN frameworks to obtain an immediate performance gain (13.4%$\uparrow$ in SPL on Replica and 12.2%$\uparrow$ in SPL on MP3D). Our project is available at https://AV-GeN.github.io/.

preprint2021arXiv

Single Neuron Segmentation using Graph-based Global Reasoning with Auxiliary Skeleton Loss from 3D Optical Microscope Images

One of the critical steps in improving accurate single neuron reconstruction from three-dimensional (3D) optical microscope images is the neuronal structure segmentation. However, they are always hard to segment due to the lack in quality. Despite a series of attempts to apply convolutional neural networks (CNNs) on this task, noise and disconnected gaps are still challenging to alleviate with the neglect of the non-local features of graph-like tubular neural structures. Hence, we present an end-to-end segmentation network by jointly considering the local appearance and the global geometry traits through graph reasoning and a skeleton-based auxiliary loss. The evaluation results on the Janelia dataset from the BigNeuron project demonstrate that our proposed method exceeds the counterpart algorithms in performance.

preprint2020arXiv

Anomalous Spontaneous Symmetry Breaking in non-Hermitian Systems with Biorthogonal Z2-symmetry

Landau's spontaneous symmetry breaking theory is a fundamental theory that describes the collective behaviors in many-body systems. It was well known that for usual spontaneous symmetry breaking in Hermitian systems, the order-disorder phase transition with gap closing and spontaneous symmetry breaking occur at the same critical point. In this paper, we generalized the Landau's spontaneous symmetry breaking theory to the cases in non-Hermitian (NH) many-body systems with biorthogonal Z2 symmetry and tried to discover certain universal features. We were surprised to find that the effect of the NH terms splits the spontaneous biorthogonal Z2 symmetry breaking from a (biorthogonal) order-disorder phase transition with gap closing. The sudden change of similarity for two degenerate ground states indicates a new type of quantum phase transition without gap closing accompanied by spontaneous biorthogonal Z2 symmetry breaking. We will take the NH transverse Ising model as an example to investigate the anomalous spontaneous symmetry breaking. The numerical results were consistent with the theoretical predictions.

preprint2020arXiv

From W-Net to CDGAN: Bi-temporal Change Detection via Deep Learning Techniques

Traditional change detection methods usually follow the image differencing, change feature extraction and classification framework, and their performance is limited by such simple image domain differencing and also the hand-crafted features. Recently, the success of deep convolutional neural networks (CNNs) has widely spread across the whole field of computer vision for their powerful representation abilities. In this paper, we therefore address the remote sensing image change detection problem with deep learning techniques. We firstly propose an end-to-end dual-branch architecture, termed as the W-Net, with each branch taking as input one of the two bi-temporal images as in the traditional change detection models. In this way, CNN features with more powerful representative abilities can be obtained to boost the final detection performance. Also, W-Net performs differencing in the feature domain rather than in the traditional image domain, which greatly alleviates loss of useful information for determining the changes. Furthermore, by reformulating change detection as an image translation problem, we apply the recently popular Generative Adversarial Network (GAN) in which our W-Net serves as the Generator, leading to a new GAN architecture for change detection which we call CDGAN. To train our networks and also facilitate future research, we construct a large scale dataset by collecting images from Google Earth and provide carefully manually annotated ground truths. Experiments show that our proposed methods can provide fine-grained change detection results superior to the existing state-of-the-art baselines.

preprint2020arXiv

Video Modeling with Correlation Networks

Motion is a salient cue to recognize actions in video. Modern action recognition models leverage motion information either explicitly by using optical flow as input or implicitly by means of 3D convolutional filters that simultaneously capture appearance and motion information. This paper proposes an alternative approach based on a learnable correlation operator that can be used to establish frame-toframe matches over convolutional feature maps in the different layers of the network. The proposed architecture enables the fusion of this explicit temporal matching information with traditional appearance cues captured by 2D convolution. Our correlation network compares favorably with widely-used 3D CNNs for video modeling, and achieves competitive results over the prominent two-stream network while being much faster to train. We empirically demonstrate that correlation networks produce strong results on a variety of video datasets, and outperform the state of the art on four popular benchmarks for action recognition: Kinetics, Something-Something, Diving48 and Sports1M.

preprint2019arXiv

Tackling Challenges in Seebeck Coefficient Measurement of Ultra-High Resistance Samples with an AC Technique

Seebeck coefficient is a widely-studied semiconductor property. Conventional Seebeck coefficient measurements are based on DC voltage measurement. Normally this is performed on samples with low resistances below a few Mohm level. Meanwhile, certain semiconductors are highly intrinsic and resistive, many examples can be found in optical and photovoltaic materials. The hybrid halide perovskites that have gained extensive attention recently are a good example. Few credible studies exist on the Seebeck coefficient of, CH3NH3PbI3, for example. We report here an AC technique based Seebeck coefficient measurement, which makes high quality voltage measurement on samples with resistances up to 100Gohm. This is achieved through a specifically designed setup to enhance sample isolation and reduce meter loading. As a demonstration, we performed Seebeck coefficient measurement of a CH3NH3PbI3 thin film at dark and found S = +550 microV/K. Such property of this material has not been successfully studied before.

preprint2016arXiv

Creating arbitrary quantum vibrational states in a carbon nanotube

We theoretically study the creation of single- and multi-phonon Fock states and arbitrary superpositions of quantum phonon states in a nanomechanical carbon nanotube (CNT) resonator. In our model, a doubly clamped CNT resonator is initialized in the ground state and a single electron is trapped in a quantum dot which is formed by a electric gate potential and brought into the magnetic field of a micro-magnet. The preparation of arbitrary quantum phonon states is based on the coupling between the mechanical motion of the CNT and the electron spin which acts as a non-linearity. We assume that electrical driving pulses with different frequencies are applied on the system. The quantum information is transferred from the spin qubit to the mechanical motion by the spin-phonon coupling and the electron spin qubit can be reset by the single-electron spin resonance. We describe Wigner tomography which can be applied at the end to obtain the phase information of the prepared phonon states.

preprint2015arXiv

A robust and efficient video representation for action recognition

This paper introduces a state-of-the-art video representation and applies it to efficient action recognition and detection. We first propose to improve the popular dense trajectory features by explicit camera motion estimation. More specifically, we extract feature point matches between frames using SURF descriptors and dense optical flow. The matches are used to estimate a homography with RANSAC. To improve the robustness of homography estimation, a human detector is employed to remove outlier matches from the human body as human motion is not constrained by the camera. Trajectories consistent with the homography are considered as due to camera motion, and thus removed. We also use the homography to cancel out camera motion from the optical flow. This results in significant improvement on motion-based HOF and MBH descriptors. We further explore the recent Fisher vector as an alternative feature encoding approach to the standard bag-of-words histogram, and consider different ways to include spatial layout information in these encodings. We present a large and varied set of evaluations, considering (i) classification of short basic actions on six datasets, (ii) localization of such actions in feature-length movies, and (iii) large-scale recognition of complex events. We find that our improved trajectory features significantly outperform previous dense trajectories, and that Fisher vectors are superior to bag-of-words encodings for video recognition tasks. In all three tasks, we show substantial improvements over the state-of-the-art results.

preprint2015arXiv

Active Community Detection in Massive Graphs

A canonical problem in graph mining is the detection of dense communities. This problem is exacerbated for a graph with a large order and size -- the number of vertices and edges -- as many community detection algorithms scale poorly. In this work we propose a novel framework for detecting active communities that consist of the most active vertices in massive graphs. The framework is applicable to graphs having billions of vertices and hundreds of billions of edges. Our framework utilizes a parallelizable trimming algorithm based on a locality statistic to filter out inactive vertices, and then clusters the remaining active vertices via spectral decomposition on their similarity matrix. We demonstrate the validity of our method with synthetic Stochastic Block Model graphs, using Adjusted Rand Index as the performance metric. We further demonstrate its practicality and efficiency on a most recent real-world Hyperlink Web graph consisting of over 3.5 billion vertices and 128 billion edges.

preprint2015arXiv

Concept Drift Detection for Streaming Data

Common statistical prediction models often require and assume stationarity in the data. However, in many practical applications, changes in the relationship of the response and predictor variables are regularly observed over time, resulting in the deterioration of the predictive performance of these models. This paper presents Linear Four Rates (LFR), a framework for detecting these concept drifts and subsequently identifying the data points that belong to the new concept (for relearning the model). Unlike conventional concept drift detection approaches, LFR can be applied to both batch and stream data; is not limited by the distribution properties of the response variable (e.g., datasets with imbalanced labels); is independent of the underlying statistical-model; and uses user-specified parameters that are intuitively comprehensible. The performance of LFR is compared to benchmark approaches using both simulated and commonly used public datasets that span the gamut of concept drift types. The results show LFR significantly outperforms benchmark approaches in terms of recall, accuracy and delay in detection of concept drifts across datasets.

preprint2015arXiv

Face Aging Effect Simulation using Hidden Factor Analysis Joint Sparse Representation

Face aging simulation has received rising investigations nowadays, whereas it still remains a challenge to generate convincing and natural age-progressed face images. In this paper, we present a novel approach to such an issue by using hidden factor analysis joint sparse representation. In contrast to the majority of tasks in the literature that handle the facial texture integrally, the proposed aging approach separately models the person-specific facial properties that tend to be stable in a relatively long period and the age-specific clues that change gradually over time. It then merely transforms the age component to a target age group via sparse reconstruction, yielding aging effects, which is finally combined with the identity component to achieve the aged face. Experiments are carried out on three aging databases, and the results achieved clearly demonstrate the effectiveness and robustness of the proposed method in rendering a face with aging effects. Additionally, a series of evaluations prove its validity with respect to identity preservation and aging effect generation.

preprint2015arXiv

Mechanically induced two-qubit gates and maximally entangled states for single electron spins in a carbon nanotube

We theoretically analyze a system where two electrons are trapped separately in two quantum dots on a suspended carbon nanotube (CNT), subject to external ac electric driving. An indirect mechanically-induced coupling of two distant single electron spins is induced by the interaction between the spins and the mechanical motion of the CNT. We show that a two-qubit iSWAP gate and arbitrary single-qubit gates can be obtained from the intrinsic spin-orbit coupling. Combining the iSWAP gate and single-qubit gates, maximally entangled states of two spins can be generated in a single step by varying the frequency and the strength of the external electric driving field. The spin-phonon coupling can be turned off by electrostatically shifting the electron wave function on the nanotube.

preprint2014arXiv

Mechanically induced spin resonance in a carbon nanotube

The electron spin is a promising qubit candidate for quantum computation and quantum information. Here we propose and analyze a mechanically-induced single electron spin resonance, which amounts to a rotation of the spin about the $x$-axis in a suspended carbon nanotube. The effect is based on the coupling between the spin and the mechanical degree of freedom due to the intrinsic curvature-induced spin-orbit coupling. A rotation about the $z$-axis is obtained by the off-resonant external electric driving field. Arbitrary-angle rotations of the single electron spin about any axis in the $x$-$z$ plane can be obtained with a single operation by varying the frequency and the strength of the external electric driving field. With multiple steps combining the rotations about the $x$- and $z$-axes, arbitrary-angle rotations about arbitrary axes can be constructed, which implies that any single-qubit gate of the electron spin qubit can be performed. We simulate the system numerically using a master equation with realistic parameters.

preprint2014arXiv

MICA: A fast short-read aligner that takes full advantage of Intel Many Integrated Core Architecture (MIC)

Background: Short-read aligners have recently gained a lot of speed by exploiting the massive parallelism of GPU. An uprising alternative to GPU is Intel MIC; supercomputers like Tianhe-2, currently top of TOP500, is built with 48,000 MIC boards to offer ~55 PFLOPS. The CPU-like architecture of MIC allows CPU-based software to be parallelized easily; however, the performance is often inferior to GPU counterparts as an MIC board contains only ~60 cores (while a GPU board typically has over a thousand cores). Results: To better utilize MIC-enabled computers for NGS data analysis, we developed a new short-read aligner MICA that is optimized in view of MICs limitation and the extra parallelism inside each MIC core. Experiments on aligning 150bp paired-end reads show that MICA using one MIC board is 4.9 times faster than the BWA-MEM (using 6-core of a top-end CPU), and slightly faster than SOAP3-dp (using a GPU). Furthermore, MICAs simplicity allows very efficient scale-up when multiple MIC boards are used in a node (3 cards give a 14.1-fold speedup over BWA-MEM). Summary: MICA can be readily used by MIC-enabled supercomputers for production purpose. We have tested MICA on Tianhe-2 with 90 WGS samples (17.47 Tera-bases), which can be aligned in an hour less than 400 nodes. MICA has impressive performance even though the current MIC is at its initial stage of development (the next generation of MIC has been announced to release in late 2014).

preprint2012arXiv

Nanomechanical read-out of a single spin

The spin of a single electron in a suspended carbon nanotube can be read out by using its coupling to the nano-mechanical motion of the nanotube. To show this, we consider a single electron confined within a quantum dot formed by the suspended carbon nanotube. The spin- orbit interaction induces a coupling between the spin and one of the bending modes of the suspended part of the nanotube. We calculate the response of the system to pulsed external driving of the mechanical motion using a Jaynes-Cummings model. To account for resonator damping, we solve a quantum master equation, with parameters comparable to those used in recent experiments, and show how information of the spin state of the system can be acquired by measuring its mechanical motion. The latter can be detected by observing the current through a nearby charge detector.

preprint2011arXiv

Low Effective Mass Leading to High Thermoelectric Performance

High Seebeck coefficient by creating large density of state (DOS) around the Fermi level through either electronic structure modification or manipulating nanostructures, is commonly considered as a route to advanced thermoelectrics. However, large density of state due to flat bands leads to large effective mass, which results in a simultaneous decrease of mobility. In fact, the net effect of high effective mass is a lower thermoelectric figure of merit when the carriers are predominantly scattered by acoustic phonons according to the deformation potential theory of Bardeen-Shockley. We demonstrate the beneficial effect of light effective mass leading to high power factor in n-type thermoelectric PbTe, where doping and temperature can be used to tune the effective mass. This clear demonstration of the deformation potential theory to thermoelectrics shows that the guiding principle for band structure engineering should be low effective mass along the transport direction.

Heng Wang

What is connected

Connect this record

See the researcher in context

Building this map preview

27 published item(s)

CoSER: A Comprehensive Literary Dataset and Framework for Training and Evaluating LLM Role-Playing and Persona Simulation

Thinking Traps in Long Chain-of-Thought: A Measurable Study and Trap-Aware Adaptive Restart

Can Language Models Solve Graph Problems in Natural Language?

One is All: Bridging the Gap Between Neural Radiance Fields Architectures with Progressive Volume Distillation

Temporal Perceiving Video-Language Pre-training

Canonical Mean Filter for Almost Zero-Shot Multi-Task classification

Direct observation of nodeless superconductivity and phonon modes in electron-doped copper oxide Sr$_{1-x}$Nd$_x$CuO$_2$

High-throughput decoder of quasi-cyclic LDPC codes with limited precision for continuous-variable quantum key distribution systems

Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity

Secure two-way fiber-optic time transfer against sub-ns asymmetric delay attack

Spatiality-guided Transformer for 3D Dense Captioning on Point Clouds

Towards Generalisable Audio Representations for Audio-Visual Navigation

Single Neuron Segmentation using Graph-based Global Reasoning with Auxiliary Skeleton Loss from 3D Optical Microscope Images

Anomalous Spontaneous Symmetry Breaking in non-Hermitian Systems with Biorthogonal Z2-symmetry

From W-Net to CDGAN: Bi-temporal Change Detection via Deep Learning Techniques

Video Modeling with Correlation Networks

Tackling Challenges in Seebeck Coefficient Measurement of Ultra-High Resistance Samples with an AC Technique

Creating arbitrary quantum vibrational states in a carbon nanotube

A robust and efficient video representation for action recognition

Active Community Detection in Massive Graphs

Concept Drift Detection for Streaming Data

Face Aging Effect Simulation using Hidden Factor Analysis Joint Sparse Representation

Mechanically induced two-qubit gates and maximally entangled states for single electron spins in a carbon nanotube

Mechanically induced spin resonance in a carbon nanotube

MICA: A fast short-read aligner that takes full advantage of Intel Many Integrated Core Architecture (MIC)

Nanomechanical read-out of a single spin

Low Effective Mass Leading to High Thermoelectric Performance