Source author record

Wei Tang

Wei Tang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision hep-ex physics.ins-det eess.IV Emerging Technologies quant-ph Applications Artificial Intelligence Computation and Language cond-mat.mtrl-sci Cryptography and Security Graphics Information Retrieval Information Theory Machine Learning math.IT Networking and Internet Architecture nucl-ex Software Engineering

Catalog footprint

What is connected

16works

19topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Awaking Spatial Intelligence in Unified Multimodal Understanding and Generation

We present JoyAI-Image, a unified multimodal foundation model for visual understanding, text-to-image generation, and instruction-guided image editing. JoyAI-Image couples a spatially enhanced Multimodal Large Language Model (MLLM) with a Multimodal Diffusion Transformer (MMDiT), allowing perception and generation to interact through a shared multimodal interface. Around this architecture, we build a scalable training recipe that combines unified instruction tuning, long-text rendering supervision, spatially grounded data, and both general and spatial editing signals. This design gives the model broad multimodal capability while strengthening geometry-aware reasoning and controllable visual synthesis. Experiments across understanding, generation, long-text rendering, and editing benchmarks show that JoyAI-Image achieves state-of-the-art or highly competitive performance. More importantly, the bidirectional loop between enhanced understanding, controllable spatial editing, and novel-view-assisted reasoning enables the model to move beyond general visual competence toward stronger spatial intelligence. These results suggest a promising path for unified visual models in downstream applications such as vision-language-action systems and world models.

preprint2022arXiv

Cutting Quantum Circuits to Run on Quantum and Classical Platforms

Quantum computing (QC) offers a new computing paradigm that has the potential to provide significant speedups over classical computing. Each additional qubit doubles the size of the computational state space available to a quantum algorithm. Such exponentially expanding reach underlies QC's power, but at the same time puts demanding requirements on the quantum processing units (QPU) hardware. On the other hand, purely classical simulations of quantum circuits on either central processing unit (CPU) or graphics processing unit (GPU) scale poorly as they quickly become bottlenecked by runtime and memory. This paper introduces CutQC, a scalable hybrid computing approach that distributes a large quantum circuit onto quantum (QPU) and classical platforms (CPU or GPU) for co-processing. CutQC demonstrates evaluation of quantum circuits that are larger than the limit of QPU or classical simulation, and achieves much higher quantum circuit evaluation fidelity than the large NISQ devices achieve in real-system runs.

preprint2022arXiv

Learning to Refactor Action and Co-occurrence Features for Temporal Action Localization

The main challenge of Temporal Action Localization is to retrieve subtle human actions from various co-occurring ingredients, e.g., context and background, in an untrimmed video. While prior approaches have achieved substantial progress through devising advanced action detectors, they still suffer from these co-occurring ingredients which often dominate the actual action content in videos. In this paper, we explore two orthogonal but complementary aspects of a video snippet, i.e., the action features and the co-occurrence features. Especially, we develop a novel auxiliary task by decoupling these two types of features within a video snippet and recombining them to generate a new feature representation with more salient action information for accurate action localization. We term our method RefactorNet, which first explicitly factorizes the action content and regularizes its co-occurrence features, and then synthesizes a new action-dominated video representation. Extensive experimental results and ablation studies on THUMOS14 and ActivityNet v1.3 demonstrate that our new representation, combined with a simple action detector, can significantly improve the action localization performance.

preprint2022arXiv

LibDB: An Effective and Efficient Framework for Detecting Third-Party Libraries in Binaries

Third-party libraries (TPLs) are reused frequently in software applications for reducing development cost. However, they could introduce security risks as well. Many TPL detection methods have been proposed to detect TPL reuse in Android bytecode or in source code. This paper focuses on detecting TPL reuse in binary code, which is a more challenging task. For a detection target in binary form, libraries may be compiled and linked to separate dynamic-link files or built into a fused binary that contains multiple libraries and project-specific code. This could result in fewer available code features and lower the effectiveness of feature engineering. In this paper, we propose a binary TPL reuse detection framework, LibDB, which can effectively and efficiently detect imported TPLs even in stripped and fused binaries. In addition to the basic and coarse-grained features (string literals and exported function names), LibDB utilizes function contents as a new type of feature. It embeds all functions in a binary file to low-dimensional representations with a trained neural network. It further adopts a function call graph-based comparison method to improve the accuracy of the detection. LibDB is able to support version identification of TPLs contained in the detection target, which is not considered by existing detection methods. To evaluate the performance of LibDB, we construct three datasets for binary-based TPL reuse detection. Our experimental results show that LibDB is more accurate and efficient than state-of-the-art tools on the binary TPL detection task and the version identification task. Our datasets and source code used in this work are anonymously available at https://github.com/DeepSoftwareAnalytics/LibDB.

preprint2022arXiv

ScaleQC: A Scalable Framework for Hybrid Computation on Quantum and Classical Processors

Quantum processing unit (QPU) has to satisfy highly demanding quantity and quality requirements on its qubits to produce accurate results for problems at useful scales. Furthermore, classical simulations of quantum circuits generally do not scale. Instead, quantum circuit cutting techniques cut and distribute a large quantum circuit into multiple smaller subcircuits feasible for less powerful QPUs. However, the classical post-processing incurred from the cutting introduces runtime and memory bottlenecks. Our tool, called ScaleQC, addresses the bottlenecks by developing novel algorithmic techniques including (1) a quantum states merging framework that quickly locates the solution states of large quantum circuits; (2) an automatic solver that cuts complex quantum circuits to fit on less powerful QPUs; and (3) a tensor network based post-processing that minimizes the classical overhead. Our experiments demonstrate both QPU requirement advantages over the purely quantum platforms, and runtime advantages over the purely classical platforms for benchmarks up to 1000 qubits.

preprint2022arXiv

Time-aware Path Reasoning on Knowledge Graph for Recommendation

Reasoning on knowledge graph (KG) has been studied for explainable recommendation due to it's ability of providing explicit explanations. However, current KG-based explainable recommendation methods unfortunately ignore the temporal information (such as purchase time, recommend time, etc.), which may result in unsuitable explanations. In this work, we propose a novel Time-aware Path reasoning for Recommendation (TPRec for short) method, which leverages the potential of temporal information to offer better recommendation with plausible explanations. First, we present an efficient time-aware interaction relation extraction component to construct collaborative knowledge graph with time-aware interactions (TCKG for short), and then introduce a novel time-aware path reasoning method for recommendation. We conduct extensive experiments on three real-world datasets. The results demonstrate that the proposed TPRec could successfully employ TCKG to achieve substantial gains and improve the quality of explainable recommendation.

preprint2020arXiv

Combining Visible Light and Infrared Imaging for Efficient Detection of Respiratory Infections such as COVID-19 on Portable Device

Coronavirus Disease 2019 (COVID-19) has become a serious global epidemic in the past few months and caused huge loss to human society worldwide. For such a large-scale epidemic, early detection and isolation of potential virus carriers is essential to curb the spread of the epidemic. Recent studies have shown that one important feature of COVID-19 is the abnormal respiratory status caused by viral infections. During the epidemic, many people tend to wear masks to reduce the risk of getting sick. Therefore, in this paper, we propose a portable non-contact method to screen the health condition of people wearing masks through analysis of the respiratory characteristics. The device mainly consists of a FLIR one thermal camera and an Android phone. This may help identify those potential patients of COVID-19 under practical scenarios such as pre-inspection in schools and hospitals. In this work, we perform the health screening through the combination of the RGB and thermal videos obtained from the dual-mode camera and deep learning architecture.We first accomplish a respiratory data capture technique for people wearing masks by using face recognition. Then, a bidirectional GRU neural network with attention mechanism is applied to the respiratory data to obtain the health screening result. The results of validation experiments show that our model can identify the health status on respiratory with the accuracy of 83.7\% on the real-world dataset. The abnormal respiratory data and part of normal respiratory data are collected from Ruijin Hospital Affiliated to The Shanghai Jiao Tong University Medical School. Other normal respiratory data are obtained from healthy people around our researchers. This work demonstrates that the proposed portable and intelligent health screening device can be used as a pre-scan method for respiratory infections, which may help fight the current COVID-19 epidemic.

preprint2020arXiv

MsCGAN: Multi-scale Conditional Generative Adversarial Networks for Person Image Generation

To synthesize high-quality person images with arbitrary poses is challenging. In this paper, we propose a novel Multi-scale Conditional Generative Adversarial Networks (MsCGAN), aiming to convert the input conditional person image to a synthetic image of any given target pose, whose appearance and the texture are consistent with the input image. MsCGAN is a multi-scale adversarial network consisting of two generators and two discriminators. One generator transforms the conditional person image into a coarse image of the target pose globally, and the other is to enhance the detailed quality of the synthetic person image through a local reinforcement network. The outputs of the two generators are then merged into a synthetic, discriminant and high-resolution image. On the other hand, the synthetic image is downsampled to multiple resolutions as the input to multi-scale discriminator networks. The proposed multi-scale generators and discriminators handling different levels of visual features can benefit to synthesizing high-resolution person images with realistic appearance and texture. Experiments are conducted on the Market-1501 and DeepFashion datasets to evaluate the proposed model, and both qualitative and quantitative results demonstrate the superior performance of the proposed MsCGAN.

preprint2020arXiv

Multi-label Zero-shot Classification by Learning to Transfer from External Knowledge

Multi-label zero-shot classification aims to predict multiple unseen class labels for an input image. It is more challenging than its single-label counterpart. On one hand, the unconstrained number of labels assigned to each image makes the model more easily overfit to those seen classes. On the other hand, there is a large semantic gap between seen and unseen classes in the existing multi-label classification datasets. To address these difficult issues, this paper introduces a novel multi-label zero-shot classification framework by learning to transfer from external knowledge. We observe that ImageNet is commonly used to pretrain the feature extractor and has a large and fine-grained label space. This motivates us to exploit it as external knowledge to bridge the seen and unseen classes and promote generalization. Specifically, we construct a knowledge graph including not only classes from the target dataset but also those from ImageNet. Since ImageNet labels are not available in the target dataset, we propose a novel PosVAE module to infer their initial states in the extended knowledge graph. Then we design a relational graph convolutional network (RGCN) to propagate information among classes and achieve knowledge transfer. Experimental results on two benchmark datasets demonstrate the effectiveness of the proposed approach.

preprint2020arXiv

Scalable Double Regularization for 3D Nano-CT Reconstruction

Nano-CT (computerized tomography) has emerged as a non-destructive high-resolution cross-sectional imaging technique to effectively study the sub-$μ$m pore structure of shale, which is of fundamental importance to the evaluation and development of shale oil and gas. Nano-CT poses unique challenges to the inverse problem of reconstructing the 3D structure due to the lower signal-to-noise ratio (than Micro-CT) at the nano-scale, increased sensitivity to the misaligned geometry caused by the movement of object manipulator, limited sample size, and a larger volume of data at higher resolution. In this paper, we propose a scalable double regularization (SDR) method to utilize the entire dataset for simultaneous 3D structural reconstruction across slices through total variation regularization within slices and $L_1$ regularization between adjacent slices. SDR allows information borrowing both within and between slices, contrasting with the traditional methods that usually build on slice by slice reconstruction. We develop a scalable and memory-efficient algorithm by exploiting the systematic sparsity and consistent geometry induced by such Nano-CT data. We illustrate the proposed method using synthetic data and two Nano-CT imaging datasets of Jiulaodong (JLD) shale and Longmaxi (LMX) shale acquired in the Sichuan Basin. These numerical experiments show that the proposed method substantially outperforms selected alternatives both visually and quantitatively.

preprint2016arXiv

A 20-Liter Test Stand with Gas Purification for Liquid Argon Research

We describe the design of a 20-liter test stand constructed to study fundamental properties of liquid argon (LAr). This system utilizes a simple, cost-effective gas argon (GAr) purification to achieve high purity, which is necessary to study electron transport properties in LAr. An electron drift stack with up to 25 cm length is constructed to study electron drift, diffusion, and attachment at various electric fields. A gold photocathode and a pulsed laser are used as a bright electron source. The operational performance of this system is reported.

preprint2016arXiv

Measurement of Longitudinal Electron Diffusion in Liquid Argon

We report the measurement of longitudinal electron diffusion coefficients in liquid argon for electric fields between 100 and 2000 V/cm with a gold photocathode as a bright electron source. The measurement principle, apparatus, and data analysis are described. Our results, which are consistent with previous measurements in the region between 100 to 350 V/cm [1] , are systematically higher than the prediction of Atrazhev-Timoshkin[2], and represent the world's best measurement in the region between 350 to 2000 V/cm. The quantum efficiency of the gold photocathode, the drift velocity and longitudinal diffusion coefficients in gas argon are also presented.

preprint2015arXiv

Recent Results from the Daya Bay Neutrino Experiment

The Daya Bay neutrino experiment has recently updated the oscillation analysis results with 621 days of data in 2015, which has 3.6 times more statistics than the previous publication in 2014. The relative $\barν_{e}$ rate and spectrum measurement between the near and far detectors yielded the best fit values of $\sin^{2}2θ_{13}$ = 0.084 $\pm$ 0.005 and $|Δm_{ee}^{2}|$ = (2.42 $\pm$ 0.11) $\times$ 10$^{-3}$ eV$^{2}$. This is currently the most precise measurement of $\sin^{2}2θ_{13}$ in the world. The measurement of $|Δm_{ee}^{2}|$ also has a precision that is comparable to the measurements from MINOS and T2K experiments in 2014. Daya Bay also performed several other analyses such as the search for the light sterile neutrino in the 3+1 neutrino framework, and the measurements of the absolute reactor anti-neutrino flux and spectrum.

preprint2014arXiv

Applications of Compressed Sensing in Communications Networks

This paper presents a tutorial for CS applications in communications networks. The Shannon's sampling theorem states that to recover a signal, the sampling rate must be as least the Nyquist rate. Compressed sensing (CS) is based on the surprising fact that to recover a signal that is sparse in certain representations, one can sample at the rate far below the Nyquist rate. Since its inception in 2006, CS attracted much interest in the research community and found wide-ranging applications from astronomy, biology, communications, image and video processing, medicine, to radar. CS also found successful applications in communications networks. CS was applied in the detection and estimation of wireless signals, source coding, multi-access channels, data collection in sensor networks, and network monitoring, etc. In many cases, CS was shown to bring performance gains on the order of 10X. We believe this is just the beginning of CS applications in communications networks, and the future will see even more fruitful applications of CS in our field.

preprint2012arXiv

Measurement of the generalized form factors near threshold via $γ^* p \to nπ^+$ at high $Q^2$

We report the first extraction of the pion-nucleon multipoles near the production threshold for the $nπ^+$ channel at relatively high momentum transfer ($Q^2$ up to 4.2 $\rm{GeV^2}$). The dominance of the s-wave transverse multipole ($E_{0+}$), expected in this region, allowed us to access the generalized form factor $G_1$ within the light-cone sum rule (LCSR) framework as well as the axial form factor $G_A$. The data analyzed in this work were collected by the nearly $4π$ CEBAF Large Acceptance Spectrometer (CLAS) using a 5.754 $\rm{GeV}$ electron beam on a proton target. The differential cross section and the $π-N$-multipole $E_{0+}/G_D$ were measured using two different methods, the LCSR and a direct multipole fit. The results from the two methods are found to be consistent and almost $Q^2$ independent.

preprint2011arXiv

Structural and Magnetic Properties of Sm(Co0.7Fe0.1Ni0.12Zr0.04B0.04)7.5 Melt Spun Isotropic and Anisotropic Ribbons

We have investigated the structural and magnetic properties of Sm(Co0.7Fe0.1Ni0.12Zr0.04B0.04)7.5 melt spun ribbons. Samples were arc melted then melt spun at 37 m/s up to 55 m/s to obtain ribbon for powdering. Annealing has been performed in argon atmosphere for (30 to 75) min at (600 to 870) oC. In as-spun ribbons the hexagonal SmCo7 (TbCu7-type of structure) of crystal structure has been determined from x-ray diffraction patterns, while fcc-Co has been identified as a secondary phase. After annealing, the 1:7 phase of the as-spun ribbons transforms into 2:17 and 1:5 phases. X-ray patterns for as-milled powders exhibit very broad peaks making it difficult to identify a precise structure but represent the 1:7 structure after annealing at low temperature (650 oC). TEM analysis shows a homogeneous nanocrystalline microstructure with average grain size of (30 to 80) nm. Coercivity values of (15 to 27) kOe are obtained from hysteresis loops traced up to a field of 5 T. The coercivity decreases as temperature increases, but it maintains values higher than 5 kOe at 380 oC. The maximum energy product at room temperature increases, as high as 7.2 MGOe, for melt-spun isotropic ribbons produced at higher wheel speeds. Anisotropic ribbons have a maximum energy product close to 12 MGOe.

Wei Tang

What is connected

Connect this record

See the researcher in context

Building this map preview

16 published item(s)

Awaking Spatial Intelligence in Unified Multimodal Understanding and Generation

Cutting Quantum Circuits to Run on Quantum and Classical Platforms

Learning to Refactor Action and Co-occurrence Features for Temporal Action Localization

LibDB: An Effective and Efficient Framework for Detecting Third-Party Libraries in Binaries

ScaleQC: A Scalable Framework for Hybrid Computation on Quantum and Classical Processors

Time-aware Path Reasoning on Knowledge Graph for Recommendation

Combining Visible Light and Infrared Imaging for Efficient Detection of Respiratory Infections such as COVID-19 on Portable Device

MsCGAN: Multi-scale Conditional Generative Adversarial Networks for Person Image Generation

Multi-label Zero-shot Classification by Learning to Transfer from External Knowledge

Scalable Double Regularization for 3D Nano-CT Reconstruction

A 20-Liter Test Stand with Gas Purification for Liquid Argon Research

Measurement of Longitudinal Electron Diffusion in Liquid Argon

Recent Results from the Daya Bay Neutrino Experiment

Applications of Compressed Sensing in Communications Networks

Measurement of the generalized form factors near threshold via $γ^* p \to nπ^+$ at high $Q^2$

Structural and Magnetic Properties of Sm(Co0.7Fe0.1Ni0.12Zr0.04B0.04)7.5 Melt Spun Isotropic and Anisotropic Ribbons