Researcher profile

Junjie Wu

Junjie Wu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
15works
0followers
14topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

15 published item(s)

preprint2026arXiv

Direct Detection of Type II-P Supernova Progenitors with the Euclid and CSST Surveys

A central goal in supernova (SN) research is to identify and characterize their progenitors. However, this is very difficult due to the limited archival images with sufficient depth and spatial resolution required for direct progenitor detection and due to the circumstellar dust which often biases the estimate of their intrinsic parameters. This field will be revolutionized by Euclid and the upcoming Chinese Space Station Survey Telescope (CSST), which conduct deep, wide-field, high-resolution and multi-band imaging surveys. We analyze their detection capability by comparing the model magnitudes of red supergiant (RSG) progenitors with the detection limits under different conditions, and we estimate the annual detection rates with Monte-Carlo simulations. We explore how to recover the intrinsic properties of SN progenitors with the help of radiation transfer calculations in circumstellar dust. We find the optical and near-infrared filters of the Euclid and CSST are highly effective for detecting RSG progenitors. We predict that archival images from the completed 2 surveys will enable $\lesssim13$ (or 24) progenitor detections per year within the mass range of 8--16 (or 8--25)M_\odot, an order of magnitude higher than the current detection rate of $\sim1$ detection per year. In the presence of circumstellar dust, the emerging spectral energy distribution (SED) of the progenitor is mainly affected by the optical depth and is almost independent of dust temperature in the Euclid and CSST filters. Our mock tests demonstrate that one can derive the progenitor mass and dust optical depth simultaneously by fitting the observed SED over the 11 filters of the 2 surveys while fixing the dust temperature to a typical value. Euclid and CSST will significantly enlarge the sample of direct progenitor detections with accurate mass measurements, which is crucial to resolve the long-standing RSG problem.

preprint2026arXiv

Language Hierarchization Provides the Optimal Solution to Human Working Memory Limits

Language is a uniquely human trait, conveying information efficiently by organizing word sequences in sentences into hierarchical structures. A central question persists: Why is human language hierarchical? In this study, we show that hierarchization optimally solves the challenge of our limited working memory capacity. We established a likelihood function that quantifies how well the average number of units according to the language processing mechanisms aligns with human working memory capacity (WMC) in a direct fashion. The maximum likelihood estimate (MLE) of this function, tehta_MLE, turns out to be the mean of units. Through computational simulations of symbol sequences and validation analyses of natural language sentences, we uncover that compared to linear processing, hierarchical processing far surpasses it in constraining the tehta_MLE values under the human WMC limit, along with the increase of sequence/sentence length successfully. It also shows a converging pattern related to children's WMC development. These results suggest that constructing hierarchical structures optimizes the processing efficiency of sequential language input while staying within memory constraints, genuinely explaining the universal hierarchical nature of human language.

preprint2026arXiv

OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation

Human cognition operates through two complementary modes: fast intuitive thinking and slow deliberate thinking. Vanilla large language models (LLMs) predominantly follow the fast-thinking paradigm, producing immediate responses; while recent large reasoning models (LRMs) adopt slow-thinking strategies, generating detailed reasoning chains before arriving at answers. While LRMs often achieve higher accuracy, this comes at the cost of substantially increased token usage. To address this efficiency-accuracy trade-off, we propose OThink-R1, a hybrid reasoning framework that integrates both modes within a single LRM and enables automatic mode switching based on problem characteristics. We first identify three major patterns of essential and redundant reasoning trajectories in LRMs, which guide the design of an auxiliary LLM-based judge that adaptively determines when slow thinking is necessary. Leveraging the judge's decisions, we construct a hybrid fine-tuning dataset by pruning redundant reasoning to produce fast-thinking samples and retaining complete reasoning for slow-thinking samples. This dataset is then used to fine-tune LRMs, equipping them with inherent autonomous mode-selection capabilities. Extensive experiments on mathematical and question-answering benchmarks show that OThink-R1 reduces reasoning token usage significantly while maintaining competitive accuracy. The code is available at https://github.com/AgenticIR-Lab/OThink-R1.

preprint2026arXiv

SGDrive: Scene-to-Goal Hierarchical World Cognition for Autonomous Driving

Recent end-to-end autonomous driving approaches have leveraged Vision-Language Models (VLMs) to enhance planning capabilities in complex driving scenarios. However, VLMs are inherently trained as generalist models, lacking specialized understanding of driving-specific reasoning in 3D space and time. When applied to autonomous driving, these models struggle to establish structured spatial-temporal representations that capture geometric relationships, scene context, and motion patterns critical for safe trajectory planning. To address these limitations, we propose SGDrive, a novel framework that explicitly structures the VLM's representation learning around driving-specific knowledge hierarchies. Built upon a pre-trained VLM backbone, SGDrive decomposes driving understanding into a scene-agent-goal hierarchy that mirrors human driving cognition: drivers first perceive the overall environment (scene context), then attend to safety-critical agents and their behaviors, and finally formulate short-term goals before executing actions. This hierarchical decomposition provides the structured spatial-temporal representation that generalist VLMs lack, integrating multi-level information into a compact yet comprehensive format for trajectory planning. Extensive experiments on the NAVSIM benchmark demonstrate that SGDrive achieves state-of-the-art performance among camera-only methods on both PDMS and EPDMS, validating the effectiveness of hierarchical knowledge structuring for adapting generalist VLMs to autonomous driving.

preprint2024arXiv

Data Valuation for Vertical Federated Learning: A Model-free and Privacy-preserving Method

Vertical Federated learning (VFL) is a promising paradigm for predictive analytics, empowering an organization (i.e., task party) to enhance its predictive models through collaborations with multiple data suppliers (i.e., data parties) in a decentralized and privacy-preserving way. Despite the fast-growing interest in VFL, the lack of effective and secure tools for assessing the value of data owned by data parties hinders the application of VFL in business contexts. In response, we propose FedValue, a privacy-preserving, task-specific but model-free data valuation method for VFL, which consists of a data valuation metric and a federated computation method. Specifically, we first introduce a novel data valuation metric, namely MShapley-CMI. The metric evaluates a data party's contribution to a predictive analytics task without the need of executing a machine learning model, making it well-suited for real-world applications of VFL. Next, we develop an innovative federated computation method that calculates the MShapley-CMI value for each data party in a privacy-preserving manner. Extensive experiments conducted on six public datasets validate the efficacy of FedValue for data valuation in the context of VFL. In addition, we illustrate the practical utility of FedValue with a case study involving federated movie recommendations.

preprint2023arXiv

Language Model as an Annotator: Unsupervised Context-aware Quality Phrase Generation

Phrase mining is a fundamental text mining task that aims to identify quality phrases from context. Nevertheless, the scarcity of extensive gold labels datasets, demanding substantial annotation efforts from experts, renders this task exceptionally challenging. Furthermore, the emerging, infrequent, and domain-specific nature of quality phrases presents further challenges in dealing with this task. In this paper, we propose LMPhrase, a novel unsupervised context-aware quality phrase mining framework built upon large pre-trained language models (LMs). Specifically, we first mine quality phrases as silver labels by employing a parameter-free probing technique called Perturbed Masking on the pre-trained language model BERT (coined as Annotator). In contrast to typical statistic-based or distantly-supervised methods, our silver labels, derived from large pre-trained language models, take into account rich contextual information contained in the LMs. As a result, they bring distinct advantages in preserving informativeness, concordance, and completeness of quality phrases. Secondly, training a discriminative span prediction model heavily relies on massive annotated data and is likely to face the risk of overfitting silver labels. Alternatively, we formalize phrase tagging task as the sequence generation problem by directly fine-tuning on the Sequence-to-Sequence pre-trained language model BART with silver labels (coined as Generator). Finally, we merge the quality phrases from both the Annotator and Generator as the final predictions, considering their complementary nature and distinct characteristics. Extensive experiments show that our LMPhrase consistently outperforms all the existing competitors across two different granularity phrase mining tasks, where each task is tested on two different domain datasets.

preprint2022arXiv

An Unbiased Quantum Random Number Generator Based on Boson Sampling

It has been proven that Boson sampling is a much promising model of optical quantum computation, which has been applied to designing quantum computer successfully, such as "Jiuzhang". However, the meaningful randomness of Boson sampling results, whose correctness and significance were proved from a specific quantum mechanical distribution, has not been utilized or exploited. In this research, Boson sampling is applied to design a novel Quantum Random Number Generator (QRNG) by fully exploiting the randomness of Boson sampling results, and its prototype system is constructed with the programmable silicon photonic processor, which can generate uniform and unbiased random sequences and overcome the shortcomings of the existing discrete QRNGs such as source-related, high demand for the photon number resolution capability of the detector and slow self-detection generator speed. Boson sampling is implemented as a random entropy source, and random bit strings with satisfactory randomness and uniformity can be obtained after post-processing the sampling results. It is the first approach for applying the randomness of Boson sampling results to develop a practical prototype system for actual tasks, and the experiment results demonstrate the designed Boson sampling-based QRNG prototype system pass 15 tests of the NIST SP 800-22 statistical test component, which prove that Boson sampling has great potential for practical applications with desirable performance besides quantum advantage.

preprint2022arXiv

Large-scale full-programmable quantum walk and its applications

With photonics, the quantum computational advantage has been demonstrated on the task of boson sampling. Next, developing quantum-enhanced approaches for practical problems becomes one of the top priorities for photonic systems. Quantum walks are powerful kernels for developing new and useful quantum algorithms. Here we realize large-scale quantum walks using a fully programmable photonic quantum computing system. The system integrates a silicon quantum photonic chip, enabling the simulation of quantum walk dynamics on graphs with up to 400 vertices and possessing full programmability over quantum walk parameters, including the particle property, initial state, graph structure, and evolution time. In the 400-dimensional Hilbert space, the average fidelity of random entangled quantum states after the whole on-chip circuit evolution reaches as high as 94.29$\pm$1.28$\%$. With the system, we demonstrated exponentially faster hitting and quadratically faster mixing performance of quantum walks over classical random walks, achieving more than two orders of magnitude of enhancement in the experimental hitting efficiency and almost half of the reduction in the experimental evolution time for mixing. We utilize the system to implement a series of quantum applications, including measuring the centrality of scale-free networks, searching targets on Erdös-Rényi networks, distinguishing non-isomorphic graph pairs, and simulating the topological phase of higher-order topological insulators. Our work shows one feasible path for quantum photonics to address applications of practical interests in the near future.

preprint2022arXiv

Large-Scale Privacy-Preserving Network Embedding against Private Link Inference Attacks

Network embedding represents network nodes by a low-dimensional informative vector. While it is generally effective for various downstream tasks, it may leak some private information of networks, such as hidden private links. In this work, we address a novel problem of privacy-preserving network embedding against private link inference attacks. Basically, we propose to perturb the original network by adding or removing links, and expect the embedding generated on the perturbed network can leak little information about private links but hold high utility for various downstream tasks. Towards this goal, we first propose general measurements to quantify privacy gain and utility loss incurred by candidate network perturbations; we then design a PPNE framework to identify the optimal perturbation solution with the best privacy-utility trade-off in an iterative way. Furthermore, we propose many techniques to accelerate PPNE and ensure its scalability. For instance, as the skip-gram embedding methods including DeepWalk and LINE can be seen as matrix factorization with closed form embedding results, we devise efficient privacy gain and utility loss approximation methods to avoid the repetitive time-consuming embedding training for every candidate network perturbation in each iteration. Experiments on real-life network datasets (with up to millions of nodes) verify that PPNE outperforms baselines by sacrificing less utility and obtaining higher privacy protection.

preprint2021arXiv

Conversations Gone Alright: Quantifying and Predicting Prosocial Outcomes in Online Conversations

Online conversations can go in many directions: some turn out poorly due to antisocial behavior, while others turn out positively to the benefit of all. Research on improving online spaces has focused primarily on detecting and reducing antisocial behavior. Yet we know little about positive outcomes in online conversations and how to increase them-is a prosocial outcome simply the lack of antisocial behavior or something more? Here, we examine how conversational features lead to prosocial outcomes within online discussions. We introduce a series of new theory-inspired metrics to define prosocial outcomes such as mentoring and esteem enhancement. Using a corpus of 26M Reddit conversations, we show that these outcomes can be forecasted from the initial comment of an online conversation, with the best model providing a relative 24% improvement over human forecasting performance at ranking conversations for predicted outcome. Our results indicate that platforms can use these early cues in their algorithmic ranking of early conversations to prioritize better outcomes.

preprint2021arXiv

Variational quantum process tomography

Quantum process tomography is an experimental technique to fully characterize an unknown quantum process. Standard quantum process tomography suffers from exponentially scaling of the number of measurements with the increasing system size. In this work, we put forward a quantum machine learning algorithm which approximately encodes the unknown unitary quantum process into a relatively shallow depth parametric quantum circuit. We demonstrate our method by reconstructing the unitary quantum processes resulting from the quantum Hamiltonian evolution and random quantum circuits up to $8$ qubits. Results show that those quantum processes could be reconstructed with high fidelity, while the number of input states required are at least $2$ orders of magnitude less than required by the standard quantum process tomography.

preprint2020arXiv

A Matlab Toolbox for Feature Importance Ranking

More attention is being paid for feature importance ranking (FIR), in particular when thousands of features can be extracted for intelligent diagnosis and personalized medicine. A large number of FIR approaches have been proposed, while few are integrated for comparison and real-life applications. In this study, a matlab toolbox is presented and a total of 30 algorithms are collected. Moreover, the toolbox is evaluated on a database of 163 ultrasound images. To each breast mass lesion, 15 features are extracted. To figure out the optimal subset of features for classification, all combinations of features are tested and linear support vector machine is used for the malignancy prediction of lesions annotated in ultrasound images. At last, the effectiveness of FIR is analyzed according to performance comparison. The toolbox is online (https://github.com/NicoYuCN/matFIR). In our future work, more FIR methods, feature selection methods and machine learning classifiers will be integrated.

preprint2020arXiv

Robustness study of noisy annotation in deep learning based medical image segmentation

Partly due to the use of exhaustive-annotated data, deep networks have achieved impressive performance on medical image segmentation. Medical imaging data paired with noisy annotation are, however, ubiquitous, but little is known about the effect of noisy annotation on deep learning-based medical image segmentation. We studied the effects of noisy annotation in the context of mandible segmentation from CT images. First, 202 images of Head and Neck cancer patients were collected from our clinical database, where the organs-at-risk were annotated by one of 12 planning dosimetrists. The mandibles were roughly annotated as the planning avoiding structure. Then, mandible labels were checked and corrected by a physician to get clean annotations. At last, by varying the ratios of noisy labels in the training data, deep learning-based segmentation models were trained, one for each ratio. In general, a deep network trained with noisy labels had worse segmentation results than that trained with clean labels, and fewer noisy labels led to better segmentation. When using 20% or less noisy cases for training, no significant difference was found on the prediction performance between the models trained by noisy or clean. This study suggests that deep learning-based medical image segmentation is robust to noisy annotations to some extent. It also highlights the importance of labeling quality in deep learning

preprint2020arXiv

Sample caching Markov chain Monte Carlo approach to boson sampling simulation

Boson sampling is a promising candidate for quantum supremacy. It requires to sample from a complicated distribution, and is trusted to be intractable on classical computers. Among the various classical sampling methods, the Markov chain Monte Carlo method is an important approach to the simulation and validation of boson sampling. This method however suffers from the severe sample loss issue caused by the autocorrelation of the sample sequence. Addressing this, we propose the sample caching Markov chain Monte Carlo method that eliminates the correlations among the samples, and prevents the sample loss at the meantime, allowing more efficient simulation of boson sampling. Moreover, our method can be used as a general sampling framework that can benefit a wide range of sampling tasks, and is particularly suitable for applications where a large number of samples are taken.

preprint2020arXiv

Variational Quantum Circuits for Quantum State Tomography

Quantum state tomography is a key process in most quantum experiments. In this work, we employ quantum machine learning for state tomography. Given an unknown quantum state, it can be learned by maximizing the fidelity between the output of a variational quantum circuit and this state. The number of parameters of the variational quantum circuit grows linearly with the number of qubits and the circuit depth, so that only polynomial measurements are required, even for highly-entangled states. After that, a subsequent classical circuit simulator is used to transform the information of the target quantum state from the variational quantum circuit into a familiar format. We demonstrate our method by performing numerical simulations for the tomography of the ground state of a one-dimensional quantum spin chain, using a variational quantum circuit simulator. Our method is suitable for near-term quantum computing platforms, and could be used for relatively large-scale quantum state tomography for experimentally relevant quantum states.