Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
20works
0followers
18topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

20 published item(s)

preprint2026arXiv

FlexSQL: Flexible Exploration and Execution Make Better Text-to-SQL Agents

Text-to-SQL over large analytical databases requires navigating complex schemas, resolving ambiguous queries, and grounding decisions in actual data. Most current systems follow a fixed pipeline where schema elements are retrieved once upfront and the database is only revisited for post-hoc repair, limiting recovery from early mistakes. We present FlexSQL, a text-to-SQL agent whose core design principle is flexible database interaction: the agent can explore schema structure, inspect data values, and run verification queries at any point during reasoning. FlexSQL generates diverse execution plans to cover multiple query interpretations, implements each plan in either SQL or Python depending on the task, and uses a two-tiered repair mechanism that can backtrack from code-level errors to plan-level revisions. On Spider2-Snow, using gpt-oss-120b, FlexSQL achieves a 65.4\% score, outperforming strong open-source baselines that use stronger, larger models such as gpt-o3 and DeepSeek-R1. When integrated into a general-purpose coding agent (as skills in Claude Code), our approach yields over 10\% relative improvement on Spider2-Snow. Further analysis shows that flexible exploration and flexible execution jointly contribute to the effectiveness of our approach, highlighting flexibility as a key design principle. Our code is available at: https://github.com/StringNLPLAB/FlexSQL

preprint2023arXiv

Deep Learning for Code Intelligence: Survey, Benchmark and Toolkit

Code intelligence leverages machine learning techniques to extract knowledge from extensive code corpora, with the aim of developing intelligent tools to improve the quality and productivity of computer programming. Currently, there is already a thriving research community focusing on code intelligence, with efforts ranging from software engineering, machine learning, data mining, natural language processing, and programming languages. In this paper, we conduct a comprehensive literature review on deep learning for code intelligence, from the aspects of code representation learning, deep learning techniques, and application tasks. We also benchmark several state-of-the-art neural models for code intelligence, and provide an open-source toolkit tailored for the rapid prototyping of deep-learning-based code intelligence models. In particular, we inspect the existing code intelligence models under the basis of code representation learning, and provide a comprehensive overview to enhance comprehension of the present state of code intelligence. Furthermore, we publicly release the source code and data resources to provide the community with a ready-to-use benchmark, which can facilitate the evaluation and comparison of existing and future code intelligence models (https://xcodemind.github.io). At last, we also point out several challenging and promising directions for future research.

preprint2023arXiv

Effects of the formation time of parton shower on jet quenching in heavy-ion collisions

Jet quenching has successfully served as a hard probe to study the properties of Quark-Gluon Plasma (QGP). As a multi-particle system, jets take time to develop from a highly virtual parton to a group of partons close to mass shells. In this study, we present a systematical study on the effects of this formation time on jet quenching in relativistic nuclear collisions. Jets from initial hard scatterings were simulated with Pythia, and their interactions with QGP were described using a Linear Boltzmann Transport (LBT) model that incorporates both elastic and inelastic scatterings between jet partons and the thermal medium. Three different estimations of the jet formation time were implemented and compared, including instantaneous formation, formation from single splitting, and formation from sequential splittings, before which no jet-medium interaction was assumed. We found that deferring the jet-medium interaction with a longer formation time not only affects the overall magnitude of the nuclear modification factor of jets, but also its dependence on the jet transverse momentum.

preprint2022arXiv

Alleviating Cold-start Problem in CTR Prediction with A Variational Embedding Learning Framework

We propose a general Variational Embedding Learning Framework (VELF) for alleviating the severe cold-start problem in CTR prediction. VELF addresses the cold start problem via alleviating over-fits caused by data-sparsity in two ways: learning probabilistic embedding, and incorporating trainable and regularized priors which utilize the rich side information of cold start users and advertisements (Ads). The two techniques are naturally integrated into a variational inference framework, forming an end-to-end training process. Abundant empirical tests on benchmark datasets well demonstrate the advantages of our proposed VELF. Besides, extended experiments confirmed that our parameterized and regularized priors provide more generalization capability than traditional fixed priors.

preprint2022arXiv

CosSGD: Communication-Efficient Federated Learning with a Simple Cosine-Based Quantization

Federated learning is a promising framework to mitigate data privacy and computation concerns. However, the communication cost between the server and clients has become the major bottleneck for successful deployment. Despite notable progress in gradient compression, the existing quantization methods require further improvement when low-bits compression is applied, especially the overall systems often degenerate a lot when quantization are applied in double directions to compress model weights and gradients. In this work, we propose a simple cosine-based nonlinear quantization and achieve impressive results in compressing round-trip communication costs. We are not only able to compress model weights and gradients at higher ratios than previous methods, but also achieve competing model performance at the same time. Further, our approach is highly suitable for federated learning problems since it has low computational complexity and requires only a little additional data to recover the compressed information. Extensive experiments have been conducted on image classification and brain tumor semantic segmentation using the CIFAR-10, and BraTS datasets where we show state-of-the-art effectiveness and impressive communication efficiency.

preprint2022arXiv

Filter Pruning by Switching to Neighboring CNNs with Good Attributes

Filter pruning is effective to reduce the computational costs of neural networks. Existing methods show that updating the previous pruned filter would enable large model capacity and achieve better performance. However, during the iterative pruning process, even if the network weights are updated to new values, the pruning criterion remains the same. In addition, when evaluating the filter importance, only the magnitude information of the filters is considered. However, in neural networks, filters do not work individually, but they would affect other filters. As a result, the magnitude information of each filter, which merely reflects the information of an individual filter itself, is not enough to judge the filter importance. To solve the above problems, we propose Meta-attribute-based Filter Pruning (MFP). First, to expand the existing magnitude information based pruning criteria, we introduce a new set of criteria to consider the geometric distance of filters. Additionally, to explicitly assess the current state of the network, we adaptively select the most suitable criteria for pruning via a meta-attribute, a property of the neural network at the current state. Experiments on two image classification benchmarks validate our method. For ResNet-50 on ILSVRC-2012, we could reduce more than 50% FLOPs with only 0.44% top-5 accuracy loss.

preprint2022arXiv

Gating-adapted Wavelet Multiresolution Analysis for Exposure Sequence Modeling in CTR prediction

The exposure sequence is being actively studied for user interest modeling in Click-Through Rate (CTR) prediction. However, the existing methods for exposure sequence modeling bring extensive computational burden and neglect noise problems, resulting in an excessively latency and the limited performance in online recommenders. In this paper, we propose to address the high latency and noise problems via Gating-adapted wavelet multiresolution analysis (Gama), which can effectively denoise the extremely long exposure sequence and adaptively capture the implied multi-dimension user interest with linear computational complexity. This is the first attempt to integrate non-parametric multiresolution analysis technique into deep neural networks to model user exposure sequence. Extensive experiments on large scale benchmark dataset and real production dataset confirm the effectiveness of Gama for exposure sequence modeling, especially in cold-start scenarios. Benefited from its low latency and high effecitveness, Gama has been deployed in our real large-scale industrial recommender, successfully serving over hundreds of millions users.

preprint2022arXiv

Hamiltonian Particle-in-Cell methods for Vlasov-Poisson equations

In this paper, Particle-in-Cell algorithms for the Vlasov-Poisson system are presented based on its Poisson bracket structure. The Poisson equation is solved by finite element methods, in which the appropriate finite element spaces are taken to guarantee that the semi-discretized system possesses a well defined discrete Poisson bracket structure. Then, splitting methods are applied to the semi-discretized system by decomposing the Hamiltonian function. The resulting discretizations are proved to be Poisson bracket preserving. Moreover, the conservative quantities of the system are also well preserved. In numerical experiments, we use the presented numerical methods to simulate various physical phenomena. Due to the huge computational effort of the practical computations, we employ the strategy of parallel computing. The numerical results verify the efficiency of the new derived numerical discretizations.

preprint2022arXiv

Observation of biradical spin coupling through hydrogen bonds

Investigation of intermolecular electron spin interaction is of fundamental importance in both science and technology.Here, radical pairs of all-trans retinoic acid molecules on Au(111) are created using an ultra-low temperature scanning tunneling microscope. Antiferromagnetic coupling between two radicals is identified by magnetic-field dependent spectroscopy.The measured exchange energies are from 0.1 to 1.0 meV. The biradical spin coupling is mediated through O-H$\cdots$O hydrogen bonds, as elucidated from analysis combining density functional theory calculation and a modern version of valence bond theory.

preprint2022arXiv

PYTHIA 8 underlying event tune For RHIC energies

We report an underlying event tune for the PYTHIA 8 Monte Carlo event generator that is applicable for hadron collisions primarily at $\sqrt{s}$ ranges available at the Relativistic Heavy-Ion Collider (RHIC). We compare our new PYTHIA 8 tuned predictions to mid-rapidity inclusive $π^{\pm}$ spectra, jet sub-structure, Drell-Yan production, and underlying event measurements from RHIC and the Tevatron, as well as underlying event data from the Large Hadron Collider. With respect to the default PYTHIA 8 Monash Tune, the new `Detroit' tune shows significant improvements in the description of the experimental data. Additionally, we explore the validity of PYTHIA 8 predictions for forward rapidity $π$ in $\sqrt{s}$ = 200 GeV collisions, where neither tune is able to sufficiently describe the data. We advocate for the new tune to be used for PYTHIA 8 studies at current and future RHIC experiments, and discuss future tuning exercises at lower center-of-mass energies, where forward/backward kinematics are essential at the upcoming Electron-Ion collider.

preprint2022arXiv

Reliable and Broad-range Layer Identification of Au-assisted Exfoliated Large Area MoS$_2$ and WS$_2$ Using Reflection Spectroscopic Fingerprints

The emerging Au-assisted exfoliation technique provides a wealth of large-area and high-quality ultrathin two-dimensional (2D) materials compared with traditional tape-based exfoliation. Fast, damage-free, and reliable determination of the layer number of such 2D films is essential to study layer-dependent physics and promote device applications. Here, an optical method has been developed for simple, high throughput, and accurate determination of the layer number for Au-assisted exfoliated MoS$_2$ and WS$_2$ films in a broad thickness range. The method is based on quantitative analysis of layer-dependent white light reflection spectra, revealing that the reflection peak intensity can be used as a clear indicator for determining the layer number. The simple yet robust method will facilitate the fundamental study on layer-dependent optical, electrical, and thermal properties and device applications of 2D materials. The technique can also be readily combined with photoluminescence and Raman spectroscopies to study other layer-dependent physical properties of 2D materials.

preprint2022arXiv

Self-injection-locked second-harmonic integrated source

High coherence visible and near-visible laser sources are centrally important to the operation of advanced position/navigation/timing systems as well as classical/quantum sensing systems. However, the complexity and size of these bench-top lasers is an impediment to their transitioning beyond the laboratory. Here, a system-on-a-chip that emits high-coherence visible and near-visible lightwaves is demonstrated. The devices rely upon a new approach wherein wavelength conversion and coherence increase by self-injection-locking are combined within in a single nonlinear resonator. This simplified approach is demonstrated in a hybridly-integrated device and provides a short-term linewidth around 10-30 kHz. On-chip, converted optical power over 2 mW is also obtained. Moreover, measurements show that heterogeneous integration can result in conversion efficiency higher than 25% with output power over 11 mW. Because the approach uses mature III-V pump lasers in combination with thin-film lithium niobate, it can be scaled for low-cost manufacturing of high-coherence visible emitters. Also, the coherence generation process can be transferred to other frequency conversion processes including optical parametric oscillation, sum/difference frequency generation, and third-harmonic generation.

preprint2020arXiv

Consistency between ARPES and STM measurements on SmB$_6$

Strongly correlated topological surface states are promising platforms for next-generation quantum applications, but they remain elusive in real materials. The correlated Kondo insulator SmB$_6$ is one of the most promising candidates, with theoretically predicted heavy Dirac surface states supported by transport and scanning tunneling microscopy (STM) experiments. However, a puzzling discrepancy appears between STM and angle-resolved photoemission (ARPES) experiments on SmB$_6$. Although ARPES detects spin-textured surface states, their velocity is an order of magnitude higher than expected, while the Dirac point -- the hallmark of any topological system -- can only be inferred deep within the bulk valence band. A significant challenge is that SmB$_6$ lacks a natural cleavage plane, resulting in ordered surface domains limited to 10s of nanometers. Here we use STM to show that surface band bending can shift energy features by 10s of meV between domains. Starting from our STM spectra, we simulate the full spectral function as an average over multiple domains with different surface potentials. Our simulation shows excellent agreement with ARPES data, and thus resolves the apparent discrepancy between large-area measurements that average over multiple band-shifted domains and atomically-resolved measurements within a single domain.

preprint2020arXiv

Lithium niobate photonic-crystal electro-optic modulator

Modern advanced photonic integrated circuits require dense integration of high-speed electro-optic functional elements on a compact chip that consumes only moderate power. Energy efficiency, operation speed, and device dimension are thus crucial metrics underlying almost all current developments of photonic signal processing units. Recently, thin-film lithium niobate (LN) emerges as a promising platform for photonic integrated circuits. Here we make an important step towards miniaturizing functional components on this platform, reporting probably the smallest high-speed LN electro-optic modulators, based upon photonic crystal nanobeam resonators. The devices exhibit a significant tuning efficiency up to 1.98 GHz/V, a broad modulation bandwidth of 17.5 GHz, while with a tiny electro-optic modal volume of only 0.58 $μ{\rm m}^3$. The modulators enable efficient electro-optic driving of high-Q photonic cavity modes in both adiabatic and non-adiabatic regimes, and allow us to achieve electro-optic switching at 11 Gb/s with a bit-switching energy as low as 22 fJ. The demonstration of energy efficient and high-speed electro-optic modulation at the wavelength scale paves a crucial foundation for realizing large-scale LN photonic integrated circuits that are of immense importance for broad applications in data communication, microwave photonics, and quantum photonics.

preprint2020arXiv

Probing Image Potential States on Topological Semimetal Antimony Surface

A point charge near the surface of a topological insulator (TI) with broken time-reversal symmetry is predicted to generate an image magnetic charge in addition to an image electric charge. We use scanning tunneling spectroscopy to study the image potential states (IPS) of the topological semimetal Sb(111) surface. We observe five IPS with discrete energy levels that are well described by a one-dimensional model. The spatial variation of the IPS energies and lifetimes near surface step edges shows the first local signature of resonant interband scattering between IPS, which suggests that image charges too may interact. Our work motivates the exploration of the TI surface geometry necessary to realize and manipulate a magnetic charge.

preprint2020arXiv

Progressive Local Filter Pruning for Image Retrieval Acceleration

This paper focuses on network pruning for image retrieval acceleration. Prevailing image retrieval works target at the discriminative feature learning, while little attention is paid to how to accelerate the model inference, which should be taken into consideration in real-world practice. The challenge of pruning image retrieval models is that the middle-level feature should be preserved as much as possible. Such different requirements of the retrieval and classification model make the traditional pruning methods not that suitable for our task. To solve the problem, we propose a new Progressive Local Filter Pruning (PLFP) method for image retrieval acceleration. Specifically, layer by layer, we analyze the local geometric properties of each filter and select the one that can be replaced by the neighbors. Then we progressively prune the filter by gradually changing the filter weights. In this way, the representation ability of the model is preserved. To verify this, we evaluate our method on two widely-used image retrieval datasets,i.e., Oxford5k and Paris6K, and one person re-identification dataset,i.e., Market-1501. The proposed method arrives with superior performance to the conventional pruning methods, suggesting the effectiveness of the proposed method for image retrieval.

preprint2020arXiv

Segmentations-Leak: Membership Inference Attacks and Defenses in Semantic Image Segmentation

Today's success of state of the art methods for semantic segmentation is driven by large datasets. Data is considered an important asset that needs to be protected, as the collection and annotation of such datasets comes at significant efforts and associated costs. In addition, visual data might contain private or sensitive information, that makes it equally unsuited for public release. Unfortunately, recent work on membership inference in the broader area of adversarial machine learning and inference attacks on machine learning models has shown that even black box classifiers leak information on the dataset that they were trained on. We show that such membership inference attacks can be successfully carried out on complex, state of the art models for semantic segmentation. In order to mitigate the associated risks, we also study a series of defenses against such membership inference attacks and find effective counter measures against the existing risks with little effect on the utility of the segmentation method. Finally, we extensively evaluate our attacks and defenses on a range of relevant real-world datasets: Cityscapes, BDD100K, and Mapillary Vistas.

preprint2020arXiv

Stochastic Model Pruning via Weight Dropping Away and Back

Deep neural networks have dramatically achieved great success on a variety of challenging tasks. However, most successful DNNs have an extremely complex structure, leading to extensive research on model compression.As a significant area of progress in model compression, traditional gradual pruning approaches involve an iterative prune-retrain procedure and may suffer from two critical issues: local importance judgment, where the pruned weights are merely unimportant in the current model; and an irretrievable pruning process, where the pruned weights have no chance to come back. Addressing these two issues, this paper proposes the Drop Pruning approach, which leverages stochastic optimization in the pruning process by introducing a drop strategy at each pruning step, namely, drop away, which stochastically deletes some unimportant weights, and drop back, which stochastically recovers some pruned weights. The suitable choice of drop probabilities decreases the model size during the pruning process and helps it flow to the target sparsity. Compared to the Bayesian approaches that stochastically train a compact model for pruning, we directly aim at stochastic gradual pruning. We provide a detailed analysis showing that the drop away and drop back approaches have individual contributions. Moreover, Drop Pruning can achieve competitive compression performance and accuracy on many benchmark tasks compared with state-of-the-art weights pruning and Bayesian training approaches.

preprint2020arXiv

Synthetic Convolutional Features for Improved Semantic Segmentation

Recently, learning-based image synthesis has enabled to generate high-resolution images, either applying popular adversarial training or a powerful perceptual loss. However, it remains challenging to successfully leverage synthetic data for improving semantic segmentation with additional synthetic images. Therefore, we suggest to generate intermediate convolutional features and propose the first synthesis approach that is catered to such intermediate convolutional features. This allows us to generate new features from label masks and include them successfully into the training procedure in order to improve the performance of semantic segmentation. Experimental results and analysis on two challenging datasets Cityscapes and ADE20K show that our generated feature improves performance on segmentation tasks.

preprint2018arXiv

Imaging emergent heavy Dirac fermions of a topological Kondo insulator

Kondo insulators are primary candidates in the search for strongly correlated topological quantum phases, which may host topological order, fractionalization, and non-Abelian statistics. Within some Kondo insulators, the hybridization gap is predicted to protect a nontrivial topological invariant and to harbor emergent heavy Dirac fermion surface modes. We use high-energy-resolution spectroscopic imaging in real and momentum space on the Kondo insulator, SmB$_6$. On cooling through $T^*_Δ\approx$ 35 K we observe the opening of an insulating gap that expands to $Δ\approx$ 10 meV at 2 K. Within the gap, we image the formation of linearly dispersing surface states with effective masses reaching $m^* = (410\pm20)m_e$. We thus demonstrate existence of a strongly correlated topological Kondo insulator phase hosting the heaviest known Dirac fermions.