Source author record

Xue Feng

Xue Feng appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

physics.optics Machine Learning physics.ao-ph quant-ph Computer Vision eess.IV Information Retrieval math.NA Numerical Analysis physics.geo-ph physics.ins-det

Catalog footprint

What is connected

14works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Convex Dataset Valuation for Post-Training

Improving LLM performance on downstream tasks sometimes requires leveraging auxiliary datasets during post-training. In practice, however, developers face constraints on compute, labeling, and licensing costs that preclude using all available data, necessitating principled dataset-level selection. These constraints are increasingly shaped by dataset marketplaces, where data acquisition is governed by budgets and negotiation. We study dataset valuation as a subset selection problem during LLM post-training. Our goal is to identify and weight auxiliary datasets so as to maximize target task performance given constrained budgets. We first show that commonly used gradient alignment scores provide a reasonable yet incomplete valuation signal, as they ignore redundancy among datasets. To address this, we propose a scalable convex dataset-level valuation method based on kernel mean matching (KMM) in gradient space, which jointly accounts for alignment with the target task and redundancy across auxiliary datasets. Through extensive experiments across diverse post-training settings and tasks, we show that our approach consistently outperforms existing valuation baselines, achieving stronger performance with low computational overhead. Our results position dataset valuation as a practical decision tool for post-training data selection in market-constrained large language model settings. The code is available at https://github.com/uiuctml/convex_data_valuation.

preprint2026arXiv

Learn to Evolve: Self-supervised Neural JKO Operator for Wasserstein Gradient Flow

The Jordan-Kinderlehrer-Otto (JKO) scheme provides a stable variational framework for computing Wasserstein gradient flows, but its practical use is often limited by the high computational cost of repeatedly solving the JKO subproblems. We propose a self-supervised approach for learning a JKO solution operator without requiring numerical solutions of any JKO trajectories. The learned operator maps an input density directly to the minimizer of the corresponding JKO subproblem, and can be iteratively applied to efficiently generate the gradient-flow evolution. A key challenge is that only a number of initial densities are typically available for training. To address this, we introduce a Learn-to-Evolve algorithm that jointly learns the JKO operator and its induced trajectories by alternating between trajectory generation and operator updates. As training progresses, the generated data increasingly approximates true JKO trajectories. Meanwhile, this Learn-to-Evolve strategy serves as a natural form of data augmentation, significantly enhancing the generalization ability of the learned operator. Numerical experiments demonstrate the accuracy, stability, and robustness of the proposed method across various choices of energies and initial conditions.

preprint2024arXiv

SUANPAN: Scalable Photonic Linear Vector Machine

Photonic linear operation is a promising approach to handle the extensive vector multiplications in artificial intelligence techniques due to the natural bosonic parallelism and high-speed information transmission of photonics. Although it is believed that maximizing the interaction of the light beams is necessary to fully utilize the parallelism and tremendous efforts have been made in past decades, the achieved dimensionality of vector-matrix multiplication is very limited due to the difficulty of scaling up a tightly interconnected or highly coupled optical system. Additionally, there is still a lack of a universal photonic computing architecture that can be readily merged with existing computing system to meet the computing power demand of AI techniques. Here, we propose a programmable and reconfigurable photonic linear vector machine to perform only the inner product of two vectors, formed by a series of independent basic computing units, while each unit is just one pair of light-emitter and photodetector. Since there is no interaction among light beams inside, extreme scalability could be achieved by simply duplicating the independent basic computing unit while there is no requirement of large-scale analog-to-digital converter and digital-to-analog converter arrays. Our architecture is inspired by the traditional Chinese Suanpan or abacus and thus is denoted as photonic SUANPAN. As a proof of principle, SUANPAN architecture is implemented with an 8*8 vertical cavity surface emission laser array and an 8*8 MoTe2 two-dimensional material photodetector array. We believe that our proposed photonic SUANPAN is capable of serving as a fundamental linear vector machine that can be readily merged with existing electronic digital computing system and is potential to enhance the computing power for future various AI applications.

preprint2023arXiv

Deep-learning-based on-chip rapid spectral imaging with high spatial resolution

Spectral imaging extends the concept of traditional color cameras to capture images across multiple spectral channels and has broad application prospects. Conventional spectral cameras based on scanning methods suffer from low acquisition speed and large volume. On-chip computational spectral imaging based on metasurface filters provides a promising scheme for portable applications, but endures long computation time for point-by-point iterative spectral reconstruction and mosaic effect in the reconstructed spectral images. In this study, we demonstrated on-chip rapid spectral imaging eliminating the mosaic effect in the spectral image by deep-learning-based spectral data cube reconstruction. We experimentally achieved four orders of magnitude speed improvement than iterative spectral reconstruction and high fidelity of spectral reconstruction over 99% for a standard color board. In particular, we demonstrated video-rate spectral imaging for moving objects and outdoor driving scenes with good performance for recognizing metamerism, where the concolorous sky and white cars can be distinguished via their spectra, showing great potential for autonomous driving and other practical applications in the field of intelligent perception.

preprint2022arXiv

A photon counting reconstructive spectrometer combining metasurfaces and superconducting nanowire single-photon detectors

Faint light spectroscopy has many important applications such as fluorescence spectroscopy, lidar and astronomical observations. However, long measurement time limit its application on real-time measurement. In this work, a photon counting reconstructive spectrometer combining metasurfaces and superconducting nanowire single photon detectors (SNSPDs) was proposed. A prototype device was fabricated on a silicon on isolator (SOI) substrate, and its performance was characterized. Experiment results show that this device support spectral reconstruction of mono-color lights with a resolution of 2 nm in the wavelength region of 1500 nm ~ 1600 nm. The detection efficiency of this device is 1.4% ~ 3.2% in this wavelength region. The measurement time required by this photon counting reconstructive spectrometer was also investigated experimentally, showing its potential to be applied in the scenarios requiring real-time measurement.

preprint2022arXiv

MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

In many personalized recommendation scenarios, the generalization ability of a target task can be improved via learning with additional auxiliary tasks alongside this target task on a multi-task network. However, this method often suffers from a serious optimization imbalance problem. On the one hand, one or more auxiliary tasks might have a larger influence than the target task and even dominate the network weights, resulting in worse recommendation accuracy for the target task. On the other hand, the influence of one or more auxiliary tasks might be too weak to assist the target task. More challenging is that this imbalance dynamically changes throughout the training process and varies across the parts of the same network. We propose a new method: MetaBalance to balance auxiliary losses via directly manipulating their gradients w.r.t the shared parameters in the multi-task network. Specifically, in each training iteration and adaptively for each part of the network, the gradient of an auxiliary loss is carefully reduced or enlarged to have a closer magnitude to the gradient of the target loss, preventing auxiliary tasks from being so strong that dominate the target task or too weak to help the target task. Moreover, the proximity between the gradient magnitudes can be flexibly adjusted to adapt MetaBalance to different scenarios. The experiments show that our proposed method achieves a significant improvement of 8.34% in terms of NDCG@10 upon the strongest baseline on two real-world datasets. The code of our approach can be found at here: https://github.com/facebookresearch/MetaBalance

preprint2022arXiv

Programmable Unitary Operations for Orbital Angular Momentum Encoded States

We have proposed and demonstrated a scalable and efficient scheme for programmable unitary operations in orbital angular momentum (OAM) domain. Based on matrix decomposition into diagonal and Fourier factors, arbitrary matrix operators can be implemented only by diagonal matrices alternately acting on orbital angular momentum domain and azimuthal angle domain, which are linked by Fourier transform. With numerical simulations, unitary matrices with dimensionality of 3*3 are designed and discussed for OAM domain. Meanwhile, the parallelism of our proposed scheme is also presented with two 3*3 matrices. Furthermore, as an alternative to verify our proposal, proof of principle experiments have been performed on path domain with the same matrix decomposition method, in which an average fidelity of 0.97 is evaluated through 80 experimental results with dimensionality of 3*3.

preprint2020arXiv

An entanglement-based quantum network based on symmetric dispersive optics quantum key distribution

Quantum key distribution (QKD) is a crucial technology for information security in the future. Developing simple and efficient ways to establish QKD among multiple users are important to extend the applications of QKD in communication networks. Herein, we proposed a scheme of symmetric dispersive optics QKD (DO-QKD) and demonstrated an entanglement-based quantum network based on it. In the experiment, a broadband entanglement photon pair source was shared by end users via wavelength and space division multiplexing. The wide spectrum of generated entangled photon pairs was divided into 16 combinations of frequency-conjugate channels. Photon pairs in each channel combination supported a fully-connected subnet with 8 users by a passive beam splitter. Eventually, it showed that an entanglement-based QKD network over 100 users could be supported by one entangled photon pair source in this architecture. It has great potential on applications of local quantum networks with large user number.

preprint2020arXiv

Feature Interaction Interpretability: A Case for Explaining Ad-Recommendation Systems via Neural Interaction Detection

Recommendation is a prevalent application of machine learning that affects many users; therefore, it is important for recommender models to be accurate and interpretable. In this work, we propose a method to both interpret and augment the predictions of black-box recommender systems. In particular, we propose to interpret feature interactions from a source recommender model and explicitly encode these interactions in a target recommender model, where both source and target models are black-boxes. By not assuming the structure of the recommender system, our approach can be used in general settings. In our experiments, we focus on a prominent use of machine learning recommendation: ad-click prediction. We found that our interaction interpretations are both informative and predictive, e.g., significantly outperforming existing recommender models. What's more, the same approach to interpret interactions can provide new insights into domains even beyond recommendation, such as text and image classification.

preprint2020arXiv

MixModule: Mixed CNN Kernel Module for Medical Image Segmentation

Convolutional neural networks (CNNs) have been successfully applied to medical image classification, segmentation, and related tasks. Among the many CNNs architectures, U-Net and its improved versions based are widely used and achieve state-of-the-art performance these years. These improved architectures focus on structural improvements and the size of the convolution kernel is generally fixed. In this paper, we propose a module that combines the benefits of multiple kernel sizes and we apply the proposed module to U-Net and its variants. We test our module on three segmentation benchmark datasets and experimental results show significant improvement.

preprint2018arXiv

Universal linear optical operations on discrete phase-coherent spatial modes

Linear optical operations are fundamental and significant for both quantum mechanics and classical technologies. We demonstrate a non-cascaded approach to perform arbitrary unitary and non-unitary linear operations for N-dimensional phase-coherent spatial modes with meticulously designed phase gratings. As implemented on spatial light modulators (SLMs), the unitary transformation matrix has been realized with dimensionalities ranging from 7 to 24 and the corresponding fidelities are from 95.1% to 82.1%. For the non-unitary operators, a matrix is presented for the tomography of a 4-level quantum system with a fidelity of 94.9%. Thus, the linear operator has been successfully implemented with much higher dimensionality than that in previous reports. It should be mentioned that our method is not limited to SLMs and can be easily applied on other devices. Thus we believe that our proposal provides another option to perform linear operation with a simple, fixed, error-tolerant and scalable scheme.

preprint2017arXiv

Identifying the tilt angle and correcting the orbital angular momentum spectrum dispersion of misaligned light beam

The axis tilt of light beam in optical system would introduce the dispersion of orbital angular momentum (OAM) spectrum. To deal with it, a two-step method is proposed and demonstrated. First, the tilt angle of optical axis is identified with a deduced relation between the tilt angle and the variation of OAM topological charges with different reference axes, which is obtained with the help of a charge coupled device (CCD) camera. In our experiments, the precision of measured tilt angle is about 10-4rad with OAM orders of -3~3. Using the measured angle value, the additional phase delay due to axis tilt can be calculated so that the dispersion of OAM spectrum can be corrected with a simple formula while the optical axis is not aligned. The experimental results indicate that the original OAM spectrum has been successfully extracted for not only the pure OAM state but also the superposed OAM states.

preprint2014arXiv

Analysis of rainfall seasonality from observations and climate models

Two new indicators of rainfall seasonality based on information entropy, the relative entropy (RE) and the dimensionless seasonality index (DSI), together with the mean annual rainfall, are evaluated on a global scale for recently updated precipitation gridded datasets and for historical simulations from coupled atmosphere-ocean general circulation models. The RE provides a measure of the number of wet months and, for precipitation regimes featuring one maximum in the monthly rain distribution, it is related to the duration of the wet season. The DSI combines the rainfall intensity with its degree of seasonality and it is an indicator of the extent of the global monsoon region. We show that the RE and the DSI are fairly independent of the time resolution of the precipitation data, thereby allowing objective metrics for model intercomparison and ranking. Regions with different precipitation regimes are classified and characterized in terms of RE and DSI. Comparison of different land observational datasets reveals substantial difference in their local representation of seasonality. It is shown that two-dimensional maps of RE provide an easy way to compare rainfall seasonality from various datasets and to determine areas of interest. CMIP5 models consistently overestimate the RE over tropical Latin America and underestimate it in Western Africa and East Asia. It is demonstrated that positive RE biases in a GCM are associated with simulated monthly precipitation fractions which are too large during the wet months and too small in the months preceding the wet season; negative biases are instead due to an excess of rainfall during the dry months.

preprint2014arXiv

Projected changes of rainfall seasonality and dry spells in a high concentration pathway 21st century scenario

In this diagnostic study we analyze changes of rainfall seasonality and dry spells by the end of the twenty-first century under the most extreme IPCC5 emission scenario (RCP8.5) as projected by twenty-four coupled climate models participating to Coupled Model Intercomparison Project 5. We use estimates of the centroid of the monthly rainfall distribution as an index of the rainfall timing and a threshold-independent, information theory-based quantity such as relative entropy (RE) to quantify the concentration of annual rainfall and the number of dry months and to build a monsoon dimensionless seasonality index (DSI). The RE is projected to increase, with high inter-model agreement over Mediterranean-type regions (southern Europe, northern Africa and southern Australia) and areas of South and Central America, implying an increase in the number of dry days up to one month by the end of the twenty-first century. Positive RE changes are also projected over the monsoon regions of southern Africa and North America, South America. These trends are consistent with a shortening of the wet season associated with a more prolonged pre-monsoonal dry period. The extent of the global monsoon region, characterized by large DSI, is projected to remain substantially unaltered. Centroid analysis shows that most of CMIP5 projections suggest that the monsoonal annual rainfall distribution is expected to change from early to late in the course of the hydrological year by the end of the twenty-first century and particularly after year 2050. This trend is particularly evident over Northern Africa, Southern Africa and western Mexico, where more than 90 % of the models project a delay of the rainfall centroid from a few days up to two weeks. Over the remaining monsoonal regions, there is little inter-model agreement in terms of centroid changes.

Xue Feng

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

Convex Dataset Valuation for Post-Training

Learn to Evolve: Self-supervised Neural JKO Operator for Wasserstein Gradient Flow

SUANPAN: Scalable Photonic Linear Vector Machine

Deep-learning-based on-chip rapid spectral imaging with high spatial resolution

A photon counting reconstructive spectrometer combining metasurfaces and superconducting nanowire single-photon detectors

MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

Programmable Unitary Operations for Orbital Angular Momentum Encoded States

An entanglement-based quantum network based on symmetric dispersive optics quantum key distribution

Feature Interaction Interpretability: A Case for Explaining Ad-Recommendation Systems via Neural Interaction Detection

MixModule: Mixed CNN Kernel Module for Medical Image Segmentation

Universal linear optical operations on discrete phase-coherent spatial modes

Identifying the tilt angle and correcting the orbital angular momentum spectrum dispersion of misaligned light beam

Analysis of rainfall seasonality from observations and climate models

Projected changes of rainfall seasonality and dry spells in a high concentration pathway 21st century scenario