Source author record

Xin Ma

Xin Ma appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision cond-mat.mtrl-sci astro-ph.SR Machine Learning math.DS math.OA Artificial Intelligence Computation and Language math.GR math.GT math.LO math.OC math.RT Methodology Networking and Internet Architecture quant-ph Robotics

Catalog footprint

What is connected

22works

17topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

More Edits, More Stable: Understanding the Lifelong Normalization in Sequential Model Editing

Lifelong Model Editing aims to continuously update evolving facts in Large Language Models while preserving unrelated knowledge and general capabilities, yet it remains plagued by catastrophic forgetting and model collapse. Empirically, we find that recent editors resilient over long horizons share the same core strategy: Lifelong Normalization (LN), which normalizes value gradients using running statistics. Removing LN causes immediate performance collapse, and we observe a counter-intuitive positive cumulative effect where early edits can promote the success of future edits. Yet the mechanism of LN remains a "black box", leaving its precise role in lifelong stability poorly understood. In this work, we provide the first theoretical account of LN in the lifelong regime. Our analysis reveals a self-reinforcing stability loop and proves that, when combined with ridge-regularized regression, LN yields parameter updates with asymptotic orthogonality and bounded norms, directly mitigating forgetting and systemic collapse. Based on these insights, we derive StableEdit, which strengthens this stability loop via an explicit warm-up stage and full whitening, improving long-horizon stability at minimal overhead. Extensive experiments validate our theory and demonstrate competitive performance. Our code is available at https://github.com/MINE-USTC/StableEdit.

preprint2024arXiv

InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation

This paper introduces InternVid, a large-scale video-centric multimodal dataset that enables learning powerful and transferable video-text representations for multimodal understanding and generation. The InternVid dataset contains over 7 million videos lasting nearly 760K hours, yielding 234M video clips accompanied by detailed descriptions of total 4.1B words. Our core contribution is to develop a scalable approach to autonomously build a high-quality video-text dataset with large language models (LLM), thereby showcasing its efficacy in learning video-language representation at scale. Specifically, we utilize a multi-scale approach to generate video-related descriptions. Furthermore, we introduce ViCLIP, a video-text representation learning model based on ViT-L. Learned on InternVid via contrastive learning, this model demonstrates leading zero-shot action recognition and competitive video retrieval performance. Beyond basic video understanding tasks like recognition and retrieval, our dataset and model have broad applications. They are particularly beneficial for generating interleaved video-text data for learning a video-centric dialogue system, advancing video-to-text and text-to-video generation research. These proposed resources provide a tool for researchers and practitioners interested in multimodal video understanding and generation.

preprint2022arXiv

A categorical study on the generalized type semigroup

In this short note, we show that the generalized type semigroup $\CW(X, Γ)$ introduced by the author in \cite{M3} belongs to the category \textnormal{W}. In particular, we demonstrate that $\CW(X, Γ)$ satisfies axioms (W1)-(W4) and (W6). When $X$ is zero-dimensional, we also establish (W5) for the semigroup. This supports the analogy between the generalized type semigroup and pre-completed Cuntz semigroup $W(\cdot)$ for $C^*$-algebras.

preprint2022arXiv

A DNS Tunnel Sliding Window Differential Detection Method Based on Normal Distribution Reasonable Range Filtering

A covert attack method often used by APT organizations is the DNS tunnel, which is used to pass information by constructing C2 networks. And they often use the method of frequently changing domain names and server IP addresses to evade monitoring, which makes it extremely difficult to detect them. However, they carry DNS tunnel information traffic in normal DNS communication, which inevitably brings anomalies in some statistical characteristics of DNS traffic, so that it would provide security personnel with the opportunity to find them. Based on the above considerations, this paper studies the statistical discovery methodology of typical DNS tunnel high-frequency query behavior. Firstly, we analyze the distribution of the DNS domain name length and times and finds that the DNS domain name length and times follow the normal distribution law. Secondly, based on this distribution law, we propose a method for detecting and discovering high-frequency DNS query behaviors of non-single domain names based on the statistical rules of domain name length and frequency and we also give three theorems as theoretical support. Thirdly, we design a sliding window difference scheme based on the above method. Experimental results show that our method has a higher detection rate. At the same time, since our method does not need to construct a data set, it has better practicability in detecting unknown DNS tunnels. This also shows that our detection method based on mathematical models can effectively avoid the dilemma for machine learning methods that must have useful training data sets, and has strong practical significance.

preprint2022arXiv

Boundary actions of CAT(0) spaces and their $C^*$-algebras

In this paper, we study boundary actions of CAT(0) spaces from a point of view of topological dynamics and $C^*$-algebras. First, we investigate the actions of right-angled Coexter groups and right-angled Artin groups with finite defining graphs on the visual boundaries and the Nevo-Sageev boundaries of their natural assigned CAT(0) cube complexes. In particular, we establish (strongly) pure infiniteness results for reduced crossed product $C^*$-algebras of these actions through investigating the corresponding $\cat$ cube complexes and establishing necessary dynamical properties such as minimality, topological freeness and pure infiniteness of the actions. In addition, we study actions of fundamental groups of graphs of groups on the visual boundaries of their Bass-Serre trees. We show that the existence of repeatable paths essentially implies that the action is $2$-filling, from which, we also obtain a large class of unital Kirchberg algebras. Furthermore, our result also provides a new method in identifying $C^*$-simple generalized Baumslag-Solitar groups. The examples of groups obtained from our method have $n$-paradoxical towers in the sense of \cite{G-G-K-N}. This class particularly contains non-degenerated free products, Baumslag-Solitar groups and fundamental groups of $n$-circles or wedge sums of $n$-circles.

preprint2022arXiv

Compressing Models with Few Samples: Mimicking then Replacing

Few-sample compression aims to compress a big redundant model into a small compact one with only few samples. If we fine-tune models with these limited few samples directly, models will be vulnerable to overfit and learn almost nothing. Hence, previous methods optimize the compressed model layer-by-layer and try to make every layer have the same outputs as the corresponding layer in the teacher model, which is cumbersome. In this paper, we propose a new framework named Mimicking then Replacing (MiR) for few-sample compression, which firstly urges the pruned model to output the same features as the teacher's in the penultimate layer, and then replaces teacher's layers before penultimate with a well-tuned compact one. Unlike previous layer-wise reconstruction methods, our MiR optimizes the entire network holistically, which is not only simple and effective, but also unsupervised and general. MiR outperforms previous methods with large margins. Codes will be available soon.

preprint2022arXiv

Disjoint Masking with Joint Distillation for Efficient Masked Image Modeling

Masked image modeling (MIM) has shown great promise for self-supervised learning (SSL) yet been criticized for learning inefficiency. We believe the insufficient utilization of training signals should be responsible. To alleviate this issue, we introduce a conceptually simple yet learning-efficient MIM training scheme, termed Disjoint Masking with Joint Distillation (DMJD). For disjoint masking (DM), we sequentially sample multiple masked views per image in a mini-batch with the disjoint regulation to raise the usage of tokens for reconstruction in each image while keeping the masking rate of each view. For joint distillation (JD), we adopt a dual branch architecture to respectively predict invisible (masked) and visible (unmasked) tokens with superior learning targets. Rooting in orthogonal perspectives for training efficiency improvement, DM and JD cooperatively accelerate the training convergence yet not sacrificing the model generalization ability. Concretely, DM can train ViT with half of the effective training epochs (3.7 times less time-consuming) to report competitive performance. With JD, our DMJD clearly improves the linear probing classification accuracy over ConvMAE by 5.8%. On fine-grained downstream tasks like semantic segmentation, object detection, etc., our DMJD also presents superior generalization compared with state-of-the-art SSL methods. The code and model will be made public at https://github.com/mx-mark/DMJD.

preprint2022arXiv

FedSSO: A Federated Server-Side Second-Order Optimization Algorithm

In this work, we propose FedSSO, a server-side second-order optimization method for federated learning (FL). In contrast to previous works in this direction, we employ a server-side approximation for the Quasi-Newton method without requiring any training data from the clients. In this way, we not only shift the computation burden from clients to server, but also eliminate the additional communication for second-order updates between clients and server entirely. We provide theoretical guarantee for convergence of our novel method, and empirically demonstrate our fast convergence and communication savings in both convex and non-convex settings.

preprint2022arXiv

Multi-task Learning with High-Dimensional Noisy Images

Recent medical imaging studies have given rise to distinct but inter-related datasets corresponding to multiple experimental tasks or longitudinal visits. Standard scalar-on-image regression models that fit each dataset separately are not equipped to leverage information across inter-related images, and existing multi-task learning approaches are compromised by the inability to account for the noise that is often observed in images. We propose a novel joint scalar-on-image regression framework involving wavelet-based image representations with grouped penalties that are designed to pool information across inter-related images for joint learning, and which explicitly accounts for noise in high-dimensional images via a projection-based approach. In the presence of non-convexity arising due to noisy images, we derive non-asymptotic error bounds under non-convex as well as convex grouped penalties, even when the number of voxels increases exponentially with sample size. A projected gradient descent algorithm is used for computation, which is shown to approximate the optimal solution via well-defined non-asymptotic optimization error bounds under noisy images. Extensive simulations and application to a motivating longitudinal Alzheimer's disease study illustrate significantly improved predictive ability and greater power to detect true signals, that are simply missed by existing methods without noise correction due to the attenuation to null phenomenon.

preprint2022arXiv

Resolving subcategories and dimensions in recollements of extriangulated categories

Recently, Wang, Wei and Zhang introduced the notion of recollements of extriangulated categories. In this paper, let $(\mathcal{A},\mathcal{B},\mathcal{C})$ be a recollement of extriangulated categories. We provide some methods to construct resolving subcategories in $(\mathcal{A},\mathcal{B},\mathcal{C})$. As applications of the Auslander-Reiten correspondence, we get the gluing of cotilting modules in a recollement of module categories for artin algebras. We also give some bounds of resolution dimensions of the categories involved in $(\mathcal{A},\mathcal{B},\mathcal{C})$ with respect to resolving subcategories, which generalize some known results.

preprint2022arXiv

TL-GAN: Improving Traffic Light Recognition via Data Synthesis for Autonomous Driving

Traffic light recognition, as a critical component of the perception module of self-driving vehicles, plays a vital role in the intelligent transportation systems. The prevalent deep learning based traffic light recognition methods heavily hinge on the large quantity and rich diversity of training data. However, it is quite challenging to collect data in various rare scenarios such as flashing, blackout or extreme weather, thus resulting in the imbalanced distribution of training data and consequently the degraded performance in recognizing rare classes. In this paper, we seek to improve traffic light recognition by leveraging data synthesis. Inspired by the generative adversarial networks (GANs), we propose a novel traffic light generation approach TL-GAN to synthesize the data of rare classes to improve traffic light recognition for autonomous driving. TL-GAN disentangles traffic light sequence generation into image synthesis and sequence assembling. In the image synthesis stage, our approach enables conditional generation to allow full control of the color of the generated traffic light images. In the sequence assembling stage, we design the style mixing and adaptive template to synthesize realistic and diverse traffic light sequences. Extensive experiments show that the proposed TL-GAN renders remarkable improvement over the baseline without using the generated data, leading to the state-of-the-art performance in comparison with the competing algorithms that are used for general image synthesis and data imbalance tackling.

preprint2021arXiv

Adaptive Deconvolution-based stereo matching Net for Local Stereo Matching

In deep learning-based local stereo matching methods, larger image patches usually bring better stereo matching accuracy. However, it is unrealistic to increase the size of the image patch size without restriction. Arbitrarily extending the patch size will change the local stereo matching method into the global stereo matching method, and the matching accuracy will be saturated. We simplified the existing Siamese convolutional network by reducing the number of network parameters and propose an efficient CNN based structure, namely Adaptive Deconvolution-based disparity matching Net (ADSM net) by adding deconvolution layers to learn how to enlarge the size of input feature map for the following convolution layers. Experimental results on the KITTI 2012 and 2015 datasets demonstrate that the proposed method can achieve a good trade-off between accuracy and complexity.

preprint2021arXiv

Interfacial Dzyaloshinskii-Moriya interaction and spin-orbit torque in Au1-xPtx/Co bilayers with varying interfacial spin-orbit coupling

The quantitative roles of the interfacial spin-orbit coupling (SOC) in Dzyaloshinskii-Moriya interaction (DMI) and dampinglike spin-orbit torque (τDL) have remained unsettled after a decade of intensive study. Here, we report a conclusive experiment evidence that, because of the critical role of the interfacial orbital hybridization, the interfacial DMI is not necessarily a linear function of the interfacial SOC, e.g. at Au1-xPtx/Co interfaces where the interfacial SOC can be tuned significantly via strongly composition (x)-dependent spin-orbit proximity effect without varying the bulk SOC and the electronegativity of the Au1-xPtx layer. We also find that τDL in the Au1-xPtx/Co bilayers varies distinctly from the interfacial SOC as a function of x, indicating no important τDL contribution from the interfacial Rashba-Edelstein effect.

preprint2021arXiv

Point-line-based RGB-D SLAM and Bundle Adjustment Uncertainty Analysis

Most of the state-of-the-art indirect visual SLAM methods are based on the sparse point features. However, it is hard to find enough reliable point features for state estimation in the case of low-textured scenes. Line features are abundant in urban and indoor scenes. Recent studies have shown that the combination of point and line features can provide better accuracy despite the decrease in computational efficiency. In this paper, measurements of point and line features are extracted from RGB-D data to create map features, and points on a line are treated as keypoints. We propose an extended approach to make more use of line observation information. And we prove that, in the local bundle adjustment, the estimation uncertainty of keyframe poses can be reduced when considering more landmarks with independent measurements in the optimization process. Experimental results on two public RGB-D datasets demonstrate that the proposed method has better robustness and accuracy in challenging environments.

preprint2016arXiv

Current Control of Magnetic Anisotropy via Stress in a Ferromagnetic Metal Waveguide

We demonstrate that in-plane charge current can effectively control the spin precession resonance in an Al2O3/CoFeB/Ta heterostructure. Brillouin Light Scattering (BLS) was used to detect the ferromagnetic resonance field under microwave excitation of spin waves at fixed frequencies. The current control of spin precession resonance originates from modification of the in-plane uniaxial magnetic anisotropy field H_k, which changes symmetrically with respect to the current direction. Numerical simulation suggests that the anisotropic stress introduced by Joule heating plays an important role in controlling H_k. These results provide new insights into current manipulation of magnetic properties and have broad implications for spintronic devices.

preprint2016arXiv

Interfacial Control of Dzyaloshinskii Moriya Interaction in Heavy Metal_Ferromagnetic Metal Thin Film Heterostructures

The interfacial Dzyaloshinskii Moriya Interaction (DMI) in ultrathin magnetic thin film heterostructures provides a new approach for controlling spin textures on mesoscopic length scales. Here we investigate the dependence of the interfacial DMI constant D on a Pt wedge insertion layer in Ta_CoFeB_Pt(wedge)_MgO thin films by observing the asymmetric spin wave dispersion using Brillouin light scattering. Continuous tuning of D by more than a factor of three is realized by inserting less than one monolayer of Pt. The observations provide new insights for designing magnetic thin film heterostructures with tailored D for controlling skyrmions and magnetic domain wall chirality and dynamics.

preprint2016arXiv

Magnons and Phonons Optically Driven Out of Local Equilibrium in a Magnetic Insulator

Magnons are the energy quanta of fundamental spin excitations, namely spin waves, and they can make a considerable contribution to energy transport in some magnetic materials in a similar manner as lattice vibration waves or phonons. The coupling and possible non-equilibrium between magnons and other energy carriers have been used to explain several recently discovered thermally driven spin transport and energy conversion phenomena. Here, we report experiments in which local non-equilibrium between magnons and phonons in a single crystalline bulk magnetic insulator, Y3Fe5O12 (yttrium iron garnet, or YIG), has been created optically within a focused laser spot and probed directly with the use of micro-Brillouin light scattering (BLS). By analyzing the experimental results with a thermally induced magnon diffusion model, we obtain the magnon diffusion length of thermal magnons. By explicitly establishing non-equilibrium between magnons and phonons, our studies represent an important step toward a quantitative understanding of various spin-heat coupling phenomena.

preprint2016arXiv

Quantum Phase Operator and Phase States

A Hermitian quantum phase operator is formulated that mirrors the classical phase variable with proper time dependence and satisfies trigonometric identities. The eigenstates of the phase operator are solved in terms of Gegenbauer ultraspherical polynomials in the number state representation.

preprint2015arXiv

A super-Eddington wind scenario for the progenitors of type Ia supernovae: binary population synthesis calculations

The super-Eddington wind scenario has been proposed as an alternative way for producing type Ia supernovae (SNe Ia). The super-Eddington wind can naturally prevent the carbon--oxygen white dwarfs (CO WDs) with high mass-accretion rates from becoming red-giant-like stars. Furthermore, it works in low-metallicity environments, which may explain SNe Ia observed at high redshifts. In this article, we systematically investigated the most prominent single-degenerate WD+MS channel based on the super-Eddington wind scenario. We combined the Eggleton stellar evolution code with a rapid binary population synthesis (BPS) approach to predict SN Ia birthrates for the WD+MS channel by adopting the super-Eddington wind scenario and detailed mass-accumulation efficiencies of H-shell flashes on the WDs. Our BPS calculations found that the estimated SN Ia birthrates for the WD+MS channel are ~0.009-0.315*10^{-3}{yr}^{-1} if we adopt the Eddington accretion rate as the critical accretion rate, which are much lower than that of the observations (<10% of the observed SN Ia birthrates). This indicates that the WD+MS channel only contributes a small proportion of all SNe Ia. The birthrates in this simulation are lower than previous studies, the main reason of which is that new mass-accumulation efficiencies of H-shell flashes are adopted. We also found that the critical mass-accretion rate has a significant influence on the birthrates of SNe Ia. Meanwhile, the results of our BPS calculations are sensitive to the values of the common-envelope ejection efficiency.

preprint2015arXiv

Super-Eddington wind scenario for the progenitors of type Ia supernovae: Accreting He-rich matter onto white dwarfs

Supernovae of type Ia (SNe Ia) are believed to be thermonuclear explosions of carbon-oxygen white dwarfs (CO WDs). However, the mass accretion process onto CO WDs is still not completely understood. In this paper, we study the accretion of He-rich matter onto CO WDs and explore a scenario in which a strong wind forms on the surface of the WD if the total luminosity exceeds the Eddington limit. Using a stellar evolution code called modules for experiments in stellar astrophysics (MESA), we simulated the He accretion process onto CO WDs for WDs with masses of 0.6-1.35Msun and various accretion rates of 10^{-8}-10^{-5}Msun/yr. If the contribution of the total luminosity is included when determining the Eddington accretion rate, then a super-Eddington wind could be triggered at relatively lower accretion rates than those of previous studies based on steady-state models. The super-Eddington wind can prevent the WDs with high accretion rates from evolving into red-giant-like He stars. We found that the contributions from thermal energy of the WD are non-negligible, judging by our simulations, even though the nuclear burning energy is the dominating source of luminosity. We also provide the limits of the steady He-burning regime in which the WDs do not lose any accreted matter and increase their mass steadily, and calculated the mass retention efficiency during He layer flashes for various WD masses and accretion rates. These obtained results can be used in future binary population synthesis computations.

preprint2014arXiv

On Schauder Equivalence Relations

In this paper, we study Schauder equivalence relations, which are Borel equivalence relations generated by Banach spaces with basic sequences. We prove that the set of equivalence relations generated by basic sequences has boundaries. Then we show that equivalence relations generated by the basis in Tsirelson spaces has similar properties of Tsirelson spaces in the Banach space theory. In particular, we prove that both l_p and c_0 are not reducible to the equivalence relation generated by Tsirelson space T with the unit vector basis \{t_n\}. We also show that Borel equivalence relation generated by α-Tsirelson spaces are mutually incompatible. Based on this argument, we show that any basis of Schauder equivalence relations must be of cardinal 2^ω.

preprint2013arXiv

A Super-Eddington Wind Scenario for the Progenitors of Type Ia Supernovae

The accretion of hydrogen-rich material onto carbon-oxygen white dwarfs (CO WDs) is crucial for understanding type Ia supernova (SN Ia) from the single-degenerate model, but this process has not been well understood due to the numerical difficulties in treating H and He flashes during the accretion. For the CO WD masses from 0.5 to $1.378\,{M}_\odot$ and accretion rates in the range from $10^{-8}$ to $10^{-5}\,{M}_\odot\,\mbox{yr}^{-1}$, we simulated the accretion of solar-composition material onto CO WDs using the state-of-the-art stellar evolution code of {\sc MESA}. For comparison with the steady-state models (e.g \citet{nskh07}), we firstly ignored the contribution from nuclear burning to the luminosity when determining the Eddington accretion rate and found that the properties of H burning in our accreting CO WD models are similar to those from the steady-state models, except that the critical accretion rates at which the WDs turn into red giants or H-shell flashes occur on their surfaces are slightly higher than those from the steady-state models. However, the super-Eddington wind is triggered at much lower accretion rates, than previously thought, when the contribution of nuclear burning to the total luminosity is included. This super-Eddington wind naturally prevents the CO WDs with high accretion rates from becoming red giants, thus presenting an alternative to the optically thick wind proposed by \cite{hkn96}. Furthermore, the super-Eddington wind works in low-metallicity environments, which may explain SNe Ia observed at high redshifts.

Xin Ma

What is connected

Connect this record

See the researcher in context

Building this map preview

22 published item(s)

More Edits, More Stable: Understanding the Lifelong Normalization in Sequential Model Editing

InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation

A categorical study on the generalized type semigroup

A DNS Tunnel Sliding Window Differential Detection Method Based on Normal Distribution Reasonable Range Filtering

Boundary actions of CAT(0) spaces and their $C^*$-algebras

Compressing Models with Few Samples: Mimicking then Replacing

Disjoint Masking with Joint Distillation for Efficient Masked Image Modeling

FedSSO: A Federated Server-Side Second-Order Optimization Algorithm

Multi-task Learning with High-Dimensional Noisy Images

Resolving subcategories and dimensions in recollements of extriangulated categories

TL-GAN: Improving Traffic Light Recognition via Data Synthesis for Autonomous Driving

Adaptive Deconvolution-based stereo matching Net for Local Stereo Matching

Interfacial Dzyaloshinskii-Moriya interaction and spin-orbit torque in Au1-xPtx/Co bilayers with varying interfacial spin-orbit coupling

Point-line-based RGB-D SLAM and Bundle Adjustment Uncertainty Analysis

Current Control of Magnetic Anisotropy via Stress in a Ferromagnetic Metal Waveguide

Interfacial Control of Dzyaloshinskii Moriya Interaction in Heavy Metal_Ferromagnetic Metal Thin Film Heterostructures

Magnons and Phonons Optically Driven Out of Local Equilibrium in a Magnetic Insulator

Quantum Phase Operator and Phase States

A super-Eddington wind scenario for the progenitors of type Ia supernovae: binary population synthesis calculations

Super-Eddington wind scenario for the progenitors of type Ia supernovae: Accreting He-rich matter onto white dwarfs

On Schauder Equivalence Relations

A Super-Eddington Wind Scenario for the Progenitors of Type Ia Supernovae