Source author record

Yin Zhang

Yin Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language astro-ph.SR Artificial Intelligence Computer Vision Machine Learning Information Retrieval Information Theory math.IT math.NA cond-mat.mes-hall cond-mat.soft math.OC Multiagent Systems Multimedia Numerical Analysis physics.chem-ph Robotics Social and Information Networks Software Engineering

Catalog footprint

What is connected

31works

19topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A microscopic origin for the breakdown of the Stokes Einstein relation in ion transport

Ion transport underlies the operation of biological ion channels and governs the performance of electrochemical energy-storage devices. A long-standing anomaly is that smaller alkali metal ions, such as Li$^+$, migrate more slowly in water than larger ions, in apparent violation of the Stokes-Einstein relation. This breakdown is conventionally attributed to dielectric friction, a collective drag force arising from electrostatic interactions between a drifting ion and its surrounding solvent. Here, combining nanopore transport measurements over electric fields spanning several orders of magnitude with molecular dynamics simulations, we show that the time-averaged electrostatic force on a migrating ion is not a drag force but a net driving force. By contrasting charged ions with neutral particles, we reveal that ionic charge introduces additional Lorentzian peaks in the frequency-dependent friction coefficient. These peaks originate predominantly from short-range Lennard-Jones (LJ) interactions within the first hydration layer and represent additional channels for energy dissipation, strongest for Li$^+$ and progressively weaker for Na$^+$ and K$^+$. Our results demonstrate that electrostatic interactions primarily act to tighten the local hydration structure, thereby amplifying short-range LJ interactions rather than directly opposing ion motion. This microscopic mechanism provides a unified physical explanation for the breakdown of the Stokes-Einstein relation in aqueous ion transport.

preprint2026arXiv

Adaptive TD-Lambda for Cooperative Multi-agent Reinforcement Learning

TD($λ$) in value-based MARL algorithms or the Temporal Difference critic learning in Actor-Critic-based (AC-based) algorithms synergistically integrate elements from Monte-Carlo simulation and Q function bootstrapping via dynamic programming, which effectively addresses the inherent bias-variance trade-off in value estimation. Based on that, some recent works link the adaptive $λ$ value to the policy distribution in the single-agent reinforcement learning area. However, because of the large joint action space from multiple number of agents, and the limited transition data in Multi-agent Reinforcement Learning, the policy distribution is infeasible to be calculated statistically. To solve the policy distribution calculation problem in MARL settings, we employ a parametric likelihood-free density ratio estimator with two replay buffers instead of calculating statistically. The two replay buffers of different sizes store the historical trajectories that represent the data distribution of the past and current policies correspondingly. Based on the estimator, we assign Adaptive TD($λ$), \textbf{ATD($λ$)}, values to state-action pairs based on their likelihood under the stationary distribution of the current policy. We apply the proposed method on two competitive baseline methods, QMIX for value-based algorithms, and MAPPO for AC-based algorithms, over SMAC benchmarks and Gfootball academy scenarios, and demonstrate consistently competitive or superior performance compared to other baseline approaches with static $λ$ values.

preprint2026arXiv

Fusing Urban Structure and Semantics: A Conditional Diffusion Model for Cross-City OD Matrix Generation

Accurate modeling of commuting flows is important for urban governance, traffic planning, and resource allocation. However, the combined influence of individual intentions, geographic constraints, and social dynamics leads to considerable heterogeneity in commuting patterns, making it difficult to develop generation models that generalize across cities. To address this issue, we propose SEDAN, a Structure-Enhanced Diffusion model conditioned on Attributed Nodes for generalizable OD matrix generation. SEDAN models a city as an attributed graph. Each region is treated as a node with demographic and point-of-interest features, and commuting flows are modeled as weighted edges. Adjacency and distance matrices are incorporated to characterize spatial structure. Based on this representation, we design a fusion mechanism within SEDAN to jointly model semantic information and spatial information. Regional semantic attributes are used to model latent travel demand through graph-transformer-based node interactions, while spatial structure is injected into the generation process as explicit constraints. The adjacency matrix guides attention weights to strengthen interactions between neighboring regions. Meanwhile, the distance matrix serves as a diffusion condition to capture spatial proximity and travel impedance. The fusion of urban semantics and spatial constraints enables SEDAN to generate OD matrices that are both behaviorally plausible and geographically coherent. Experiments on real-world OD datasets from U.S. cities show that SEDAN achieves a 7.38\% improvement in RMSE over the state-of-the-art baseline, WEDAN. It also remains robust across heterogeneous urban scenarios and varying structural patterns. Our work provides an effective and generalizable solution for commuting OD matrix generation. The code is available at https://anonymous.4open.science/r/SEDAN.

preprint2026arXiv

MIRL: Mutual Information-Guided Reinforcement Learning for Vision-Language Models

Vision-Language Models (VLMs) frequently suffer from visual perception errors and hallucinations that compromise answer accuracy in complex reasoning tasks. Reinforcement Learning with Verifiable Rewards (RLVR) offers a promising solution by optimizing policies using answer correctness signals. Despite their effectiveness, prevailing RLVR methods face two critical limitations. First, much of the sampling budget is wasted on trajectories doomed to fail due to early visual description errors. Second, sparse rewards cannot distinguish whether failures stem from visual perception or reasoning stages. We introduce MIRL, a decoupled framework that addresses both limitations by leveraging mutual information (MI) between generated descriptions and visual inputs as a cheap pre-screening signal. This enables intelligent budget allocation toward high-potential trajectories via forking, while decoupled training provides independent MI-based rewards for visual perception optimization, resolving reward blindness. Experiments on six vision-language reasoning benchmarks demonstrate that MIRL achieves 70.22% average accuracy and successfully surpasses the performance of sampling 16 complete trajectories using only 10 pre-samples with top-6 selection (25% fewer complete trajectories). Our code is available at: https://anonymous.4open.science/r/mirl-main/.

preprint2026arXiv

Observability-Enhanced Target Motion Estimation via Bearing-Box: Theory and MAV Applications

Monocular vision-based target motion estimation is a fundamental challenge in numerous applications. This work introduces a novel bearing-box approach that fully leverages modern 3D detection measurements that are widely available nowadays but have not been well explored for motion estimation so far. Unlike existing methods that rely on restrictive assumptions such as isotropic target shape and lateral motion, our bearing-box estimator can estimate both the target's motion and its physical size without these assumptions by exploiting the information buried in a 3D bounding box. When applied to multi-rotor micro aerial vehicles (MAVs), the estimator yields an interesting advantage: it further removes the need for higher-order motion assumptions by exploiting the unique coupling between MAV's acceleration and thrust. This is particularly significant, as higher-order motion assumptions are widely believed to be necessary in state-of-the-art bearing-based estimators. We support our claims with rigorous observability analyses and extensive experimental validation, demonstrating the estimator's superior performance in real-world scenarios.

preprint2026arXiv

ShortCoder: Knowledge-Augmented Syntax Optimization for Token-Efficient Code Generation

Code generation tasks aim to automate the conversion of user requirements into executable code, significantly reducing manual development efforts and enhancing software productivity. The emergence of large language models (LLMs) has significantly advanced code generation, though their efficiency is still impacted by certain inherent architectural constraints. Each token generation necessitates a complete inference pass, requiring persistent retention of contextual information in memory and escalating resource consumption. While existing research prioritizes inference-phase optimizations such as prompt compression and model quantization, the generation phase remains underexplored. To tackle these challenges, we propose a knowledge-infused framework named ShortCoder, which optimizes code generation efficiency while preserving semantic equivalence and readability. In particular, we introduce: (1) ten syntax-level simplification rules for Python, derived from AST-preserving transformations, achieving 18.1% token reduction without functional compromise; (2) a hybrid data synthesis pipeline integrating rule-based rewriting with LLM-guided refinement, producing ShorterCodeBench, a corpus of validated tuples of original code and simplified code with semantic consistency; (3) a fine-tuning strategy that injects conciseness awareness into the base LLMs. Extensive experimental results demonstrate that ShortCoder consistently outperforms state-of-the-art methods on HumanEval, achieving an improvement of 18.1%-37.8% in generation efficiency over previous methods while ensuring the performance of code generation.

preprint2026arXiv

SpatialForge: Bootstrapping 3D-Aware Spatial Reasoning from Open-World 2D Images

Recent advancements in Large Vision-Language Models (VLMs) have demonstrated exceptional semantic understanding, yet these models consistently struggle with spatial reasoning, often failing at fundamental geometric tasks such as depth ordering and precise coordinate grounding. Recent efforts introduce spatial supervision from scene-centric datasets (e.g., multi-view scans or indoor video), but are constrained by the limited number of underlying scenes. As a result, the scale and diversity of such data remain significantly smaller than those of web-scale 2D image collections. To address this limitation, we propose SpatialForge, a scalable data synthesis pipeline that transforms in-the-wild 2D images into spatial reasoning supervision. Our approach decomposes spatial reasoning into perception and relation, and constructs structured supervision signals covering depth, layout, and viewpoint-dependent reasoning, with automatic verification to ensure data quality. Based on this pipeline, we build SpatialForge-10M, a large-scale dataset containing 10 million spatial QA pairs. Extensive experiments across multiple spatial reasoning benchmarks demonstrate that training on SpatialForge-10M significantly improves the spatial reasoning ability of standard VLMs, highlighting the effectiveness of scaling 2D data for 3D-aware spatial reasoning.

preprint2026arXiv

The Efficiency Gap in Byte Modeling

Modern language models have historically relied on two dominant design choices: subword tokenization and autoregressive (AR) ordering. These design decisions bake in priors that dictate a model's learning. Recently, two alternative paradigms have challenged this: byte-level modeling, which bypasses static statistically-derived token vocabularies, and masked diffusion modeling (MDM), which conducts parallel, non-sequential generation. Their intersection represents a fully end-to-end modality-agnostic generative prototype; however, removing these structural priors incurs a significant computational cost. In this work, we investigate this cost through a compute-matched scaling study. Our results reveal that the performance penalty of byte modeling is not uniform; across scale, the scaling overhead of byte modeling is worse for MDM than for AR. We hypothesize that this disparity stems from context fragility: while AR's stable causal history allows models to naturally rediscover subword patterns, the MDM objective destroys the local contiguity required to efficiently resolve semantics from raw bytes. Our findings from controlled permutation experiments suggest that future modality-agnostic designs must incorporate alternative structural biases to maintain viable scaling trajectories in the byte regime.

preprint2025arXiv

Training Report of TeleChat3-MoE

TeleChat3-MoE is the latest series of TeleChat large language models, featuring a Mixture-of-Experts (MoE) architecture with parameter counts ranging from 105 billion to over one trillion,trained end-to-end on Ascend NPU cluster. This technical report mainly presents the underlying training infrastructure that enables reliable and efficient scaling to frontier model sizes. We detail systematic methodologies for operator-level and end-to-end numerical accuracy verification, ensuring consistency across hardware platforms and distributed parallelism strategies. Furthermore, we introduce a suite of performance optimizations, including interleaved pipeline scheduling, attention-aware data scheduling for long-sequence training,hierarchical and overlapped communication for expert parallelism, and DVM-based operator fusion. A systematic parallelization framework, leveraging analytical estimation and integer linear programming, is also proposed to optimize multi-dimensional parallelism configurations. Additionally, we present methodological approaches to cluster-level optimizations, addressing host- and device-bound bottlenecks during large-scale training tasks. These infrastructure advancements yield significant throughput improvements and near-linear scaling on clusters comprising thousands of devices, providing a robust foundation for large-scale language model development on hardware ecosystems.

preprint2022arXiv

A Communication-Efficient and Privacy-Aware Distributed Algorithm for Sparse PCA

Sparse principal component analysis (PCA) improves interpretability of the classic PCA by introducing sparsity into the dimension-reduction process. Optimization models for sparse PCA, however, are generally non-convex, non-smooth and more difficult to solve, especially on large-scale datasets requiring distributed computation over a wide network. In this paper, we develop a distributed and centralized algorithm called DSSAL1 for sparse PCA that aims to achieve low communication overheads by adapting a newly proposed subspace-splitting strategy to accelerate convergence. Theoretically, convergence to stationary points is established for DSSAL1. Extensive numerical results show that DSSAL1 requires far fewer rounds of communication than state-of-the-art peer methods. In addition, we make the case that since messages exchanged in DSSAL1 are well-masked, the possibility of private-data leakage in DSSAL1 is much lower than in some other distributed algorithms.

preprint2022arXiv

DictBERT: Dictionary Description Knowledge Enhanced Language Model Pre-training via Contrastive Learning

Although pre-trained language models (PLMs) have achieved state-of-the-art performance on various natural language processing (NLP) tasks, they are shown to be lacking in knowledge when dealing with knowledge driven tasks. Despite the many efforts made for injecting knowledge into PLMs, this problem remains open. To address the challenge, we propose \textbf{DictBERT}, a novel approach that enhances PLMs with dictionary knowledge which is easier to acquire than knowledge graph (KG). During pre-training, we present two novel pre-training tasks to inject dictionary knowledge into PLMs via contrastive learning: \textit{dictionary entry prediction} and \textit{entry description discrimination}. In fine-tuning, we use the pre-trained DictBERT as a plugin knowledge base (KB) to retrieve implicit knowledge for identified entries in an input sequence, and infuse the retrieved knowledge into the input to enhance its representation via a novel extra-hop attention mechanism. We evaluate our approach on a variety of knowledge driven and language understanding tasks, including NER, relation extraction, CommonsenseQA, OpenBookQA and GLUE. Experimental results demonstrate that our model can significantly improve typical PLMs: it gains a substantial improvement of 0.5\%, 2.9\%, 9.0\%, 7.1\% and 3.3\% on BERT-large respectively, and is also effective on RoBERTa-large.

preprint2022arXiv

Diverse Instance Discovery: Vision-Transformer for Instance-Aware Multi-Label Image Recognition

Previous works on multi-label image recognition (MLIR) usually use CNNs as a starting point for research. In this paper, we take pure Vision Transformer (ViT) as the research base and make full use of the advantages of Transformer with long-range dependency modeling to circumvent the disadvantages of CNNs limited to local receptive field. However, for multi-label images containing multiple objects from different categories, scales, and spatial relations, it is not optimal to use global information alone. Our goal is to leverage ViT's patch tokens and self-attention mechanism to mine rich instances in multi-label images, named diverse instance discovery (DiD). To this end, we propose a semantic category-aware module and a spatial relationship-aware module, respectively, and then combine the two by a re-constraint strategy to obtain instance-aware attention maps. Finally, we propose a weakly supervised object localization-based approach to extract multi-scale local features, to form a multi-view pipeline. Our method requires only weakly supervised information at the label level, no additional knowledge injection or other strongly supervised information is required. Experiments on three benchmark datasets show that our method significantly outperforms previous works and achieves state-of-the-art results under fair experimental comparisons.

preprint2022arXiv

LFGCF: Light Folksonomy Graph Collaborative Filtering for Tag-Aware Recommendation

Tag-aware recommendation is a task of predicting a personalized list of items for a user by their tagging behaviors. It is crucial for many applications with tagging capabilities like last.fm or movielens. Recently, many efforts have been devoted to improving Tag-aware recommendation systems (TRS) with Graph Convolutional Networks (GCN), which has become new state-of-the-art for the general recommendation. However, some solutions are directly inherited from GCN without justifications, which is difficult to alleviate the sparsity, ambiguity, and redundancy issues introduced by tags, thus adding to difficulties of training and degrading recommendation performance. In this work, we aim to simplify the design of GCN to make it more concise for TRS. We propose a novel tag-aware recommendation model named Light Folksonomy Graph Collaborative Filtering (LFGCF), which only includes the essential GCN components. Specifically, LFGCF first constructs Folksonomy Graphs from the records of user assigning tags and item getting tagged. Then we leverage the simple design of aggregation to learn the high-order representations on Folksonomy Graphs and use the weighted sum of the embeddings learned at several layers for information updating. We share tags embeddings to bridge the information gap between users and items. Besides, a regularization function named TransRT is proposed to better depict user preferences and item features. Extensive hyperparameters experiments and ablation studies on three real-world datasets show that LFGCF uses fewer parameters and significantly outperforms most baselines for the tag-aware top-N recommendations.

preprint2022arXiv

Rethinking the Value of Gazetteer in Chinese Named Entity Recognition

Gazetteer is widely used in Chinese named entity recognition (NER) to enhance span boundary detection and type classification. However, to further understand the generalizability and effectiveness of gazetteers, the NLP community still lacks a systematic analysis of the gazetteer-enhanced NER model. In this paper, we first re-examine the effectiveness several common practices of the gazetteer-enhanced NER models and carry out a series of detailed analysis to evaluate the relationship between the model performance and the gazetteer characteristics, which can guide us to build a more suitable gazetteer. The findings of this paper are as follows: (1) the gazetteer improves most of the situations that the traditional NER model datasets are difficult to learn. (2) the performance of model greatly benefits from the high-quality pre-trained lexeme embeddings. (3) a good gazetteer should cover more entities that can be matched in both the training set and testing set.

preprint2021arXiv

A Model of Two Tales: Dual Transfer Learning Framework for Improved Long-tail Item Recommendation

Highly skewed long-tail item distribution is very common in recommendation systems. It significantly hurts model performance on tail items. To improve tail-item recommendation, we conduct research to transfer knowledge from head items to tail items, leveraging the rich user feedback in head items and the semantic connections between head and tail items. Specifically, we propose a novel dual transfer learning framework that jointly learns the knowledge transfer from both model-level and item-level: 1. The model-level knowledge transfer builds a generic meta-mapping of model parameters from few-shot to many-shot model. It captures the implicit data augmentation on the model-level to improve the representation learning of tail items. 2. The item-level transfer connects head and tail items through item-level features, to ensure a smooth transfer of meta-mapping from head items to tail items. The two types of transfers are incorporated to ensure the learned knowledge from head items can be well applied for tail item representation learning in the long-tail distribution settings. Through extensive experiments on two benchmark datasets, results show that our proposed dual transfer learning framework significantly outperforms other state-of-the-art methods for tail item recommendation in hit ratio and NDCG. It is also very encouraging that our framework further improves head items and overall performance on top of the gains on tail items.

preprint2020arXiv

Chemical-protein Interaction Extraction via Gaussian Probability Distribution and External Biomedical Knowledge

Motivation: The biomedical literature contains a wealth of chemical-protein interactions (CPIs). Automatically extracting CPIs described in biomedical literature is essential for drug discovery, precision medicine, as well as basic biomedical research. Most existing methods focus only on the sentence sequence to identify these CPIs. However, the local structure of sentences and external biomedical knowledge also contain valuable information. Effective use of such information may improve the performance of CPI extraction. Results: In this paper, we propose a novel neural network-based approach to improve CPI extraction. Specifically, the approach first employs BERT to generate high-quality contextual representations of the title sequence, instance sequence, and knowledge sequence. Then, the Gaussian probability distribution is introduced to capture the local structure of the instance. Meanwhile, the attention mechanism is applied to fuse the title information and biomedical knowledge, respectively. Finally, the related representations are concatenated and fed into the softmax function to extract CPIs. We evaluate our proposed model on the CHEMPROT corpus. Our proposed model is superior in performance as compared with other state-of-the-art models. The experimental results show that the Gaussian probability distribution and external knowledge are complementary to each other. Integrating them can effectively improve the CPI extraction performance. Furthermore, the Gaussian probability distribution can effectively improve the extraction performance of sentences with overlapping relations in biomedical relation extraction tasks. Availability: Data and code are available at https://github.com/CongSun-dlut/CPI_extraction. Contact: yangzh@dlut.edu.cn, wangleibihami@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online.

preprint2020arXiv

Lifelong Learning with Searchable Extension Units

Lifelong learning remains an open problem. One of its main difficulties is catastrophic forgetting. Many dynamic expansion approaches have been proposed to address this problem, but they all use homogeneous models of predefined structure for all tasks. The common original model and expansion structures ignore the requirement of different model structures on different tasks, which leads to a less compact model for multiple tasks and causes the model size to increase rapidly as the number of tasks increases. Moreover, they can not perform best on all tasks. To solve those problems, in this paper, we propose a new lifelong learning framework named Searchable Extension Units (SEU) by introducing Neural Architecture Search into lifelong learning, which breaks down the need for a predefined original model and searches for specific extension units for different tasks, without compromising the performance of the model on different tasks. Our approach can obtain a much more compact model without catastrophic forgetting. The experimental results on the PMNIST, the split CIFAR10 dataset, the split CIFAR100 dataset, and the Mixture dataset empirically prove that our method can achieve higher accuracy with much smaller model, whose size is about 25-33 percentage of that of the state-of-the-art methods.

preprint2020arXiv

Magnetic Gradient: A Natural Driver of Solar Eruptions

It is well-known that there is a gradient, there will drive a flow inevitably. For example, a density-gradient may drive a diffusion flow, an electrical potential-gradient may drive an electric current in plasmas, etc. Then, what will be driven when a magnetic-gradient occurs in solar atmospheric plasmas? Considering the ubiquitous distribution of magnetic-gradient in solar plasma loops, this work demonstrates that magnetic-gradient pumping (MGP) mechanism is valid even in the partial ionized solar photosphere, chromosphere as well as in the corona. It drives energetic particle flows which carry and convey kinetic energy from the underlying atmosphere to move upwards, accumulate around the looptop and increase there temperature and pressure, and finally lead to eruptions around the looptop by triggering ballooning instabilities. This mechanism may explain the formation of the observing hot cusp-structures above flaring loops in most preflare phases, therefore, the magnetic-gradient should be a natural driver of solar eruptions. Furthermore, we may also apply to understand many other astrophysical phenomena, such as the temperature distribution above sunspots, the formation of solar plasma jets, type-II spicule, and fast solar wind above coronal holes, as well as the fast plasma jets related to white dwarfs, neutron stars and black holes.

preprint2020arXiv

MTSS: Learn from Multiple Domain Teachers and Become a Multi-domain Dialogue Expert

How to build a high-quality multi-domain dialogue system is a challenging work due to its complicated and entangled dialogue state space among each domain, which seriously limits the quality of dialogue policy, and further affects the generated response. In this paper, we propose a novel method to acquire a satisfying policy and subtly circumvent the knotty dialogue state representation problem in the multi-domain setting. Inspired by real school teaching scenarios, our method is composed of multiple domain-specific teachers and a universal student. Each individual teacher only focuses on one specific domain and learns its corresponding domain knowledge and dialogue policy based on a precisely extracted single domain dialogue state representation. Then, these domain-specific teachers impart their domain knowledge and policies to a universal student model and collectively make this student model a multi-domain dialogue expert. Experiment results show that our method reaches competitive results with SOTAs in both multi-domain and single domain setting.

preprint2020arXiv

Teacher-Student Framework Enhanced Multi-domain Dialogue Generation

Dialogue systems dealing with multi-domain tasks are highly required. How to record the state remains a key problem in a task-oriented dialogue system. Normally we use human-defined features as dialogue states and apply a state tracker to extract these features. However, the performance of such a system is limited by the error propagation of a state tracker. In this paper, we propose a dialogue generation model that needs no external state trackers and still benefits from human-labeled semantic data. By using a teacher-student framework, several teacher models are firstly trained in their individual domains, learn dialogue policies from labeled states. And then the learned knowledge and experience are merged and transferred to a universal student model, which takes raw utterance as its input. Experiments show that the dialogue system trained under our framework outperforms the one uses a belief tracker.

preprint2019arXiv

Consistency-Aware Recommendation for User-Generated ItemList Continuation

User-generated item lists are popular on many platforms. Examples include video-based playlists on YouTube, image-based lists (or"boards") on Pinterest, book-based lists on Goodreads, and answer-based lists on question-answer forums like Zhihu. As users create these lists, a common challenge is in identifying what items to curate next. Some lists are organized around particular genres or topics, while others are seemingly incoherent, reflecting individual preferences for what items belong together. Furthermore, this heterogeneity in item consistency may vary from platform to platform, and from sub-community to sub-community. Hence, this paper proposes a generalizable approach for user-generated item list continuation. Complementary to methods that exploit specific content patterns (e.g., as in song-based playlists that rely on audio features), the proposed approach models the consistency of item lists based on human curation patterns, and so can be deployed across a wide range of varying item types (e.g., videos, images, books). A key contribution is in intelligently combining two preference models via a novel consistency-aware gating network - a general user preference model that captures a user's overall interests, and a current preference priority model that captures a user's current (as of the most recent item) interests. In this way, the proposed consistency-aware recommender can dynamically adapt as user preferences evolve. Evaluation over four datasets(of songs, books, and answers) confirms these observations and demonstrates the effectiveness of the proposed model versus state-of-the-art alternatives. Further, all code and data are available at https://github.com/heyunh2015/ListContinuation_WSDM2020.

preprint2016arXiv

Very Long-period Pulsations before the Onset of Solar Flares

Solar flares are the most powerful explosions occurring in the solar system, which may lead to disastrous space weather events and impact various aspects of our Earth. So far, it is still a big challenge in modern astrophysics to understand the origin of solar flares and predict their onset. Based on the analysis of soft X-ray emission observed by the Geostationary Operational Environmental Satellite (GOES), this work reported a new discovery of very long-periodic pulsations occurred in the preflare phase before the onset of solar flares (preflare-VLPs). These pulsations are typically with period of 8 - 30 min and last for about 1 - 2 hours. They are possibly generated from LRC oscillations of plasma loops where electric current dominates the physical process during magnetic energy accumulation in the source region. The preflare-VLP provides an essential information for understanding the triggering mechanism and origin of solar flares, and may help us to response to solar explosions and the corresponding disastrous space weather events as a convenient precursory indicator.

preprint2015arXiv

Block algorithms with augmented Rayleigh-Ritz projections for large-scale eigenpair computation

Most iterative algorithms for eigenpair computation consist of two main steps: a subspace update (SU) step that generates bases for approximate eigenspaces, followed by a Rayleigh-Ritz (RR) projection step that extracts approximate eigenpairs. So far the predominant methodology for the SU step is based on Krylov subspaces that builds orthonormal bases piece by piece in a sequential manner. In this work, we investigate block methods in the SU step that allow a higher level of concurrency than what is reachable by Krylov subspace methods. To achieve a competitive speed, we propose an augmented Rayleigh-Ritz (ARR) procedure and analyze its rate of convergence under realistic conditions. Combining this ARR procedure with a set of polynomial accelerators, as well as utilizing a few other techniques such as continuation and deflation, we construct a block algorithm designed to reduce the number of RR steps and elevate concurrency in the SU steps. Extensive computational experiments are conducted in Matlab on a representative set of test problems to evaluate the performance of two variants of our algorithm in comparison to two well-established, high-quality eigensolvers ARPACK and FEAST. Numerical results, obtained on a many-core computer without explicit code parallelization, show that when computing a relatively large number of eigenpairs, the performance of our algorithms is competitive with, and frequently superior to, that of the two state-of-the-art eigensolvers.

preprint2015arXiv

Dynamic magnetic susceptibility and electrical detection of ferromagnetic resonance

The dynamic magnetic susceptibility of magnetic materials near ferromagnetic resonance (FMR) is very important in interpreting dc-voltage in electrical detection of FMR. Based on the causality principle and the assumption that the usual microwave absorption lineshape around FMR is Lorentzian, general forms of dynamic susceptibility of an arbitrary sample and the corresponding dc-voltage lineshape are obtained. Our main findings are: 1) The dynamic susceptibility is not a Polder tensor for material with arbitrary anisotropy. Two off-diagonal elements are not in general opposite to each other. However, the linear response coefficient of magnetization to total rf field is a Polder tensor. This may explain why two off-diagonal elements are always assumed to be opposite to each other in analyses. 2) The frequency dependence of dynamic susceptibility near FMR is fully characterized by six numbers while its field dependence is fully characterized by seven numbers. 3) A recipe of how to determine these numbers by standard microwave absorption measurements for an arbitrary sample is proposed. Our results allow one to unambiguously separate the contribution of the anisotropic magnetoresistance to dc-voltage from that of the anomalous Hall effect. With these results, one can reliably extract the information of spin pumping and the inverse spin Hall effect, and determine the spin-Hall angle. 4) The field-dependence of susceptibility matrix at a fixed frequency may have several peaks when the effective field is not monotonic of the applied field. In contrast, the frequency-dependence of susceptibility matrix at a fixed field has only one peak. Furthermore, in the case that resonance frequency is not sensitive to the applied field, the field dependence of susceptibility matrix, as well as dc-voltage, may have another non-resonance broad peak. Thus, one should be careful in interpreting observed peaks.

preprint2014arXiv

A Very Small and Super Strong Zebra Pattern Burst at the Beginning of a Solar Flare

Microwave emission with spectral zebra pattern structures (ZPs) is observed frequently in solar flares and the Crab pulsar. The previous observations show that ZP is only a structure overlapped on the underlying broadband continuum with slight increments and decrements. This work reports an extremely unusual strong ZP burst occurring just at the beginning of a solar flare observed simultaneously by two radio telescopes located in China and Czech Republic and by the extreme ultraviolet (EUV) telescope on board NASA's satellite Solar Dynamics Observatory on 2013 April 11. It is a very short and super strong explosion whose intensity exceeds several times that of the underlying flaring broadband continuum emission, lasting for just 18 s. EUV images show that the flare starts from several small flare bursting points (FBPs). There is a sudden EUV flash with extra enhancement in one of these FBPs during the ZP burst. Analysis indicates that the ZP burst accompanying EUV flash is an unusual explosion revealing a strong coherent process with rapid particle acceleration, violent energy release, and fast plasma heating simultaneously in a small region with short duration just at the beginning of the flare.

preprint2014arXiv

Solar Radio Bursts with Spectral Fine Structures in Preflares

A good observation of preflare activities is important for us to understand the origin and triggering mechanism of solar flares, and to predict the occurrence of solar flares. This work presents the characteristics of microwave spectral fine structures as preflare activities of four solar flares observed by Ondřejov radio spectrograph in the frequency range of 0.8--2.0 GHz. We found that these microwave bursts which occurred 1--4 minutes before the onset of flares have spectral fine structures with relatively weak intensities and very short timescales. They include microwave quasi-periodic pulsations (QPP) with very short period of 0.1-0.3 s and dot bursts with millisecond timescales and narrow frequency bandwidths. Accompanying these microwave bursts, there are filament motions, plasma ejection or loop brightening on the EUV imaging observations and non-thermal hard X-ray emission enhancements observed by RHESSI. These facts may reveal certain independent non-thermal energy releasing processes and particle acceleration before the onset of solar flares. They may be conducive to understand the nature of solar flares and predict their occurrence.

preprint2013arXiv

A new compressive video sensing framework for mobile broadcast

A new video coding method based on compressive sampling is proposed. In this method, a video is coded using compressive measurements on video cubes. Video reconstruction is performed by minimization of total variation (TV) of the pixelwise DCT coefficients along the temporal direction. A new reconstruction algorithm is developed from TVAL3, an efficient TV minimization algorithm based on the alternating minimization and augmented Lagrangian methods. Video coding with this method is inherently scalable, and has applications in mobile broadcast.

preprint2013arXiv

Statistics and Classification of the Microwave Zebra Patterns Associated with Solar Flares

The microwave zebra pattern (ZP) is the most interesting, intriguing, and complex spectral structure frequently observed in solar flares. A comprehensive statistical study will certainly help us to understand the formation mechanism, which is not exactly clear now. This work presents a comprehensive statistical analysis on a big sample with 202 ZP events collected from observations at the Chinese Solar Broadband Radio Spectrometer at Huairou and the Ondrejov Radiospectrograph in Czech Republic at frequencies of 1.00 - 7.60 GHz during 2000 - 2013. After investigating the parameter properties of ZPs, such as the occurrence in flare phase, frequency range, polarization degree, duration, etc., we find that the variation of zebra stripe frequency separation with respect to frequency is the best indicator for a physical classification of ZPs. Microwave ZPs can be classified into 3 types: equidistant ZP, variable-distant ZP, and growing-distant ZP, possibly corresponding to mechanisms of Bernstein wave model, whistler wave model, and double plasma resonance model, respectively. This statistical classification may help us to clarify the controversies between the existing various theoretical models, and understand the physical processes in the source regions.

preprint2012arXiv

The Morphologic Properties of Magnetic networks over the Solar Cycle 23

The morphologic properties of the magnetic networks during Carrington Rotations (CR) 1955 to 2091 (from 1999 to 2010) have been analyzed by applying the watershed algorithm to magnetograms observed by the Michelson Doppler Interferometer (MDI) on board the Solar and Heliospheric Observatory (SOHO) spacecraft. We find that the average area of magnetic cells on the solar surface at lower latitudes (within +-50 degree) are smaller than those at higher latitudes (beyond +-50 degree). Statistical analysis of these data indicates that the magnetic networks are of fractal in nature, and the average fractal dimension is D_f = 1.253+-0.011. We also find that both the fractal dimension and the size of the magnetic networks are anti-correlated with the sunspot area. This is perhaps because a strong magnetic field can suppress spatially modulated oscillation, compress the boundaries of network cells, leading to smoother cell boundaries. The fractal dimension of the cell deviates that predicted from an isobar of Kolmogorov homogeneous turbulence.

preprint2011arXiv

An Alternating Direction Algorithm for Matrix Completion with Nonnegative Factors

This paper introduces an algorithm for the nonnegative matrix factorization-and-completion problem, which aims to find nonnegative low-rank matrices X and Y so that the product XY approximates a nonnegative data matrix M whose elements are partially known (to a certain accuracy). This problem aggregates two existing problems: (i) nonnegative matrix factorization where all entries of M are given, and (ii) low-rank matrix completion where nonnegativity is not required. By taking the advantages of both nonnegativity and low-rankness, one can generally obtain superior results than those of just using one of the two properties. We propose to solve the non-convex constrained least-squares problem using an algorithm based on the classic alternating direction augmented Lagrangian method. Preliminary convergence properties of the algorithm and numerical simulation results are presented. Compared to a recent algorithm for nonnegative matrix factorization, the proposed algorithm produces factorizations of similar quality using only about half of the matrix entries. On tasks of recovering incomplete grayscale and hyperspectral images, the proposed algorithm yields overall better qualities than those produced by two recent matrix-completion algorithms that do not exploit nonnegativity.

preprint2010arXiv

Microwave Quasi-periodic Pulsations in Multi-timescales Associated with a Solar Flare/CME Event

Microwave observations of quasi-periodic pulsations (QPP) in multi-timescales are confirmed to be associated with an X3.4 flare/CME event at Solar Broadband Radio Spectrometer in Huairou (SBRS/Huairou) on 13 December 2006. It is most remarkable that the timescales of QPPs are distributed in a broad range from hecto-second (very long period pulsation, VLP, the period P>100 s), deca-second (long period pulsation, LPP, 10<P<100 s), few seconds (short period pulsation, SPP, 1<P<10 s), deci-second (slow-very short period pulsation, slow-VSP, 0.1<P<1.0 s), to centi-second (fast-very short period pulsation, fast-VSP, P<0.1 s), and forms a broad hierarchy of timescales. The statistical distribution in logarithmic period-duration space indicates that QPPs can be classified into two groups: group I includes VLP, LPP, SPP and part of slow-VSPs distributed around a line approximately; group II includes fast-VSP and most of slow-VSP dispersively distributed away from the above line. This feature implies that the generation mechanism of group I is different from group II. Group I is possibly related with some MHD oscillations in magnetized plasma loops in the active region, e.g., VLP may be generated by standing slow sausage mode coupling and resonating with the underlying photospheric 5-min oscillation, the modulation is amplified and forms the main framework of the whole flare/CME process; LPP, SPP, and part of slow-VSPs are most likely to be caused by standing fast modes or LRC-circuit resonance in current-carrying plasma loops. Group II is possibly generated by modulations of resistive tearing-mode oscillations in electric current-carrying flaring loops.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint

Fields this researcher appears in

Source provenance

Where this author record came from

arxivconfidence 95%

external id: arxiv:2512.24157:author:47:yin-zhang

Imported May 21, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.12928:author:5:yin-zhang

Imported May 20, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.11880:author:3:yin-zhang

Imported May 20, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.01520:author:1:yin-zhang

Imported May 20, 2026Synced May 20, 2026

arxivconfidence 95%

external id: arxiv:2605.00938:author:6:yin-zhang

Imported May 20, 2026Synced May 20, 2026

arxivconfidence 95%

external id: arxiv:2605.11462:author:5:yin-zhang

Imported May 20, 2026Synced May 20, 2026

7 works

Baolin Tan

Researcher

Baolin Tan contributes to research discovery and scholarly infrastructure.

Open to collaborate

5 works

Chengming Tan

Researcher

Chengming Tan contributes to research discovery and scholarly infrastructure.

Open to collaborate

3 works

Jing Huang

Researcher

Jing Huang contributes to research discovery and scholarly infrastructure.

Open to collaborate

2 works

Feng Ji

Researcher

Feng Ji contributes to research discovery and scholarly infrastructure.

Open to collaborate

Yin Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

31 published item(s)

A microscopic origin for the breakdown of the Stokes Einstein relation in ion transport

Adaptive TD-Lambda for Cooperative Multi-agent Reinforcement Learning

Fusing Urban Structure and Semantics: A Conditional Diffusion Model for Cross-City OD Matrix Generation

MIRL: Mutual Information-Guided Reinforcement Learning for Vision-Language Models

Observability-Enhanced Target Motion Estimation via Bearing-Box: Theory and MAV Applications

ShortCoder: Knowledge-Augmented Syntax Optimization for Token-Efficient Code Generation

SpatialForge: Bootstrapping 3D-Aware Spatial Reasoning from Open-World 2D Images

The Efficiency Gap in Byte Modeling

Training Report of TeleChat3-MoE

A Communication-Efficient and Privacy-Aware Distributed Algorithm for Sparse PCA

DictBERT: Dictionary Description Knowledge Enhanced Language Model Pre-training via Contrastive Learning

Diverse Instance Discovery: Vision-Transformer for Instance-Aware Multi-Label Image Recognition

LFGCF: Light Folksonomy Graph Collaborative Filtering for Tag-Aware Recommendation

Rethinking the Value of Gazetteer in Chinese Named Entity Recognition

A Model of Two Tales: Dual Transfer Learning Framework for Improved Long-tail Item Recommendation

Chemical-protein Interaction Extraction via Gaussian Probability Distribution and External Biomedical Knowledge

Lifelong Learning with Searchable Extension Units

Magnetic Gradient: A Natural Driver of Solar Eruptions

MTSS: Learn from Multiple Domain Teachers and Become a Multi-domain Dialogue Expert

Teacher-Student Framework Enhanced Multi-domain Dialogue Generation

Consistency-Aware Recommendation for User-Generated ItemList Continuation

Very Long-period Pulsations before the Onset of Solar Flares

Block algorithms with augmented Rayleigh-Ritz projections for large-scale eigenpair computation

Dynamic magnetic susceptibility and electrical detection of ferromagnetic resonance

A Very Small and Super Strong Zebra Pattern Burst at the Beginning of a Solar Flare

Solar Radio Bursts with Spectral Fine Structures in Preflares

A new compressive video sensing framework for mobile broadcast

Statistics and Classification of the Microwave Zebra Patterns Associated with Solar Flares

The Morphologic Properties of Magnetic networks over the Solar Cycle 23

An Alternating Direction Algorithm for Matrix Completion with Nonnegative Factors

Microwave Quasi-periodic Pulsations in Multi-timescales Associated with a Solar Flare/CME Event