Source author record

Shan Yu

Shan Yu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Neurons and Cognition Computer Vision Biological Physics cond-mat.dis-nn cond-mat.str-el cond-mat.supr-con Machine Learning Artificial Intelligence Computation and Language cond-mat.mtrl-sci physics.gen-ph physics.pop-ph

Catalog footprint

What is connected

12works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A neural network for modeling human concept formation, understanding and communication

A remarkable capability of the human brain is to form more abstract conceptual representations from sensorimotor experiences and flexibly apply them independent of direct sensory inputs. However, the computational mechanism underlying this ability remains poorly understood. Here, we present a dual-module neural network framework, the CATS Net, to bridge this gap. Our model consists of a concept-abstraction module that extracts low-dimensional conceptual representations, and a task-solving module that performs visual judgement tasks under the hierarchical gating control of the formed concepts. The system develops transferable semantic structure based on concept representations that enable cross-network knowledge transfer through conceptual communication. Model-brain fitting analyses reveal that these emergent concept spaces align with both neurocognitive semantic model and brain response structures in the human ventral occipitotemporal cortex, while the gating mechanisms mirror that in the semantic control brain network. This work establishes a unified computational framework that can offer mechanistic insights for understanding human conceptual cognition and engineering artificial systems with human-like conceptual intelligence.

preprint2026arXiv

AG-TAL: Anatomically-Guided Topology-Aware Loss for Multiclass Segmentation of the Circle of Willis Using Large-Scale Multi-Center Datasets

Accurate multiclass segmentation of the Circle of Willis (CoW) is essential for neurovascular disease management but remains challenging due to complex vascular topology and variable morphology. Existing deep learning methods often suffer from vascular discontinuities and inter-class misclassification, while current topological loss functions incur prohibitive computational costs in 3D multiclass settings. To address these limitations, we propose an Anatomically-Guided Topology-Aware Loss (AG-TAL) and introduce a large-scale, multi-center CoW dataset with unified annotations to facilitate robust model training. AG-TAL specifically integrates a radius-aware Dice loss to address class imbalance in small vessels, a breakage-aware clDice loss that utilizes group convolutions to efficiently preserve local connectivity, and an adjacency-aware co-occurrence loss that leverages anatomical priors to enforce distinct boundaries between neighboring arteries. Evaluated using 5-fold cross-validation, AG-TAL achieved an average Dice score of 80.85% for all CoW arteries, with small arteries notably higher by 1.05-3.09% compared to state-of-the-art methods. Across six independent datasets, the performance of AG-TAL achieved Dice scores ranging from 74.46% to 81.17% for all CoW arteries, with improvements of 2.20% to 9.98% for small arteries compared to other methods. This study demonstrates the superiority of AG-TAL in identifying multiclass CoW arteries and its ability to generalize well to multiple independent datasets. Furthermore, reliability analyses and clinical applications in an Alzheimer's disease cohort validate the AG-TAL's robustness and its potential for discovering imaging-based morphological biomarkers.

preprint2023arXiv

AI of Brain and Cognitive Sciences: From the Perspective of First Principles

Nowadays, we have witnessed the great success of AI in various applications, including image classification, game playing, protein structure analysis, language translation, and content generation. Despite these powerful applications, there are still many tasks in our daily life that are rather simple to humans but pose great challenges to AI. These include image and language understanding, few-shot learning, abstract concepts, and low-energy cost computing. Thus, learning from the brain is still a promising way that can shed light on the development of next-generation AI. The brain is arguably the only known intelligent machine in the universe, which is the product of evolution for animals surviving in the natural environment. At the behavior level, psychology and cognitive sciences have demonstrated that human and animal brains can execute very intelligent high-level cognitive functions. At the structure level, cognitive and computational neurosciences have unveiled that the brain has extremely complicated but elegant network forms to support its functions. Over years, people are gathering knowledge about the structure and functions of the brain, and this process is accelerating recently along with the initiation of giant brain projects worldwide. Here, we argue that the general principles of brain functions are the most valuable things to inspire the development of AI. These general principles are the standard rules of the brain extracting, representing, manipulating, and retrieving information, and here we call them the first principles of the brain. This paper collects six such first principles. They are attractor network, criticality, random network, sparse coding, relational memory, and perceptual learning. On each topic, we review its biological background, fundamental property, potential application to AI, and future development.

preprint2022arXiv

BigDL 2.0: Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster

Most AI projects start with a Python notebook running on a single laptop; however, one usually needs to go through a mountain of pains to scale it to handle larger dataset (for both experimentation and production deployment). These usually entail many manual and error-prone steps for the data scientists to fully take advantage of the available hardware resources (e.g., SIMD instructions, multi-processing, quantization, memory allocation optimization, data partitioning, distributed computing, etc.). To address this challenge, we have open sourced BigDL 2.0 at https://github.com/intel-analytics/BigDL/ under Apache 2.0 license (combining the original BigDL and Analytics Zoo projects); using BigDL 2.0, users can simply build conventional Python notebooks on their laptops (with possible AutoML support), which can then be transparently accelerated on a single node (with up-to 9.6x speedup in our experiments), and seamlessly scaled out to a large cluster (across several hundreds servers in real-world use cases). BigDL 2.0 has already been adopted by many real-world users (such as Mastercard, Burger King, Inspur, etc.) in production.

preprint2022arXiv

Recursive Least-Squares Estimator-Aided Online Learning for Visual Tracking

Tracking visual objects from a single initial exemplar in the testing phase has been broadly cast as a one-/few-shot problem, i.e., one-shot learning for initial adaptation and few-shot learning for online adaptation. The recent few-shot online adaptation methods incorporate the prior knowledge from large amounts of annotated training data via complex meta-learning optimization in the offline phase. This helps the online deep trackers to achieve fast adaptation and reduce overfitting risk in tracking. In this paper, we propose a simple yet effective recursive least-squares estimator-aided online learning approach for few-shot online adaptation without requiring offline training. It allows an in-built memory retention mechanism for the model to remember the knowledge about the object seen before, and thus the seen data can be safely removed from training. This also bears certain similarities to the emerging continual learning field in preventing catastrophic forgetting. This mechanism enables us to unveil the power of modern online deep trackers without incurring too much extra computational cost. We evaluate our approach based on two networks in the online learning families for tracking, i.e., multi-layer perceptrons in RT-MDNet and convolutional neural networks in DiMP. The consistent improvements on several challenging tracking benchmarks demonstrate its effectiveness and efficiency.

preprint2020arXiv

Progressive Relation Learning for Group Activity Recognition

Group activities usually involve spatiotemporal dynamics among many interactive individuals, while only a few participants at several key frames essentially define the activity. Therefore, effectively modeling the group-relevant and suppressing the irrelevant actions (and interactions) are vital for group activity recognition. In this paper, we propose a novel method based on deep reinforcement learning to progressively refine the low-level features and high-level relations of group activities. Firstly, we construct a semantic relation graph (SRG) to explicitly model the relations among persons. Then, two agents adopting policy according to two Markov decision processes are applied to progressively refine the SRG. Specifically, one feature-distilling (FD) agent in the discrete action space refines the low-level spatio-temporal features by distilling the most informative frames. Another relation-gating (RG) agent in continuous action space adjusts the high-level semantic graph to pay more attention to group-relevant relations. The SRG, FD agent, and RG agent are optimized alternately to mutually boost the performance of each other. Extensive experiments on two widely used benchmarks demonstrate the effectiveness and superiority of the proposed approach.

preprint2015arXiv

A ferroelectric-like structural transition in a metal

Metals cannot exhibit ferroelectricity because static internal electric fields are screened by conduction electrons, but in 1965, Anderson and Blount predicted the possibility of a ferroelectric metal, in which a ferroelectric-like structural transition occurs in the metallic state. Up to now, no clear example of such a material has been identified. Here we report on a centrosymmetric (R-3c) to non-centrosymmetric (R3c) transition in metallic LiOsO3 that is structurally equivalent to the ferroelectric transition of LiNbO3. The transition involves a continuous shift in the mean position of Li+ ions on cooling below 140K. Its discovery realizes the scenario described by Anderson and Blount, and establishes a new class of materials whose properties may differ from those of normal metals.

preprint2013arXiv

Universal Organization of Resting Brain Activity at the Thermodynamic Critical Point

Thermodynamic criticality describes emergent phenomena in a wide variety of complex systems. In the mammalian brain, the complex dynamics that spontaneously emerge from neuronal interactions have been characterized as neuronal avalanches, a form of critical branching dynamics. Here, we show that neuronal avalanches also reflect that the brain dynamics are organized close to a thermodynamic critical point. We recorded spontaneous cortical activity in monkeys and humans at rest using high-density intracranial microelectrode arrays and magnetoencephalography, respectively. By numerically changing a control parameter equivalent to thermodynamic temperature, we observed typical critical behavior in cortical activities near the actual physiological condition, including the phase transition of an order parameter, as well as the divergence of susceptibility and specific heat. Finite-size scaling of these quantities allowed us to derive robust critical exponents highly consistent across monkey and humans that uncover a distinct, yet universal organization of brain dynamics.

preprint2012arXiv

Superconductivity suppression of Ba0.5K0.5Fe2-2xM2xAs2 single crystals by substitution of transition-metal (M = Mn, Ru, Co, Ni, Cu, and Zn)

We investigated the doping effects of magnetic and nonmagnetic impurities on the single-crystalline p-type Ba0.5K0.5Fe2-2xM2xAs2 (M = Mn, Ru, Co, Ni, Cu and Zn) superconductors. The superconductivity indicates robustly against impurity of Ru, while weakly against the impurities of Mn, Co, Ni, Cu, and Zn. However, the present Tc suppression rate of both magnetic and nonmagnetic impurities remains much lower than what was expected for the s\pm-wave model. The temperature dependence of resistivity data is observed an obvious low-T upturn for the crystals doped with high-level impurity, which is due to the occurrence of localization. Thus, the relatively weak Tc suppression effect from Mn, Co, Ni, Cu, and Zn are considered as a result of localization rather than pair-breaking effect in s\pm-wave model.

preprint2011arXiv

Linear decrease of critical temperature with increasing Zn substitution in the iron-based superconductor BaFe1.89-2xZn2xCo0.11As2

The nonmagnetic impurity effect is studied on the Fe-based BaFe1.89Co0.11As2 superconductor (Tc = 25 K) with Zn substitution for Fe up to 8 at. %, which is achieved by means of high-pressure and high-temperature heating. Tc decreases almost linearly with increasing the Zn content and disappears at ~8 atomic %, being different in the shared phenomenology of the early Zn doping studies, where Tc decreases little. The Tc decreasing rate, however, remains much lower (3.63 K/%) than what is expected for the s(+-)-wave model, implying the model is unlikely. Another symmetry model such as the non-sign reversal s-wave model may better account for the result.

preprint2010arXiv

Information capacity and transmission are maximized in balanced cortical networks with neuronal avalanches

The repertoire of neural activity patterns that a cortical network can produce constrains the network's ability to transfer and process information. Here, we measured activity patterns obtained from multi-site local field potential (LFP) recordings in cortex cultures, urethane anesthetized rats, and awake macaque monkeys. First, we quantified the information capacity of the pattern repertoire of ongoing and stimulus-evoked activity using Shannon entropy. Next, we quantified the efficacy of information transmission between stimulus and response using mutual information. By systematically changing the ratio of excitation/inhibition (E/I) in vitro and in a network model, we discovered that both information capacity and information transmission are maximized at a particular intermediate E/I, at which ongoing activity emerges as neuronal avalanches. Next, we used our in vitro and model results to correctly predict in vivo information capacity and interactions between neuronal groups during ongoing activity. Close agreement between our experiments and model suggest that neuronal avalanches and peak information capacity arise due to criticality and are general properties of cortical networks with balanced E/I.

preprint2010arXiv

Quantum mechanics needs no consciousness (and the other way around)

It has been suggested that consciousness plays an important role in quantum mechanics as it is necessary for the collapse of wave function during the measurement. Furthermore, this idea has spawned a symmetrical proposal: a possibility that quantum mechanics explains the emergence of consciousness in the brain. Here we formulated several predictions that follow from this hypothetical relationship and that can be empirically tested. Some of the experimental results that are already available suggest falsification of the first hypothesis. Thus, the suggested link between human consciousness and collapse of wave function does not seem viable. We discuss the constraints implied by the existing evidence on the role that the human observer may play for quantum mechanics and the role that quantum mechanics may play in the observer's consciousness.

Shan Yu

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

A neural network for modeling human concept formation, understanding and communication

AG-TAL: Anatomically-Guided Topology-Aware Loss for Multiclass Segmentation of the Circle of Willis Using Large-Scale Multi-Center Datasets

AI of Brain and Cognitive Sciences: From the Perspective of First Principles

BigDL 2.0: Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster

Recursive Least-Squares Estimator-Aided Online Learning for Visual Tracking

Progressive Relation Learning for Group Activity Recognition

A ferroelectric-like structural transition in a metal

Universal Organization of Resting Brain Activity at the Thermodynamic Critical Point

Superconductivity suppression of Ba0.5K0.5Fe2-2xM2xAs2 single crystals by substitution of transition-metal (M = Mn, Ru, Co, Ni, Cu, and Zn)

Linear decrease of critical temperature with increasing Zn substitution in the iron-based superconductor BaFe1.89-2xZn2xCo0.11As2

Information capacity and transmission are maximized in balanced cortical networks with neuronal avalanches

Quantum mechanics needs no consciousness (and the other way around)