Source author record

Furao Shen

Furao Shen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Vision Artificial Intelligence

Catalog footprint

What is connected

13works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

AffineLens: Capturing the Continuous Piecewise Affine Functions of Neural Networks

Piecewise affine neural networks (PANNs) provide a principled geometric perspective on neural network expressivity by characterizing the input--output map as a continuous piecewise affine (CPA) function whose complexity is governed by the number, arrangement, and shapes of its affine regions. However, existing interpretability and expressivity analyses often rely on indirect proxies (e.g., activation statistics or theoretical upper bounds) and rarely offer practical, accurate tools for enumerating and visualizing the induced region partition under realistic architectures and bounded input domains. In this work, we present AffineLens, a unified framework for computing the hyperplane arrangements and polyhedral structures underlying PANNs. Given a calibrated (bounded) input polytope, AffineLens identifies the subset of neuron-induced hyperplanes that intersect the domain, enumerates the resulting affine sub-regions in a layer-wise manner, and returns provably non-empty maximal CPA regions together with interior representatives. The framework further provides visualizations of region partitioning and decision boundaries, enabling qualitative inspection alongside quantitative region counts. By exploiting the affine restriction property of CPA networks under fixed activation patterns, AffineLens supports a broad class of modern components, including batch normalization, pooling, residual connections, multilayer perceptrons, and convolutional layers. Finally, we use AffineLens to perform a systematic empirical study of architectural expressivity, comparing networks through region complexity metrics and revealing how design choices influence the geometry of learned functions.

preprint2026arXiv

Beyond Point-wise Neural Collapse: A Topology-Aware Hierarchical Classifier for Class-Incremental Learning

The Nearest Class Mean (NCM) classifier is widely favored in Class-Incremental Learning (CIL) for its superior resistance to catastrophic forgetting compared to Fully Connected layers. While Neural Collapse (NC) theory supports NCM's optimality by assuming features collapse into single points, non-linear feature drift and insufficient training in CIL often prevent this ideal state. Consequently, classes manifest as complex manifolds rather than collapsed points, rendering the single-point NCM suboptimal. To address this, we propose Hierarchical-Cluster SOINN (HC-SOINN), a novel classifier that captures the topological structure of these manifolds via a ``local-to-global'' representation. Furthermore, we introduce Structure-Topology Alignment via Residuals (STAR) method, which employs a fine-grained pointwise trajectory tracking mechanism to actively deform the learned topology, allowing it to adapt precisely to complex non-linear feature drift. Theoretical analysis and Procrustes distance experiments validate our framework's resilience to manifold deformations. We integrated HC-SOINN into seven state-of-the-art methods by replacing their original classifiers, achieving consistent improvements that highlight the effectiveness and robustness of our approach. Code is available at https://github.com/yhyet/HC_SOINN.

preprint2026arXiv

Beyond What to Select: A Plug-and-play Oscillatory Data-Volume Scheduling for Efficient Model Training

Data selection accelerates training by identifying representative training data while preserving model performance. However, existing methods mainly focus on designing sample-importance criteria, i.e., deciding what to select, while typically fixing the selected data volume as the target ratio throughout training. Thus, they are often dynamic in sample identity but static in data volume. In this work, we revisit data selection from an optimization perspective and show that selected-data training induces an implicit regularization effect modulated by the instantaneous selection ratio. This reveals a key trade-off: lower ratios amplify selection-induced regularization, whereas higher ratios preserve data coverage and optimization fidelity. Motivated by this insight, we propose PODS, a Plug-and-play Oscillatory Data-volume Scheduling framework. Rather than introducing another sample-scoring metric, PODS serves as a lightweight module that dynamically schedules how much data to select over training. Under the target selection ratio, PODS alternates between low-ratio regularization phases and high-ratio recovery phases to exploit selection-induced regularization without sacrificing optimization stability. With its lightweight, ratio-level, and task-agnostic design, PODS is compatible with existing static and dynamic selection methods and broadly applicable across training paradigms. Experiments across various datasets, architectures, and tasks show that PODS consistently improves the efficiency-generalization trade-off, e.g., reducing ImageNet-1k training cost by 50% with improved accuracy and accelerating LLM instruction tuning by over 2x without performance degradation.

preprint2026arXiv

MiCA: A Mobility-Informed Causal Adapter for Lightweight Epidemic Forecasting

Accurate forecasting of infectious disease dynamics is critical for public health planning and intervention. Human mobility plays a central role in shaping the spatial spread of epidemics, but mobility data are noisy, indirect, and difficult to integrate reliably with disease records. Meanwhile, epidemic case time series are typically short and reported at coarse temporal resolution. These conditions limit the effectiveness of parameter-heavy mobility-aware forecasters that rely on clean and abundant data. In this work, we propose the Mobility-Informed Causal Adapter (MiCA), a lightweight and architecture-agnostic module for epidemic forecasting. MiCA infers mobility relations through causal discovery and integrates them into temporal forecasting models via gated residual mixing. This design allows lightweight forecasters to selectively exploit mobility-derived spatial structure while remaining robust under noisy and data-limited conditions, without introducing heavy relational components such as graph neural networks or full attention. Extensive experiments on four real-world epidemic datasets, including COVID-19 incidence, COVID-19 mortality, influenza, and dengue, show that MiCA consistently improves lightweight temporal backbones, achieving an average relative error reduction of 7.5\% across forecasting horizons. Moreover, MiCA attains performance competitive with SOTA spatio-temporal models while remaining lightweight.

preprint2026arXiv

Nested Spatio-Temporal Time Series Forecasting

Spatiotemporal forecasting is critical for real-world applications like traffic management, yet capturing reliable interactions remains challenging under noisy and non-stationary conditions. Existing methods primarily rely on historical spatial priors, often failing to account for evolving temporal correlations and suffering from systematic errors. In this work, we propose a nested forecasting framework that couples future macro-level regional trends with micro-level historical observations, enabling top-down guidance from abstract future representations for fine-grained forecasting. Specifically, we employ a spectral clustering-based approach to construct semantically coherent regions, providing both theoretical and empirical evidence that this representation effectively filters systematic noise while preserving essential trends. Building on this, we develop a progressive coarse-to-fine predictor to integrate these representative features into the inference process. This enables the model to leverage trend predictions to anticipate dynamic anomalies, such as periodic offsets, in advance. Furthermore, extensive experiments on multiple high-dimensional datasets demonstrate that our method consistently outperforms state-of-the-art baselines, validating the effectiveness of future macro-guided nested forecasting.

preprint2026arXiv

Region Seeding via Pre-Activation Regularization: A Geometric View of Piecewise Affine Neural Networks

Deep networks with continuous piecewise affine activations induce polyhedral partitions of the input space, making the number of realized affine regions a natural measure of expressive capacity and a key determinant of how well the model can approximate nonlinear target functions. In practice, standard training realizes far fewer region refinements in data-visited neighborhoods than the architecture could in principle support, while existing region-count theory is primarily architectural and offers little guidance on how optimization shapes the realized partition near the data. Our theory provides a sufficient condition under which bringing neuron switching surfaces sufficiently close to data points ensures their intersection with local neighborhoods, which in turn implies a strict increase in the local affine-region count, yielding a principled training-time handle for seeding data-relevant partitions early in optimization. Guided by these results, we propose a plug-and-play region-seeding regularizer that encourages early partitioning while allowing task-driven refinement to dominate later in training. Experiments show that the regularizer increases the number of realized affine regions via exact enumeration and improves overall performance on toy datasets, while also improving early-stage accuracy and achieving comparable (or slightly improved) final accuracy on ImageNet-1k for classical models.

preprint2026arXiv

Training-Time Batch Normalization Reshapes Local Partition Geometry in Piecewise-Affine Networks

Batch normalization (BN) is central to modern deep networks, but its effect on the realized function during training remains less understood than its optimization benefits. We study training-time BN in continuous piecewise-affine (CPA) networks through the geometry of switching hyperplanes and the induced affine-region partition. Conditioned on a mini-batch, we show that BN defines for each neuron a reference hyperplane through the batch centroid, and that breakpoint-switching hyperplanes are parallel translates whose offsets are expressed in batch-standardized coordinates and are independent of the raw bias. This yields an exact criterion for when a switching hyperplane intersects a local $\ell_\infty$ window and motivates a local region-density functional based on exact affine-region counts. Under explicit sufficient conditions, we show that BN increases expected local partition refinement in ReLU and more general piecewise-affine networks, and that this mechanism transfers locally through depth inside parent affine regions where the upstream representation map is an affine embedding. These results provide a function-level geometric account of training-time BN as a batch-conditional recentering mechanism near the data.

preprint2022arXiv

AugRmixAT: A Data Processing and Training Method for Improving Multiple Robustness and Generalization Performance

Deep neural networks are powerful, but they also have shortcomings such as their sensitivity to adversarial examples, noise, blur, occlusion, etc. Moreover, ensuring the reliability and robustness of deep neural network models is crucial for their application in safety-critical areas. Much previous work has been proposed to improve specific robustness. However, we find that the specific robustness is often improved at the sacrifice of the additional robustness or generalization ability of the neural network model. In particular, adversarial training methods significantly hurt the generalization performance on unperturbed data when improving adversarial robustness. In this paper, we propose a new data processing and training method, called AugRmixAT, which can simultaneously improve the generalization ability and multiple robustness of neural network models. Finally, we validate the effectiveness of AugRmixAT on the CIFAR-10/100 and Tiny-ImageNet datasets. The experiments demonstrate that AugRmixAT can improve the model's generalization performance while enhancing the white-box robustness, black-box robustness, common corruption robustness, and partial occlusion robustness.

preprint2022arXiv

AutoAdversary: A Pixel Pruning Method for Sparse Adversarial Attack

Deep neural networks (DNNs) have been proven to be vulnerable to adversarial examples. A special branch of adversarial examples, namely sparse adversarial examples, can fool the target DNNs by perturbing only a few pixels. However, many existing sparse adversarial attacks use heuristic methods to select the pixels to be perturbed, and regard the pixel selection and the adversarial attack as two separate steps. From the perspective of neural network pruning, we propose a novel end-to-end sparse adversarial attack method, namely AutoAdversary, which can find the most important pixels automatically by integrating the pixel selection into the adversarial attack. Specifically, our method utilizes a trainable neural network to generate a binary mask for the pixel selection. After jointly optimizing the adversarial perturbation and the neural network, only the pixels corresponding to the value 1 in the mask are perturbed. Experiments demonstrate the superiority of our proposed method over several state-of-the-art methods. Furthermore, since AutoAdversary does not require a heuristic pixel selection process, it does not slow down excessively as other methods when the image size increases.

preprint2022arXiv

Review Neural Networks about Image Transformation Based on IGC Learning Framework with Annotated Information

Image transformation, a class of vision and graphics problems whose goal is to learn the mapping between an input image and an output image, develops rapidly in the context of deep neural networks. In Computer Vision (CV), many problems can be regarded as the image transformation task, e.g., semantic segmentation and style transfer. These works have different topics and motivations, making the image transformation task flourishing. Some surveys only review the research on style transfer or image-to-image translation, all of which are just a branch of image transformation. However, none of the surveys summarize those works together in a unified framework to our best knowledge. This paper proposes a novel learning framework including Independent learning, Guided learning, and Cooperative learning, called the IGC learning framework. The image transformation we discuss mainly involves the general image-to-image translation and style transfer about deep neural networks. From the perspective of this framework, we review those subtasks and give a unified interpretation of various scenarios. We categorize related subtasks about the image transformation according to similar development trends. Furthermore, experiments have been performed to verify the effectiveness of IGC learning. Finally, new research directions and open problems are discussed for future research.

preprint2022arXiv

RSTAM: An Effective Black-Box Impersonation Attack on Face Recognition using a Mobile and Compact Printer

Face recognition has achieved considerable progress in recent years thanks to the development of deep neural networks, but it has recently been discovered that deep neural networks are vulnerable to adversarial examples. This means that face recognition models or systems based on deep neural networks are also susceptible to adversarial examples. However, the existing methods of attacking face recognition models or systems with adversarial examples can effectively complete white-box attacks but not black-box impersonation attacks, physical attacks, or convenient attacks, particularly on commercial face recognition systems. In this paper, we propose a new method to attack face recognition models or systems called RSTAM, which enables an effective black-box impersonation attack using an adversarial mask printed by a mobile and compact printer. First, RSTAM enhances the transferability of the adversarial masks through our proposed random similarity transformation strategy. Furthermore, we propose a random meta-optimization strategy for ensembling several pre-trained face models to generate more general adversarial masks. Finally, we conduct experiments on the CelebA-HQ, LFW, Makeup Transfer (MT), and CASIA-FaceV5 datasets. The performance of the attacks is also evaluated on state-of-the-art commercial face recognition systems: Face++, Baidu, Aliyun, Tencent, and Microsoft. Extensive experiments show that RSTAM can effectively perform black-box impersonation attacks on face recognition models or systems.

preprint2021arXiv

IC Networks: Remodeling the Basic Unit for Convolutional Neural Networks

Convolutional neural network (CNN) is a class of artificial neural networks widely used in computer vision tasks. Most CNNs achieve excellent performance by stacking certain types of basic units. In addition to increasing the depth and width of the network, designing more effective basic units has become an important research topic. Inspired by the elastic collision model in physics, we present a general structure which can be integrated into the existing CNNs to improve their performance. We term it the "Inter-layer Collision" (IC) structure. Compared to the traditional convolution structure, the IC structure introduces nonlinearity and feature recalibration in the linear convolution operation, which can capture more fine-grained features. In addition, a new training method, namely weak logit distillation (WLD), is proposed to speed up the training of IC networks by extracting knowledge from pre-trained basic models. In the ImageNet experiment, we integrate the IC structure into ResNet-50 and reduce the top-1 error from 22.38% to 21.75%, which also catches up the top-1 error of ResNet-100 (21.75%) with nearly half of FLOPs.

preprint2020arXiv

IC-Network: Efficient Structure for Convolutional Neural Networks

Neural networks have been widely used, and most networks achieve excellent performance by stacking certain types of basic units. Compared to increasing the depth and width of the network, designing more effective basic units has become an important research topic. Inspired by the elastic collision model in physics, we present a universal structure that could be integrated into the existing network structures to speed up the training process and increase their generalization abilities. We term this structure the "Inter-layer Collision" (IC) structure. We built two kinds of basic computational units (IC layer and IC block) that compose the convolutional neural networks (CNNs) by combining the IC structure with the convolution operation. Compared to traditional convolutions, both of the proposed computational units have a stronger non-linear representation ability and can filter features useful for a given task. Using these computational units to build networks, we bring significant improvements in performance for existing state-of-the-art CNNs. On the imagenet experiment, we integrate the IC block into ResNet-50 and reduce the top-1 error from 22.85% to 21.49%, which also exceeds the top-1 error of ResNet-100 (21.75%).

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint

Fields this researcher appears in

Machine Learning Computer Vision Artificial Intelligence

Source provenance