Source author record

Wenyuan Li

Wenyuan Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning Artificial Intelligence math.NA Numerical Analysis Information Retrieval math.DS Neurons and Cognition physics.ao-ph Social and Information Networks

Catalog footprint

What is connected

9works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

AgriFM: A Multi-source Temporal Remote Sensing Foundation Model for Agriculture Mapping

Accurate crop mapping fundamentally relies on modeling multi-scale spatiotemporal patterns, where spatial scales range from individual field textures to landscape-level context, and temporal scales capture both short-term phenological transitions and full growing-season dynamics. Transformer-based remote sensing foundation models (RSFMs) offer promising potential for crop mapping due to their innate ability for unified spatiotemporal processing. However, current RSFMs remain suboptimal for crop mapping: they either employ fixed spatiotemporal windows that ignore the multi-scale nature of crop systems or completely disregard temporal information by focusing solely on spatial patterns. To bridge these gaps, we present AgriFM, a multi-source remote sensing foundation model specifically designed for agricultural crop mapping. Our approach begins by establishing the necessity of simultaneous hierarchical spatiotemporal feature extraction, leading to the development of a modified Video Swin Transformer architecture where temporal down-sampling is synchronized with spatial scaling operations. This modified backbone enables efficient unified processing of long time-series satellite inputs. AgriFM leverages temporally rich data streams from three satellite sources including MODIS, Landsat-8/9 and Sentinel-2, and is pre-trained on a global representative dataset comprising over 25 million image samples supervised by land cover products. The resulting framework incorporates a versatile decoder architecture that dynamically fuses these learned spatiotemporal representations, supporting diverse downstream tasks. Comprehensive evaluations demonstrate AgriFM's superior performance over conventional deep learning approaches and state-of-the-art general-purpose RSFMs across all downstream tasks. Codes will be available at https://github.com/flyakon/AgriFM.

preprint2026arXiv

Predictive but Not Plannable: RC-aux for Latent World Models

A latent world model may achieve accurate short-horizon prediction while still inducing a latent space that is poorly aligned with planning. A key issue is spatiotemporal mismatch: these models are often trained with local predictive supervision, but deployed for long-horizon goal-directed search in latent spaces where Euclidean distance may not reflect what is reachable within a finite action budget. We present the Reachability-Correction auxiliary objective (RC-aux), a lightweight correction for this mismatch in reconstruction-free latent world models. RC-aux keeps the world-model backbone unchanged and adds planning-aligned supervision along two axes. Along the time axis, multi-horizon open-loop prediction trains the model beyond one-step consistency. Along the space axis, budget-conditioned reachability supervision, together with temporal hard negatives, encourages the latent space to distinguish states that are eventually reachable from those reachable within the current planning horizon. At test time, the learned reachability signal can also be used by a reachability-aware planner to favor trajectories that are both goal-directed and attainable under the available budget. We instantiate RC-aux on LeWorldModel and evaluate it under both continuation-training and matched-from-scratch settings. Across goal-conditioned pixel-control tasks and a LIBERO-Goal extension, RC-aux improves LeWM-style planning with modest additional cost. These results suggest that planning with latent world models depends not only on predictive accuracy, but also on whether the learned representation encodes the temporal and geometric structure required by downstream search. The code is available at https://github.com/Guang000/RC-aux.

preprint2024arXiv

DeepPhysiNet: Bridging Deep Learning and Atmospheric Physics for Accurate and Continuous Weather Modeling

Accurate weather forecasting holds significant importance to human activities. Currently, there are two paradigms for weather forecasting: Numerical Weather Prediction (NWP) and Deep Learning-based Prediction (DLP). NWP utilizes atmospheric physics for weather modeling but suffers from poor data utilization and high computational costs, while DLP can learn weather patterns from vast amounts of data directly but struggles to incorporate physical laws. Both paradigms possess their respective strengths and weaknesses, and are incompatible, because physical laws adopted in NWP describe the relationship between coordinates and meteorological variables, while DLP directly learns the relationships between meteorological variables without consideration of coordinates. To address these problems, we introduce the DeepPhysiNet framework, incorporating physical laws into deep learning models for accurate and continuous weather system modeling. First, we construct physics networks based on multilayer perceptrons (MLPs) for individual meteorological variable, such as temperature, pressure, and wind speed. Physics networks establish relationships between variables and coordinates by taking coordinates as input and producing variable values as output. The physical laws in the form of Partial Differential Equations (PDEs) can be incorporated as a part of loss function. Next, we construct hyper-networks based on deep learning methods to directly learn weather patterns from a large amount of meteorological data. The output of hyper-networks constitutes a part of the weights for the physics networks. Experimental results demonstrate that, upon successful integration of physical laws, DeepPhysiNet can accomplish multiple tasks simultaneously, not only enhancing forecast accuracy but also obtaining continuous spatiotemporal resolution results, which is unattainable by either the NWP or DLP.

preprint2022arXiv

Nonlocal transport equations in multiscale media. Modeling, dememorization, and discretizations

In this paper, we consider a class of convection-diffusion equations with memory effects. These equations arise as a result of homogenization or upscaling of linear transport equations in heterogeneous media and play an important role in many applications. First, we present a dememorization technique for these equations. We show that the convection-diffusion equations with memory effects can be written as a system of standard convection-diffusion-reaction equations. This allows removing the memory term and simplifying the computations. We consider a relation between dememorized equations and micro-scale equations, which do not contain memory terms. We note that dememorized equations differ from micro-scale equations and constitute a macroscopic model. Next, we consider both implicit and partially explicit methods. The latter is introduced for problems in multiscale media with high-contrast properties. Because of high contrast, explicit methods are restrictive and require time steps that are very small (scales as the inverse of the contrast). We show that, by appropriately decomposing the space, we can treat only a few degrees of freedom implicitly and the remaining degrees of freedom explicitly. We present a stability analysis. Numerical results are presented that confirm our theoretical findings of partially explicit schemes applied to dememorized systems of equations.

preprint2022arXiv

Semantic-aware Dense Representation Learning for Remote Sensing Image Change Detection

Supervised deep learning models depend on massive labeled data. Unfortunately, it is time-consuming and labor-intensive to collect and annotate bitemporal samples containing desired changes. Transfer learning from pre-trained models is effective to alleviate label insufficiency in remote sensing (RS) change detection (CD). We explore the use of semantic information during pre-training. Different from traditional supervised pre-training that learns the mapping from image to label, we incorporate semantic supervision into the self-supervised learning (SSL) framework. Typically, multiple objects of interest (e.g., buildings) are distributed in various locations in an uncurated RS image. Instead of manipulating image-level representations via global pooling, we introduce point-level supervision on per-pixel embeddings to learn spatially-sensitive features, thus benefiting downstream dense CD. To achieve this, we obtain multiple points via class-balanced sampling on the overlapped area between views using the semantic mask. We learn an embedding space where background and foreground points are pushed apart, and spatially aligned points across views are pulled together. Our intuition is the resulting semantically discriminative representations invariant to irrelevant changes (illumination and unconcerned land covers) may help change recognition. We collect large-scale image-mask pairs freely available in the RS community for pre-training. Extensive experiments on three CD datasets verify the effectiveness of our method. Ours significantly outperforms ImageNet pre-training, in-domain supervision, and several SSL methods. Empirical results indicate our pre-training improves the generalization and data efficiency of the CD model. Notably, we achieve competitive results using 20% training data than baseline (random initialization) using 100% data. Our code is available.

preprint2021arXiv

Geographical Knowledge-driven Representation Learning for Remote Sensing Images

The proliferation of remote sensing satellites has resulted in a massive amount of remote sensing images. However, due to human and material resource constraints, the vast majority of remote sensing images remain unlabeled. As a result, it cannot be applied to currently available deep learning methods. To fully utilize the remaining unlabeled images, we propose a Geographical Knowledge-driven Representation learning method for remote sensing images (GeoKR), improving network performance and reduce the demand for annotated data. The global land cover products and geographical location associated with each remote sensing image are regarded as geographical knowledge to provide supervision for representation learning and network pre-training. An efficient pre-training framework is proposed to eliminate the supervision noises caused by imaging times and resolutions difference between remote sensing images and geographical knowledge. A large scale pre-training dataset Levir-KR is proposed to support network pre-training. It contains 1,431,950 remote sensing images from Gaofen series satellites with various resolutions. Experimental results demonstrate that our proposed method outperforms ImageNet pre-training and self-supervised representation learning methods and significantly reduces the burden of data annotation on downstream tasks such as scene classification, semantic segmentation, object detection, and cloud / snow detection. It demonstrates that our proposed method can be used as a novel paradigm for pre-training neural networks. Codes will be available on https://github.com/flyakon/Geographical-Knowledge-driven-Representaion-Learning.

preprint2021arXiv

Partially Explicit Time Discretization for Nonlinear Time Fractional Diffusion Equations

Nonlinear time fractional partial differential equations are widely used in modeling and simulations. In many applications, there are high contrast changes in media properties. For solving these problems, one often uses coarse spatial grid for spatial resolution. For temporal discretization, implicit methods are often used. For implicit methods, though the time step can be relatively large, the equations are difficult to compute due to the nonlinearity and the fact that one deals with large-scale systems. On the other hand, the discrete system in explicit methods are easier to compute but it requires small time steps. In this work, we propose the partially explicit scheme following earlier works on developing partially explicit methods for nonlinear diffusion equations. In this scheme, the diffusion term is treated partially explicitly and the reaction term is treated fully explicitly. With the appropriate construction of spaces and stability analysis, we find that the required time step in our proposed scheme scales as the coarse mesh size, which creates a great saving in computing. The main novelty of this work is the extension of our earlier works for diffusion equations to time fractional diffusion equations. For the case of fractional diffusion equations, the constraints on time steps are more severe and the proposed methods alleviate this since the time step in partially explicit method scales as the coarse mesh size. We present stability results. Numerical results are presented where we compare our proposed partially explicit methods with a fully implicit approach. We show that our proposed approach provides similar results, while treating many degrees of freedom in nonlinear terms explicitly.

preprint2020arXiv

A Social Search Model for Large Scale Social Networks

With the rise of social networks, information on the internet is no longer solely organized by web pages. Rather, content is generated and shared among users and organized around their social relations on social networks. This presents new challenges to information retrieval systems. On a social network search system, the generation of result sets not only needs to consider keyword matches, like a traditional web search engine does, but it also needs to take into account the searcher's social connections and the content's visibility settings. Besides, search ranking should be able to handle both textual relevance and the rich social interaction signals from the social network. In this paper, we present our solution to these two challenges by first introducing a social retrieval mechanism, and then investigate novel deep neural networks for the ranking problem. The retrieval system treats social connections as indexing terms, and generates meaningful results sets by biasing towards close social connections in a constrained optimization fashion. The result set is then ranked by a deep neural network that handles textual and social relevance in a two-tower approach, in which personalization and textual relevance are addressed jointly. The retrieval mechanism is deployed on Facebook and is helping billions of users finding postings from their connections efficiently. Based on the postings being retrieved, we evaluate our two-tower neutral network, and examine the importance of personalization and textual signals in the ranking problem.

preprint2020arXiv

Criticality or Supersymmetry Breaking ?

In many stochastic dynamical systems, ordinary chaotic behavior is preceded by a full-dimensional phase that exhibits 1/f-type power-spectra and/or scale-free statistics of (anti)instantons such as neuroavalanches, earthquakes, etc. In contrast with the phenomenological concept of self-organized criticality, the recently developed approximation-free supersymmetric theory of stochastic differential equations, or stochastics, (STS) identifies this phase as the noise-induced chaos (N-phase), i.e., the phase where the topological supersymmetry pertaining to all stochastic dynamical systems is broken spontaneously by the condensation of the noise-induced (anti-)instantons. Here, we support this picture in the context of neurodynamics. We study a 1D chain of neuron-like elements and find that the dynamics in the N-phase is indeed featured by positive stochastic Lyapunov exponents and dominated by (anti)instantonic processes of (creation)annihilation of kinks and antikinks, which can be viewed as predecessors of boundaries of neuroavalanches. We also construct the phase diagram of emulated stochastic neurodynamics on Spikey neuromorphic hardware and demonstrate that the width of the N-phase vanishes in the deterministic limit in accordance with STS. As a first result of the application of STS to neurodynamics comes the conclusion that a conscious brain can reside only in the N-phase.

Wenyuan Li

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

AgriFM: A Multi-source Temporal Remote Sensing Foundation Model for Agriculture Mapping

Predictive but Not Plannable: RC-aux for Latent World Models

DeepPhysiNet: Bridging Deep Learning and Atmospheric Physics for Accurate and Continuous Weather Modeling

Nonlocal transport equations in multiscale media. Modeling, dememorization, and discretizations

Semantic-aware Dense Representation Learning for Remote Sensing Image Change Detection

Geographical Knowledge-driven Representation Learning for Remote Sensing Images

Partially Explicit Time Discretization for Nonlinear Time Fractional Diffusion Equations

A Social Search Model for Large Scale Social Networks

Criticality or Supersymmetry Breaking ?