Source author record

Miao Hu

Miao Hu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Networking and Internet Architecture Artificial Intelligence Machine Learning math-ph math.MP math.NA Multimedia Numerical Analysis

Catalog footprint

What is connected

4works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A model order reduction based adaptive parareal method for time-dependent partial differential equations

In this paper, we propose a model order reduction based adaptive parareal method for time-dependent partial differential equations. By using the data obtained by the fine propagator in each iteration of the plain parareal method together with some model order reduction technique, we construct the coarse propagator adaptively in each parareal iteration, and then obtain our adaptive parareal method. We apply this new method to solve some 3D time-dependent advection-diffusion equations with the Kolmogorov flow and the ABC flow. Numerical results show the good performance of our method in simulating long-term evolution problems.

preprint2026arXiv

From Sparsity to Simplicity: Enabling Simpler Sequential Replacements via Sparse Attention Distillation

Self-attention serves as the core foundation of large-scale transformer pretraining, but its quadratic token interaction cost makes inference expensive. Replacing attention with simpler sequential modules is appealing, yet naive substitution is often lossy, especially at larger scales. This paper revisits attention replacement through the lens of sparsity. Based on the observation of diverse sparsity patterns across transformer layers, we posit that pretrained transformers decompose the complex token dependency across tokens into various sequence-to-sequence mappings of diverse complexities, where some layer functionalities can be approximated and replaced with much simpler sequential modules without loss. We evaluate this premise using a plug-and-play layer-wise distillation framework to approximate and replace attention functionalities in pretrained vision transformer models. Controlled group-wise replacements under a fixed training budget reveal a clear pattern: substituting layers with sparser attention incurs substantially smaller accuracy drops than replacing denser ones. We further impose explicit attention sparsity on the pretrained ViT via AViT-style token retention and perform sparsity-guided distillation for sequential replacing models, where we see increasing teacher sparsity consistently reduces the student-teacher gap. The proposed method achieves efficient attention replacement for reduced parameter size and latency through the guidance of attention sparsity.

preprint2021arXiv

Optimizing Video Caching at the Edge: A Hybrid Multi-Point Process Approach

It is always a challenging problem to deliver a huge volume of videos over the Internet. To meet the high bandwidth and stringent playback demand, one feasible solution is to cache video contents on edge servers based on predicted video popularity. Traditional caching algorithms (e.g., LRU, LFU) are too simple to capture the dynamics of video popularity, especially long-tailed videos. Recent learning-driven caching algorithms (e.g., DeepCache) show promising performance, however, such black-box approaches are lack of explainability and interpretability. Moreover, the parameter tuning requires a large number of historical records, which are difficult to obtain for videos with low popularity. In this paper, we optimize video caching at the edge using a white-box approach, which is highly efficient and also completely explainable. To accurately capture the evolution of video popularity, we develop a mathematical model called \emph{HRS} model, which is the combination of multiple point processes, including Hawkes' self-exciting, reactive and self-correcting processes. The key advantage of the HRS model is its explainability, and much less number of model parameters. In addition, all its model parameters can be learned automatically through maximizing the Log-likelihood function constructed by past video request events. Next, we further design an online HRS-based video caching algorithm. To verify its effectiveness, we conduct a series of experiments using real video traces collected from Tencent Video, one of the largest online video providers in China. Experiment results demonstrate that our proposed algorithm outperforms the state-of-the-art algorithms, with 12.3\% improvement on average in terms of cache hit rate under realistic settings.

preprint2015arXiv

Recursion-based Analysis for Information Propagation in Vehicular Ad Hoc Networks

Effective inter-vehicle communication is fundamental to a decentralized traffic information system based on Vehicular Ad Hoc Networks (VANETs). To reflect the uncertainty of the information propagation, most of the existing work was conducted by assuming the inter-vehicle distance follows some specific probability models, e.g., the lognormal or exponential distribution, while reducing the analysis complexity. Aimed at providing more generic results, a recursive modeling framework is proposed for VANETs in this paper when the vehicle spacing can be captured by a general i.i.d. distribution. With the framework, the analytical expressions for a series of commonly discussed metrics are derived respectively, including the mean, variance, probability distribution of the propagation distance, and expectation for the number of vehicles included in a propagation process, when the transmission failures are mainly caused by MAC contentions. Moreover, a discussion is also made for demonstrating the efficiency of the recursive analysis method when the impact of channel fading is also considered. All the analytical results are verified by extensive simulations. We believe that this work is able to potentially reveal a more insightful understanding of information propagation in VANETs by allowing to evaluate the effect of any vehicle headway distributions.