Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
32works
0followers
22topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

32 published item(s)

preprint2026arXiv

Asynchronous Fractional Multi-Agent Deep Reinforcement Learning for Age-Minimal Mobile Edge Computing

In the realm of emerging real-time networked applications such as cyber-physical systems (CPS), the Age of Information (AoI) has emerged as a pivotal metric for evaluating timeliness. To meet the high computational demands, such as those in smart manufacturing within CPS, mobile edge computing (MEC) presents a promising solution for optimizing computing and reducing AoI. In this work, we study the timeliness of compute-intensive updates and explore jointly optimizing the task updating (when to generate a task) and offloading (where to process a task) policies to minimize AoI. Specifically, we consider edge load dynamics and formulate a task scheduling problem to minimize the expected time-average AoI. Solving this problem is challenging due to the fractional objective introduced by AoI and the asynchronous decision-making of the semi-Markov game (SMG). To this end, we propose a fractional reinforcement learning (RL) framework. We begin by introducing a fractional single-agent RL framework and establish its linear convergence rate. Building on this, we develop a fractional multi-agent RL framework, extend Dinkelbach's method, and demonstrate its equivalence to the inexact Newton's method. Furthermore, we provide the conditions under which the framework achieves linear convergence to the Nash equilibrium (NE). To tackle the challenge of asynchronous decision-making in the SMG, we further design an asynchronous model-free fractional multi-agent RL algorithm, where each mobile device can determine the task updating and offloading decisions without knowing the real-time system dynamics and decisions of other devices. Experimental results show that when compared with the best existing baseline algorithm, our proposed algorithm reduces the average AoI by up to 50.6%.

preprint2026arXiv

Decomposing the Generalization Gap in PROTAC Activity Prediction: Variance Attribution and the Inter-Laboratory Ceiling

Machine-learning predictors of biochemical activity often exhibit large random-split-to-leave-one-target-out generalisation gaps that have been documented but not decomposed. We frame this as an evaluation-science question and use targeted protein degradation as the empirical test bed. PROTACs (proteolysis-targeting chimeras) are heterobifunctional small molecules that induce targeted protein degradation, with more than forty candidates currently in clinical trials; published predictors report AUROC of 0.85 to 0.91 under random-split cross-validation, while the leave-one-target-out (LOTO) protocol of Ribes et al. reduces performance to approximately 0.67. Random splits reward within-target interpolation, whereas LOTO measures the novel-target prediction that de-novo design depends on. We decompose this gap and identify inter-laboratory measurement variance as the dominant component, anchored by a within-target cross-laboratory cascade bounding the inter-laboratory contribution at 0.124 AUROC, well above the 0.05 contribution from binarisation-threshold choice. Across eight published architectures and ESM-2 protein language models up to 3B parameters, LOTO AUROC plateaus near 0.67, with a comparable plateau under SMILES-level deduplication; a 21-dimensional 2000-trial hyperparameter optimisation cannot break this ceiling, and the rank-1 single-seed configuration regresses by 0.161 AUROC under multi-seed validation, matching a closed-form selection-bias prediction (Bailey and Lopez de Prado, 2014). Few-shot k=5 stratified per-target retraining combined with ADMET features lifts 65-target LOTO AUROC from 0.668 to 0.7050, and post-hoc Platt scaling recovers raw output to within the 0.05 well-calibrated threshold. We release PROTAC-Bench (10,748 measurements, 173 targets, 65 LOTO folds), the variance-decomposition framework, the per-target calibration protocol, and the evaluation code.

preprint2026arXiv

First Thin-Film Lithium Tantalate Polarization Controller Enabling Reset-Free Mrad/s Tracking for Optical Interconnects

The rapid escalation of computing power driven by large-scale artificial intelligence is placing unprecedented demands on the bandwidth, latency, and energy efficiency of data-center interconnects (DCIs). Self-homodyne coherent (SHC) transmission is a promising architecture because it preserves the spectral efficiency of coherent detection while greatly simplifying digital signal processing, but its practical deployment is critically limited by random and often ultrafast state-of-polarization (SOP) fluctuations that induce carrier fading and destabilize coherent reception. Here we report the first integrated polarization controller based on thin-film lithium tantalate (TFLT), enabling reset-free polarization tracking at Mrad/s speeds. The four-stage electro-optic device exhibits polarization-dependent loss (PDL) below 0.3 dB, a half-wave voltage below 2.5 V, high modulation bandwidth, and negligible DC drift. To accommodate the finite tuning range of integrated phase shifters, we develop a finite-boundary gradient-descent (FBGD) control algorithm that ensures reset-free SOP evolution with no phase jump. The implemented adaptive polarization controller (APC) is validated through both standalone polarization-tracking measurements and a dual-polarization 16-QAM SHC 400-Gbps transmission system. Transient polarization disturbances can be tracked at speeds up to 2 Mrad/s, while stable reset-free operation under continuous polarization disturbances is maintained up to 1 Mrad/s. This reset-free performance represents more than doubling the state of the art, while the pre-FEC bit-error rates remain below the HD-FEC threshold under realistic DCI conditions and lightning-scale polarization disturbances. These results establish TFLT as a new platform for ultrafast, low-power, reset-free, and drift-free polarization control in coherent optical interconnects and beyond.

preprint2026arXiv

FlowSpec: Continuous Pipelined Speculative Decoding for Efficient Distributed LLM Inference

Distributed inference serves as a promising approach to enabling the inference of large language models (LLMs) at the network edge. It distributes the inference process to multiple devices to ensure that the LLMs can fit into the device memory. Recent pipeline-based approaches have the potential to parallelize communication and computation, which helps reduce inference latency. However, the benefit diminishes when the inference request at the network edge is sparse, where pipeline is typically at low utilization. To enable efficient distributed LLM inference at the edge, we propose \textbf{FlowSpec}, a pipeline-parallel tree-based speculative decoding framework. FlowSpec incorporates three key mechanisms to improve decoding efficiency: 1) score-based step-wise verification prioritizes more important draft tokens to bring earlier accepted tokens; 2) efficient draft management to prune invalid tokens while maintaining correct causal relationship during verification; 3) dynamic draft expansion strategies to supply high-quality speculative inputs. These techniques work in concert to enhance both pipeline utilization and speculative efficiency. We evaluate FlowSpec on a real-world testbed with other baselines. Experimental results demonstrate that our proposed framework significantly improves inference speed across diverse models and configurations, achieving speedup ratios 1.37$\times$-1.73$\times$ compared to baselines. Our code is publicly available at \href{https://github.com/Leosang-lx/FlowSpec#}{https://github.com/Leosang-lx/FlowSpec\#}.

preprint2026arXiv

GeM-VG: Towards Generalized Multi-image Visual Grounding with Multimodal Large Language Models

Multimodal Large Language Models (MLLMs) have demonstrated impressive progress in single-image grounding and general multi-image understanding. Recently, some methods begin to address multi-image grounding. However, they are constrained by single-target localization and limited types of practical tasks, due to the lack of unified modeling for generalized grounding tasks. Therefore, we propose GeM-VG, an MLLM capable of Generalized Multi-image Visual Grounding. To support this, we systematically categorize and organize existing multi-image grounding tasks according to their reliance of cross-image cues and reasoning, and introduce the MG-Data-240K dataset, addressing the limitations of existing datasets regarding target quantity and image relation. To tackle the challenges of robustly handling diverse multi-image grounding tasks, we further propose a hybrid reinforcement finetuning strategy that integrates chain-of-thought (CoT) reasoning and direct answering, considering their complementary strengths. This strategy adopts an R1-like algorithm guided by a carefully designed rule-based reward, effectively enhancing the model's overall perception and reasoning capabilities. Extensive experiments demonstrate the superior generalized grounding capabilities of our model. For multi-image grounding, it outperforms the previous leading MLLMs by 2.0% and 9.7% on MIG-Bench and MC-Bench, respectively. In single-image grounding, it achieves a 9.1% improvement over the base model on ODINW. Furthermore, our model retains strong capabilities in general multi-image understanding.

preprint2026arXiv

ReST-KV: Robust KV Cache Eviction with Layer-wise Output Reconstruction and Spatial-Temporal Smoothing

Large language models (LLMs) face growing challenges in efficient generative inference due to the increasing memory demands of Key-Value (KV) caches, especially for long sequences. Existing eviction methods typically retain KV pairs with high attention weights but overlook the impact of attention redistribution caused by token removal, as well as the spatial-temporal dynamics in KV selection. In this paper, we propose ReST-KV, a robust KV eviction method that combines layer-wise output Reconstruction and Spatial-Temporal smoothing to provide a more comprehensive perspective for the KV cache eviction task. Specifically, ReST-KV formulates KV cache eviction as an optimization problem that minimizes output discrepancies through efficient layer-wise reconstruction. By directly modeling how each token's removal affects the model output, our method naturally captures attention redistribution effects, going beyond simplistic reliance on raw attention weights. To further enhance robustness, we design exponential moving average smoothing to handle temporal variations and an adaptive window-based mechanism to capture spatial patterns. Our method, ReST-KV, significantly advances performance on long-context benchmarks. It surpasses state-of-the-art baselines by 2.58% on LongBench and 15.2% on RULER. Additionally, ReST-KV consistently outperforms existing methods on Needle-in-a-Haystack and InfiniteBench, all while achieving a remarkable 10.61$\times$ reduction in decoding latency at 128k context length. The code is publicly available at https://github.com/an-yongqi/rest-kv to facilitate reproducibility and further research.

preprint2023arXiv

A Posterior Error Estimator for Mixed Interior Penalty Discontinuous Galerkin Finite Element Method for the H(curl)-Elliptic Problems

In this paper, we design the first residual type a posteriori error estimator for mixed interior penalty discontinuous Galerkin method for the H(curl)-elliptic problems. Then we prove that our residual based a posteriori error indicator is both reliable and efficient. At last, we present some numerical experiments to validate the performance of the indicator within an adaptive mesh refinement procedure.

preprint2023arXiv

Iterative two-level algorithm for nonsymmetric or indefinite elliptic problems

In this paper, a new iterative two-level algorithm is presented for solving the finite element discretization for nonsymmetric or indefinite elliptic problems. The iterative two-level algorithm uses the same coarse space as the traditional two-grid algorithm, but its ``fine space'' uses the higher oder finite element space under the coarse grid. Therefore, the iterative two-level algorithm only needs one grid, and the computational cost is much lower than the traditional iterative two-grid algorithm. Finally, compared with the traditional two-grid algorithm, numerical experiments show that the computational cost is lower to achieve the same convergence order.

preprint2022arXiv

A modified weak Galerkin method for $\boldsymbol{H}(\mathrm{curl})$-elliptic problem

In this paper, we design and analysis a modified weak Galerkin (MWG) finite element method for $\boldsymbol{H}(\mathrm{curl})-$elliptic problem. We first introduce a new discrete weak curl operator and the MWG finite element space. The modified weak Galerkin method does not require the penalty parameter by comparing with traditional DG methods. We prove optimal error estimates in energy norm. At last, we provide the numerical results to confirm these theoretical results.

preprint2022arXiv

Beyond the Limitation of Pulse Width in Optical Time-domain Reflectometry

Optical time-domain reflectometry (OTDR) is the basis for distributed time-domain optical fiber sensing techniques. By injecting pulse light into an optical fiber, the distance information of an event can be obtained based on the time of light flight. The minimum distinguishable event separation along the fiber length is called the spatial resolution, which is determined by the optical pulse width. By reducing the pulse width, the spatial resolution can be improved. However, at the same time, the signal-to-noise ratio of the system is degraded, and higher speed equipment is required. To solve this problem, data processing methods such as iterative subdivision, deconvolution, and neural networks have been proposed. However, they all have some shortcomings and thus have not been widely applied. Here, we propose and experimentally demonstrate an OTDR deconvolution neural network based on deep convolutional neural networks. A simplified OTDR model is built to generate a large amount of training data. By optimizing the network structure and training data, an effective OTDR deconvolution is achieved. The simulation and experimental results show that the proposed neural network can achieve more accurate deconvolution than the conventional deconvolution algorithm with a higher signal-to-noise ratio.

preprint2022arXiv

Improving Information Freshness via Backbone-Assisted Cooperative Access Points

Information freshness, characterized by age of information (AoI), is important for sensor applications involving timely status updates. In many cases, the wireless signals from one sensor can be received by multiple access points (APs). This paper investigates the average AoI for cooperative APs, in which they can share information through a wired backbone network. We first study a basic backbone-assisted COoperative AP (Co-AP) system where APs share only decoded packets. Experimental results on software-defined radios (SDR) indicate that Co-AP significantly improves the average AoI performance over a single-AP system. Next, we investigate an improved Co-AP system, called Soft-Co-AP. In addition to sharing decoded packets, Soft-Co-AP shares and collects soft information of packets that the APs fail to decode for further joint decoding. A critical issue in Soft-Co-AP is determining the number of quantization bits that represent the soft information (each soft bit) shared over the backbone. While more quantization bits per soft bit improves the joint decoding performance, it leads to higher backbone delay. We experimentally study the average AoI of Soft-Co-AP by evaluating the tradeoff between the backbone delay and the number of quantization bits. SDR experiments show that when the number of sensors is large, Soft-Co-AP further reduces the average AoI by 12% compared with Co-AP. Interestingly, good average AoI performance is usually achieved when the number of quantization bits per soft bit is neither too large nor too small.

preprint2022arXiv

PASS: Part-Aware Self-Supervised Pre-Training for Person Re-Identification

In person re-identification (ReID), very recent researches have validated pre-training the models on unlabelled person images is much better than on ImageNet. However, these researches directly apply the existing self-supervised learning (SSL) methods designed for image classification to ReID without any adaption in the framework. These SSL methods match the outputs of local views (e.g., red T-shirt, blue shorts) to those of the global views at the same time, losing lots of details. In this paper, we propose a ReID-specific pre-training method, Part-Aware Self-Supervised pre-training (PASS), which can generate part-level features to offer fine-grained information and is more suitable for ReID. PASS divides the images into several local areas, and the local views randomly cropped from each area are assigned with a specific learnable [PART] token. On the other hand, the [PART]s of all local areas are also appended to the global views. PASS learns to match the output of the local views and global views on the same [PART]. That is, the learned [PART] of the local views from a local area is only matched with the corresponding [PART] learned from the global views. As a result, each [PART] can focus on a specific local area of the image and extracts fine-grained information of this area. Experiments show PASS sets the new state-of-the-art performances on Market1501 and MSMT17 on various ReID tasks, e.g., vanilla ViT-S/16 pre-trained by PASS achieves 92.2\%/90.2\%/88.5\% mAP accuracy on Market1501 for supervised/UDA/USL ReID. Our codes are available at https://github.com/CASIA-IVA-Lab/PASS-reID.

preprint2022arXiv

Thermal Modelling and Controller Design of an Alkaline Electrolysis System under Dynamic Operating Conditions

Thermal management is vital for the efficient and safe operation of alkaline electrolysis systems. Traditional alkaline electrolysis systems use simple proportional-integral-differentiation (PID) controllers to maintain the stack temperature near the rated value. However, in renewable-to-hydrogen scenarios, the stack temperature is disturbed by load fluctuations, and the temperature overshoot phenomenon occurs which can exceed the upper limit and harm the stack. This paper focuses on the thermal modelling and controller design of an alkaline electrolysis system under dynamic operating conditions. A control-oriented thermal model is established in the form of a third-order time-delay process, which is used for simulation and controller design. Based on this model, we propose two novel controllers to reduce temperature overshoot: one is a current feed-forward PID controller (PID-I), the other is a model predictive controller (MPC). Their performances are tested on a lab-scale system and the experimental results are satisfying: the temperature overshoot is reduced by 2.2 degree with the PID-I controller, and no obvious overshoot is observed with the MPC controller. Furthermore, the thermal dynamic performance of an MW-scale alkaline electrolysis system is analyzed by simulation, which shows that the temperature overshoot phenomenon is more general in large systems. The proposed method allows for higher temperature set points which can improve system efficiency by 1%.

preprint2022arXiv

Transfering Low-Frequency Features for Domain Adaptation

Previous unsupervised domain adaptation methods did not handle the cross-domain problem from the perspective of frequency for computer vision. The images or feature maps of different domains can be decomposed into the low-frequency component and high-frequency component. This paper proposes the assumption that low-frequency information is more domain-invariant while the high-frequency information contains domain-related information. Hence, we introduce an approach, named low-frequency module (LFM), to extract domain-invariant feature representations. The LFM is constructed with the digital Gaussian low-pass filter. Our method is easy to implement and introduces no extra hyperparameter. We design two effective ways to utilize the LFM for domain adaptation, and our method is complementary to other existing methods and formulated as a plug-and-play unit that can be combined with these methods. Experimental results demonstrate that our LFM outperforms state-of-the-art methods for various computer vision tasks, including image classification and object detection.

preprint2022arXiv

Unconventional Excitonic States with Phonon Sidebands in Layered Silicon Diphosphide

Many-body interactions between quasiparticles (electrons, excitons, and phonons) have led to the emergence of new complex correlated states and are at the core of condensed matter physics and material science. In low-dimensional materials, unique electronic properties for these correlated states could significantly affect their optical properties. Herein, combining photoluminescence, optical reflection measurements and theoretical calculations, we demonstrate an unconventional excitonic state and its bound phonon sideband in layered silicon diphosphide (SiP$_2$), in which the bound electron-hole pair is composed of electrons confined within one-dimensional phosphorus$-$phosphorus chains and holes extended in two-dimensional SiP$_2$ layers. The excitonic state and the emergent phonon sideband show linear dichroism and large energy redshifts with increasing temperature. Within the $GW$ plus Bethe$-$Salpeter equation calculations and solving the generalized Holstein model non-perturbatively, we confirm that the observed sideband feature results from the correlated interaction between excitons and optical phonons. Such a layered material provides a new platform to study excitonic physics and many-particle effects.

preprint2022arXiv

UniVIP: A Unified Framework for Self-Supervised Visual Pre-training

Self-supervised learning (SSL) holds promise in leveraging large amounts of unlabeled data. However, the success of popular SSL methods has limited on single-centric-object images like those in ImageNet and ignores the correlation among the scene and instances, as well as the semantic difference of instances in the scene. To address the above problems, we propose a Unified Self-supervised Visual Pre-training (UniVIP), a novel self-supervised framework to learn versatile visual representations on either single-centric-object or non-iconic dataset. The framework takes into account the representation learning at three levels: 1) the similarity of scene-scene, 2) the correlation of scene-instance, 3) the discrimination of instance-instance. During the learning, we adopt the optimal transport algorithm to automatically measure the discrimination of instances. Massive experiments show that UniVIP pre-trained on non-iconic COCO achieves state-of-the-art transfer performance on a variety of downstream tasks, such as image classification, semi-supervised learning, object detection and segmentation. Furthermore, our method can also exploit single-centric-object dataset such as ImageNet and outperforms BYOL by 2.5% with the same pre-training epochs in linear probing, and surpass current self-supervised object detection methods on COCO dataset, demonstrating its universality and potential.

preprint2021arXiv

Asymmetrically interacting dynamics with mutual confirmation from multi-source on multiplex networks

In the early stage of epidemics, individuals' determination on adopting protective measures, which can reduce their risk of infection and suppress disease spreading, is likely to depend on multiple information sources and their mutual confirmation due to inadequate exact information. Here we introduce the inter-layer mutual confirmation mechanism into the information-disease interacting dynamics on multiplex networks. In our model, an individual increases the information transmission rate and willingness to adopt protective measures once he confirms the authenticity of news and severity of disease from neighbors status in multiple layers. By using the microscopic Markov chain approach, we analytically calculate the epidemic threshold and the awareness and infected density in the stationary state, which agree well with simulation results. We find that the increment of epidemic threshold when confirming the aware neighbors on communication layer is larger than that of the contact layer. On the contrary, the confirmation of neighbors' awareness and infection from the contact layer leads to a lower final infection density and a higher awareness density than that of the communication layer. The results imply that individuals' explicit exposure of their infection and awareness status to neighbors, especially those with real contacts, is helpful in suppressing epidemic spreading.

preprint2021arXiv

Identify Influential Spreaders in Asymmetrically Interacting Multiplex Networks

Identifying the most influential spreaders is important to understand and control the spreading process in a network. As many real-world complex systems can be modeled as multilayer networks, the question of identifying important nodes in multilayer network has attracted much attention. Existing studies focus on the multilayer network structure, while neglecting how the structural and dynamical coupling of multiple layers influence the dynamical importance of nodes in the network. Here we investigate on this question in an information-disease coupled spreading dynamics on multiplex networks. Firstly, we explicitly reveal that three interlayer coupling factors, which are the two-layer relative spreading speed, the interlayer coupling strength and the two-layer degree correlation, significantly impact the spreading influence of a node on the contact layer. The suppression effect from the information layer makes the structural centrality on the contact layer fail to predict the spreading influence of nodes in the multiplex network. Then by mapping the coevolving spreading dynamics into percolation process and using the message-passing approach, we propose a method to calculate the size of the disease outbreaks from a single seed node, which can be used to estimate the nodes' spreading influence in the coevolving dynamics. Our work provides insights on the importance of nodes in the multiplex network and gives a feasible framework to investigate influential spreaders in the asymmetrically coevolving dynamics.

preprint2020arXiv

A Quantitative Analytical Model for Predicting and Optimizing the Rate Performance of Battery Cells

An important objective of designing lithium-ion rechargeable battery cells is to maximize their rate performance without compromising the energy density, which is mainly achieved through computationally expensive numerical simulations at present. Here we present a simple analytical model for predicting the rate performance of battery cells limited by electrolyte transport without any fitting parameters. It exhibits very good agreement with simulations over a wide range of discharge rate and electrode thickness and offers a speedup of >10$^5$ times. The optimal electrode properties predicted by the model are of less than 10% difference from simulation results, suggesting it as an attractive computational tool for the cell-level battery architecture design. The model also offers important insights on practical ways to improve the rate performance of thick electrodes, including avoiding electrode materials such as LiFePO$_4$ and Li$_4$Ti$_5$O$_{12}$ whose open-circuit potentials are insensitive to the state of charge and utilizing lithium metal anode to synergistically accelerate electrolyte transport within thick cathodes.

preprint2020arXiv

Deep Reinforcement Learning for Task Offloading in Mobile Edge Computing Systems

In mobile edge computing systems, an edge node may have a high load when a large number of mobile devices offload their tasks to it. Those offloaded tasks may experience large processing delay or even be dropped when their deadlines expire. Due to the uncertain load dynamics at the edge nodes, it is challenging for each device to determine its offloading decision (i.e., whether to offload or not, and which edge node it should offload its task to) in a decentralized manner. In this work, we consider non-divisible and delay-sensitive tasks as well as edge load dynamics, and formulate a task offloading problem to minimize the expected long-term cost. We propose a model-free deep reinforcement learning-based distributed algorithm, where each device can determine its offloading decision without knowing the task models and offloading decision of other devices. To improve the estimation of the long-term cost in the algorithm, we incorporate the long short-term memory (LSTM), dueling deep Q-network (DQN), and double-DQN techniques. Simulation results with 50 mobile devices and five edge nodes show that the proposed algorithm can reduce the ratio of dropped tasks and average task delay by 86.4%-95.4% and 18.0%-30.1%, respectively, when compared with several existing algorithms.

preprint2020arXiv

Distributed Brillouin frequency shift extraction via a convolutional neural network

Distributed optical fiber Brillouin sensors detect the temperature and strain along a fiber according to the local Brillouin frequency shift, which is usually calculated by the measured Brillouin spectrum using Lorentzian curve fitting. In addition, cross-correlation, principal component analysis, and machine learning methods have been proposed for the more efficient extraction of Brillouin frequency shifts. However, existing methods only process the Brillouin spectrum individually, ignoring the correlation in the time domain, indicating that there is still room for improvement. Here, we propose and experimentally demonstrate a full convolution neural network to extract the distributed Brillouin frequency shift directly from the measured two-dimensional data. Simulated ideal Brillouin spectrum with various parameters are used to train the network. Both the simulation and experimental results show that the extraction accuracy of the network is better than that of the traditional curve fitting algorithm with a much shorter processing time. This network has good universality and robustness and can effectively improve the performances of existing Brillouin sensors.

preprint2020arXiv

Identity-Guided Human Semantic Parsing for Person Re-Identification

Existing alignment-based methods have to employ the pretrained human parsing models to achieve the pixel-level alignment, and cannot identify the personal belongings (e.g., backpacks and reticule) which are crucial to person re-ID. In this paper, we propose the identity-guided human semantic parsing approach (ISP) to locate both the human body parts and personal belongings at pixel-level for aligned person re-ID only with person identity labels. We design the cascaded clustering on feature maps to generate the pseudo-labels of human parts. Specifically, for the pixels of all images of a person, we first group them to foreground or background and then group the foreground pixels to human parts. The cluster assignments are subsequently used as pseudo-labels of human parts to supervise the part estimation and ISP iteratively learns the feature maps and groups them. Finally, local features of both human body parts and personal belongings are obtained according to the selflearned part estimation, and only features of visible parts are utilized for the retrieval. Extensive experiments on three widely used datasets validate the superiority of ISP over lots of state-of-the-art methods. Our code is available at https://github.com/CASIA-IVA-Lab/ISP-reID.

preprint2020arXiv

Learning Feature Embeddings for Discriminant Model based Tracking

After observing that the features used in most online discriminatively trained trackers are not optimal, in this paper, we propose a novel and effective architecture to learn optimal feature embeddings for online discriminative tracking. Our method, called DCFST, integrates the solver of a discriminant model that is differentiable and has a closed-form solution into convolutional neural networks. Then, the resulting network can be trained in an end-to-end way, obtaining optimal feature embeddings for the discriminant model-based tracker. As an instance, we apply the popular ridge regression model in this work to demonstrate the power of DCFST. Extensive experiments on six public benchmarks, OTB2015, NFS, GOT10k, TrackingNet, VOT2018, and VOT2019, show that our approach is efficient and generalizes well to class-agnostic target objects in online tracking, thus achieves state-of-the-art accuracy, while running beyond the real-time speed. Code will be made available.

preprint2020arXiv

Non-Markovian recovery makes complex networks more resilient against large-scale failures

Non-Markovian spontaneous recovery processes with a time delay (memory) are ubiquitous in the real world. How does the non-Markovian characteristic affect failure propagation in complex networks? We consider failures due to internal causes at the nodal level and external failures due to an adverse environment, and develop a pair approximation analysis taking into account the two-node correlation. In general, a high failure stationary state can arise, corresponding to large-scale failures that can significantly compromise the functioning of the network. We uncover a striking phenomenon: memory associated with nodal recovery can counter-intuitively make the network more resilient against large-scale failures. In natural systems, the intrinsic non-Markovian characteristic of nodal recovery may thus be one reason for their resilience. In engineering design, incorporating certain non-Markovian features into the network may be beneficial to equipping it with a strong resilient capability to resist catastrophic failures.

preprint2020arXiv

Online Bitrate Selection for Viewport Adaptive 360-Degree Video Streaming

360-degree video streaming provides users with immersive experience by letting users determine their field-of-views (FoVs) in real time. To enhance the users' quality of experience (QoE) given their limited bandwidth, recent works have proposed a viewport adaptive 360-degree video streaming model by exploiting the bitrate adaptation in spatial and temporal domains. Under this video streaming model, in this paper, we consider a scenario with a newly generated 360-degree video without viewing history from other users. To maximize the user's QoE, we propose an online bitrate selection algorithm, called OBS360. The proposed online algorithm can adapt to the unknown and heterogeneous users' FoVs and downloading capacities. We prove that the proposed algorithm achieves sublinear dynamic regret under a convex decision set. This suggests that as the number of video segments increases, the performance of the online algorithm approaches the performance of the offline algorithm, where the users' FoVs and downloading capacities are known. We perform simulations with real-world dataset to evaluate the performance of the proposed algorithm. Results show that compared with several existing methods, our proposed algorithm can enhance the users' QoE significantly by improving the viewing bitrate and reducing the inter-segment and intra-segment degradation losses of the users.

preprint2020arXiv

Quantitative assessment of the role of undocumented infection in the 2019 novel coronavirus (COVID-19) pandemic

An urgent problem in controlling COVID-19 spreading is to understand the role of undocumented infection. We develop a five-state model for COVID-19, taking into account the unique features of the novel coronavirus, with key parameters determined by the government reports and mathematical optimization. Tests using data from China, South Korea, Italy, and Iran indicate that the model is capable of generating accurate prediction of the daily accumulated number of confirmed cases and is entirely suitable for real-time prediction. The drastically disparate testing and diagnostic standards/policies among different countries lead to large variations in the estimated parameter values such as the duration of the outbreak, but such uncertainties have little effect on the occurrence time of the inflection point as predicted by the model, indicating its reliability and robustness. Model prediction for Italy suggests that insufficient government action leading to a large fraction of undocumented infection plays an important role in the abnormally high mortality in that country. With the data currently available from United Kingdom, our model predicts catastrophic epidemic scenarios in the country if the government did not impose strict travel and social distancing restrictions. A key finding is that, if the percentage of undocumented infection exceeds a threshold, a non-negligible hidden population can exist even after the the epidemic has been deemed over, implying the likelihood of future outbreaks should the currently imposed strict government actions be relaxed. This could make COVID-19 evolving into a long-term epidemic or a community disease a real possibility, suggesting the necessity to conduct universal testing and monitoring to identify the hidden individuals.

preprint2020arXiv

Self-Supervised Learning and Prediction of Microstructure Evolution with Recurrent Neural Networks

Microstructural evolution is a key aspect of understanding and exploiting the structure-property-performance relation of materials. Modeling microstructure evolution usually relies on coarse-grained simulations with evolution principles described by partial differential equations (PDEs). Here we demonstrate that convolutional recurrent neural networks can learn the underlying physical rules and replace PDE-based simulations in the prediction of microstructure phenomena. Neural nets are trained by self-supervised learning with image sequences from simulations of several common processes, including plane wave propagation, grain growth, spinodal decomposition and dendritic crystal growth. The trained networks can accurately predict both short-term local dynamics and long-term statistical properties of microstructures and is capable of extrapolating beyond the training datasets in spatiotemporal domains and configurational and parametric spaces. Such a data-driven approach offers significant advantages over PDE-based simulations in time stepping efficiency and offers a useful alternative especially when the material parameters or governing PDEs are not well determined.

preprint2020arXiv

Stress-Induced Intercalation Instability

We present a linear stability analysis to demonstrate that a flat coherent phase boundary formed by the (de)intercalation of solutes into a compound is unstable against perturbations with wavelengths larger than a critical wavelength. This critical wavelength is controlled by the competition between the interface energy and the elastic strain energy caused by the misfit between the solute-rich and solute-poor phases. It increases with the distance between the phase boundary and free surface of the compound, and so the instability is most pronounced when the boundary is close to the surface at the early stage of the (de)intercalation process. Numerical calculations show that such instability leads to non-uniform intercalation behavior. We find that uniform intercalation cannot be achieved unless the phase boundary moves at a speed greater than a critical velocity. Estimate of the magnitude of this velocity suggests that the stress-induced intercalation instability is generally operative in intercalation compounds used for battery applications.

preprint2020arXiv

Thermodynamic Origin of Reaction Non-Uniformity in Battery Porous Electrodes and its Mitigation

The development of non-uniform reaction current distribution within porous electrodes is a ubiquitous phenomenon during battery charging / discharging and frequently controls the rate performance of battery cells. Reaction inhomogeneity in porous electrodes is usually attributed to the kinetic limitation of mass transport within the electrolyte and/or solid electrode phase. In this work, however, we reveal that it is also strongly influenced by the intrinsic thermodynamic behavior of electrode materials, specifically the dependence of the equilibrium potential on the state of charge: electrode reaction becomes increasingly non-uniform when the slope of the equilibrium potential curve is reduced. We employ numerical simulation and equivalent circuit model to elucidate such a correlation and show that the degree of reaction inhomogeneity and the resultant discharge capacity can be predicted by a dimensionless reaction uniformity number. For electrode materials that have equilibrium potentials insensitive to the state of charge and exhibit significant reaction non-uniformity, we demonstrate several approaches to spatially homogenizing the reaction current inside porous electrodes, including matching the electronic and ionic resistances, introducing graded electronic conductivity and reducing the surface reaction kinetics.

preprint2019arXiv

Learning epidemic threshold in complex networks by Convolutional Neural Network

Deep learning has taken part in the competition since not long ago to learn and identify phase transitions in physical systems such as many body quantum systems, whose underlying lattice structures are generally regular as they're in euclidean space. Real networks have complex structural features which play a significant role in dynamics in them, and thus the structural and dynamical information of complex networks can not be directly learned by existing neural network models. Here we propose a novel and effective framework to learn the epidemic threshold in complex networks by combining the structural and dynamical information into the learning procedure. Considering the strong performance of learning in Euclidean space, Convolutional Neural Network (CNN) is used and, with the help of confusion scheme, we can identify precisely the outbreak threshold of epidemic dynamics. To represent the high dimensional network data set in Euclidean space for CNN, we reduce the dimensionality of a network by using graph representation learning algorithms and discretize the embedded space to convert it into an image-like structure. We then creatively merge the nodal dynamical states with the structural embedding by multi-channel images. In this manner, the proposed model can draw the conclusion from both structural and dynamical information. A large number of simulations show a great performance in both synthetic and empirical network data set. Our end-to-end machine learning framework is robust and universally applicable to complex networks with arbitrary size and topology.

preprint2019arXiv

Machine learning dynamical phase transitions in complex networks

In recent years, machine learning has been adopted to complex networks, but most existing works concern about the structural properties. To use machine learning to detect phase transitions and accurately identify the critical transition point associated with dynamical processes on complex networks thus stands out as an open and significant problem. Here we develop a framework combining supervised and unsupervised learning, incorporating proper sampling of training data set. In particular, using epidemic spreading dynamics on complex networks as a paradigmatic setting, we start from supervised learning alone and identify situations that degrade the performance. To overcome the difficulties leads to the idea of exploiting confusion scheme, effectively a combination of supervised and unsupervised learning. We demonstrate that the scheme performs well for identifying phase transitions associated with spreading dynamics on homogeneous networks, but the performance deteriorates for heterogeneous networks. To strive to meet this challenge leads to the realization that sampling the training data set is necessary for heterogeneous networks, and we test two sampling methods: one based on the hub nodes together with their neighbors and another based on k-core of the network. The end result is a general machine learning framework for detecting phase transition and accurately identifying the critical transition point, which is robust, computationally efficient, and universally applicable to complex networks of arbitrary size and topology. Extensive tests using synthetic and empirical networks verify the virtues of the articulated framework, opening the door to exploiting machine learning for understanding, detection, prediction, and control of complex dynamical systems in general.

preprint2019arXiv

Period-doubling bifurcation of dissipative-soliton-resonance pulses in a passively mode-locked fiber laser

We report on the experimental observation of period-doubling bifurcation of dissipative-soliton-resonance (DSR) pulses in a fiber laser passively mode-locked by using the nonlinear optical loop mirror. Increasing the pump power of the fiber laser, we show that temporally a stable, uniform DSR pulse train could be transformed into a period-doubling state, exhibiting two sets of pulse parameters between the adjacent cavity roundtrip. It is found that DSR pulses in the period-doubling state could maintain the typical feature of DSR pulse: fixed pulse peak power and linear variation in pulse width with respect to the pump power change. The mechanism for achieving period-doubling of DSR pulses is discussed.