Researcher profile

Jian Shi

Jian Shi contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2026arXiv

Exploring Reliable Spatiotemporal Dependencies for Efficient Visual Tracking

Recent advances in transformer-based lightweight object tracking have established new standards across benchmarks, leveraging the global receptive field and powerful feature extraction capabilities of attention mechanisms. Despite these achievements, existing methods universally employ sparse sampling during training--utilizing only one template and one search image per sequence--which fails to comprehensively explore spatiotemporal information in videos. This limitation constrains performance and cause the gap between lightweight and high-performance trackers. To bridge this divide while maintaining real-time efficiency, we propose STDTrack, a framework that pioneers the integration of reliable spatiotemporal dependencies into lightweight trackers. Our approach implements dense video sampling to maximize spatiotemporal information utilization. We introduce a temporally propagating spatiotemporal token to guide per-frame feature extraction. To ensure comprehensive target state representation, we disign the Multi-frame Information Fusion Module (MFIFM), which augments current dependencies using historical context. The MFIFM operates on features stored in our constructed Spatiotemporal Token Maintainer (STM), where a quality-based update mechanism ensures information reliability. Considering the scale variation among tracking targets, we develop a multi-scale prediction head to dynamically adapt to objects of different sizes. Extensive experiments demonstrate state-of-the-art results across six benchmarks. Notably, on GOT-10k, STDTrack rivals certain high-performance non-real-time trackers (e.g., MixFormer) while operating at 192 FPS(GPU) and 41 FPS(CPU).

preprint2026arXiv

Will the Carbon Border Adjustment Mechanism Impact European Electricity Prices? A GNN-Based Network Analysis

The European Union's Carbon Border Adjustment Mechanism (CBAM) creates a complex challenge for the interconnected European electricity market. Traditional static analyses often miss the cross-border spillover effects that are vital for understanding this policy. This paper addresses this gap by developing a spatio-temporal Graph Neural Network (GNN) framework. It quantifies how CBAM affects electricity prices and carbon intensity (CI) at the same time. We modeled a subgraph of eight European countries. Our results suggest that CBAM is not just a uniform tax. Instead, it acts as a tool that transforms the market and creates structural differences. In our simulated scenarios, we observe that low-carbon countries like France and Switzerland can gain a competitive advantage. This suggests a potential decrease in their domestic electricity prices. Meanwhile, high-carbon countries like Poland face a double burden of rising costs. We identify the primary driver as a fundamental shift in the market's merit order.

preprint2022arXiv

DeepCatra: Learning Flow- and Graph-based Behaviors for Android Malware Detection

As Android malware is growing and evolving, deep learning has been introduced into malware detection, resulting in great effectiveness. Recent work is considering hybrid models and multi-view learning. However, they use only simple features, limiting the accuracy of these approaches in practice. In this paper, we propose DeepCatra, a multi-view learning approach for Android malware detection, whose model consists of a bidirectional LSTM (BiLSTM) and a graph neural network (GNN) as subnets. The two subnets rely on features extracted from statically computed call traces leading to critical APIs derived from public vulnerabilities. For each Android app, DeepCatra first constructs its call graph and computes call traces reaching critical APIs. Then, temporal opcode features used by the BiLSTM subnet are extracted from the call traces, while flow graph features used by the GNN subnet are constructed from all the call traces and inter-component communications. We evaluate the effectiveness of DeepCatra by comparing it with several state-of-the-art detection approaches. Experimental results on over 18,000 real-world apps and prevalent malware show that DeepCatra achieves considerable improvement, e.g., 2.7% to 14.6% on F1-measure, which demonstrates the feasibility of DeepCatra in practice.

preprint2022arXiv

Filtering electrons by mode coupling in finite semiconductor superlattices

Electron transmission through semiconductor superlattices is studied with transfer matrix method and resonance theory. The formation of electron band-pass transmission is ascribed to the coupling of different modes in those semiconductor superlattices with the symmetric unit cell. Upon Fabry-Pérot resonance condition, Bloch modes and two other resonant modes are identified to be related to the nature of the superlattice and its unit cell, respectively. The bands related to the unit cell and the superlattice overlap spontaneously in the tunneling region due to the shared wells, and the coupling of perfectly resonances results in the band-pass tunneling. Our findings provide a promising way to study electronic systems with more complicated superlattices or even optical systems with photonic crystals.

preprint2022arXiv

Semantic decomposition Network with Contrastive and Structural Constraints for Dental Plaque Segmentation

Segmenting dental plaque from images of medical reagent staining provides valuable information for diagnosis and the determination of follow-up treatment plan. However, accurate dental plaque segmentation is a challenging task that requires identifying teeth and dental plaque subjected to semantic-blur regions (i.e., confused boundaries in border regions between teeth and dental plaque) and complex variations of instance shapes, which are not fully addressed by existing methods. Therefore, we propose a semantic decomposition network (SDNet) that introduces two single-task branches to separately address the segmentation of teeth and dental plaque and designs additional constraints to learn category-specific features for each branch, thus facilitating the semantic decomposition and improving the performance of dental plaque segmentation. Specifically, SDNet learns two separate segmentation branches for teeth and dental plaque in a divide-and-conquer manner to decouple the entangled relation between them. Each branch that specifies a category tends to yield accurate segmentation. To help these two branches better focus on category-specific features, two constraint modules are further proposed: 1) contrastive constraint module (CCM) to learn discriminative feature representations by maximizing the distance between different category representations, so as to reduce the negative impact of semantic-blur regions on feature extraction; 2) structural constraint module (SCM) to provide complete structural information for dental plaque of various shapes by the supervision of an boundary-aware geometric constraint. Besides, we construct a large-scale open-source Stained Dental Plaque Segmentation dataset (SDPSeg), which provides high-quality annotations for teeth and dental plaque. Experimental results on SDPSeg datasets show SDNet achieves state-of-the-art performance.

preprint2022arXiv

TorMentor: Deterministic dynamic-path, data augmentations with fractals

We propose the use of fractals as a means of efficient data augmentation. Specifically, we employ plasma fractals for adapting global image augmentation transformations into continuous local transforms. We formulate the diamond square algorithm as a cascade of simple convolution operations allowing efficient computation of plasma fractals on the GPU. We present the TorMentor image augmentation framework that is totally modular and deterministic across images and point-clouds. All image augmentation operations can be combined through pipelining and random branching to form flow networks of arbitrary width and depth. We demonstrate the efficiency of the proposed approach with experiments on document image segmentation (binarization) with the DIBCO datasets. The proposed approach demonstrates superior performance to traditional image augmentation techniques. Finally, we use extended synthetic binary text images in a self-supervision regiment and outperform the same model when trained with limited data and simple extensions.

preprint2022arXiv

Upsampling Autoencoder for Self-Supervised Point Cloud Learning

In computer-aided design (CAD) community, the point cloud data is pervasively applied in reverse engineering, where the point cloud analysis plays an important role. While a large number of supervised learning methods have been proposed to handle the unordered point clouds and demonstrated their remarkable success, their performance and applicability are limited to the costly data annotation. In this work, we propose a novel self-supervised pretraining model for point cloud learning without human annotations, which relies solely on upsampling operation to perform feature learning of point cloud in an effective manner. The key premise of our approach is that upsampling operation encourages the network to capture both high-level semantic information and low-level geometric information of the point cloud, thus the downstream tasks such as classification and segmentation will benefit from the pre-trained model. Specifically, our method first conducts the random subsampling from the input point cloud at a low proportion e.g., 12.5%. Then, we feed them into an encoder-decoder architecture, where an encoder is devised to operate only on the subsampled points, along with a upsampling decoder is adopted to reconstruct the original point cloud based on the learned features. Finally, we design a novel joint loss function which enforces the upsampled points to be similar with the original point cloud and uniformly distributed on the underlying shape surface. By adopting the pre-trained encoder weights as initialisation of models for downstream tasks, we find that our UAE outperforms previous state-of-the-art methods in shape classification, part segmentation and point cloud upsampling tasks. Code will be made publicly available upon acceptance.

preprint2020arXiv

Multi-year Long-term Load Forecast for Area Distribution Feeders based on Selective Sequence Learning

Long-term load forecast (LTLF) for area distribution feeders is one of the most critical tasks frequently performed in electric distribution utility companies. For a specific planning area, cost-effective system upgrades can only be planned out based on accurate feeder LTLF results. In our previous research, we established a unique sequence prediction method which has the tremendous advantage of combining area top-down, feeder bottom-up and multi-year historical data all together for forecast and achieved a superior performance over various traditional methods by real-world tests. However, the previous method only focused on the forecast of the next one-year. In our current work, we significantly improved this method: the forecast can now be extended to a multi-year forecast window in the future; unsupervised learning techniques are used to group feeders by their load composition features to improve accuracy; we also propose a novel selective sequence learning mechanism which uses Gated Recurrent Unit network to not only learn how to predict sequence values but also learn to select the best-performing sequential configuration for each individual feeder. The proposed method was tested on an actual urban distribution system in West Canada. It was compared with traditional methods and our previous sequence prediction method. It demonstrates the best forecasting performance as well as the possibility of using sequence prediction models for multi-year component-level load forecast.