Source author record

Jie Mei

Jie Mei appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning Computation and Language math.OC

Catalog footprint

What is connected

8works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A Cascaded Information Interaction Network for Precise Image Segmentation

Visual perception plays a pivotal role in enabling autonomous behavior, offering a cost-effective and efficient alternative to complex multi-sensor systems. However, robust segmentation remains a challenge in complex scenarios. To address this, this paper proposes a cascaded convolutional neural network integrated with a novel Global Information Guidance Module. This module is designed to effectively fuse low-level texture details with high-level semantic features across multiple layers, thereby overcoming the inherent limitations of single-scale feature extraction. This architectural innovation significantly enhances segmentation accuracy, particularly in visually cluttered or blurred environments where traditional methods often fail. Experimental evaluations on benchmark image segmentation datasets demonstrate that the proposed framework achieves superior precision, outperforming existing state-of-the-art methods. The results highlight the effectiveness of the approach and its promising potential for deployment in practical robotic applications.

preprint2022arXiv

HCIL: Hierarchical Class Incremental Learning for Longline Fishing Visual Monitoring

The goal of electronic monitoring of longline fishing is to visually monitor the fish catching activities on fishing vessels based on cameras, either for regulatory compliance or catch counting. The previous hierarchical classification method demonstrates efficient fish species identification of catches from longline fishing, where fishes are under severe deformation and self-occlusion during the catching process. Although the hierarchical classification mitigates the laborious efforts of human reviews by providing confidence scores in different hierarchical levels, its performance drops dramatically under the class incremental learning (CIL) scenario. A CIL system should be able to learn about more and more classes over time from a stream of data, i.e., only the training data for a small number of classes have to be present at the beginning and new classes can be added progressively. In this work, we introduce a Hierarchical Class Incremental Learning (HCIL) model, which significantly improves the state-of-the-art hierarchical classification methods under the CIL scenario.

preprint2022arXiv

NTIRE 2021 Multi-modal Aerial View Object Classification Challenge

In this paper, we introduce the first Challenge on Multi-modal Aerial View Object Classification (MAVOC) in conjunction with the NTIRE 2021 workshop at CVPR. This challenge is composed of two different tracks using EO andSAR imagery. Both EO and SAR sensors possess different advantages and drawbacks. The purpose of this competition is to analyze how to use both sets of sensory information in complementary ways. We discuss the top methods submitted for this competition and evaluate their results on our blind test set. Our challenge results show significant improvement of more than 15% accuracy from our current baselines for each track of the competition

preprint2022arXiv

Unsupervised Severely Deformed Mesh Reconstruction (DMR) from a Single-View Image

Much progress has been made in the supervised learning of 3D reconstruction of rigid objects from multi-view images or a video. However, it is more challenging to reconstruct severely deformed objects from a single-view RGB image in an unsupervised manner. Although training-based methods, such as specific category-level training, have been shown to successfully reconstruct rigid objects and slightly deformed objects like birds from a single-view image, they cannot effectively handle severely deformed objects and neither can be applied to some downstream tasks in the real world due to the inconsistent semantic meaning of vertices, which are crucial in defining the adopted 3D templates of objects to be reconstructed. In this work, we introduce a template-based method to infer 3D shapes from a single-view image and apply the reconstructed mesh to a downstream task, i.e., absolute length measurement. Without using 3D ground truth, our method faithfully reconstructs 3D meshes and achieves state-of-the-art accuracy in a length measurement task on a severely deformed fish dataset.

preprint2021arXiv

Absolute 3D Pose Estimation and Length Measurement of Severely Deformed Fish from Monocular Videos in Longline Fishing

Monocular absolute 3D fish pose estimation allows for efficient fish length measurement in the longline fisheries, where fishes are under severe deformation during the catching process. This task is challenging since it requires locating absolute 3D fish keypoints based on a short monocular video clip. Unlike related works, which either require expensive 3D ground-truth data and/or multiple-view images to provide depth information, or are limited to rigid objects, we propose a novel frame-based method to estimate the absolute 3D fish pose and fish length from a single-view 2D segmentation mask. We first introduce a relative 3D fish template. By minimizing an objective function, our method systematically estimates the relative 3D pose of the target fish and fish 2D keypoints in the image. Finally, with a closed-form solution, the relative 3D fish pose can help locate absolute 3D keypoints, resulting in the frame-based absolute fish length measurement, which is further refined based on the statistical temporal inference for the optimal fish length measurement from the video clip. Our experiments show that this method can accurately estimate the absolute 3D fish pose and further measure the absolute length, even outperforming the state-of-the-art multi-view method.

preprint2021arXiv

Video-based Hierarchical Species Classification for Longline Fishing Monitoring

The goal of electronic monitoring (EM) of longline fishing is to monitor the fish catching activities on fishing vessels, either for the regulatory compliance or catch counting. Hierarchical classification based on videos allows for inexpensive and efficient fish species identification of catches from longline fishing, where fishes are under severe deformation and self-occlusion during the catching process. More importantly, the flexibility of hierarchical classification mitigates the laborious efforts of human reviews by providing confidence scores in different hierarchical levels. Some related works either use cascaded models for hierarchical classification or make predictions per image or predict one overlapping hierarchical data structure of the dataset in advance. However, with a known non-overlapping hierarchical data structure provided by fisheries scientists, our method enforces the hierarchical data structure and introduces an efficient training and inference strategy for video-based fisheries data. Our experiments show that the proposed method outperforms the classic flat classification system significantly and our ablation study justifies our contributions in CNN model design, training strategy, and the video-based inference schemes for the hierarchical fish species classification task.

preprint2016arXiv

Statistical Learning for OCR Text Correction

The accuracy of Optical Character Recognition (OCR) is crucial to the success of subsequent applications used in text analyzing pipeline. Recent models of OCR post-processing significantly improve the quality of OCR-generated text, but are still prone to suggest correction candidates from limited observations while insufficiently accounting for the characteristics of OCR errors. In this paper, we show how to enlarge candidate suggestion space by using external corpus and integrating OCR-specific features in a regression approach to correct OCR-generated errors. The evaluation results show that our model can correct 61.5% of the OCR-errors (considering the top 1 suggestion) and 71.5% of the OCR-errors (considering the top 3 suggestions), for cases where the theoretical correction upper-bound is 78%.

preprint2015arXiv

Fully Distributed Flocking with a Moving Leader for Lagrange Networks with Parametric Uncertainties

This paper addresses the leader-follower flocking problem with a moving leader for networked Lagrange systems with parametric uncertainties under a proximity graph. Here a group of followers move cohesively with the moving leader to maintain connectivity and avoid collisions for all time and also eventually achieve velocity matching. In the proximity graph, the neighbor relationship is defined according to the relative distance between each pair of agents. Each follower is able to obtain information from only the neighbors in its proximity, involving only local interaction. We consider two cases: i) the leader moves with a constant velocity, and ii) the leader moves with a varying velocity. In the first case, a distributed continuous adaptive control algorithm accounting for unknown parameters is proposed in combination with a distributed continuous estimator for each follower. In the second case, a distributed discontinuous adaptive control algorithm and estimator are proposed. Then the algorithm is extended to be fully distributed with the introduction of gain adaptation laws. In all proposed algorithms, only one-hop neighbors' information (e.g., the relative position and velocity measurements between the neighbors and the absolute position and velocity measurements) is required, and flocking is achieved as long as the connectivity and collision avoidance are ensured at the initial time and the control gains are designed properly. Numerical simulations are presented to illustrate the theoretical results.

Jie Mei

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

A Cascaded Information Interaction Network for Precise Image Segmentation

HCIL: Hierarchical Class Incremental Learning for Longline Fishing Visual Monitoring

NTIRE 2021 Multi-modal Aerial View Object Classification Challenge

Unsupervised Severely Deformed Mesh Reconstruction (DMR) from a Single-View Image

Absolute 3D Pose Estimation and Length Measurement of Severely Deformed Fish from Monocular Videos in Longline Fishing

Video-based Hierarchical Species Classification for Longline Fishing Monitoring

Statistical Learning for OCR Text Correction

Fully Distributed Flocking with a Moving Leader for Lagrange Networks with Parametric Uncertainties