Source author record

Fengqing Zhu

Fengqing Zhu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision eess.IV Machine Learning Artificial Intelligence

Catalog footprint

What is connected

13works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Instance camera focus prediction for crystal agglomeration classification

Agglomeration refers to the process of crystal clustering due to interparticle forces. Crystal agglomeration analysis from microscopic images is challenging due to the inherent limitations of two-dimensional imaging. Overlapping crystals may appear connected even when located at different depth layers. Because optical microscopes have a shallow depth of field, crystals that are in-focus and out-of-focus in the same image typically reside on different depth layers and do not constitute true agglomeration. To address this, we first quantified camera focus with an instance camera focus prediction network to predict 2 class focus level that aligns better with visual observations than traditional image processing focus measures. Then an instance segmentation model is combined with the predicted focus level for agglomeration classification. Our proposed method has a higher agglomeration classification and segmentation accuracy than the baseline models on ammonium perchlorate crystal and sugar crystal dataset.

preprint2022arXiv

3DG-STFM: 3D Geometric Guided Student-Teacher Feature Matching

We tackle the essential task of finding dense visual correspondences between a pair of images. This is a challenging problem due to various factors such as poor texture, repetitive patterns, illumination variation, and motion blur in practical scenarios. In contrast to methods that use dense correspondence ground-truths as direct supervision for local feature matching training, we train 3DG-STFM: a multi-modal matching model (Teacher) to enforce the depth consistency under 3D dense correspondence supervision and transfer the knowledge to 2D unimodal matching model (Student). Both teacher and student models consist of two transformer-based matching modules that obtain dense correspondences in a coarse-to-fine manner. The teacher model guides the student model to learn RGB-induced depth information for the matching purpose on both coarse and fine branches. We also evaluate 3DG-STFM on a model compression task. To the best of our knowledge, 3DG-STFM is the first student-teacher learning method for the local feature matching task. The experiments show that our method outperforms state-of-the-art methods on indoor and outdoor camera pose estimations, and homography estimation problems. Code is available at: https://github.com/Ryan-prime/3DG-STFM.

preprint2022arXiv

Exemplar-free Online Continual Learning

Targeted for real world scenarios, online continual learning aims to learn new tasks from sequentially available data under the condition that each data is observed only once by the learner. Though recent works have made remarkable achievements by storing part of learned task data as exemplars for knowledge replay, the performance is greatly relied on the size of stored exemplars while the storage consumption is a significant constraint in continual learning. In addition, storing exemplars may not always be feasible for certain applications due to privacy concerns. In this work, we propose a novel exemplar-free method by leveraging nearest-class-mean (NCM) classifier where the class mean is estimated during training phase on all data seen so far through online mean update criteria. We focus on image classification task and conduct extensive experiments on benchmark datasets including CIFAR-100 and Food-1k. The results demonstrate that our method without using any exemplar outperforms state-of-the-art exemplar-based approaches with large margins under standard protocol (20 exemplars per class) and is able to achieve competitive performance even with larger exemplar size (100 exemplars per class).

preprint2022arXiv

Image Based Food Energy Estimation With Depth Domain Adaptation

Assessment of dietary intake has primarily relied on self-report instruments, which are prone to measurement errors. Dietary assessment methods have increasingly incorporated technological advances particularly mobile, image based approaches to address some of these limitations and further automation. Mobile, image-based methods can reduce user burden and bias by automatically estimating dietary intake from eating occasion images that are captured by mobile devices. In this paper, we propose an "Energy Density Map" which is a pixel-to-pixel mapping from the RGB image to the energy density of the food. We then incorporate the "Energy Density Map" with an associated depth map that is captured by a depth sensor to estimate the food energy. The proposed method is evaluated on the Nutrition5k dataset. Experimental results show improved results compared to baseline methods with an average error of 13.29 kCal and an average percentage error of 13.57% between the ground-truth and the estimated energy of the food.

preprint2022arXiv

Out-Of-Distribution Detection In Unsupervised Continual Learning

Unsupervised continual learning aims to learn new tasks incrementally without requiring human annotations. However, most existing methods, especially those targeted on image classification, only work in a simplified scenario by assuming all new data belong to new tasks, which is not realistic if the class labels are not provided. Therefore, to perform unsupervised continual learning in real life applications, an out-of-distribution detector is required at beginning to identify whether each new data corresponds to a new task or already learned tasks, which still remains under-explored yet. In this work, we formulate the problem for Out-of-distribution Detection in Unsupervised Continual Learning (OOD-UCL) with the corresponding evaluation protocol. In addition, we propose a novel OOD detection method by correcting the output bias at first and then enhancing the output confidence for in-distribution data based on task discriminativeness, which can be applied directly without modifying the learning procedures and objectives of continual learning. Our method is evaluated on CIFAR-100 dataset by following the proposed evaluation protocol and we show improved performance compared with existing OOD detection methods under the unsupervised continual learning scenario.

preprint2022arXiv

Simulating Personal Food Consumption Patterns using a Modified Markov Chain

Food image classification serves as the foundation of image-based dietary assessment to predict food categories. Since there are many different food classes in real life, conventional models cannot achieve sufficiently high accuracy. Personalized classifiers aim to largely improve the accuracy of food image classification for each individual. However, a lack of public personal food consumption data proves to be a challenge for training such models. To address this issue, we propose a novel framework to simulate personal food consumption data patterns, leveraging the use of a modified Markov chain model and self-supervised learning. Our method is capable of creating an accurate future data pattern from a limited amount of initial data, and our simulated data patterns can be closely correlated with the initial data pattern. Furthermore, we use Dynamic Time Warping distance and Kullback-Leibler divergence as metrics to evaluate the effectiveness of our method on the public Food-101 dataset. Our experimental results demonstrate promising performance compared with random simulation and the original Markov chain method.

preprint2022arXiv

Towards the Creation of a Nutrition and Food Group Based Image Database

Food classification is critical to the analysis of nutrients comprising foods reported in dietary assessment. Advances in mobile and wearable sensors, combined with new image based methods, particularly deep learning based approaches, have shown great promise to improve the accuracy of food classification to assess dietary intake. However, these approaches are data-hungry and their performances are heavily reliant on the quantity and quality of the available datasets for training the food classification model. Existing food image datasets are not suitable for fine-grained food classification and the following nutrition analysis as they lack fine-grained and transparently derived food group based identification which are often provided by trained dietitians with expert domain knowledge. In this paper, we propose a framework to create a nutrition and food group based image database that contains both visual and hierarchical food categorization information to enhance links to the nutrient profile of each food. We design a protocol for linking food group based food codes in the U.S. Department of Agriculture's (USDA) Food and Nutrient Database for Dietary Studies (FNDDS) to a food image dataset, and implement a web-based annotation tool for efficient deployment of this protocol.Our proposed method is used to build a nutrition and food group based image database including 16,114 food images representing the 74 most frequently consumed What We Eat in America (WWEIA) food sub-categories in the United States with 1,865 USDA food code matched to a nutrient database, the USDA FNDDS nutrient database.

preprint2021arXiv

Advances In Video Compression System Using Deep Neural Network: A Review And Case Studies

Significant advances in video compression system have been made in the past several decades to satisfy the nearly exponential growth of Internet-scale video traffic. From the application perspective, we have identified three major functional blocks including pre-processing, coding, and post-processing, that have been continuously investigated to maximize the end-user quality of experience (QoE) under a limited bit rate budget. Recently, artificial intelligence (AI) powered techniques have shown great potential to further increase the efficiency of the aforementioned functional blocks, both individually and jointly. In this article, we review extensively recent technical advances in video compression system, with an emphasis on deep neural network (DNN)-based approaches; and then present three comprehensive case studies. On pre-processing, we show a switchable texture-based video coding example that leverages DNN-based scene understanding to extract semantic areas for the improvement of subsequent video coder. On coding, we present an end-to-end neural video coding framework that takes advantage of the stacked DNNs to efficiently and compactly code input raw videos via fully data-driven learning. On post-processing, we demonstrate two neural adaptive filters to respectively facilitate the in-loop and post filtering for the enhancement of compressed frames. Finally, a companion website hosting the contents developed in this work can be accessed publicly at https://purdueviper.github.io/dnn-coding/.

preprint2021arXiv

An End-to-End Food Image Analysis System

Modern deep learning techniques have enabled advances in image-based dietary assessment such as food recognition and food portion size estimation. Valuable information on the types of foods and the amount consumed are crucial for prevention of many chronic diseases. However, existing methods for automated image-based food analysis are neither end-to-end nor are capable of processing multiple tasks (e.g., recognition and portion estimation) together, making it difficult to apply to real life applications. In this paper, we propose an image-based food analysis framework that integrates food localization, classification and portion size estimation. Our proposed framework is end-to-end, i.e., the input can be an arbitrary food image containing multiple food items and our system can localize each single food item with its corresponding predicted food type and portion size. We also improve the single food portion estimation by consolidating localization results with a food energy distribution map obtained by conditional GAN to generate a four-channel RGB-Distribution image. Our end-to-end framework is evaluated on a real life food image dataset collected from a nutrition feeding study.

preprint2021arXiv

Saliency-Aware Class-Agnostic Food Image Segmentation

Advances in image-based dietary assessment methods have allowed nutrition professionals and researchers to improve the accuracy of dietary assessment, where images of food consumed are captured using smartphones or wearable devices. These images are then analyzed using computer vision methods to estimate energy and nutrition content of the foods. Food image segmentation, which determines the regions in an image where foods are located, plays an important role in this process. Current methods are data dependent, thus cannot generalize well for different food types. To address this problem, we propose a class-agnostic food image segmentation method. Our method uses a pair of eating scene images, one before start eating and one after eating is completed. Using information from both the before and after eating images, we can segment food images by finding the salient missing objects without any prior information about the food class. We model a paradigm of top down saliency which guides the attention of the human visual system (HVS) based on a task to find the salient missing objects in a pair of images. Our method is validated on food images collected from a dietary study which showed promising results.

preprint2021arXiv

Turkey Behavior Identification System with a GUI Using Deep Learning and Video Analytics

In this paper, we propose a video analytics system to identify the behavior of turkeys. Turkey behavior provides evidence to assess turkey welfare, which can be negatively impacted by uncomfortable ambient temperature and various diseases. In particular, healthy and sick turkeys behave differently in terms of the duration and frequency of activities such as eating, drinking, preening, and aggressive interactions. Our system incorporates recent advances in object detection and tracking to automate the process of identifying and analyzing turkey behavior captured by commercial grade cameras. We combine deep-learning and traditional image processing methods to address challenges in this practical agricultural problem. Our system also includes a web-based user interface to create visualization of automated analysis results. Together, we provide an improved tool for turkey researchers to assess turkey welfare without the time-consuming and labor-intensive manual inspection.

preprint2020arXiv

Deepfakes Detection with Automatic Face Weighting

Altered and manipulated multimedia is increasingly present and widely distributed via social media platforms. Advanced video manipulation tools enable the generation of highly realistic-looking altered multimedia. While many methods have been presented to detect manipulations, most of them fail when evaluated with data outside of the datasets used in research environments. In order to address this problem, the Deepfake Detection Challenge (DFDC) provides a large dataset of videos containing realistic manipulations and an evaluation system that ensures that methods work quickly and accurately, even when faced with challenging data. In this paper, we introduce a method based on convolutional neural networks (CNNs) and recurrent neural networks (RNNs) that extracts visual and temporal features from faces present in videos to accurately detect manipulations. The method is evaluated with the DFDC dataset, providing competitive results compared to other techniques.

preprint2020arXiv

Multi-Task Image-Based Dietary Assessment for Food Recognition and Portion Size Estimation

Deep learning based methods have achieved impressive results in many applications for image-based diet assessment such as food classification and food portion size estimation. However, existing methods only focus on one task at a time, making it difficult to apply in real life when multiple tasks need to be processed together. In this work, we propose an end-to-end multi-task framework that can achieve both food classification and food portion size estimation. We introduce a food image dataset collected from a nutrition study where the groundtruth food portion is provided by registered dietitians. The multi-task learning uses L2-norm based soft parameter sharing to train the classification and regression tasks simultaneously. We also propose the use of cross-domain feature adaptation together with normalization to further improve the performance of food portion size estimation. Our results outperforms the baseline methods for both classification accuracy and mean absolute error for portion estimation, which shows great potential for advancing the field of image-based dietary assessment.

Fengqing Zhu

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

Instance camera focus prediction for crystal agglomeration classification

3DG-STFM: 3D Geometric Guided Student-Teacher Feature Matching

Exemplar-free Online Continual Learning

Image Based Food Energy Estimation With Depth Domain Adaptation

Out-Of-Distribution Detection In Unsupervised Continual Learning

Simulating Personal Food Consumption Patterns using a Modified Markov Chain

Towards the Creation of a Nutrition and Food Group Based Image Database

Advances In Video Compression System Using Deep Neural Network: A Review And Case Studies

An End-to-End Food Image Analysis System

Saliency-Aware Class-Agnostic Food Image Segmentation

Turkey Behavior Identification System with a GUI Using Deep Learning and Video Analytics

Deepfakes Detection with Automatic Face Weighting

Multi-Task Image-Based Dietary Assessment for Food Recognition and Portion Size Estimation