Source author record

Zhen He

Zhen He appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision math.CO Databases Artificial Intelligence Distributed, Parallel, and Cluster Computing eess.SY Machine Learning Systems and Control

Catalog footprint

What is connected

13works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

Group Activity Recognition using Unreliable Tracked Pose

Group activity recognition in video is a complex task due to the need for a model to recognise the actions of all individuals in the video and their complex interactions. Recent studies propose that optimal performance is achieved by individually tracking each person and subsequently inputting the sequence of poses or cropped images/optical flow into a model. This helps the model to recognise what actions each person is performing before they are merged to arrive at the group action class. However, all previous models are highly reliant on high quality tracking and have only been evaluated using ground truth tracking information. In practice it is almost impossible to achieve highly reliable tracking information for all individuals in a group activity video. We introduce an innovative deep learning-based group activity recognition approach called Rendered Pose based Group Activity Recognition System (RePGARS) which is designed to be tolerant of unreliable tracking and pose information. Experimental results confirm that RePGARS outperforms all existing group activity recognition algorithms tested which do not use ground truth detection and tracking information.

preprint2022arXiv

Adaptive Smooth Disturbance Observer-Based Fast Finite-Time Attitude Tracking Control of a Small Unmanned Helicopter

In this paper, a novel adaptive smooth disturbance observer-based fast finite-time adaptive backstepping control scheme is presented for the attitude tracking of the 3-DOF helicopter system subject to compound disturbances. First, an adaptive smooth disturbance observer (ASDO) is proposed to estimate the composite disturbance, which owns the characteristics of smooth output, fast finite-time convergence, and adaptability to the disturbance of unknown derivative boundary. Then, a finite-time backstepping control protocol is construct to drive the elevation and pitch angles to track reference trajectories. To tackle the "explosion of complexity" and "singularity" problems in the conventional backstepping design framework, a fast finite-time command filter (FFTCF) is utilized to estimate the virtual control signal and its derivative. Moreover, a fractional power-based auxiliary dynamic system is introduced to compensate the error caused by the FFTCF estimation. Furthermore, an improved fractional power-based adaptive law with the $σ$-modification term is designed to attenuate the observer approximation error, such that the tracking performance is further enhanced. In terms of the fast finite-time stability theory, the signals of the closed-loop system are all fast finite-time bounded while the attitude tracking errors can fast converge to a sufficiently small region of the origin in finite time. Finally, a contrastive numerical simulation is carried out to validate the effectiveness and superiority of the designed control scheme.

preprint2022arXiv

Automatic Generation of Product-Image Sequence in E-commerce

Product images are essential for providing desirable user experience in an e-commerce platform. For a platform with billions of products, it is extremely time-costly and labor-expensive to manually pick and organize qualified images. Furthermore, there are the numerous and complicated image rules that a product image needs to comply in order to be generated/selected. To address these challenges, in this paper, we present a new learning framework in order to achieve Automatic Generation of Product-Image Sequence (AGPIS) in e-commerce. To this end, we propose a Multi-modality Unified Image-sequence Classifier (MUIsC), which is able to simultaneously detect all categories of rule violations through learning. MUIsC leverages textual review feedback as the additional training target and utilizes product textual description to provide extra semantic information. Based on offline evaluations, we show that the proposed MUIsC significantly outperforms various baselines. Besides MUIsC, we also integrate some other important modules in the proposed framework, such as primary image selection, noncompliant content detection, and image deduplication. With all these modules, our framework works effectively and efficiently in JD.com recommendation platform. By Dec 2021, our AGPIS framework has generated high-standard images for about 1.5 million products and achieves 13.6% in reject rate.

preprint2022arXiv

Deep Graph Learning for Spatially-Varying Indoor Lighting Prediction

Lighting prediction from a single image is becoming increasingly important in many vision and augmented reality (AR) applications in which shading and shadow consistency between virtual and real objects should be guaranteed. However, this is a notoriously ill-posed problem, especially for indoor scenarios, because of the complexity of indoor luminaires and the limited information involved in 2D images. In this paper, we propose a graph learning-based framework for indoor lighting estimation. At its core is a new lighting model (dubbed DSGLight) based on depth-augmented Spherical Gaussians (SG) and a Graph Convolutional Network (GCN) that infers the new lighting representation from a single LDR image of limited field-of-view. Our lighting model builds 128 evenly distributed SGs over the indoor panorama, where each SG encoding the lighting and the depth around that node. The proposed GCN then learns the mapping from the input image to DSGLight. Compared with existing lighting models, our DSGLight encodes both direct lighting and indirect environmental lighting more faithfully and compactly. It also makes network training and inference more stable. The estimated depth distribution enables temporally stable shading and shadows under spatially-varying lighting. Through thorough experiments, we show that our method obviously outperforms existing methods both qualitatively and quantitatively.

preprint2022arXiv

Exact results for generalized extremal problems forbidding an even cycle

We determine the maximum number of copies of $K_{s,s}$ in a $C_{2s+2}$-free $n$-vertex graph for all integers $s \ge 2$ and sufficiently large $n$. Moreover, for $s\in\{2,3\}$ and any integer $n$ we obtain the maximum number of cycles of length $2s$ in an $n$-vertex $C_{2s+2}$-free bipartite graph.

preprint2022arXiv

Scenario-based Multi-product Advertising Copywriting Generation for E-Commerce

In this paper, we proposed an automatic Scenario-based Multi-product Advertising Copywriting Generation system (SMPACG) for E-Commerce, which has been deployed on a leading Chinese e-commerce platform. The proposed SMPACG consists of two main components: 1) an automatic multi-product combination selection module, which itself is consisted of a topic prediction model, a pattern and attribute-based selection model and an arbitrator model; and 2) an automatic multi-product advertising copywriting generation module, which combines our proposed domain-specific pretrained language model and knowledge-based data enhancement model. The SMPACG is the first system that realizes automatic scenario-based multi-product advertising contents generation, which achieves significant improvements over other state-of-the-art methods. The SMPACG has been not only developed for directly serving for our e-commerce recommendation system, but also used as a real-time writing assistant tool for merchants.

preprint2022arXiv

Spatial Transformer Network with Transfer Learning for Small-scale Fine-grained Skeleton-based Tai Chi Action Recognition

Human action recognition is a quite hugely investigated area where most remarkable action recognition networks usually use large-scale coarse-grained action datasets of daily human actions as inputs to state the superiority of their networks. We intend to recognize our small-scale fine-grained Tai Chi action dataset using neural networks and propose a transfer-learning method using NTU RGB+D dataset to pre-train our network. More specifically, the proposed method first uses a large-scale NTU RGB+D dataset to pre-train the Transformer-based network for action recognition to extract common features among human motion. Then we freeze the network weights except for the fully connected (FC) layer and take our Tai Chi actions as inputs only to train the initialized FC weights. Experimental results show that our general model pipeline can reach a high accuracy of small-scale fine-grained Tai Chi action recognition with even few inputs and demonstrate that our method achieves the state-of-the-art performance compared with previous Tai Chi action recognition methods.

preprint2022arXiv

Stability version of Dirac's theorem and its applications for generalized Turán problems

In 1952, Dirac proved that every $2$-connected $n$-vertex graph with the minimum degree $k+1$ contains a cycle of length at least $\min\{n, 2(k+1)\}$. Here we obtain a stability version of this result by characterizing those graphs with minimum degree $k$ and circumference at most $2k+1$. We present applications of the above-stated result by obtaining generalized Turán numbers. In particular, for all $\ell \geq 5$ we determine how many copies of a five-cycle as well as four-cycle are necessary to guarantee that the graph has circumference larger than $\ell$. In addition, we give a new proof of Luo's Theorem for cliques using our stability result.

preprint2022arXiv

The maximum number of copies of an even cycle in a planar graph

We resolve a conjecture of Cox and Martin by determining asymptotically for every $k\ge 2$ the maximum number of copies of $C_{2k}$ in an $n$-vertex planar graph.

preprint2021arXiv

Pose is all you need: The pose only group activity recognition system (POGARS)

We introduce a novel deep learning based group activity recognition approach called the Pose Only Group Activity Recognition System (POGARS), designed to use only tracked poses of people to predict the performed group activity. In contrast to existing approaches for group activity recognition, POGARS uses 1D CNNs to learn spatiotemporal dynamics of individuals involved in a group activity and forgo learning features from pixel data. The proposed model uses a spatial and temporal attention mechanism to infer person-wise importance and multi-task learning for simultaneously performing group and individual action classification. Experimental results confirm that POGARS achieves highly competitive results compared to state-of-the-art methods on a widely used public volleyball dataset despite only using tracked pose as input. Further our experiments show by using pose only as input, POGARS has better generalization capabilities compared to methods that use RGB as input.

preprint2020arXiv

Distributed Training of Deep Learning Models: A Taxonomic Perspective

Distributed deep learning systems (DDLS) train deep neural network models by utilizing the distributed resources of a cluster. Developers of DDLS are required to make many decisions to process their particular workloads in their chosen environment efficiently. The advent of GPU-based deep learning, the ever-increasing size of datasets and deep neural network models, in combination with the bandwidth constraints that exist in cluster environments require developers of DDLS to be innovative in order to train high quality models quickly. Comparing DDLS side-by-side is difficult due to their extensive feature lists and architectural deviations. We aim to shine some light on the fundamental principles that are at work when training deep neural networks in a cluster of independent machines by analyzing the general properties associated with training deep learning models and how such workloads can be distributed in a cluster to achieve collaborative model training. Thereby we provide an overview of the different techniques that are used by contemporary DDLS and discuss their influence and implications on the training process. To conceptualize and compare DDLS, we group different techniques into categories, thus establishing a taxonomy of distributed deep learning systems.

preprint2017arXiv

Evaluating the Dynamic Behavior of Database Applications

This paper explores the effect that changing access patterns has on the performance of database management systems. Changes in access patterns play an important role in determining the efficiency of key performance optimization techniques, such as dynamic clustering, prefetching, and buffer replacement. However, all existing benchmarks or evaluation frameworks produce static access patterns in which objects are always accessed in the same order repeatedly. Hence, we have proposed the Dynamic Evaluation Framework (DEF) that simulates access pattern changes using configurable styles of change. DEF has been designed to be open and fully extensible (e.g., new access pattern change models can be added easily). In this paper, we instantiate DEF into the Dynamic object Evaluation Framework (DoEF) which is designed for object databases, i.e., object-oriented or object-relational databases such as multi-media databases or most XML databases.The capabilities of DoEF have been evaluated by simulating the execution of four different dynamic clustering algorithms. The results confirm our analysis that flexible conservative re-clustering is the key in determining a clustering algorithm's ability to adapt to changes in access pattern. These results show the effectiveness of DoEF at determining the adaptability of each dynamic clustering algorithm to changes in access pattern in a simulation environment. In a second set of experiments, we have used DoEF to compare the performance of two real-life object stores : Platypus and SHORE. DoEF has helped to reveal the poor swapping performance of Platypus.

preprint2012arXiv

Boosting Moving Object Indexing through Velocity Partitioning

There have been intense research interests in moving object indexing in the past decade. However, existing work did not exploit the important property of skewed velocity distributions. In many real world scenarios, objects travel predominantly along only a few directions. Examples include vehicles on road networks, flights, people walking on the streets, etc. The search space for a query is heavily dependent on the velocity distribution of the objects grouped in the nodes of an index tree. Motivated by this observation, we propose the velocity partitioning (VP) technique, which exploits the skew in velocity distribution to speed up query processing using moving object indexes. The VP technique first identifies the "dominant velocity axes (DVAs)" using a combination of principal components analysis (PCA) and k-means clustering. Then, a moving object index (e.g., a TPR-tree) is created based on each DVA, using the DVA as an axis of the underlying coordinate system. An object is maintained in the index whose DVA is closest to the object's current moving direction. Thus, all the objects in an index are moving in a near 1-dimensional space instead of a 2-dimensional space. As a result, the expansion of the search space with time is greatly reduced, from a quadratic function of the maximum speed (of the objects in the search range) to a near linear function of the maximum speed. The VP technique can be applied to a wide range of moving object index structures. We have implemented the VP technique on two representative ones, the TPR*-tree and the Bx-tree. Extensive experiments validate that the VP technique consistently improves the performance of those index structures.

Zhen He

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

Group Activity Recognition using Unreliable Tracked Pose

Adaptive Smooth Disturbance Observer-Based Fast Finite-Time Attitude Tracking Control of a Small Unmanned Helicopter

Automatic Generation of Product-Image Sequence in E-commerce

Deep Graph Learning for Spatially-Varying Indoor Lighting Prediction

Exact results for generalized extremal problems forbidding an even cycle

Scenario-based Multi-product Advertising Copywriting Generation for E-Commerce

Spatial Transformer Network with Transfer Learning for Small-scale Fine-grained Skeleton-based Tai Chi Action Recognition

Stability version of Dirac's theorem and its applications for generalized Turán problems

The maximum number of copies of an even cycle in a planar graph

Pose is all you need: The pose only group activity recognition system (POGARS)

Distributed Training of Deep Learning Models: A Taxonomic Perspective

Evaluating the Dynamic Behavior of Database Applications

Boosting Moving Object Indexing through Velocity Partitioning