Source author record

Xiaodong Chen

Xiaodong Chen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision physics.flu-dyn Artificial Intelligence cond-mat.mtrl-sci Distributed, Parallel, and Cluster Computing Machine Learning math.CO

Catalog footprint

What is connected

9works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Baseline Framework for Part-level Action Parsing and Action Recognition

This technical report introduces our 2nd place solution to Kinetics-TPS Track on Part-level Action Parsing in ICCV DeeperAction Workshop 2021. Our entry is mainly based on YOLOF for instance and part detection, HRNet for human pose estimation, and CSN for video-level action recognition and frame-level part state parsing. We describe technical details for the Kinetics-TPS dataset, together with some experimental results. In the competition, we achieved 61.37% mAP on the test set of Kinetics-TPS.

preprint2022arXiv

MAPLE: Masked Pseudo-Labeling autoEncoder for Semi-supervised Point Cloud Action Recognition

Recognizing human actions from point cloud videos has attracted tremendous attention from both academia and industry due to its wide applications like automatic driving, robotics, and so on. However, current methods for point cloud action recognition usually require a huge amount of data with manual annotations and a complex backbone network with high computation costs, which makes it impractical for real-world applications. Therefore, this paper considers the task of semi-supervised point cloud action recognition. We propose a Masked Pseudo-Labeling autoEncoder (\textbf{MAPLE}) framework to learn effective representations with much fewer annotations for point cloud action recognition. In particular, we design a novel and efficient \textbf{De}coupled \textbf{s}patial-\textbf{t}emporal Trans\textbf{Former} (\textbf{DestFormer}) as the backbone of MAPLE. In DestFormer, the spatial and temporal dimensions of the 4D point cloud videos are decoupled to achieve efficient self-attention for learning both long-term and short-term features. Moreover, to learn discriminative features from fewer annotations, we design a masked pseudo-labeling autoencoder structure to guide the DestFormer to reconstruct features of masked frames from the available frames. More importantly, for unlabeled data, we exploit the pseudo-labels from the classification head as the supervision signal for the reconstruction of features from the masked frames. Finally, comprehensive experiments demonstrate that MAPLE achieves superior results on three public benchmarks and outperforms the state-of-the-art method by 8.08\% accuracy on the MSR-Action3D dataset.

preprint2022arXiv

MLPerf Mobile Inference Benchmark

This paper presents the first industry-standard open-source machine learning (ML) benchmark to allow perfor mance and accuracy evaluation of mobile devices with different AI chips and software stacks. The benchmark draws from the expertise of leading mobile-SoC vendors, ML-framework providers, and model producers. It comprises a suite of models that operate with standard data sets, quality metrics and run rules. We describe the design and implementation of this domain-specific ML benchmark. The current benchmark version comes as a mobile app for different computer vision and natural language processing tasks. The benchmark also supports non-smartphone devices, such as laptops and mobile PCs. Benchmark results from the first two rounds reveal the overwhelming complexity of the underlying mobile ML system stack, emphasizing the need for transparency in mobile ML performance analysis. The results also show that the strides being made all through the ML stack improve performance. Within six months, offline throughput improved by 3x, while latency reduced by as much as 12x. ML is an evolving field with changing use cases, models, data sets and quality targets. MLPerf Mobile will evolve and serve as an open-source community framework to guide research and innovation for mobile AI.

preprint2022arXiv

Part-level Action Parsing via a Pose-guided Coarse-to-Fine Framework

Action recognition from videos, i.e., classifying a video into one of the pre-defined action types, has been a popular topic in the communities of artificial intelligence, multimedia, and signal processing. However, existing methods usually consider an input video as a whole and learn models, e.g., Convolutional Neural Networks (CNNs), with coarse video-level class labels. These methods can only output an action class for the video, but cannot provide fine-grained and explainable cues to answer why the video shows a specific action. Therefore, researchers start to focus on a new task, Part-level Action Parsing (PAP), which aims to not only predict the video-level action but also recognize the frame-level fine-grained actions or interactions of body parts for each person in the video. To this end, we propose a coarse-to-fine framework for this challenging task. In particular, our framework first predicts the video-level class of the input video, then localizes the body parts and predicts the part-level action. Moreover, to balance the accuracy and computation in part-level action parsing, we propose to recognize the part-level actions by segment-level features. Furthermore, to overcome the ambiguity of body parts, we propose a pose-guided positional embedding method to accurately localize body parts. Through comprehensive experiments on a large-scale dataset, i.e., Kinetics-TPS, our framework achieves state-of-the-art performance and outperforms existing methods over a 31.10% ROC score.

preprint2022arXiv

The crossing number of the complete 4-partite graph $K_{1,1,m,n}$

Let $\textrm{cr}(G)$ denote the crossing number of a graph $G$. The well-known Zarankiewicz's conjecture (ZC) asserted $\textrm{cr}(K_{m,n})$ in 1954. In 1971, Harborth gave a conjecture (HC) on $\textrm{cr}(K_{x_1,...,x_n})$. HC on $K_{1,m,n}$ is verified if ZC is true by Ho et al. in 2021. In this paper, we showed the following results: If both $m$ and $n$ are even, then \[\textrm{cr}(K_{1,1,m,n})\geq \frac{1}{2}(\textrm{cr}(K_{m+1,n+3})+\textrm{cr}(K_{m+3,n+1})-mn-\frac{1}{4}(m^2+n^2));\] If both $m$ and $n$ are odd, then \[\textrm{cr}(K_{1,1,m,n})\geq \frac{1}{2}(\textrm{cr}(K_{1,m+1,n+1})+\textrm{cr}(K_{2,m,n})-\frac{1}{4}(m+1)(n+1)+1);\] If $m$ is even and $n$ is odd, then \begin{equation}\nonumber \begin{split} \textrm{cr}(K_{1,1,m,n})&\geq \frac{1}{4}(\textrm{cr}(K_{m+1,n+2})+\textrm{cr}(K_{m+3,n+2})+2\textrm{cr}(K_{2,m,n}) \\&-m(n+1)-\frac{1}{4}(n+1)^2). \end{split} \end{equation} The lower bounds in our result imply that if both $m$ and $n$ are even and ZC is true, then HC on $K_{1,1,m,n}$ holds; if at least one of $m$ and $n$ is odd and both ZC and HC on $K_{2,m,n}$ are true, then HC on $K_{1,1,m,n}$ holds.

preprint2013arXiv

Single-Layer MoS2 Phototransistors

A new phototransistor based on the mechanically-exfoliated single-layer MoS2 nanosheet is fabricated and its light-induced electric properties are investigated in details. Photocurrent generated from the phototransistor is solely determined by the illuminated optical power at a constant drain or gate voltage. The switching behavior of photocurrent generation and annihilation can be completely finished within ca. 50 ms and it shows good stability. Especially, the single-layer MoS2 phototransistor exhibits a better photoresponsivity as compared with the graphene-based device. The unique characteristics of incident-light control, prompt photoswitching and good photoresponsivity from the MoS2 phototransistor pave an avenue to develop the single-layer semiconducting materials for multi-functional optoelectronic device applications in future.

preprint2012arXiv

Bouncing, Helical and Buckling Instabilities During Droplet Collision: Newtonian and Non-Newtonian Liquids

In this video, Ray-tracing data visualization technique was used to obtain realistic and detailed flow motions during droplet collision. The differences of collision outcome between Newtonian and non-Newtonian were compared. Various types of droplet collision were presented, including bouncing, coalescence, and stretching separation. Because of the reducing of equivalent viscosity caused by shear stress, the gas film between shear-thinning droplet is thinner than Newtonian liquid. Since thinner gas film promotes coalescence, shear thinning liquid has smaller area of bouncing regime in the diagram of Weber number and impact parameter. During the ligament/thread breakup process of stretching separation, two kinds of instabilities are identified, helical and buckling instabilities. Helical instability is analogous to a viscous rotating liquid jet, while the buckling instability is analogous to electrically charged liquid jets of polymer solutions.

preprint2012arXiv

Impinging Jet Dynamics

In this fluid dynamics video, Ray-tracing data visualization technique was used to obtain realistic and detailed flow motions during impinging of two liquid jets. Different patterns of sheet and rim configurations were presented to shed light into the underlying physics, including liquid chain, closed rim, open rim, unstable rim and flapping sheet. In addition, stationary asymmetrical waves were observed and compared with existing theories. The generation of stationary capillary wave in respect to the liquid rim were explained by the classic shallow water wave theory. The atomization process caused by development of the impact waves were observed in detail, including fragmentation of liquid sheet, formation of liquid ligaments, and breakup of ligament into droplet. The locking-on feature of the wavelength of impact wave were also found to be similar to that of perturbed free shear layers.

preprint2011arXiv

Impinging Jets and Droplet Dynamics

In this fluid dynamics video, results from high fidelity numerical simulations are presented, which have been carried out to study the flow and droplet dynamics of liquid sheets formed by two impinging jets. A three-dimensional Volume-of-Fluid (VOF) method with adaptive mesh refinement (AMR) based on octree meshes [1] is used to simulate the various flow patterns associated with impinging jets, secondary breakup and binary collision of droplets. In addition to AMR, a thickness based refinement algorithm is also developed and implemented to efficiently resolve the various scales of surface tension driven interfacial flows.

Xiaodong Chen

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

A Baseline Framework for Part-level Action Parsing and Action Recognition

MAPLE: Masked Pseudo-Labeling autoEncoder for Semi-supervised Point Cloud Action Recognition

MLPerf Mobile Inference Benchmark

Part-level Action Parsing via a Pose-guided Coarse-to-Fine Framework

The crossing number of the complete 4-partite graph $K_{1,1,m,n}$

Single-Layer MoS2 Phototransistors

Bouncing, Helical and Buckling Instabilities During Droplet Collision: Newtonian and Non-Newtonian Liquids

Impinging Jet Dynamics

Impinging Jets and Droplet Dynamics