Source author record

Zongming Guo

Zongming Guo appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Graphics math.AP Multimedia Networking and Internet Architecture eess.IV eess.SP Machine Learning

Catalog footprint

What is connected

11works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

3D Dynamic Point Cloud Inpainting via Temporal Consistency on Graphs

With the development of 3D laser scanning techniques and depth sensors, 3D dynamic point clouds have attracted increasing attention as a representation of 3D objects in motion, enabling various applications such as 3D immersive tele-presence, gaming and navigation. However, dynamic point clouds usually exhibit holes of missing data, mainly due to the fast motion, the limitation of acquisition and complicated structure. Leveraging on graph signal processing tools, we represent irregular point clouds on graphs and propose a novel inpainting method exploiting both intra-frame self-similarity and inter-frame consistency in 3D dynamic point clouds. Specifically, for each missing region in every frame of the point cloud sequence, we search for its self-similar regions in the current frame and corresponding ones in adjacent frames as references. Then we formulate dynamic point cloud inpainting as an optimization problem based on the two types of references, which is regularized by a graph-signal smoothness prior. Experimental results show the proposed approach outperforms three competing methods significantly, both in objective and subjective quality.

preprint2020arXiv

Consistent User-Traffic Allocation and Load Balancing in Mobile Edge Caching

Cache-equipped Base-Stations (CBSs) is an attractive alternative to offload the rapidly growing backhaul traffic in a mobile network. New 5G technology and dense femtocell enable one user to connect to multiple base-stations simultaneously. Practical implementation requires the caches in BSs to be regarded as a cache server, but few of the existing works considered how to offload traffic, or how to schedule HTTP requests to CBSs. In this work, we propose a DNS-based HTTP traffic allocation framework. It schedules user traffic among multiple CBSs by DNS resolution, with the consideration of load-balancing, traffic allocation consistency and scheduling granularity of DNS. To address these issues, we formulate the user-traffic allocation problem in DNS-based mobile edge caching, aiming at maximizing QoS gain and allocation consistency while maintaining load balance. Then we present a simple greedy algorithm which gives a more consistent solution when user-traffic changes dynamically. Theoretical analysis proves that it is within 3/4 of the optimal solution. Extensive evaluations in numerical and trace-driven situations show that the greedy algorithm can avoid about 50% unnecessary shift in user-traffic allocation, yield more stable cache hit ratio and balance the load between CBSs without losing much of the QoS gain.

preprint2020arXiv

Deep Plastic Surgery: Robust and Controllable Image Editing with Human-Drawn Sketches

Sketch-based image editing aims to synthesize and modify photos based on the structural information provided by the human-drawn sketches. Since sketches are difficult to collect, previous methods mainly use edge maps instead of sketches to train models (referred to as edge-based models). However, sketches display great structural discrepancy with edge maps, thus failing edge-based models. Moreover, sketches often demonstrate huge variety among different users, demanding even higher generalizability and robustness for the editing model to work. In this paper, we propose Deep Plastic Surgery, a novel, robust and controllable image editing framework that allows users to interactively edit images using hand-drawn sketch inputs. We present a sketch refinement strategy, as inspired by the coarse-to-fine drawing process of the artists, which we show can help our model well adapt to casual and varied sketches without the need for real sketch training data. Our model further provides a refinement level control parameter that enables users to flexibly define how "reliable" the input sketch should be considered for the final output, balancing between sketch faithfulness and output verisimilitude (as the two goals might contradict if the input sketch is drawn poorly). To achieve the multi-level refinement, we introduce a style-based module for level conditioning, which allows adaptive feature representations for different levels in a singe network. Extensive experimental results demonstrate the superiority of our approach in improving the visual quality and user controllablity of image editing over the state-of-the-art methods.

preprint2020arXiv

DeepRS: Deep-learning Based Network-Adaptive FEC for Real-Time Video Communications

This work proposes an innovative approach to handle packet loss in real-time video streaming scenarios in a more sophisticated way -- Predicting packet loss pattern on time field by deep learning model.

preprint2020arXiv

Feature Graph Learning for 3D Point Cloud Denoising

Identifying an appropriate underlying graph kernel that reflects pairwise similarities is critical in many recent graph spectral signal restoration schemes, including image denoising, dequantization, and contrast enhancement. Existing graph learning algorithms compute the most likely entries of a properly defined graph Laplacian matrix $\mathbf{L}$, but require a large number of signal observations $\mathbf{z}$'s for a stable estimate. In this work, we assume instead the availability of a relevant feature vector $\mathbf{f}_i$ per node $i$, from which we compute an optimal feature graph via optimization of a feature metric. Specifically, we alternately optimize the diagonal and off-diagonal entries of a Mahalanobis distance matrix $\mathbf{M}$ by minimizing the graph Laplacian regularizer (GLR) $\mathbf{z}^{\top} \mathbf{L} \mathbf{z}$, where edge weight is $w_{i,j} = \exp\{-(\mathbf{f}_i - \mathbf{f}_j)^{\top} \mathbf{M} (\mathbf{f}_i - \mathbf{f}_j) \}$, given a single observation $\mathbf{z}$. We optimize diagonal entries via proximal gradient (PG), where we constrain $\mathbf{M}$ to be positive definite (PD) via linear inequalities derived from the Gershgorin circle theorem. To optimize off-diagonal entries, we design a block descent algorithm that iteratively optimizes one row and column of $\mathbf{M}$. To keep $\mathbf{M}$ PD, we constrain the Schur complement of sub-matrix $\mathbf{M}_{2,2}$ of $\mathbf{M}$ to be PD when optimizing via PG. Our algorithm mitigates full eigen-decomposition of $\mathbf{M}$, thus ensuring fast computation speed even when feature vector $\mathbf{f}_i$ has high dimension. To validate its usefulness, we apply our feature graph learning algorithm to the problem of 3D point cloud denoising, resulting in state-of-the-art performance compared to competing schemes in extensive experiments.

preprint2020arXiv

Modality Compensation Network: Cross-Modal Adaptation for Action Recognition

With the prevalence of RGB-D cameras, multi-modal video data have become more available for human action recognition. One main challenge for this task lies in how to effectively leverage their complementary information. In this work, we propose a Modality Compensation Network (MCN) to explore the relationships of different modalities, and boost the representations for human action recognition. We regard RGB/optical flow videos as source modalities, skeletons as auxiliary modality. Our goal is to extract more discriminative features from source modalities, with the help of auxiliary modality. Built on deep Convolutional Neural Networks (CNN) and Long Short Term Memory (LSTM) networks, our model bridges data from source and auxiliary modalities by a modality adaptation block to achieve adaptive representation learning, that the network learns to compensate for the loss of skeletons at test time and even at training time. We explore multiple adaptation schemes to narrow the distance between source and auxiliary modal distributions from different levels, according to the alignment of source and auxiliary data in training. In addition, skeletons are only required in the training phase. Our model is able to improve the recognition performance with source data when testing. Experimental results reveal that MCN outperforms state-of-the-art approaches on four widely-used action recognition benchmarks.

preprint2020arXiv

Predictive Generalized Graph Fourier Transform for Attribute Compression of Dynamic Point Clouds

As 3D scanning devices and depth sensors advance, dynamic point clouds have attracted increasing attention as a format for 3D objects in motion, with applications in various fields such as immersive telepresence, navigation for autonomous driving and gaming. Nevertheless, the tremendous amount of data in dynamic point clouds significantly burden transmission and storage. To this end, we propose a complete compression framework for attributes of 3D dynamic point clouds, focusing on optimal inter-coding. Firstly, we derive the optimal inter-prediction and predictive transform coding assuming the Gaussian Markov Random Field model with respect to a spatio-temporal graph underlying the attributes of dynamic point clouds. The optimal predictive transform proves to be the Generalized Graph Fourier Transform in terms of spatio-temporal decorrelation. Secondly, we propose refined motion estimation via efficient registration prior to inter-prediction, which searches the temporal correspondence between adjacent frames of irregular point clouds. Finally, we present a complete framework based on the optimal inter-coding and our previously proposed intra-coding, where we determine the optimal coding mode from rate-distortion optimization with the proposed offline-trained $λ$-Q model. Experimental results show that we achieve around 17% bit rate reduction on average over competitive dynamic point cloud compression methods.

preprint2020arXiv

The Stefan problem for the Fisher-KPP equation with unbounded initial range

We consider the nonlinear Stefan problem $$ \left \{ \begin{array} {ll} -d Δu=a u-b u^2 \;\; & \mbox{for } x \in Ω(t), \; t>0, \\ u=0 \mbox{ and } u_t=μ|\nabla_x u |^2 \;\;&\mbox{for } x \in \partialΩ(t), \; t>0, \\ u(0,x)=u_0 (x) \;\; & \mbox{for } x \in Ω_0, \end{array}\right. $$ where $Ω(0)=Ω_0$ is an unbounded smooth domain in $\mathbb R^N$, $u_0>0$ in $Ω_0$ and $u_0$ vanishes on $\partialΩ_0$. When $Ω_0$ is bounded, the long-time behavior of this problem has been rather well-understood by \cite{DG1,DG2,DLZ, DMW}. Here we reveal some interesting different behavior for certain unbounded $Ω_0$. We also give a unified approach for a weak solution theory to this kind of free boundary problems with bounded or unbounded $Ω_0$.

preprint2016arXiv

Awesome Typography: Statistics-Based Text Effects Transfer

In this work, we explore the problem of generating fantastic special-effects for the typography. It is quite challenging due to the model diversities to illustrate varied text effects for different characters. To address this issue, our key idea is to exploit the analytics on the high regularity of the spatial distribution for text effects to guide the synthesis process. Specifically, we characterize the stylized patches by their normalized positions and the optimal scales to depict their style elements. Our method first estimates these two features and derives their correlation statistically. They are then converted into soft constraints for texture transfer to accomplish adaptive multi-scale texture synthesis and to make style element distribution uniform. It allows our algorithm to produce artistic typography that fits for both local texture patterns and the global spatial distribution in the example. Experimental results demonstrate the superiority of our method for various text effects over conventional style transfer methods. In addition, we validate the effectiveness of our algorithm with extensive artistic typography library generation.

preprint2016arXiv

MARLow: A Joint Multiplanar Autoregressive and Low-Rank Approach for Image Completion

In this paper, we propose a novel multiplanar autoregressive (AR) model to exploit the correlation in cross-dimensional planes of a similar patch group collected in an image, which has long been neglected by previous AR models. On that basis, we then present a joint multiplanar AR and low-rank based approach (MARLow) for image completion from random sampling, which exploits the nonlocal self-similarity within natural images more effectively. Specifically, the multiplanar AR model constraints the local stationarity in different cross-sections of the patch group, while the low-rank minimization captures the intrinsic coherence of nonlocal patches. The proposed approach can be readily extended to multichannel images (e.g. color images), by simultaneously considering the correlation in different channels. Experimental results demonstrate that the proposed approach significantly outperforms state-of-the-art methods, even if the pixel missing rate is as high as 90%.

preprint2013arXiv

Finite Morse index solutions and asymptotics of weighted nonlinear elliptic equations

By introducing a suitable setting, we study the behavior of finite Morse index solutions of the equation \[ -\{div} (|x|^θ\nabla v)=|x|^l |v|^{p-1}v \;\;\; \{in $Ω\subset \R^N \; (N \geq 2)$}, \leqno(1) \] where $p>1$, $θ, l\in\R^1$ with $N+θ>2$, $l-θ>-2$, and $Ω$ is a bounded or unbounded domain. Through a suitable transformation of the form $v(x)=|x|^σu(x)$, equation (1) can be rewritten as a nonlinear Schrödinger equation with Hardy potential $$-Δu=|x|^α|u|^{p-1}u+\frac{\ell}{|x|^2} u \;\; \{in $Ω\subset \R^N \;\; (N \geq 2)$}, \leqno{(2)}$$ where $p>1$, $α\in (-\infty, \infty)$ and $\ell \in (-\infty,(N-2)^2/4)$. We show that under our chosen setting for the finite Morse index theory of (1), the stability of a solution to (1) is unchanged under various natural transformations. This enables us to reveal two critical values of the exponent $p$ in (1) that divide the behavior of finite Morse index solutions of (1), which in turn yields two critical powers for (2) through the transformation. The latter appear difficult to obtain by working directly with (2).

Zongming Guo

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

3D Dynamic Point Cloud Inpainting via Temporal Consistency on Graphs

Consistent User-Traffic Allocation and Load Balancing in Mobile Edge Caching

Deep Plastic Surgery: Robust and Controllable Image Editing with Human-Drawn Sketches

DeepRS: Deep-learning Based Network-Adaptive FEC for Real-Time Video Communications

Feature Graph Learning for 3D Point Cloud Denoising

Modality Compensation Network: Cross-Modal Adaptation for Action Recognition

Predictive Generalized Graph Fourier Transform for Attribute Compression of Dynamic Point Clouds

The Stefan problem for the Fisher-KPP equation with unbounded initial range

Awesome Typography: Statistics-Based Text Effects Transfer

MARLow: A Joint Multiplanar Autoregressive and Low-Rank Approach for Image Completion

Finite Morse index solutions and asymptotics of weighted nonlinear elliptic equations