Source author record

Ping Hu

Ping Hu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision math.CO Multimedia Artificial Intelligence eess.IV Information Theory Machine Learning math.IT Robotics

Catalog footprint

What is connected

15works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Graph Smoothing for Enhanced Local Geometry Learning in Point Cloud Analysis

Graph-based methods have proven to be effective in capturing relationships among points for 3D point cloud analysis. However, these methods often suffer from suboptimal graph structures, particularly due to sparse connections at boundary points and noisy connections in junction areas. To address these challenges, we propose a novel method that integrates a graph smoothing module with an enhanced local geometry learning module. Specifically, we identify the limitations of conventional graph structures, particularly in handling boundary points and junction areas. In response, we introduce a graph smoothing module designed to optimize the graph structure and minimize the negative impact of unreliable sparse and noisy connections. Based on the optimized graph structure, we improve the feature extract function with local geometry information. These include shape features derived from adaptive geometric descriptors based on eigenvectors and distribution features obtained through cylindrical coordinate transformation. Experimental results on real-world datasets validate the effectiveness of our method in various point cloud learning tasks, i.e., classification, part segmentation, and semantic segmentation.

preprint2022arXiv

DualCoOp: Fast Adaptation to Multi-Label Recognition with Limited Annotations

Solving multi-label recognition (MLR) for images in the low-label regime is a challenging task with many real-world applications. Recent work learns an alignment between textual and visual spaces to compensate for insufficient image labels, but loses accuracy because of the limited amount of available MLR annotations. In this work, we utilize the strong alignment of textual and visual features pretrained with millions of auxiliary image-text pairs and propose Dual Context Optimization (DualCoOp) as a unified framework for partial-label MLR and zero-shot MLR. DualCoOp encodes positive and negative contexts with class names as part of the linguistic input (i.e. prompts). Since DualCoOp only introduces a very light learnable overhead upon the pretrained vision-language framework, it can quickly adapt to multi-label recognition tasks that have limited annotations and even unseen classes. Experiments on standard multi-label recognition benchmarks across two challenging low-label settings demonstrate the advantages of our approach over state-of-the-art methods.

preprint2022arXiv

Learning to Detect Every Thing in an Open World

Many open-world applications require the detection of novel objects, yet state-of-the-art object detection and instance segmentation networks do not excel at this task. The key issue lies in their assumption that regions without any annotations should be suppressed as negatives, which teaches the model to treat the unannotated objects as background. To address this issue, we propose a simple yet surprisingly powerful data augmentation and training scheme we call Learning to Detect Every Thing (LDET). To avoid suppressing hidden objects, background objects that are visible but unlabeled, we paste annotated objects on a background image sampled from a small region of the original image. Since training solely on such synthetically-augmented images suffers from domain shift, we decouple the training into two parts: 1) training the region classification and regression head on augmented images, and 2)~training the mask heads on original images. In this way, a model does not learn to classify hidden objects as background while generalizing well to real images. LDET leads to significant improvements on many datasets in the open-world instance segmentation task, outperforming baselines on cross-category generalization on COCO, as well as cross-dataset evaluation on UVO and Cityscapes.

preprint2022arXiv

Many-to-many Splatting for Efficient Video Frame Interpolation

Motion-based video frame interpolation commonly relies on optical flow to warp pixels from the inputs to the desired interpolation instant. Yet due to the inherent challenges of motion estimation (e.g. occlusions and discontinuities), most state-of-the-art interpolation approaches require subsequent refinement of the warped result to generate satisfying outputs, which drastically decreases the efficiency for multi-frame interpolation. In this work, we propose a fully differentiable Many-to-Many (M2M) splatting framework to interpolate frames efficiently. Specifically, given a frame pair, we estimate multiple bidirectional flows to directly forward warp the pixels to the desired time step, and then fuse any overlapping pixels. In doing so, each source pixel renders multiple target pixels and each target pixel can be synthesized from a larger area of visual context. This establishes a many-to-many splatting scheme with robustness to artifacts like holes. Moreover, for each input frame pair, M2M only performs motion estimation once and has a minuscule computational overhead when interpolating an arbitrary number of in-between frames, hence achieving fast multi-frame interpolation. We conducted extensive experiments to analyze M2M, and found that it significantly improves efficiency while maintaining high effectiveness.

preprint2022arXiv

ZeroWaste Dataset: Towards Deformable Object Segmentation in Cluttered Scenes

Less than 35% of recyclable waste is being actually recycled in the US, which leads to increased soil and sea pollution and is one of the major concerns of environmental researchers as well as the common public. At the heart of the problem are the inefficiencies of the waste sorting process (separating paper, plastic, metal, glass, etc.) due to the extremely complex and cluttered nature of the waste stream. Recyclable waste detection poses a unique computer vision challenge as it requires detection of highly deformable and often translucent objects in cluttered scenes without the kind of context information usually present in human-centric datasets. This challenging computer vision task currently lacks suitable datasets or methods in the available literature. In this paper, we take a step towards computer-aided waste detection and present the first in-the-wild industrial-grade waste detection and segmentation dataset, ZeroWaste. We believe that ZeroWaste will catalyze research in object detection and semantic segmentation in extreme clutter as well as applications in the recycling domain. Our project page can be found at http://ai.bu.edu/zerowaste/.

preprint2020arXiv

Real-time Semantic Segmentation with Fast Attention

In deep CNN based models for semantic segmentation, high accuracy relies on rich spatial context (large receptive fields) and fine spatial details (high resolution), both of which incur high computational costs. In this paper, we propose a novel architecture that addresses both challenges and achieves state-of-the-art performance for semantic segmentation of high-resolution images and videos in real-time. The proposed architecture relies on our fast spatial attention, which is a simple yet efficient modification of the popular self-attention mechanism and captures the same rich spatial context at a small fraction of the computational cost, by changing the order of operations. Moreover, to efficiently process high-resolution input, we apply an additional spatial reduction to intermediate feature stages of the network with minimal loss in accuracy thanks to the use of the fast attention module to fuse features. We validate our method with a series of experiments, and show that results on multiple datasets demonstrate superior performance with better accuracy and speed compared to existing approaches for real-time semantic segmentation. On Cityscapes, our network achieves 74.4$\%$ mIoU at 72 FPS and 75.5$\%$ mIoU at 58 FPS on a single Titan X GPU, which is~$\sim$50$\%$ faster than the state-of-the-art while retaining the same accuracy.

preprint2020arXiv

Temporally Distributed Networks for Fast Video Semantic Segmentation

We present TDNet, a temporally distributed network designed for fast and accurate video semantic segmentation. We observe that features extracted from a certain high-level layer of a deep CNN can be approximated by composing features extracted from several shallower sub-networks. Leveraging the inherent temporal continuity in videos, we distribute these sub-networks over sequential frames. Therefore, at each time step, we only need to perform a lightweight computation to extract a sub-features group from a single sub-network. The full features used for segmentation are then recomposed by application of a novel attention propagation module that compensates for geometry deformation between frames. A grouped knowledge distillation loss is also introduced to further improve the representation power at both full and sub-feature levels. Experiments on Cityscapes, CamVid, and NYUD-v2 demonstrate that our method achieves state-of-the-art accuracy with significantly faster speed and lower latency.

preprint2020arXiv

The inducibility of oriented stars

We consider the problem of maximizing the number of induced copies of an oriented star $S_{k,\ell}$ in digraphs of given size, where the center of the star has out-degree $k$ and in-degree $\ell$. The case $k\ell=0$ was solved by Huang. Here, we asymptotically solve it for all other oriented stars with at least seven vertices.

preprint2020arXiv

Tilings in graphons

We introduce a counterpart to the notion of vertex disjoint tilings by copy of a fixed graph F to the setting of graphons. The case F=K_2 gives the notion of matchings in graphons. We give a transference statement that allows us to switch between the finite and limit notion, and derive several favorable properties, including the LP-duality counterpart to the classical relation between the fractional vertex covers and fractional matchings/tilings, and discuss connections with property testing. As an application of our theory, we determine the asymptotically almost sure F-tiling number of inhomogeneous random graphs \mathbb{G}(n,W). As another application, in an accompanying paper [Hladky, Hu, Piguet: Komlos's tiling theorem via graphon covers, preprint] we give a proof of a strengthening of a theorem of Komlos [Komlos: Tiling Turán Theorems, Combinatorica, 2000].

preprint2016arXiv

Broadcast Repair for Wireless Distributed Storage Systems

In wireless distributed storage systems, storage nodes are connected by wireless channels, which are broadcast in nature. This paper exploits this unique feature to design an efficient repair mechanism, called broadcast repair, for wireless distributed storage systems with multiple-node failures. Since wireless channels are typically bandwidth limited, we advocate a new measure on repair performance called repair-transmission bandwidth, which measures the average number of packets transmitted by helper nodes per failed node. The fundamental tradeoff between storage amount and repair-transmission bandwidth is obtained. It is shown that broadcast repair outperforms cooperative repair, which is the basic repair method for wired distributed storage systems with multiple-node failures, in terms of storage efficiency and repair-transmission bandwidth, thus yielding a better tradeoff curve.

preprint2015arXiv

Mantel's Theorem for Random Hypergraphs

A classical result in extremal graph theory is Mantel's Theorem, which states that every maximum triangle-free subgraph of $K_n$ is bipartite. A sparse version of Mantel's Theorem is that, for sufficiently large $p$, every maximum triangle-free subgraph of $G(n,p)$ is w.h.p. bipartite. Recently, DeMarco and Kahn proved this for $p > K \sqrt{\log n/n}$ for some constant $K$, and apart from the value of the constant this bound is best possible. We study an extremal problem of this type in random hypergraphs. Denote by $F_5$, which sometimes called as the generalized triangle, the 3-uniform hypergraph with vertex set {a,b,c,d,e} and edge set {abc, ade, bde}. One of the first extremal results in extremal hypergraph theory is by Frankl and Füredi, who proved that the maximum 3-uniform hypergraph on n vertices containing no copy of $F_5$ is tripartite for n>3000. A natural question is for what p is every maximum $F_5$-free subhypergraph of $G^3(n,p)$ w.h.p. tripartite. We show this holds for $p>K\log n/n$ for some constant K and does not hold for $p=0.1\sqrt{\log n}/n$.

preprint2015arXiv

Phase transitions in the Ramsey-Turán theory

Let $f(n)$ be a function and $L$ be a graph. Denote by $RT(n,L,f(n))$ the maximum number of edges of an $L$-free graph on $n$ vertices with independence number less than $f(n)$. Erd\H os and Sós asked if $RT\left(n, K_5, c\sqrt{n}\right) = o(n^2)$ for some constant $c$. We answer this question by proving the stronger $RT\left(n, K_5, o\left(\sqrt{n\log n}\right)\right) = o(n^2)$. It is known that $RT \left(n, K_5, c \sqrt{n\log n} \right) = n^2/4+o(n^2)$ for $c>1$, so one can say that $K_5$ has a Ramsey-Turán phase transition at $c\sqrt{n\log n}$. We extend this result to several other $K_s$'s and functions $f(n)$, determining many more phase transitions. We shall formulate several open problems, in particular, whether variants of the Bollobás-Erd\H os graph exist to give good lower bounds on $RT\left(n, K_s, f(n)\right)$ for various pairs of $s$ and $f(n)$. Among others, we use Szemerédi's Regularity Lemma and the Hypergraph Dependent Random Choice Lemma. We also present a short proof of the fact that $K_s$-free graphs with small independence number are sparse.

preprint2014arXiv

Maximum density of an induced 5-cycle is achieved by an iterated blow-up of a 5-cycle

Let $C(n)$ denote the maximum number of induced copies of 5-cycles in graphs on $n$ vertices. For $n$ large enough, we show that $C(n)=a\cdot b\cdot c \cdot d \cdot e + C(a)+C(b)+C(c)+C(d)+C(e)$, where $a+b+c+d+e = n$ and $a,b,c,d,e$ are as equal as possible. Moreover, if $n$ is a power of 5, we show that the unique graph on $n$ vertices maximizing the number of induced 5-cycles is an iterated blow-up of a 5-cycle.

preprint2014arXiv

Minimum number of monotone subsequences of length 4 in permutations

We show that for every sufficiently large $n$, the number of monotone subsequences of length four in a permutation on $n$ points is at least $\binom{\lfloor n/3 \rfloor}{4} + \binom{\lfloor(n+1)/3\rfloor}{4} + \binom{\lfloor (n+2)/3\rfloor}{4}$. Furthermore, we characterize all permutations on $[n]$ that attain this lower bound. The proof uses the flag algebra framework together with some additional stability arguments. This problem is equivalent to some specific type of edge colorings of complete graphs with two colors, where the number of monochromatic $K_4$'s is minimized. We show that all the extremal colorings must contain monochromatic $K_4$'s only in one of the two colors. This translates back to permutations, where all the monotone subsequences of length four are all either increasing, or decreasing only.

preprint2012arXiv

Upper bounds on the size of 4- and 6-cycle-free subgraphs of the hypercube

In this paper we modify slightly Razborov's flag algebra machinery to be suitable for the hypercube. We use this modified method to show that the maximum number of edges of a 4-cycle-free subgraph of the n-dimensional hypercube is at most 0.6068 times the number of its edges. We also improve the upper bound on the number of edges for 6-cycle-free subgraphs of the n-dimensional hypercube from the square root of 2 - 1 to 0.3755 times the number of its edges. Additionally, we show that if the n-dimensional hypercube is considered as a poset, then the maximum vertex density of three middle layers in an induced subgraph without 4-cycles is at most 2.15121 times n choose n/2.

Ping Hu

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

Graph Smoothing for Enhanced Local Geometry Learning in Point Cloud Analysis

DualCoOp: Fast Adaptation to Multi-Label Recognition with Limited Annotations

Learning to Detect Every Thing in an Open World

Many-to-many Splatting for Efficient Video Frame Interpolation

ZeroWaste Dataset: Towards Deformable Object Segmentation in Cluttered Scenes

Real-time Semantic Segmentation with Fast Attention

Temporally Distributed Networks for Fast Video Semantic Segmentation

The inducibility of oriented stars

Tilings in graphons

Broadcast Repair for Wireless Distributed Storage Systems

Mantel's Theorem for Random Hypergraphs

Phase transitions in the Ramsey-Turán theory

Maximum density of an induced 5-cycle is achieved by an iterated blow-up of a 5-cycle

Minimum number of monotone subsequences of length 4 in permutations

Upper bounds on the size of 4- and 6-cycle-free subgraphs of the hypercube