Source author record

Yifu Zhang

Yifu Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Artificial Intelligence Computation and Language Computational Complexity Information Theory Machine Learning math.AP math.IT

Catalog footprint

What is connected

5works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A Measure-Theoretic Analysis of Reasoning: Structural Generalization and Approximation Limits

While empirical scaling laws for LLM reasoning are well-documented, the theoretical mechanisms governing out-of-distribution (OOD) generalization remain elusive. We formalize reasoning via optimal transport, projecting discrete trajectories into a continuous metric space to quantify domain shifts using the Wasserstein-1 distance. Invoking Kantorovich duality, we bound OOD generalization via architectural Lipschitz continuity and functional approximation limits. This exposes two primary constraints. First, position-dependent attention (e.g., Absolute Positional Encoding) fails to preserve shift invariance, yielding an $Ω(1)$ Lipschitz constant and expected risk, whereas shift-invariant mechanisms (e.g., Rotary Embeddings) preserve equivariance and bound the error. Second, by mapping sequential backtracking to a Dyck-$k$ language, we establish a strict circuit depth lower bound for $\text{TC}^0$ Transformers. Scaling physical layer depth is necessary to avert representation collapse -- a constraint that scaling representation width cannot bypass due to irreducible approximation bounds in Barron spaces. Evaluations across 54 Transformer configurations on combinatorial search corroborate these bounds, demonstrating that generalization risk degrades monotonically with the Wasserstein domain shift.

preprint2022arXiv

ByteTrack: Multi-Object Tracking by Associating Every Detection Box

Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects in videos. Most methods obtain identities by associating detection boxes whose scores are higher than a threshold. The objects with low detection scores, e.g. occluded objects, are simply thrown away, which brings non-negligible true object missing and fragmented trajectories. To solve this problem, we present a simple, effective and generic association method, tracking by associating almost every detection box instead of only the high score ones. For the low score detection boxes, we utilize their similarities with tracklets to recover true objects and filter out the background detections. When applied to 9 different state-of-the-art trackers, our method achieves consistent improvement on IDF1 score ranging from 1 to 10 points. To put forwards the state-of-the-art performance of MOT, we design a simple and strong tracker, named ByteTrack. For the first time, we achieve 80.3 MOTA, 77.3 IDF1 and 63.1 HOTA on the test set of MOT17 with 30 FPS running speed on a single V100 GPU. ByteTrack also achieves state-of-the-art performance on MOT20, HiEve and BDD100K tracking benchmarks. The source code, pre-trained models with deploy versions and tutorials of applying to other trackers are released at https://github.com/ifzhang/ByteTrack.

preprint2022arXiv

Robust Multi-Object Tracking by Marginal Inference

Multi-object tracking in videos requires to solve a fundamental problem of one-to-one assignment between objects in adjacent frames. Most methods address the problem by first discarding impossible pairs whose feature distances are larger than a threshold, followed by linking objects using Hungarian algorithm to minimize the overall distance. However, we find that the distribution of the distances computed from Re-ID features may vary significantly for different videos. So there isn't a single optimal threshold which allows us to safely discard impossible pairs. To address the problem, we present an efficient approach to compute a marginal probability for each pair of objects in real time. The marginal probability can be regarded as a normalized distance which is significantly more stable than the original feature distance. As a result, we can use a single threshold for all videos. The approach is general and can be applied to the existing trackers to obtain about one point improvement in terms of IDF1 metric. It achieves competitive results on MOT17 and MOT20 benchmarks. In addition, the computed probability is more interpretable which facilitates subsequent post-processing operations.

preprint2020arXiv

Type II Finite time blow-up for the three dimensional energy critical heat equation

We consider the following Cauchy problem for three dimensional energy critical heat equation \begin{equation*} \begin{cases} u_t=Δu+u^{5},~&\mbox{ in } \ {\mathbb R}^3 \times (0,T),\\ u(x,0)=u_0(x),~&\mbox{ in } \ {\mathbb R}^3. \end{cases} \end{equation*} We construct type II finite time blow-up solution $u(x,t)$ with the blow-up rates $ \| u\|_{L^\infty} \sim (T-t)^{-k}$, where $ k=1,2,... $. This gives a rigorous proof of the formal computations by Filippas, Herrero and Velazquez \cite{fhv}. This is the first instance of type II finite time blow-up for three dimensional energy critical heat equation.

preprint2010arXiv

A new sufficient condition for sum-rate tightness in quadratic Gaussian multiterminal source coding

This work considers the quadratic Gaussian multiterminal (MT) source coding problem and provides a new sufficient condition for the Berger-Tung sum-rate bound to be tight. The converse proof utilizes a set of virtual remote sources given which the MT sources are block independent with a maximum block size of two. The given MT source coding problem is then related to a set of two-terminal problems with matrix-distortion constraints, for which a new lower bound on the sum-rate is given. Finally, a convex optimization problem is formulated and a sufficient condition derived for the optimal BT scheme to satisfy the subgradient based Karush-Kuhn-Tucker condition. The set of sum-rate tightness problems defined by our new sufficient condition subsumes all previously known tight cases, and opens new direction for a more general partial solution.