Researcher profile

Zhanghao Sun

Zhanghao Sun contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2026arXiv

What Matters in Practical Learned Image Compression

One of the major differentiators unlocked by learned codecs relative to their hard-coded traditional counterparts is their ability to be optimized directly to appeal to the human visual system. Despite this potential, a perceptual yet practical image codec is yet to be proposed. In this work, we aim to close this gap. We conduct a comprehensive study of the key modeling choices that govern the design of a practical learned image codec, jointly optimized for perceptual quality and runtime -- including within the ablations several novel techniques. We then perform performance-aware neural architecture search over millions of backbone configurations to identify models that achieve the target on-device runtime while maximizing compression performance as captured by perceptual metrics. We combine the various optimizations to construct a new codec that achieves a significantly improved tradeoff between speed and perceptual quality. Based on rigorous subjective user studies, it provides 2.3-3x bitrate savings against AV1, AV2, VVC, ECM and JPEG-AI, and 20-40% bitrate savings against the best learned codec alternatives. At the same time, on an iPhone 17 Pro Max, it encodes 12MP images as fast as 230ms, and decodes them in 150ms -- faster than most top ML-based codecs run on a V100 GPU.

preprint2022arXiv

Seeing Far in the Dark with Patterned Flash

Flash illumination is widely used in imaging under low-light environments. However, illumination intensity falls off with propagation distance quadratically, which poses significant challenges for flash imaging at a long distance. We propose a new flash technique, named ``patterned flash'', for flash imaging at a long distance. Patterned flash concentrates optical power into a dot array. Compared with the conventional uniform flash where the signal is overwhelmed by the noise everywhere, patterned flash provides stronger signals at sparsely distributed points across the field of view to ensure the signals at those points stand out from the sensor noise. This enables post-processing to resolve important objects and details. Additionally, the patterned flash projects texture onto the scene, which can be treated as a structured light system for depth perception. Given the novel system, we develop a joint image reconstruction and depth estimation algorithm with a convolutional neural network. We build a hardware prototype and test the proposed flash technique on various scenes. The experimental results demonstrate that our patterned flash has significantly better performance at long distances in low-light environments.

preprint2022arXiv

Structured Light with Redundancy Codes

Structured light (SL) systems acquire high-fidelity 3D geometry with active illumination projection. Conventional systems exhibit challenges when working in environments with strong ambient illumination, global illumination and cross-device interference. This paper proposes a general-purposed technique to improve the robustness of SL by projecting redundant optical signals in addition to the native SL patterns. In this way, projected signals become more distinguishable from errors. Thus the geometry information can be more easily recovered using simple signal processing and the ``coding gain" in performance is obtained. We propose three applications using our redundancy codes: (1) Self error-correction for SL imaging under strong ambient light, (2) Error detection for adaptive reconstruction under global illumination, and (3) Interference filtering with device-specific projection sequence encoding, especially for event camera-based SL and light curtain devices. We systematically analyze the design rules and signal processing algorithms in these applications. Corresponding hardware prototypes are built for evaluations on real-world complex scenes. Experimental results on the synthetic and real data demonstrate the significant performance improvements in SL systems with our redundancy codes.