Source author record

Bohan Zhang

Bohan Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning physics.app-ph physics.ins-det physics.optics Artificial Intelligence Distributed, Parallel, and Cluster Computing econ.EM Graphics Human-Computer Interaction Methodology physics.geo-ph

Catalog footprint

What is connected

9works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation

We present LongLive-2.0, an NVFP4-based parallel infrastructure throughout the full training and inference workflow of long video generation, addressing speed and memory bottlenecks. For training, we introduce sequence-parallel autoregressive (AR) training, instantiated as Balanced SP, which co-designs the efficient teacher-forcing layout with SP execution by pairing clean-history and noisy-target temporal chunks on each rank, enabling a natural teacher-forcing mask with SP-aware chunked VAE encoding. Combined with NVFP4 precision, it reduces GPU memory cost and accelerates GEMM computation during training, the proportion of which increases as video length grows. Moreover, we show that a high-quality infrastructure and dataset enable a remarkably clean training pipeline. Unlike existing Self-Forcing series methods that rely on ODE initialization and subsequent distribution matching distillation (DMD), LongLive-2.0 directly tunes a diffusion model into a long, multi-shot, interactive auto-regressive (AR) diffusion model. It can be further converted to real-time generation (4 to 2 denoising steps) with standalone LoRA weights. For inference on Blackwell GPUs, we enable W4A4 NVFP4 inference, quantize KV cache into NVFP4 for memory savings, and boost end-to-end throughput with asynchronous streaming VAE decoding. On non-Blackwell GPU architectures, we deploy SP inference to match the speed on Blackwell GPUs, while the quantized KV cache can lower inter-GPU communication of SP. Experiments show up to 2.15x speedup in training, and 1.84x in inference. LongLive-2.0-5B achieves 45.7 FPS inference while attaining strong performance on benchmarks. To our knowledge, LongLive-2.0 is the first NVFP4 training and inference system for long video generation.

preprint2026arXiv

Who Owns the Text? Design Patterns for Preserving Authorship in AI-Assisted Writing

AI writing assistants can reduce effort and improve fluency, but they may also weaken writers' sense of authorship. We study this tension with an ownership-aware co-writing editor that offers on-demand, sentence-level suggestions and tests two common design choices: persona-based coaching and style personalization. In an online study (N=176), participants completed three professional writing tasks: an email without AI help, a proposal with generic AI suggestions, and a cover letter with persona-based coaching, while half received suggestions tailored to a brief sample of their prior writing. Across the two AI-assisted tasks, psychological ownership dropped relative to unassisted writing (about 0.85-1.0 points on a 7-point scale), even as cognitive load decreased (about 0.9 points) and quality ratings stayed broadly similar overall. Persona coaching did not prevent the ownership decline. Style personalization partially restored ownership (about +0.43) and increased AI incorporation in text (+5 percentage points). We distill five design patterns: on-demand initiation, micro-suggestions, voice anchoring, audience scaffolds, and point-of-decision provenance, to guide authorship-preserving writing tools.

preprint2022arXiv

DyRep: Bootstrapping Training with Dynamic Re-parameterization

Structural re-parameterization (Rep) methods achieve noticeable improvements on simple VGG-style networks. Despite the prevalence, current Rep methods simply re-parameterize all operations into an augmented network, including those that rarely contribute to the model's performance. As such, the price to pay is an expensive computational overhead to manipulate these unnecessary behaviors. To eliminate the above caveats, we aim to bootstrap the training with minimal cost by devising a dynamic re-parameterization (DyRep) method, which encodes Rep technique into the training process that dynamically evolves the network structures. Concretely, our proposal adaptively finds the operations which contribute most to the loss in the network, and applies Rep to enhance their representational capacity. Besides, to suppress the noisy and redundant operations introduced by Rep, we devise a de-parameterization technique for a more compact re-parameterization. With this regard, DyRep is more efficient than Rep since it smoothly evolves the given network instead of constructing an over-parameterized network. Experimental results demonstrate our effectiveness, e.g., DyRep improves the accuracy of ResNet-18 by $2.04\%$ on ImageNet and reduces $22\%$ runtime over the baseline. Code is available at: https://github.com/hunto/DyRep.

preprint2022arXiv

Epicenter localization using forward-transmission laser interferometry

Widely distributed optical fibers, together with phase-sensitive laser interferometry, can expand seismic detection methods and have great potential for epicenter localization. In this paper, we propose an integral response method based on a forward transmission scheme. It uses spectrum analysis and parameter fitting to localize the epicenter. With the given shape of the fiber ring, the integral phase changes of light propagating in the forward and reverse directions can be used to determine the direction, depth, distance of the epicenter, and seismic wave speed. For the noisy case with SNR=20 dB, the simulation results show ultrahigh precision when epicenter distance is 200 km: the error of the orientation angle is ~0.003°, the error of the P-wave speed is ~0.9 m/s, the error of the epicenter depth is ~9.5 m, and the error of the epicenter distance is ~200 m.

preprint2022arXiv

Optimal reconciliation with immutable forecasts

The practical importance of coherent forecasts in hierarchical forecasting has inspired many studies on forecast reconciliation. Under this approach, so-called base forecasts are produced for every series in the hierarchy and are subsequently adjusted to be coherent in a second reconciliation step. Reconciliation methods have been shown to improve forecast accuracy, but will, in general, adjust the base forecast of every series. However, in an operational context, it is sometimes necessary or beneficial to keep forecasts of some variables unchanged after forecast reconciliation. In this paper, we formulate reconciliation methodology that keeps forecasts of a pre-specified subset of variables unchanged or "immutable". In contrast to existing approaches, these immutable forecasts need not all come from the same level of a hierarchy, and our method can also be applied to grouped hierarchies. We prove that our approach preserves unbiasedness in base forecasts. Our method can also account for correlations between base forecasting errors and ensure non-negativity of forecasts. We also perform empirical experiments, including an application to sales of a large scale online retailer, to assess the impacts of our proposed methodology.

preprint2022arXiv

Time shifting deviation method enhanced laser interferometry: ultrahigh precision localizing of traffic vibration using urban fiber link

Using fiber network as a huge sensing system will enrich monitoring methods of public infrastructures and geological disasters. With traditional cross-correlation method, laser interferometer has been used to detect and localize the vibration event. However, the random error induced by cross-correlation method limits the localization accuracy, and makes it not suitable for ultrahigh precision localizing applications. We propose a novel time shifting deviation (TSDEV) method, which has advantages over cross-correlation method in practicability and localization accuracy. Three experiments are carried out to demonstrate the novelty of the TSDEV method. In lab test, vibration localization accuracy of ~2.5 m is realized. In field tests, TSDEV method enhanced interferometry is applied to monitor the urban fiber link. Traffic vibration events on the campus road and Beijing ring road have been precisely localized and analyzed, respectively. The proposed technique will extend the function of existing urban fiber network, and better serve the future smart city.

preprint2021arXiv

Single-Shot Motion Completion with Transformer

Motion completion is a challenging and long-discussed problem, which is of great significance in film and game applications. For different motion completion scenarios (in-betweening, in-filling, and blending), most previous methods deal with the completion problems with case-by-case designs. In this work, we propose a simple but effective method to solve multiple motion completion problems under a unified framework and achieves a new state of the art accuracy under multiple evaluation settings. Inspired by the recent great success of attention-based models, we consider the completion as a sequence to sequence prediction problem. Our method consists of two modules - a standard transformer encoder with self-attention that learns long-range dependencies of input motions, and a trainable mixture embedding module that models temporal information and discriminates key-frames. Our method can run in a non-autoregressive manner and predict multiple missing frames within a single forward propagation in real time. We finally show the effectiveness of our method in music-dance applications.

preprint2020arXiv

Serpentine optical phased arrays for scalable integrated photonic LIDAR beam steering

Optical phased arrays (OPAs) implemented in integrated photonic circuits could enable a variety of 3D sensing, imaging, illumination, and ranging applications, and their convergence in new LIDAR technology. However, current integrated OPA approaches do not scale - in control complexity, power consumption, and optical efficiency - to the large aperture sizes needed to support medium to long range LIDAR. We present the serpentine optical phased array (SOPA), a new OPA concept that addresses these fundamental challenges and enables architectures that scale up to large apertures. The SOPA is based on a serially interconnected array of low-loss grating waveguides and supports fully passive, two-dimensional (2D) wavelength-controlled beam steering. A fundamentally space-efficient design that folds the feed network into the aperture also enables scalable tiling of SOPAs into large apertures with a high fill-factor. We experimentally demonstrate the first SOPA, using a 1450 - 1650 nm wavelength sweep to produce 16,500 addressable spots in a 27x610 array. We also demonstrate, for the first time, far-field interference of beams from two separate OPAs on a single silicon photonic chip, as an initial step towards long-range computational imaging LIDAR based on novel active aperture synthesis schemes.

preprint2020arXiv

Verniered Optical Phased Arrays for Grating Lobe Suppression and Extended FOV

Optical phased arrays (OPAs) which beam-steer in 2D have so far been unable to pack emitting elements at $λ/2$ spacing, leading to grating lobes which limit the field-of-view, introduce signal ambiguity, and reduce optical efficiency. Vernier schemes, which use paired transmitter and receiver phased arrays with different periodicity, deliberately misalign the transmission and receive patterns so that only a single pairing of transmit/receive lobes permit a signal to be detected. A pair of OPAs designed to exploit this effect thereby effectively suppress the effects of grating lobes and recover the system's field-of-view, avoid potential ambiguities, and reduce excess noise. Here we analytically evaluate Vernier schemes with arbitrary phase control to find optimal configurations, as well as elucidate the manner in which a Vernier scheme can recover the full field-of-view. We present the first experimental implementation of a Vernier scheme and demonstrate grating lobe suppression using a pair of 2D wavelength-steered OPAs. These results present a route forward for addressing the pervasive issue of grating lobes, significantly alleviating the need for dense emitter pitches.

Bohan Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation

Who Owns the Text? Design Patterns for Preserving Authorship in AI-Assisted Writing

DyRep: Bootstrapping Training with Dynamic Re-parameterization

Epicenter localization using forward-transmission laser interferometry

Optimal reconciliation with immutable forecasts

Time shifting deviation method enhanced laser interferometry: ultrahigh precision localizing of traffic vibration using urban fiber link

Single-Shot Motion Completion with Transformer

Serpentine optical phased arrays for scalable integrated photonic LIDAR beam steering

Verniered Optical Phased Arrays for Grating Lobe Suppression and Extended FOV