Researcher profile

Hao Ouyang

Hao Ouyang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2026arXiv

CausalCine: Real-Time Autoregressive Generation for Multi-Shot Video Narratives

Autoregressive video generation aims at real-time, open-ended synthesis. Yet, cinematic storytelling is not merely the endless extension of a single scene; it requires progressing through evolving events, viewpoint shifts, and discrete shot boundaries. Existing autoregressive models often struggle in this setting. Trained primarily for short-horizon continuation, they treat long sequences as extended single shots, inevitably suffering from motion stagnation and semantic drift during long rollouts. To bridge this gap, we introduce CausalCine, an interactive autoregressive framework that transforms multi-shot video generation into an online directing process. CausalCine generates causally across shot changes, accepts dynamic prompts on the fly, and reuses context without regenerating previous shots. To achieve this, we first train a causal base model on native multi-shot sequences to learn complex shot transitions prior to acceleration. We then propose Content-Aware Memory Routing (CAMR), which dynamically retrieves historical KV entries according to attention-based relevance scores rather than temporal proximity, preserving cross-shot coherence under bounded active memory. Finally, we distill the causal base model into a few-step generator for real-time interactive generation. Extensive experiments demonstrate that CausalCine significantly outperforms autoregressive baselines and approaches the capability of bidirectional models while unlocking the streaming interactivity of causal generation. Demo available at https://yihao-meng.github.io/CausalCine/

preprint2022arXiv

Deep Video Prior for Video Consistency and Propagation

Applying an image processing algorithm independently to each video frame often leads to temporal inconsistency in the resulting video. To address this issue, we present a novel and general approach for blind video temporal consistency. Our method is only trained on a pair of original and processed videos directly instead of a large dataset. Unlike most previous methods that enforce temporal consistency with optical flow, we show that temporal consistency can be achieved by training a convolutional neural network on a video with Deep Video Prior (DVP). Moreover, a carefully designed iteratively reweighted training strategy is proposed to address the challenging multimodal inconsistency problem. We demonstrate the effectiveness of our approach on 7 computer vision tasks on videos. Extensive quantitative and perceptual experiments show that our approach obtains superior performance than state-of-the-art methods on blind video temporal consistency. We further extend DVP to video propagation and demonstrate its effectiveness in propagating three different types of information (color, artistic style, and object segmentation). A progressive propagation strategy with pseudo labels is also proposed to enhance DVP's performance on video propagation. Our source codes are publicly available at https://github.com/ChenyangLEI/deep-video-prior.

preprint2022arXiv

Note on $T\bar{T}$ deformed matrix models and JT supergravity duals

In this work we calculate the partition functions of $\mathcal{N}=1$ type 0A and 0B JT supergravity (SJT) on 2D surfaces of arbitrary genus with multiple finite cut-off boundaries, based on the $T\bar{T}$ deformed super-Schwarzian theories. In terms of SJT/matrix model duality, we compute the corresponding correlation functions in the $T\bar{T}$ deformed matrix model side by using topological recursion relations as well as the transformation properties of topological recursion relations under $T\bar{T}$ deformation. We check that the partition functions finite cut-off 0A and 0B SJT on generic 2D surfaces match the associated correlation functions in $T\bar{T}$ deformed matrix models respectively.

preprint2022arXiv

Pretraining is All You Need for Image-to-Image Translation

We propose to use pretraining to boost general image-to-image translation. Prior image-to-image translation methods usually need dedicated architectural design and train individual translation models from scratch, struggling for high-quality generation of complex scenes, especially when paired training data are not abundant. In this paper, we regard each image-to-image translation problem as a downstream task and introduce a simple and generic framework that adapts a pretrained diffusion model to accommodate various kinds of image-to-image translation. We also propose adversarial training to enhance the texture synthesis in the diffusion model training, in conjunction with normalized guidance sampling to improve the generation quality. We present extensive empirical comparison across various tasks on challenging benchmarks such as ADE20K, COCO-Stuff, and DIODE, showing the proposed pretraining-based image-to-image translation (PITI) is capable of synthesizing images of unprecedented realism and faithfulness.

preprint2022arXiv

Real-Time Neural Character Rendering with Pose-Guided Multiplane Images

We propose pose-guided multiplane image (MPI) synthesis which can render an animatable character in real scenes with photorealistic quality. We use a portable camera rig to capture the multi-view images along with the driving signal for the moving subject. Our method generalizes the image-to-image translation paradigm, which translates the human pose to a 3D scene representation -- MPIs that can be rendered in free viewpoints, using the multi-views captures as supervision. To fully cultivate the potential of MPI, we propose depth-adaptive MPI which can be learned using variable exposure images while being robust to inaccurate camera registration. Our method demonstrates advantageous novel-view synthesis quality over the state-of-the-art approaches for characters with challenging motions. Moreover, the proposed method is generalizable to novel combinations of training poses and can be explicitly controlled. Our method achieves such expressive and animatable character rendering all in real time, serving as a promising solution for practical applications.

preprint2022arXiv

TBA-like equations for non-planar scattering amplitude/Wilson lines duality at strong coupling

We compute the minimal area of a string worldsheet ending on two infinite periodic light-like Wilson lines in the AdS$_3$ boundary, which is dual to the first non-planar correction to the gluon scattering amplitude in $\mathcal{N}=4$ SYM at the strong coupling. Using the connection between the Hitchin system and the thermodynamic Bethe ansatz (TBA) equations, we present an analytic method to compute the minimal area surface and express the non-trivial part of the minimal area in terms of the free energy of the TBA-like equations. Given the cross ratios as inputs, the area computed from the TBA-like equations matches that calculated using the numerical integration.

preprint2021arXiv

Wilson loops in circular quiver SCFTs at strong coupling

We study circular BPS Wilson loops in the $\mathcal{N}=2$ superconformal $n$-node quiver theories at large $N$ and strong 't Hooft coupling by using localization. We compute the expectation values of Wilson loops in the limit when the 't Hooft couplings are hierarchically different and when they are nearly equal. Based on these results, we make a conjecture for arbitrary strong couplings.

preprint2019arXiv

Open Spin Chains from Determinant Like Operators in ABJM Theory

We study the mixing problem of the determinant like operators in ABJM theory to two loop order in the scalar sector. The gravity duals of these operators are open strings attached to the maximal giant graviton, which is a D4-brane wrapping a $\mathbb{CP}^2$ inside $\mathbb{CP}^3$ in our case. The anomalous dimension matrix of these operators can be regarded as an open spin chain Hamiltonian. We provide strong evidence of its integrability based on coordinate Bethe ansatz method and boundary Yang-Baxter equation.

preprint2018arXiv

BPS Wilson loops in $\mathcal N \geq 2$ superconformal Chern-Simons-matter theories

In $\mathcal N \geq 2$ superconformal Chern-Simons-matter theories we construct the infinite family of Bogomol'nyi-Prasad-Sommerfield (BPS) Wilson loops featured by constant parametric couplings to scalar and fermion matter, including both line Wilson loops in Minkowski spacetime and circle Wilson loops in Euclidean space. We find that the connection of the most general BPS Wilson loop cannot be decomposed in terms of double-node connections. Moreover, if the quiver contains triangles, it cannot be interpreted as a supermatrix inside a superalgebra. However, for particular choices of the parameters it reduces to the well-known connections of 1/6 BPS Wilson loops in Aharony-Bergman-Jafferis-Maldacena (ABJM) theory and 1/4 BPS Wilson loops in $\mathcal N = 4$ orbifold ABJM theory. In the particular case of $\mathcal N = 2$ orbifold ABJM theory we identify the gravity duals of a subset of operators. We investigate the cohomological equivalence of fermionic and bosonic BPS Wilson loops at quantum level by studying their expectation values, and find strong evidence that the cohomological equivalence holds quantum mechanically, at framing one. Finally, we discuss a stronger formulation of the cohomological equivalence, which implies non-trivial identities for correlation functions of composite operators in the defect CFT defined on the Wilson contour and allows to make novel predictions on the corresponding unknown integrals that call for a confirmation.