Source author record

Jiacheng Zhang

Jiacheng Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language Computer Vision Machine Learning math.AP math.OC math.PR Networking and Internet Architecture physics.data-an physics.flu-dyn physics.ins-det

Catalog footprint

What is connected

6works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2025arXiv

OnlineVPO: Align Video Diffusion Model with Online Video-Centric Preference Optimization

Video diffusion models (VDMs) have demonstrated remarkable capabilities in text-to-video (T2V) generation. Despite their success, VDMs still suffer from degraded image quality and flickering artifacts. To address these issues, some approaches have introduced preference learning to exploit human feedback to enhance the video generation. However, these methods primarily adopt the routine in the image domain without an in-depth investigation into video-specific preference optimization. In this paper, we reexamine the design of the video preference learning from two key aspects: feedback source and feedback tuning methodology, and present OnlineVPO, a more efficient preference learning framework tailored specifically for VDMs. On the feedback source, we found that the image-level reward model commonly used in existing methods fails to provide a human-aligned video preference signal due to the modality gap. In contrast, video quality assessment (VQA) models show superior alignment with human perception of video quality. Building on this insight, we propose leveraging VQA models as a proxy of humans to provide more modality-aligned feedback for VDMs. Regarding the preference tuning methodology, we introduce an online DPO algorithm tailored for VDMs. It not only enjoys the benefits of superior scalability in optimizing videos with higher resolution and longer duration compared with the existing method, but also mitigates the insufficient optimization issue caused by off-policy learning via online preference generation and curriculum preference update designs. Extensive experiments on the open-source video-diffusion model demonstrate OnlineVPO as a simple yet effective and, more importantly, scalable preference learning algorithm for video diffusion models.

preprint2022arXiv

Cross-Technology Communication for the Internet of Things: A Survey

The ever-developing Internet of Things (IoT) brings the prosperity of wireless sensing and control applications. In many scenarios, different wireless technologies coexist in the shared frequency medium as well as the physical space. Such wireless coexistence may lead to serious cross-technology interference (CTI) problems, e.g. channel competition, signal collision, throughput degradation. Compared with traditional methods like interference avoidance, tolerance, and concurrency mechanism, direct and timely information exchange among heterogeneous devices is therefore a fundamental requirement to ensure the usability, inter-operability, and reliability of the IoT. Under this circumstance, Cross-Technology Communication (CTC) technique thus becomes a hot topic in both academic and industrial fields, which aims at directly exchanging data among heterogeneous devices that follow different standards. This paper comprehensively summarizes the CTC techniques and reveals that the key challenge for CTC lies in the heterogeneity of IoT devices, including the incompatibility of technical standards and the asymmetry of connection capability. Based on the above finding, we present a taxonomy of the existing CTC works (packet-level CTCs and physical-level CTCs) and compare the existing CTC techniques in terms of throughput, reliability, hardware modification, and concurrency.

preprint2022arXiv

Text-to-Table: A New Way of Information Extraction

We study a new problem setting of information extraction (IE), referred to as text-to-table. In text-to-table, given a text, one creates a table or several tables expressing the main content of the text, while the model is learned from text-table pair data. The problem setting differs from those of the existing methods for IE. First, the extraction can be carried out from long texts to large tables with complex structures. Second, the extraction is entirely data-driven, and there is no need to explicitly define the schemas. As far as we know, there has been no previous work that studies the problem. In this work, we formalize text-to-table as a sequence-to-sequence (seq2seq) problem. We first employ a seq2seq model fine-tuned from a pre-trained language model to perform the task. We also develop a new method within the seq2seq approach, exploiting two additional techniques in table generation: table constraint and table relation embeddings. We consider text-to-table as an inverse problem of the well-studied table-to-text, and make use of four existing table-to-text datasets in our experiments on text-to-table. Experimental results show that the vanilla seq2seq model can outperform the baseline methods of using relation extraction and named entity extraction. The results also show that our method can further boost the performances of the vanilla seq2seq model. We further discuss the main challenges of the proposed task. The code and data are available at https://github.com/shirley-wu/text_to_table.

preprint2020arXiv

Modeling Voting for System Combination in Machine Translation

System combination is an important technique for combining the hypotheses of different machine translation systems to improve translation performance. Although early statistical approaches to system combination have been proven effective in analyzing the consensus between hypotheses, they suffer from the error propagation problem due to the use of pipelines. While this problem has been alleviated by end-to-end training of multi-source sequence-to-sequence models recently, these neural models do not explicitly analyze the relations between hypotheses and fail to capture their agreement because the attention to a word in a hypothesis is calculated independently, ignoring the fact that the word might occur in multiple hypotheses. In this work, we propose an approach to modeling voting for system combination in machine translation. The basic idea is to enable words in hypotheses from different systems to vote on words that are representative and should get involved in the generation process. This can be done by quantifying the influence of each voter and its preference for each candidate. Our approach combines the advantages of statistical and neural methods since it can not only analyze the relations between hypotheses but also allow for end-to-end training. Experiments show that our approach is capable of better taking advantage of the consensus between hypotheses and achieves significant improvements over state-of-the-art baselines on Chinese-English and English-German machine translation tasks.

preprint2020arXiv

Superposition and mimicking theorems for conditional McKean-Vlasov equations

We consider conditional McKean-Vlasov stochastic differential equations (SDEs), such as the ones arising in the large-system limit of mean field games and particle systems with mean field interactions when common noise is present. The conditional time-marginals of the solutions to these SDEs satisfy non-linear stochastic partial differential equations (SPDEs) of the second order, whereas the laws of the conditional time-marginals follow Fokker-Planck equations on the space of probability measures. We prove two superposition principles: The first establishes that any solution of the SPDE can be lifted to a solution of the conditional McKean-Vlasov SDE, and the second guarantees that any solution of the Fokker-Planck equation on the space of probability measures can be lifted to a solution of the SPDE. We use these results to obtain a mimicking theorem which shows that the conditional time-marginals of an Ito process can be emulated by those of a solution to a conditional McKean-Vlasov SDE with Markovian coefficients. This yields, in particular, a tool for converting open-loop controls into Markovian ones in the context of controlled McKean-Vlasov dynamics.

preprint2019arXiv

Uncertainty Quantification in density estimation from Background Oriented Schlieren (BOS) measurements

We present an uncertainty quantification methodology for density estimation from Background Oriented Schlieren (BOS) measurements, in order to provide local, instantaneous, a-posteriori uncertainty bounds on each density measurement in the field of view. Displacement uncertainty quantification algorithms from cross-correlation based Particle Image Velocimetry (PIV) are used to estimate the uncertainty in the dot pattern displacements obtained from cross-correlation for BOS and assess their feasibility. In order to propagate the displacement uncertainty through the density integration procedure, we also develop a novel methodology via the Poisson solver using sparse linear operators. Testing the method using synthetic images of a Gaussian density field showed agreement between the propagated density uncertainties and the true uncertainty. Subsequently the methodology is experimentally demonstrated for supersonic flow over a wedge, showing that regions with sharp changes in density lead to an increase in density uncertainty throughout the field of view, even in regions without these sharp changes. The uncertainty propagation is influenced by the density integration scheme, and for the Poisson solver the density uncertainty increases monotonically on moving away from the regions where the Dirichlet boundary conditions are specified.