Source author record

Chutian Wang

Chutian Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

cond-mat.mtrl-sci eess.IV Machine Learning Multimedia physics.optics

Catalog footprint

What is connected

3works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Cap2Sum: Learning to Summarize Videos by Generating Captions

With the rapid growth of video data on the internet, video summarization is becoming a very important AI technology. However, due to the high labelling cost of video summarization, existing studies have to be conducted on small-scale datasets, leading to limited performance and generalization capacity. In this work, we introduce the use of dense video captions as a supervision signal to train video summarization models. Motivated by this, we propose Cap2Sum, a model that learns to summarize videos by generating captions, to exploit dense video caption annotations. This weakly-supervised approach allows us to train the models on large-scale dense video caption datasets to achieve better performance and generalization capacity. To further improve the generalization capacity, we introduce a CLIP (a strong vision-language model) Prior mechanism to enhance the learning of important objects that captions may ignore in the videos. In practice, Cap2Sum can perform zero-shot video summarization or be fine-tuned by the ground-truth summary or video caption of the target dataset. To examine the performance of Cap2Sum after weakly-supervised fine-tuning by the video captions, we propose two new datasets, TVSum-Caption and SumMe-Caption, which are derived from two common video summarization datasets and will be publicly released. We conduct extensive experiments and the results demonstrate that our method achieves significant improvements in performance and generalization capacity compared with previous methods.

preprint2023arXiv

On the use of deep learning for phase recovery

Phase recovery (PR) refers to calculating the phase of the light field from its intensity measurements. As exemplified from quantitative phase imaging and coherent diffraction imaging to adaptive optics, PR is essential for reconstructing the refractive index distribution or topography of an object and correcting the aberration of an imaging system. In recent years, deep learning (DL), often implemented through deep neural networks, has provided unprecedented support for computational imaging, leading to more efficient solutions for various PR problems. In this review, we first briefly introduce conventional methods for PR. Then, we review how DL provides support for PR from the following three stages, namely, pre-processing, in-processing, and post-processing. We also review how DL is used in phase image processing. Finally, we summarize the work in DL for PR and outlook on how to better use DL to improve the reliability and efficiency in PR. Furthermore, we present a live-updating resource (https://github.com/kqwang/phase-recovery) for readers to learn more about PR.

preprint2020arXiv

Electronic bandstructure of in-plane ferroelectric van der Waals $β'-In_{2}Se_{3}$

Layered indium selenides ($In_{2}Se_{3}$) have recently been discovered to host robust out-of-plane and in-plane ferroelectricity in the $α$ and $β$' phases, respectively. In this work, we utilise angle-resolved photoelectron spectroscopy to directly measure the electronic bandstructure of $β'-In_{2}Se_{3}$, and compare to hybrid density functional theory (DFT) calculations. In agreement with DFT, we find the band structure is highly two-dimensional, with negligible dispersion along the c-axis. Due to n-type doping we are able to observe the conduction band minima, and directly measure the minimum indirect (0.97 eV) and direct (1.46 eV) bandgaps. We find the Fermi surface in the conduction band is characterized by anisotropic electron pockets with sharp in-plane dispersion about the $\overline{M}$ points, yielding effective masses of 0.21 $m_{0}$ along $\overline{KM}$ and 0.33 $m_{0}$ along $\overline{ΓM}$. The measured band structure is well supported by hybrid density functional theory calculations. The highly two-dimensional (2D) bandstructure with moderate bandgap and small effective mass suggest that $β'-In_{2}Se_{3}$ is a potentially useful new van der Waals semiconductor. This together with its ferroelectricity makes it a viable material for high-mobility ferroelectric-photovoltaic devices, with applications in non-volatile memory switching and renewable energy technologies.