Researcher profile

Chutian Wang

Chutian Wang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2026arXiv

Cap2Sum: Learning to Summarize Videos by Generating Captions

With the rapid growth of video data on the internet, video summarization is becoming a very important AI technology. However, due to the high labelling cost of video summarization, existing studies have to be conducted on small-scale datasets, leading to limited performance and generalization capacity. In this work, we introduce the use of dense video captions as a supervision signal to train video summarization models. Motivated by this, we propose Cap2Sum, a model that learns to summarize videos by generating captions, to exploit dense video caption annotations. This weakly-supervised approach allows us to train the models on large-scale dense video caption datasets to achieve better performance and generalization capacity. To further improve the generalization capacity, we introduce a CLIP (a strong vision-language model) Prior mechanism to enhance the learning of important objects that captions may ignore in the videos. In practice, Cap2Sum can perform zero-shot video summarization or be fine-tuned by the ground-truth summary or video caption of the target dataset. To examine the performance of Cap2Sum after weakly-supervised fine-tuning by the video captions, we propose two new datasets, TVSum-Caption and SumMe-Caption, which are derived from two common video summarization datasets and will be publicly released. We conduct extensive experiments and the results demonstrate that our method achieves significant improvements in performance and generalization capacity compared with previous methods.

preprint2023arXiv

On the use of deep learning for phase recovery

Phase recovery (PR) refers to calculating the phase of the light field from its intensity measurements. As exemplified from quantitative phase imaging and coherent diffraction imaging to adaptive optics, PR is essential for reconstructing the refractive index distribution or topography of an object and correcting the aberration of an imaging system. In recent years, deep learning (DL), often implemented through deep neural networks, has provided unprecedented support for computational imaging, leading to more efficient solutions for various PR problems. In this review, we first briefly introduce conventional methods for PR. Then, we review how DL provides support for PR from the following three stages, namely, pre-processing, in-processing, and post-processing. We also review how DL is used in phase image processing. Finally, we summarize the work in DL for PR and outlook on how to better use DL to improve the reliability and efficiency in PR. Furthermore, we present a live-updating resource (https://github.com/kqwang/phase-recovery) for readers to learn more about PR.

preprint2020arXiv

Electronic bandstructure of in-plane ferroelectric van der Waals $β'-In_{2}Se_{3}$

Layered indium selenides ($In_{2}Se_{3}$) have recently been discovered to host robust out-of-plane and in-plane ferroelectricity in the $α$ and $β$' phases, respectively. In this work, we utilise angle-resolved photoelectron spectroscopy to directly measure the electronic bandstructure of $β'-In_{2}Se_{3}$, and compare to hybrid density functional theory (DFT) calculations. In agreement with DFT, we find the band structure is highly two-dimensional, with negligible dispersion along the c-axis. Due to n-type doping we are able to observe the conduction band minima, and directly measure the minimum indirect (0.97 eV) and direct (1.46 eV) bandgaps. We find the Fermi surface in the conduction band is characterized by anisotropic electron pockets with sharp in-plane dispersion about the $\overline{M}$ points, yielding effective masses of 0.21 $m_{0}$ along $\overline{KM}$ and 0.33 $m_{0}$ along $\overline{ΓM}$. The measured band structure is well supported by hybrid density functional theory calculations. The highly two-dimensional (2D) bandstructure with moderate bandgap and small effective mass suggest that $β'-In_{2}Se_{3}$ is a potentially useful new van der Waals semiconductor. This together with its ferroelectricity makes it a viable material for high-mobility ferroelectric-photovoltaic devices, with applications in non-volatile memory switching and renewable energy technologies.