Source author record

Daqing Liu

Daqing Liu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision cond-mat.mes-hall cond-mat.str-el Computation and Language cond-mat.other cond-mat.quant-gas physics.class-ph physics.gen-ph

Catalog footprint

What is connected

12works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Modeling Image Composition for Complex Scene Generation

We present a method that achieves state-of-the-art results on challenging (few-shot) layout-to-image generation tasks by accurately modeling textures, structures and relationships contained in a complex scene. After compressing RGB images into patch tokens, we propose the Transformer with Focal Attention (TwFA) for exploring dependencies of object-to-object, object-to-patch and patch-to-patch. Compared to existing CNN-based and Transformer-based generation models that entangled modeling on pixel-level&patch-level and object-level&patch-level respectively, the proposed focal attention predicts the current patch token by only focusing on its highly-related tokens that specified by the spatial layout, thereby achieving disambiguation during training. Furthermore, the proposed TwFA largely increases the data efficiency during training, therefore we propose the first few-shot complex scene generation strategy based on the well-trained TwFA. Comprehensive experiments show the superiority of our method, which significantly increases both quantitative metrics and qualitative visual realism with respect to state-of-the-art CNN-based and transformer-based methods. Code is available at https://github.com/JohnDreamer/TwFA.

preprint2020arXiv

Joint Visual Grounding with Language Scene Graphs

Visual grounding is a task to ground referring expressions in images, e.g., localize "the white truck in front of the yellow one". To resolve this task fundamentally, the model should first find out the contextual objects (e.g., the "yellow" truck) and then exploit them to disambiguate the referent from other similar objects by using the attributes and relationships (e.g., "white", "yellow", "in front of"). However, due to the lack of annotations on contextual objects and their relationships, existing methods degenerate the above joint grounding process into a holistic association between the expression and regions, thus suffering from unsatisfactory performance and limited interpretability. In this paper, we alleviate the missing-annotation problem and enable the joint reasoning by leveraging the language scene graph which covers both labeled referent and unlabeled contexts (other objects, attributes, and relationships). Specifically, the language scene graph is a graphical representation where the nodes are objects with attributes and the edges are relationships. We construct a factor graph based on it and then perform marginalization over the graph, such that we can ground both referent and contexts on corresponding image regions to achieve the joint visual grounding (JVG). Experimental results demonstrate that the proposed approach is effective and interpretable, e.g., on three benchmarks, it outperforms the state-of-the-art methods while offers a complete grounding of all the objects mentioned in the referring expression.

preprint2020arXiv

Learning to Discretely Compose Reasoning Module Networks for Video Captioning

Generating natural language descriptions for videos, i.e., video captioning, essentially requires step-by-step reasoning along the generation process. For example, to generate the sentence "a man is shooting a basketball", we need to first locate and describe the subject "man", next reason out the man is "shooting", then describe the object "basketball" of shooting. However, existing visual reasoning methods designed for visual question answering are not appropriate to video captioning, for it requires more complex visual reasoning on videos over both space and time, and dynamic module composition along the generation process. In this paper, we propose a novel visual reasoning approach for video captioning, named Reasoning Module Networks (RMN), to equip the existing encoder-decoder framework with the above reasoning capacity. Specifically, our RMN employs 1) three sophisticated spatio-temporal reasoning modules, and 2) a dynamic and discrete module selector trained by a linguistic loss with a Gumbel approximation. Extensive experiments on MSVD and MSR-VTT datasets demonstrate the proposed RMN outperforms the state-of-the-art methods while providing an explicit and explainable generation process. Our code is available at https://github.com/tgc1997/RMN.

preprint2020arXiv

More Grounded Image Captioning by Distilling Image-Text Matching Model

Visual attention not only improves the performance of image captioners, but also serves as a visual interpretation to qualitatively measure the caption rationality and model transparency. Specifically, we expect that a captioner can fix its attentive gaze on the correct objects while generating the corresponding words. This ability is also known as grounded image captioning. However, the grounding accuracy of existing captioners is far from satisfactory. To improve the grounding accuracy while retaining the captioning quality, it is expensive to collect the word-region alignment as strong supervision. To this end, we propose a Part-of-Speech (POS) enhanced image-text matching model (SCAN \cite{lee2018stacked}): POS-SCAN, as the effective knowledge distillation for more grounded image captioning. The benefits are two-fold: 1) given a sentence and an image, POS-SCAN can ground the objects more accurately than SCAN; 2) POS-SCAN serves as a word-region alignment regularization for the captioner's visual attention module. By showing benchmark experimental results, we demonstrate that conventional image captioners equipped with POS-SCAN can significantly improve the grounding accuracy without strong supervision. Last but not the least, we explore the indispensable Self-Critical Sequence Training (SCST) \cite{Rennie_2017_CVPR} in the context of grounded image captioning and show that the image-text matching score can serve as a reward for more grounded captioning \footnote{https://github.com/YuanEZhou/Grounded-Image-Captioning}.

preprint2016arXiv

Collective fermion excitation in a warm massless Dirac system

Basing on a self-consistent method, we predict theoretically that there occurs not only a normal (quasi) fermion mode, but also a collective fermion mode, plasmino, in a warm 2D massless Dirac system, especially in a warm intrinsic graphene system. Results of Landau damping show that both fermion and plasmino are well defined modes. We find that there are sharp differences between the discussed system and the QCD/QED system. Firstly, the thermal mass is proportional to $α_g^{3/4}T$ but not $α_g T$. Secondly, at $0<q<q_c$, the fermion channel and plasmino channel are nearly degenerate and furthermore, the energy difference between fermion and plasmino becomes more and more larger with increasing $q$ at the region $q>q_c$. Thirdly, the fermion behaves as a "relativity particles" with none zero mass and the plasmino exhibits an anormal dispersion at moderate momentum.

preprint2016arXiv

Low-energy quantum scattering induced by graphene ripples

We report a quantum study of the carrier scattering induced by graphene ripples. Crucial differences between the scattering induced by the ripple and ordinary scattering were found. In contrast to the latter, in which the Born approximation is valid for high-energy process, the former is valid for the low-energy process with a quite broad energy range. Furthermore, in polar symmetry ripples, the scattering amplitude exhibits a pseudo-spin structure, an additional factor $\cosθ/2$, which leads to an absence of backward scattering. We also elucidate that the scattering cross sections are proportional to the energy cubed of the incident carrier.

preprint2014arXiv

Electron spin-orbit interaction in helically coiled carbon nanotube

Recent theoretical and experimental works on carbon nanotubes (CNTs) have revealed that spin-orbit interaction (SOI) is more robust than it was thought. Motivated by this, we investigate the SOI in helically coiled CNTs. Calculations are performed within the tight-binding model with the inclusion of a four-orbital basis set; thereby the full symmetry of the helical lattice and the hybridization of $π$\ and $σ$ bands are considered. By virtue of unitary transformation and perturbation approach, we obtain the analytic solution for the torsion-dependent SOI in helically coiled CNTs. Due to the enhancement of curvature and torsion, the calculated SOI values reach the order of meV which has been confirmed by \textit{ab initio} electronic structure calculation.

preprint2012arXiv

Faraday rotation effect in periodic graphene structure

We report the magneto-optical rotation effect in a periodic graphene-sheet structure. Due to the masslessness of carriers in graphene, the magnetic response is very sensitive and the magneto-optical rotation effect is therefore significant. We predict that the Verdet constant of the periodic graphene-sheet structure is roughly 10-100 times that of rare-earth-doped magneto-optical glass in the infrared region.

preprint2011arXiv

A covering theory of special relativity

Under the assumption of closed-path velocity of light invariant, we show both the general expression of velocity of light in an ordinary inertial reference frame and the generalized Lorentz transformation between the ordinary inertial reference frame and the absolute (privileged) reference frame. Although such assumption can not determine theory ambiguously, some significant results can still be obtained by the assumption. Furthermore, the study shows that the relativity of simultaneity is not a universal concept.

preprint2010arXiv

Anomalous Valley Magnetic Moment of Graphene

Carrier interactions on graphene are studied. The study shows that besides the well known Coulomb repulsion between carriers, there also exist four-fermion interactions associated with U-process, one of which attracts carriers in different valleys. We then calculate the contributions to valley magnetic moment from vertex correction and from four-fermion corrections explicitly. The relative contributions are -18% and 3% respectively. At last we point out that we can mimic heavy quarkonium system by carrier interactions in graphene.

preprint2009arXiv

On wave functional in QED

In a discrete form of the second quantization, the gauge independencies of all the physical states including vacuum in QED are restudied through a new approach. We also discuss an interesting phenomenon attributed to vacuum effect and come up with a procedure to produce general physical states.

preprint2008arXiv

Kramers-Kronig relation of graphene conductivity

Utilizing a complete Lorentz-covariant and local-gauge-invariant formulation, we discuss graphene response to arbitrary external electric field. The relation, which is called as Kramers-Kr(\ddot{o})nig relation in the paper, between imaginary part and real part of ac conductivity is given. We point out there exists an ambiguity in the conductivity computing, attributed to the wick behavior at ultraviolet vicinity. We argue that to study electrical response of graphene completely, non-perturbational contribution should be considered.

Daqing Liu

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

Modeling Image Composition for Complex Scene Generation

Joint Visual Grounding with Language Scene Graphs

Learning to Discretely Compose Reasoning Module Networks for Video Captioning

More Grounded Image Captioning by Distilling Image-Text Matching Model

Collective fermion excitation in a warm massless Dirac system

Low-energy quantum scattering induced by graphene ripples

Electron spin-orbit interaction in helically coiled carbon nanotube

Faraday rotation effect in periodic graphene structure

A covering theory of special relativity

Anomalous Valley Magnetic Moment of Graphene

On wave functional in QED

Kramers-Kronig relation of graphene conductivity