Source author record

Guorong Li

Guorong Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision cond-mat.supr-con cond-mat.mtrl-sci cond-mat.str-el Computer Science and Game Theory cond-mat.other Cryptography and Security Machine Learning

Catalog footprint

What is connected

11works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Tale of HodgeRank and Spectral Method: Target Attack Against Rank Aggregation Is the Fixed Point of Adversarial Game

Rank aggregation with pairwise comparisons has shown promising results in elections, sports competitions, recommendations, and information retrieval. However, little attention has been paid to the security issue of such algorithms, in contrast to numerous research work on the computational and statistical characteristics. Driven by huge profits, the potential adversary has strong motivation and incentives to manipulate the ranking list. Meanwhile, the intrinsic vulnerability of the rank aggregation methods is not well studied in the literature. To fully understand the possible risks, we focus on the purposeful adversary who desires to designate the aggregated results by modifying the pairwise data in this paper. From the perspective of the dynamical system, the attack behavior with a target ranking list is a fixed point belonging to the composition of the adversary and the victim. To perform the targeted attack, we formulate the interaction between the adversary and the victim as a game-theoretic framework consisting of two continuous operators while Nash equilibrium is established. Then two procedures against HodgeRank and RankCentrality are constructed to produce the modification of the original data. Furthermore, we prove that the victims will produce the target ranking list once the adversary masters the complete information. It is noteworthy that the proposed methods allow the adversary only to hold incomplete information or imperfect feedback and perform the purposeful attack. The effectiveness of the suggested target attack strategies is demonstrated by a series of toy simulations and several real-world data experiments. These experimental results show that the proposed methods could achieve the attacker's goal in the sense that the leading candidate of the perturbed ranking list is the designated one by the adversary.

preprint2022arXiv

Hierarchical Modular Network for Video Captioning

Video captioning aims to generate natural language descriptions according to the content, where representation learning plays a crucial role. Existing methods are mainly developed within the supervised learning framework via word-by-word comparison of the generated caption against the ground-truth text without fully exploiting linguistic semantics. In this work, we propose a hierarchical modular network to bridge video representations and linguistic semantics from three levels before generating captions. In particular, the hierarchy is composed of: (I) Entity level, which highlights objects that are most likely to be mentioned in captions. (II) Predicate level, which learns the actions conditioned on highlighted objects and is supervised by the predicate in captions. (III) Sentence level, which learns the global semantic representation and is supervised by the whole caption. Each level is implemented by one module. Extensive experimental results show that the proposed method performs favorably against the state-of-the-art models on the two widely-used benchmarks: MSVD 104.0% and MSR-VTT 51.5% in CIDEr score.

preprint2022arXiv

Multi-Attention Network for Compressed Video Referring Object Segmentation

Referring video object segmentation aims to segment the object referred by a given language expression. Existing works typically require compressed video bitstream to be decoded to RGB frames before being segmented, which increases computation and storage requirements and ultimately slows the inference down. This may hamper its application in real-world computing resource limited scenarios, such as autonomous cars and drones. To alleviate this problem, in this paper, we explore the referring object segmentation task on compressed videos, namely on the original video data flow. Besides the inherent difficulty of the video referring object segmentation task itself, obtaining discriminative representation from compressed video is also rather challenging. To address this problem, we propose a multi-attention network which consists of dual-path dual-attention module and a query-based cross-modal Transformer module. Specifically, the dual-path dual-attention module is designed to extract effective representation from compressed data in three modalities, i.e., I-frame, Motion Vector and Residual. The query-based cross-modal Transformer firstly models the correlation between linguistic and visual modalities, and then the fused multi-modality features are used to guide object queries to generate a content-aware dynamic kernel and to predict final segmentation masks. Different from previous works, we propose to learn just one kernel, which thus removes the complicated post mask-matching procedure of existing methods. Extensive promising experimental results on three challenging datasets show the effectiveness of our method compared against several state-of-the-art methods which are proposed for processing RGB data. Source code is available at: https://github.com/DexiangHong/MANet.

preprint2022arXiv

Object Localization under Single Coarse Point Supervision

Point-based object localization (POL), which pursues high-performance object sensing under low-cost data annotation, has attracted increased attention. However, the point annotation mode inevitably introduces semantic variance for the inconsistency of annotated points. Existing POL methods heavily reply on accurate key-point annotations which are difficult to define. In this study, we propose a POL method using coarse point annotations, relaxing the supervision signals from accurate key points to freely spotted points. To this end, we propose a coarse point refinement (CPR) approach, which to our best knowledge is the first attempt to alleviate semantic variance from the perspective of algorithm. CPR constructs point bags, selects semantic-correlated points, and produces semantic center points through multiple instance learning (MIL). In this way, CPR defines a weakly supervised evolution procedure, which ensures training high-performance object localizer under coarse point supervision. Experimental results on COCO, DOTA and our proposed SeaPerson dataset validate the effectiveness of the CPR approach. The dataset and code will be available at https://github.com/ucas-vg/PointTinyBenchmark/.

preprint2021arXiv

Anti-UAV: A Large Multi-Modal Benchmark for UAV Tracking

Unmanned Aerial Vehicle (UAV) offers lots of applications in both commerce and recreation. With this, monitoring the operation status of UAVs is crucially important. In this work, we consider the task of tracking UAVs, providing rich information such as location and trajectory. To facilitate research on this topic, we propose a dataset, Anti-UAV, with more than 300 video pairs containing over 580k manually annotated bounding boxes. The releasing of such a large-scale dataset could be a useful initial step in research of tracking UAVs. Furthermore, the advancement of addressing research challenges in Anti-UAV can help the design of anti-UAV systems, leading to better surveillance of UAVs. Besides, a novel approach named dual-flow semantic consistency (DFSC) is proposed for UAV tracking. Modulated by the semantic flow across video sequences, the tracker learns more robust class-level semantic information and obtains more discriminative instance-level features. Experimental results demonstrate that Anti-UAV is very challenging, and the proposed method can effectively improve the tracker's performance. The Anti-UAV benchmark and the code of the proposed approach will be publicly available at https://github.com/ucas-vg/Anti-UAV.

preprint2020arXiv

Siamese Box Adaptive Network for Visual Tracking

Most of the existing trackers usually rely on either a multi-scale searching scheme or pre-defined anchor boxes to accurately estimate the scale and aspect ratio of a target. Unfortunately, they typically call for tedious and heuristic configurations. To address this issue, we propose a simple yet effective visual tracking framework (named Siamese Box Adaptive Network, SiamBAN) by exploiting the expressive power of the fully convolutional network (FCN). SiamBAN views the visual tracking problem as a parallel classification and regression problem, and thus directly classifies objects and regresses their bounding boxes in a unified FCN. The no-prior box design avoids hyper-parameters associated with the candidate boxes, making SiamBAN more flexible and general. Extensive experiments on visual tracking benchmarks including VOT2018, VOT2019, OTB100, NFS, UAV123, and LaSOT demonstrate that SiamBAN achieves state-of-the-art performance and runs at 40 FPS, confirming its effectiveness and efficiency. The code will be available at https://github.com/hqucv/siamban.

preprint2016arXiv

Hidden Phases Revealed at the Surface of Double-Layered Sr3(Ru1-xMnx)2O7

Double-layered Sr3Ru2O7 has received phenomenal consideration because it exhibits a plethora of exotic phases when perturbed. New phases emerge with the application of pressure, magnetic field, or doping. Here we show that creating a surface is an alternative and effective way to reveal hidden phases that are different from those seen in the bulk by investigating the surface properties of Sr3(Ru1-xMnx)2O7. Driven by the tilt distortion of RuO6 octahedra, the surface of Sr3Ru2O7 is less metallic than the bulk. In contrast, because of the vanishing of tilt and enhanced rotation with Mn-doping, the surface of Sr3(Ru0.84Mn0.16)2O7 is metallic while the bulk is insulating. Our result demonstrates that the electronic and structural properties at the surface are intimately coupled and consistent with quasi two-dimensional character.

preprint2016arXiv

Interrogating the superconductor Ca10(Pt4As8)(Fe2-xPtxAs2)5 Layer-by-layer

Ever since the discovery of high-Tc superconductivity in layered cuprates, the roles that individual layers play have been debated, due to difficulty in layer-by-layer characterization. While there is similar challenge in many Fe-based layered superconductors, the newly-discovered Ca10(Pt4As8)(Fe2As2)5 provides opportunities to explore superconductivity layer by layer, because it contains both superconducting building blocks (Fe2As2 layers) and intermediate Pt4As8 layers. Cleaving a single crystal under ultra-high vacuum results in multiple terminations: an ordered Pt4As8 layer, two reconstructed Ca layers on the top of a Pt4As8 layer, and disordered Ca layer on the top of Fe2As2 layer. The electronic properties of individual layers are studied using scanning tunneling microscopy/spectroscopy (STM/S), which reveals different spectra for each surface. Remarkably superconducting coherence peaks are seen only on the ordered Ca/Pt4As8 layer. Our results indicate that an ordered structure with proper charge balance is required in order to preserve superconductivity.

preprint2013arXiv

Large and temperature-independent piezoelectric response in Pb(Mg1/3Nb2/3)O3-BaTiO3-PbTiO3

The temperature dependence of elastic, dielectric, and piezoelectric properties of (65-x)Pb(Mg1/3Nb2/3)O3-xBaTiO335-PbTiO3 ceramics with x=0, 1, 2, 3, and 4 was investigated. Compound with x=2 was found to exhibit a large piezoelectric response (d31=-170 pC/N, d33=530 pC/N at 300 K). Particularly, its d31 value was nearly a constant over a temperature range from 185 to 360 K. A broad ferroelectric phase transition tuned by BaTiO3 doping was deduced from the dielectric constant, elastic compliance constant and Raman spectra. The temperature-stable piezoelectric response was attributed to the counter-balance of contributions from the dielectric and elastic responses.

preprint2010arXiv

BaFe2As2 Surface Domains and Domain Walls: Mirroring the Bulk Spin Structure

High-resolution scanning tunneling microscopy (STM) measurements on BaFe2As2-one of the parent compounds of the iron-based superconductors-reveals a (1x1) As-terminated unit cell on the (001) surface. However, there are significant differences of the surface unit cell compared to the bulk: only one of the two As atoms in the unit cell is imaged and domain walls between different (1x1) regions display a C2 symmetry at the surface. It should have been C2v if the STM image reflected the geometric structure of the surface or the orthorhombic bulk. The inequivalent As atoms and the bias dependence of the domain walls indicate that the origin of the STM image is primarily electronic not geometric. We argue that the surface electronic topography mirrors the bulk spin structure of BaFe2As2, via strong orbital-spin coupling.

preprint2009arXiv

Surface Geometric and Electronic Structure of BaFe2As2(001)

BaFe2As2 exhibits properties characteristic of the parent compounds of the newly discovered iron (Fe)-based high-TC superconductors. By combining the real space imaging of scanning tunneling microscopy/spectroscopy (STM/S) with momentum space quantitative Low Energy Electron Diffraction (LEED) we have identified the surface plane of cleaved BaFe2As2 crystals as the As terminated Fe-As layer - the plane where superconductivity occurs. LEED and STM/S data on the BaFe2As2(001) surface indicate an ordered arsenic (As) - terminated metallic surface without reconstruction or lattice distortion. It is surprising that the STM images the different Fe-As orbitals associated with the orthorhombic structure, not the As atoms in the surface plane.

Guorong Li

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

A Tale of HodgeRank and Spectral Method: Target Attack Against Rank Aggregation Is the Fixed Point of Adversarial Game

Hierarchical Modular Network for Video Captioning

Multi-Attention Network for Compressed Video Referring Object Segmentation

Object Localization under Single Coarse Point Supervision

Anti-UAV: A Large Multi-Modal Benchmark for UAV Tracking

Siamese Box Adaptive Network for Visual Tracking

Hidden Phases Revealed at the Surface of Double-Layered Sr3(Ru1-xMnx)2O7

Interrogating the superconductor Ca10(Pt4As8)(Fe2-xPtxAs2)5 Layer-by-layer

Large and temperature-independent piezoelectric response in Pb(Mg1/3Nb2/3)O3-BaTiO3-PbTiO3

BaFe2As2 Surface Domains and Domain Walls: Mirroring the Bulk Spin Structure

Surface Geometric and Electronic Structure of BaFe2As2(001)