Source author record

Yiru Wang

Yiru Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision eess.SP Computation and Language Artificial Intelligence Cryptography and Security Machine Learning

Catalog footprint

What is connected

9works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Unified Map Prior Encoder for Mapping and Planning

Online mapping and end-to-end (E2E) planning in autonomous driving remain largely sensor-centric, leaving rich map priors, including HD/SD vector maps, rasterized SD maps, and satellite imagery, underused because of heterogeneity, pose drift, and inconsistent availability at test time. We present UMPE, a Unified Map Prior Encoder that can ingest any subset of four priors and fuse them with BEV features for both mapping and planning. UMPE has two branches. The vector encoder pre-aligns HD/SD polylines with a frame-wise SE(2) correction, encodes points via multi-frequency sinusoidal features, and produces polyline tokens with confidence scores. BEV queries then apply cross-attention with confidence bias, followed by normalized channel-wise gating to avoid length imbalance and softly down-weight uncertain sources. The raster encoder shares a ResNet-18 backbone conditioned by FiLM with scaling and shift at every stage, performs SE(2) micro-alignment, and injects priors through zero-initialized residual fusion, so the network starts from a do-no-harm baseline and learns to add only useful prior evidence. A vector-then-raster fusion order reflects the inductive bias of geometry first, appearance second. On nuScenes mapping, UMPE lifts MapTRv2 from 61.5 to 67.4 mAP (+5.9) and MapQR from 66.4 to 71.7 mAP (+5.3). On Argoverse2, UMPE adds +4.1 mAP over strong baselines. UMPE is compositional: when trained with all priors, it outperforms single-prior models even when only one prior is available at test time, demonstrating powerset robustness. For E2E planning with the VAD backbone on nuScenes, UMPE reduces trajectory error from 0.72 to 0.42 m L2 on average (-0.30 m) and collision rate from 0.22% to 0.12% (-0.10%), surpassing recent prior-injection methods. These results show that a unified, alignment-aware treatment of heterogeneous map priors yields better mapping and better planning.

preprint2022arXiv

Energy Efficiency Maximization of Simultaneous Transmission and Reflection RIS Assisted Full-Duplex Communications

This work studies the effectiveness of a novel simultaneous transmission and reflection reconfigurable intelligent surface (STAR-RIS) aided Full-Duplex (FD) communication system. We aim to maximize the energy efficiency by jointly optimizing the transmit power and passive beamforming at the STAR-RIS. We propose an efficient algorithm to optimize them iteratively under the alternating optimization framework. The successive convex approximation (SCA) and Dinkelbach's method are used to solve the power optimization subproblem. The penalty-based method is used to design passive beamforming at the STAR-RIS. Numerical results verify the convergence and effectiveness of the proposed algorithm, and further reveal the benifits of the combining of the STAR-RIS and FD communication compared to benchmarks.

preprint2022arXiv

MALICE: Manipulation Attacks on Learned Image ComprEssion

Deep learning techniques have shown promising results in image compression, with competitive bitrate and image reconstruction quality from compressed latent. However, while image compression has progressed towards a higher peak signal-to-noise ratio (PSNR) and fewer bits per pixel (bpp), their robustness to adversarial images has never received deliberation. In this work, we, for the first time, investigate the robustness of image compression systems where imperceptible perturbation of input images can precipitate a significant increase in the bitrate of their compressed latent. To characterize the robustness of state-of-the-art learned image compression, we mount white-box and black-box attacks. Our white-box attack employs fast gradient sign method on the entropy estimation of the bitstream as its bitrate approximation. We propose DCT-Net simulating JPEG compression with architectural simplicity and lightweight training as the substitute in the black-box attack and enable fast adversarial transferability. Our results on six image compression models, each with six different bitrate qualities (thirty-six models in total), show that they are surprisingly fragile, where the white-box attack achieves up to 56.326x and black-box 1.947x bpp change. To improve robustness, we propose a novel compression architecture factorAtn which incorporates attention modules and a basic factorized entropy model, resulting in a promising trade-off between the rate-distortion performance and robustness to adversarial attacks that surpasses existing learned image compressors.

preprint2022arXiv

Reconfigurable Intelligent Surfaces for Energy Efficiency in Full-duplex Communication System

In this letter, we study the reconfigurable intelligent surfaces (RIS) aided full-duplex (FD) communication system. By jointly designing the active beamforming of two multi-antenna sources and passive beamforming of RIS, we aim to maximize the energy efficiency of the system, where extra self-interference cancellation power consumption in FD system is also considered. We divide the optimization problem into active and passive beamforming design subproblems, and adopt the alternative optimization framework to solve them iteratively. Dinkelbach's method is used to tackle the fractional objective function in active beamforming problem. Penalty method and successive convex approximation are exploited for passive beamforming design. Simulation results show the energy efficiency of our scheme outperforms other benchmarks.

preprint2022arXiv

Simultaneous Transmission and Reflection Reconfigurable Intelligent Surface Assisted Full-Duplex Communications

This work demonstrates the effectiveness of a novel simultaneous transmission and reflection reconfigurable intelligent surface (STAR-RIS) in Full-Duplex (FD) aided communication system. The objective is to minimize the total transmit power by jointly designing the transmit power and the transmitting and reflecting (T&R) coefficients of the STAR-RIS. To solve the nonconvex problem, an efficient algorithm is proposed by utilizing the alternating optimization framework to iteratively optimize variables. Specifically, in each iteration, we drive the closed-form expression for the optimal power design. The successive convex approximation (SCA) method and semidefinite program (SDP) are used to solve the passive beamforming optimization problem. Numerical results verify the convergence and effectiveness of the proposed algorithm, and further reveal in which scenarios STAR-RIS assisted FD communication defeats the Half-Duplex and conventional RIS.

preprint2022arXiv

Target-Relevant Knowledge Preservation for Multi-Source Domain Adaptive Object Detection

Domain adaptive object detection (DAOD) is a promising way to alleviate performance drop of detectors in new scenes. Albeit great effort made in single source domain adaptation, a more generalized task with multiple source domains remains not being well explored, due to knowledge degradation during their combination. To address this issue, we propose a novel approach, namely target-relevant knowledge preservation (TRKP), to unsupervised multi-source DAOD. Specifically, TRKP adopts the teacher-student framework, where the multi-head teacher network is built to extract knowledge from labeled source domains and guide the student network to learn detectors in unlabeled target domain. The teacher network is further equipped with an adversarial multi-source disentanglement (AMSD) module to preserve source domain-specific knowledge and simultaneously perform cross-domain alignment. Besides, a holistic target-relevant mining (HTRM) scheme is developed to re-weight the source images according to the source-target relevance. By this means, the teacher network is enforced to capture target-relevant knowledge, thus benefiting decreasing domain shift when mentoring object detection in the target domain. Extensive experiments are conducted on various widely used benchmarks with new state-of-the-art scores reported, highlighting the effectiveness.

preprint2020arXiv

Cognitive Representation Learning of Self-Media Online Article Quality

The automatic quality assessment of self-media online articles is an urgent and new issue, which is of great value to the online recommendation and search. Different from traditional and well-formed articles, self-media online articles are mainly created by users, which have the appearance characteristics of different text levels and multi-modal hybrid editing, along with the potential characteristics of diverse content, different styles, large semantic spans and good interactive experience requirements. To solve these challenges, we establish a joint model CoQAN in combination with the layout organization, writing characteristics and text semantics, designing different representation learning subnetworks, especially for the feature learning process and interactive reading habits on mobile terminals. It is more consistent with the cognitive style of expressing an expert's evaluation of articles. We have also constructed a large scale real-world assessment dataset. Extensive experimental results show that the proposed framework significantly outperforms state-of-the-art methods, and effectively learns and integrates different factors of the online article quality assessment.

preprint2020arXiv

Hierarchical Feature Embedding for Attribute Recognition

Attribute recognition is a crucial but challenging task due to viewpoint changes, illumination variations and appearance diversities, etc. Most of previous work only consider the attribute-level feature embedding, which might perform poorly in complicated heterogeneous conditions. To address this problem, we propose a hierarchical feature embedding (HFE) framework, which learns a fine-grained feature embedding by combining attribute and ID information. In HFE, we maintain the inter-class and intra-class feature embedding simultaneously. Not only samples with the same attribute but also samples with the same ID are gathered more closely, which could restrict the feature embedding of visually hard samples with regard to attributes and improve the robustness to variant conditions. We establish this hierarchical structure by utilizing HFE loss consisted of attribute-level and ID-level constraints. We also introduce an absolute boundary regularization and a dynamic loss weight as supplementary components to help build up the feature embedding. Experiments show that our method achieves the state-of-the-art results on two pedestrian attribute datasets and a facial attribute dataset.

preprint2020arXiv

HSCJN: A Holistic Semantic Constraint Joint Network for Diverse Response Generation

The sequence-to-sequence (Seq2Seq) model generates target words iteratively given the previously observed words during decoding process, which results in the loss of the holistic semantics in the target response and the complete semantic relationship between responses and dialogue histories. In this paper, we propose a generic diversity-promoting joint network, called Holistic Semantic Constraint Joint Network (HSCJN), enhancing the global sentence information, and then regularizing the objective function with penalizing the low entropy output. Our network introduces more target information to improve diversity, and captures direct semantic information to better constrain the relevance simultaneously. Moreover, the proposed method can be easily applied to any Seq2Seq structure. Extensive experiments on several dialogue corpuses show that our method effectively improves both semantic consistency and diversity of generated responses, and achieves better performance than other competitive methods.

Yiru Wang

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Unified Map Prior Encoder for Mapping and Planning

Energy Efficiency Maximization of Simultaneous Transmission and Reflection RIS Assisted Full-Duplex Communications

MALICE: Manipulation Attacks on Learned Image ComprEssion

Reconfigurable Intelligent Surfaces for Energy Efficiency in Full-duplex Communication System

Simultaneous Transmission and Reflection Reconfigurable Intelligent Surface Assisted Full-Duplex Communications

Target-Relevant Knowledge Preservation for Multi-Source Domain Adaptive Object Detection

Cognitive Representation Learning of Self-Media Online Article Quality

Hierarchical Feature Embedding for Attribute Recognition

HSCJN: A Holistic Semantic Constraint Joint Network for Diverse Response Generation