Researcher profile

Shuyuan Yang

Shuyuan Yang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

A Model-based Visual Contact Localization and Force Sensing System for Compliant Robotic Grippers

Grasp force estimation can help prevent robots from damaging delicate objects during manipulation and improve learning-based robotic control. Integrating force sensing into deformable grippers negotiates trade-offs in cost, complexity, mechanical robustness, and performance. With the growing integration of RGB-D wrist cameras into robotic systems for control purposes, camera-based techniques are a promising solution for indirect visual force estimation. Current approaches mostly utilize end-to-end deep learning, which can be brittle when generalizing to new scenarios, while existing model-based approaches are unsuited to grasping and modern grasper geometries. To address these challenges, we developed a model-based visual force sensing approach integrating an iterative contact localization with generalization to unseen objects. The system extracts structural key points from wrist camera RGB-D images of deforming fin-ray-shaped soft grippers, and uses these key points to define parameters of an inverse finite element analysis simulation in Simulation Open Framework Architecture. The iterative contact localization sub-system utilizes a deep learning-based online 3D reconstruction and pose estimation pipeline to dynamically update contact location, and is robust to visual occlusion and unseen objects. Our system demonstrated an average root mean square error of 0.23 N and normalized root mean square deviation of 2.11% during the load phase, and 0.48 N and 4.34% over the entire grasping process when interacting with different objects under various conditions, showcasing its potential for real-time model-based indirect force sensing of soft grippers.

preprint2022arXiv

PC-GANs: Progressive Compensation Generative Adversarial Networks for Pan-sharpening

The fusion of multispectral and panchromatic images is always dubbed pansharpening. Most of the available deep learning-based pan-sharpening methods sharpen the multispectral images through a one-step scheme, which strongly depends on the reconstruction ability of the network. However, remote sensing images always have large variations, as a result, these one-step methods are vulnerable to the error accumulation and thus incapable of preserving spatial details as well as the spectral information. In this paper, we propose a novel two-step model for pan-sharpening that sharpens the MS image through the progressive compensation of the spatial and spectral information. Firstly, a deep multiscale guided generative adversarial network is used to preliminarily enhance the spatial resolution of the MS image. Starting from the pre-sharpened MS image in the coarse domain, our approach then progressively refines the spatial and spectral residuals over a couple of generative adversarial networks (GANs) that have reverse architectures. The whole model is composed of triple GANs, and based on the specific architecture, a joint compensation loss function is designed to enable the triple GANs to be trained simultaneously. Moreover, the spatial-spectral residual compensation structure proposed in this paper can be extended to other pan-sharpening methods to further enhance their fusion results. Extensive experiments are performed on different datasets and the results demonstrate the effectiveness and efficiency of our proposed method.

preprint2022arXiv

Transformers Meet Visual Learning Understanding: A Comprehensive Review

Dynamic attention mechanism and global modeling ability make Transformer show strong feature learning ability. In recent years, Transformer has become comparable to CNNs methods in computer vision. This review mainly investigates the current research progress of Transformer in image and video applications, which makes a comprehensive overview of Transformer in visual learning understanding. First, the attention mechanism is reviewed, which plays an essential part in Transformer. And then, the visual Transformer model and the principle of each module are introduced. Thirdly, the existing Transformer-based models are investigated, and their performance is compared in visual learning understanding applications. Three image tasks and two video tasks of computer vision are investigated. The former mainly includes image classification, object detection, and image segmentation. The latter contains object tracking and video classification. It is significant for comparing different models' performance in various tasks on several public benchmark data sets. Finally, ten general problems are summarized, and the developing prospects of the visual Transformer are given in this review.

preprint2022arXiv

Violating the bilocal inequality with separable mixed states in the entanglement-swapping network

It has been showed that two entangle pure states can violate the bilocal inequality in the entanglement-swapping network, vice versa. What happens for mixed states? Whether or not are there separable mixed states violating the bilocal inequality? In the work, we devote to finding the mixed Werner states which violate the bilocal inequality by Particle Swarm Optimization (PSO) algorithms. Finally, we shows that there are pairs of states, where one is separable and the other is entangled, can violate the bilocal inequality.

preprint2020arXiv

An Efficient Hardware Accelerator for Structured Sparse Convolutional Neural Networks on FPGAs

Deep Convolutional Neural Networks (CNNs) have achieved state-of-the-art performance in a wide range of applications. However, deeper CNN models, which are usually computation consuming, are widely required for complex Artificial Intelligence (AI) tasks. Though recent research progress on network compression such as pruning has emerged as a promising direction to mitigate computational burden, existing accelerators are still prevented from completely utilizing the benefits of leveraging sparsity owing to the irregularity caused by pruning. On the other hand, Field-Programmable Gate Arrays (FPGAs) have been regarded as a promising hardware platform for CNN inference acceleration. However, most existing FPGA accelerators focus on dense CNN and cannot address the irregularity problem. In this paper, we propose a sparse wise dataflow to skip the cycles of processing Multiply-and-Accumulates (MACs) with zero weights and exploit data statistics to minimize energy through zeros gating to avoid unnecessary computations. The proposed sparse wise dataflow leads to a low bandwidth requirement and a high data sharing. Then we design an FPGA accelerator containing a Vector Generator Module (VGM) which can match the index between sparse weights and input activations according to the proposed dataflow. Experimental results demonstrate that our implementation can achieve 987 imag/s and 48 imag/s performance for AlexNet and VGG-16 on Xilinx ZCU102, respectively, which provides 1.5x to 6.7x speedup and 2.0x to 6.2x energy-efficiency over previous CNN FPGA accelerators.