Source author record

Bin Fang

Bin Fang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision quant-ph Robotics cond-mat.mes-hall eess.SP Information Theory Logic in Computer Science math.IT Networking and Internet Architecture physics.optics Programming Languages

Catalog footprint

What is connected

8works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Tactile-based Multimodal Fusion in Embodied Intelligence: A Survey of Vision, Language, and Contact-Driven Paradigms

Tactile sensing is a fundamental modality for embodied intelligence, offering unique and direct feedback on contact geometry, material properties, and interaction dynamics that remote sensors cannot replace. However, unimodal tactile perception is inherently limited by its sparse spatial coverage and lack of global semantic context. With the recent explosion in deep learning and large language models, integrating tactile with vision and language has become essential to bridge physical interaction with semantic reasoning, leading to the emergence of Multimodal Tactile Fusion. Despite rapid progress, the existing researches remain fragmented across disparate datasets, sensing modalities, and tasks, lacking a unified theoretical framework. To address this gap, this paper provides a comprehensive survey of multimodal tactile fusion research up to the first quarter of 2026. We propose a hierarchical taxonomy that organizes the field into two primary dimensions: multimodal datasets and multimodal methods. On the data side, we categorize resources ranging from Tactile-Vision datasets, Tactile-Language datasets, Tactile-Vision-Language datasets, and Tactile-Vision-Other datasets. On the method side, we structure prior work into three core pillars: (1) Multimodal Perception and Recognition, which focuses on object understanding and grasp prediction; (2) Cross-Modal Generation, focusing on bidirectional translation between tactile, vision, and text; and (3) Multimodal Interaction, emphasizing feedback control and language-guided manipulation. Furthermore, we summarize representative tactile sensing hardware, review commonly used evaluation metrics and benchmark settings, and discuss current challenges and promising future directions.

preprint2023arXiv

Fabric Defect Detection Using Vision-Based Tactile Sensor

This paper introduces a new type of system for fabric defect detection with the tactile inspection system. Different from existed visual inspection systems, the proposed system implements a vision-based tactile sensor. The tactile sensor, which mainly consists of a camera, four LEDs, and an elastic sensing layer, captures detailed information about fabric surface structure and ignores the color and pattern. Thus, the ambiguity between a defect and image background related to fabric color and pattern is avoided. To utilize the tactile sensor for fabric inspection, we employ intensity adjustment for image preprocessing, Residual Network with ensemble learning for detecting defects, and uniformity measurement for selecting ideal dataset for model training. An experiment is conducted to verify the performance of the proposed tactile system. The experimental results have demonstrated the feasibility of the proposed system, which performs well in detecting structural defects for various types of fabrics. In addition, the system does not require external light sources, which skips the process of setting up and tuning a lighting environment.

preprint2020arXiv

Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation

Unsupervised image-to-image translation is a central task in computer vision. Current translation frameworks will abandon the discriminator once the training process is completed. This paper contends a novel role of the discriminator by reusing it for encoding the images of the target domain. The proposed architecture, termed as NICE-GAN, exhibits two advantageous patterns over previous approaches: First, it is more compact since no independent encoding component is required; Second, this plug-in encoder is directly trained by the adversary loss, making it more informative and trained more effectively if a multi-scale discriminator is applied. The main issue in NICE-GAN is the coupling of translation with discrimination along the encoder, which could incur training inconsistency when we play the min-max game via GAN. To tackle this issue, we develop a decoupled training strategy by which the encoder is only trained when maximizing the adversary loss while keeping frozen otherwise. Extensive experiments on four popular benchmarks demonstrate the superior performance of NICE-GAN over state-of-the-art methods in terms of FID, KID, and also human preference. Comprehensive ablation studies are also carried out to isolate the validity of each proposed component. Our codes are available at https://github.com/alpc91/NICE-GAN-pytorch.

preprint2019arXiv

Photon--Matter Quantum Correlations in Spontaneous Raman Scattering

We develop a Hamiltonian formalism to study energy and position/momentum correlations between a single Stokes photon and a single material excitation that are created as a pair in the spontaneous Raman scattering process. Our approach allows for intuitive separation of the effects of spectral linewidth, chromatic dispersion, and collection angle on these correlations, and we compare the predictions of the model to experiment. These results have important implications for the use of Raman scattering in quantum protocols that rely on spectrally unentangled photons and collective excitations.

preprint2016arXiv

Hierarchical Shape Abstraction for Analysis of Free-List Memory Allocators

We propose a hierarchical abstract domain for the analysis of free-list memory allocators that tracks shape and numerical properties about both the heap and the free lists. Our domain is based on Separation Logic extended with predicates that capture the pointer arithmetics constraints for the heap-list and the shape of the free-list. These predicates are combined using a hierarchical composition operator to specify the overlapping of the heap-list by the free-list. In addition to expressiveness, this operator leads to a compositional and compact representation of abstract values and simplifies the implementation of the abstract domain. The shape constraints are combined with numerical constraints over integer arrays to track properties about the allocation policies (best-fit, first-fit, etc). Such properties are out of the scope of the existing analyzers. We implemented this domain and we show its effectiveness on several implementations of free-list allocators.

preprint2015arXiv

An Effective Handover Analysis for the Randomly Distributed Heterogeneous Cellular Networks

Handover rate is one of the most import metrics to instruct mobility management and resource management in wireless cellular networks. In the literature, the mathematical expression of handover rate has been derived for homogeneous cellular network by both regular hexagon coverage model and stochastic geometry model, but there has not been any reliable result for heterogeneous cellular networks (HCNs). Recently, stochastic geometry modeling has been shown to model well the real deployment of HCNs and has been extensively used to analyze HCNs. In this paper, we give an effective handover analysis for HCNs by stochastic geometry modeling, derive the mathematical expression of handover rate by employing an infinitesimal method for a generalized multi-tier scenario, discuss the result by deriving some meaningful corollaries, and validate the analysis by computer simulation with multiple walking models. By our analysis, we find that in HCNs the handover rate is related to many factors like the base stations' densities and transmitting powers, user's velocity distribution, bias factor, pass loss factor and etc. Although our analysis focuses on the scenario of multi-tier HCNs, the analytical framework can be easily extended for more complex scenarios, and may shed some light for future study.

preprint2014arXiv

Giant spin-torque diode sensitivity at low input power in the absence of bias magnetic field

Microwave detectors based on the spin-transfer torque diode effect are among the key emerging spintronic devices. By utilizing the spin of electrons in addition to charge, they have the potential to overcome the theoretical performance limits of their semiconductor (Schottky) counterparts, which cannot operate at low input power. Here, we demonstrate nanoscale microwave detectors exhibiting record-high detection sensitivity of 75400 mV mW$^{-1}$ at room temperature, without any external bias fields, for input microwave power down to 10 nW. This sensitivity is 20x and 6x larger than state-of-the-art Schottky diode detectors (3800 mV mW$^{-1}$) and existing spintronic diodes with >1000 Oe magnetic bias (12000 mV mW$^{-1}$), respectively. Micromagnetic simulations supported by microwave emission measurements reveal the essential role of the injection locking to achieve this sensitivity performance. The results enable dramatic improvements in the design of low input power microwave detectors, with wide-ranging applications in telecommunications, radars, and smart networks.

preprint2014arXiv

Polarization-entangled photon-pair generation in commercial-grade polarization-maintaining fiber

We demonstrate a fiber-based source of polarization-entangled photon pairs at visible wavelengths suitable for integration with local quantum processing schemes. The photons are created through birefringent phase-matching in spontaneous four-wave mixing inside a Sagnac interferometer. We address entanglement degradation due to temporal distinguishability of the photons to enable the generation of a spectrally unfiltered polarization-entangled photon-pair state with $95.86\pm0.10%$ fidelity to a maximally entangled Bell state, evaluated with a tomographic state reconstruction without applying any corrections or background subtractions. Owing to the large birefringence of the fiber, photons are created far detuned from the pump, where Raman contamination is negligible. This source's spatial mode and ability to produce spectrally uncorrelated photons make it suitable for implementing quantum information protocols over free-space and fiber-based networks.

Bin Fang

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

Tactile-based Multimodal Fusion in Embodied Intelligence: A Survey of Vision, Language, and Contact-Driven Paradigms

Fabric Defect Detection Using Vision-Based Tactile Sensor

Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation

Photon--Matter Quantum Correlations in Spontaneous Raman Scattering

Hierarchical Shape Abstraction for Analysis of Free-List Memory Allocators

An Effective Handover Analysis for the Randomly Distributed Heterogeneous Cellular Networks

Giant spin-torque diode sensitivity at low input power in the absence of bias magnetic field

Polarization-entangled photon-pair generation in commercial-grade polarization-maintaining fiber