Researcher profile

Binbin Chen

Binbin Chen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
12topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2023arXiv

1st Place Solution for ECCV 2022 OOD-CV Challenge Object Detection Track

OOD-CV challenge is an out-of-distribution generalization task. To solve this problem in object detection track, we propose a simple yet effective Generalize-then-Adapt (G&A) framework, which is composed of a two-stage domain generalization part and a one-stage domain adaptation part. The domain generalization part is implemented by a Supervised Model Pretraining stage using source data for model warm-up and a Weakly Semi-Supervised Model Pretraining stage using both source data with box-level label and auxiliary data (ImageNet-1K) with image-level label for performance boosting. The domain adaptation part is implemented as a Source-Free Domain Adaptation paradigm, which only uses the pre-trained model and the unlabeled target data to further optimize in a self-supervised training manner. The proposed G&A framework help us achieve the first place on the object detection leaderboard of the OOD-CV challenge. Code will be released in https://github.com/hikvision-research/OOD-CV.

preprint2022arXiv

An Efficient Optimal Energy Flow Model for Integrated Energy Systems Based on Energy Circuit Modeling in the Frequency Domain

With more energy networks being interconnected to form integrated energy systems (IESs), the optimal energy flow (OEF) problem has drawn increasing attention. Extant studies on OEF models mostly utilize the finite difference method (FDM) to address partial-differential-equation (PDE) constraints related to the dynamics in natural gas networks (NGNs) and district heating networks (DHNs). However, this time-domain approach suffers from a heavy computational burden with regard to achieving high finite-difference accuracy. In this paper, a novel OEF model that formulates NGN and DHN constraints in the frequency domain and corresponding model compaction techniques for efficient solving are contributed. First, an energy circuit method (ECM) that algebraizes the PDEs of NGNs and DHNs in the frequency domain is introduced. Then, an ECM-based OEF model is formulated, which contains fewer variables and constraints than an FDM-based OEF model and thereby yields better solving efficiency. Finally, variable space projection is employed to remove implicit variables, by which another constraint generation algorithm is enabled to remove redundant constraints. These two techniques further compact the OEF model and bring about a second improvement in solving efficiency. Numerical tests on actual systems indicate the final OEF model reduces variables and constraints by more than 95% and improves the solving efficiency by more than 10 times. In conclusion, the proposed OEF model and solving techniques well meet the optimization needs of large-scale IESs.

preprint2022arXiv

Distort to Detect, not Affect: Detecting Stealthy Sensor Attacks with Micro-distortion

In this paper, we propose an effective and easily deployable approach to detect the presence of stealthy sensor attacks in industrial control systems, where (legacy) control devices critically rely on accurate (and usually non-encrypted) sensor readings. Specifically, we focus on stealthy attacks that crash a sensor and then immediately impersonate that sensor by sending out fake readings. We consider attackers who aim to stay hidden in the system for a prolonged period. To detect such attacks, our approach relies on continuous injection of "micro distortion" to the original sensor's readings. In particular, the injected distortion should be kept strictly within a small magnitude (e.g., $0.5\%$ of the possible operating value range), to ensure it does not affect the normal functioning of the ICS. Our approach uses a pre-shared secret sequence between a sensor and the defender to generate the micro-distortions. One key challenge is that the micro-distortions injected are often much lower than the sensor's actual readings, hence can be easily overwhelmed by the latter. To overcome this, we leverage the observation that sensor readings in many ICS (and power grid in particular) often change gradually in a significant fraction of time (i.e., with small difference between consecutive time slots). We devise a simple yet effective algorithm that can detect stealthy attackers in a highly accurate and fast (i.e., using less than 100 samples) manner. We demonstrate the effectiveness of our defense using real-world sensor reading traces from two different smart grid systems.

preprint2022arXiv

GLD-Net: Improving Monaural Speech Enhancement by Learning Global and Local Dependency Features with GLD Block

For monaural speech enhancement, contextual information is important for accurate speech estimation. However, commonly used convolution neural networks (CNNs) are weak in capturing temporal contexts since they only build blocks that process one local neighborhood at a time. To address this problem, we learn from human auditory perception to introduce a two-stage trainable reasoning mechanism, referred as global-local dependency (GLD) block. GLD blocks capture long-term dependency of time-frequency bins both in global level and local level from the noisy spectrogram to help detecting correlations among speech part, noise part, and whole noisy input. What is more, we conduct a monaural speech enhancement network called GLD-Net, which adopts encoder-decoder architecture and consists of speech object branch, interference branch, and global noisy branch. The extracted speech feature at global-level and local-level are efficiently reasoned and aggregated in each of the branches. We compare the proposed GLD-Net with existing state-of-art methods on WSJ0 and DEMAND dataset. The results show that GLD-Net outperforms the state-of-the-art methods in terms of PESQ and STOI.

preprint2022arXiv

Improving Visual Speech Enhancement Network by Learning Audio-visual Affinity with Multi-head Attention

Audio-visual speech enhancement system is regarded as one of promising solutions for isolating and enhancing speech of desired speaker. Typical methods focus on predicting clean speech spectrum via a naive convolution neural network based encoder-decoder architecture, and these methods a) are not adequate to use data fully, b) are unable to effectively balance audio-visual features. The proposed model alleviates these drawbacks by a) applying a model that fuses audio and visual features layer by layer in encoding phase, and that feeds fused audio-visual features to each corresponding decoder layer, and more importantly, b) introducing a 2-stage multi-head cross attention (MHCA) mechanism to infer audio-visual speech enhancement for balancing the fused audio-visual features and eliminating irrelevant features. This paper proposes attentional audio-visual multi-layer feature fusion model, in which MHCA units are applied to feature mapping at every layer of decoder. The proposed model demonstrates the superior performance of the network against the state-of-the-art models.

preprint2022arXiv

VSEGAN: Visual Speech Enhancement Generative Adversarial Network

Speech enhancement is an essential task of improving speech quality in noise scenario. Several state-of-the-art approaches have introduced visual information for speech enhancement,since the visual aspect of speech is essentially unaffected by acoustic environment. This paper proposes a novel frameworkthat involves visual information for speech enhancement, by in-corporating a Generative Adversarial Network (GAN). In par-ticular, the proposed visual speech enhancement GAN consistof two networks trained in adversarial manner, i) a generator that adopts multi-layer feature fusion convolution network to enhance input noisy speech, and ii) a discriminator that attemptsto minimize the discrepancy between the distributions of the clean speech signal and enhanced speech signal. Experiment re-sults demonstrated superior performance of the proposed modelagainst several state-of-the-art

preprint2021arXiv

Box Re-Ranking: Unsupervised False Positive Suppression for Domain Adaptive Pedestrian Detection

False positive is one of the most serious problems brought by agnostic domain shift in domain adaptive pedestrian detection. However, it is impossible to label each box in countless target domains. Therefore, it yields our attention to suppress false positive in each target domain in an unsupervised way. In this paper, we model an object detection task into a ranking task among positive and negative boxes innovatively, and thus transform a false positive suppression problem into a box re-ranking problem elegantly, which makes it feasible to solve without manual annotation. An attached problem during box re-ranking appears that no labeled validation data is available for cherrypicking. Considering we aim to keep the detection of true positive unchanged, we propose box number alignment, a self-supervised evaluation metric, to prevent the optimized model from capacity degeneration. Extensive experiments conducted on cross-domain pedestrian detection datasets have demonstrated the effectiveness of our proposed framework. Furthermore, the extension to two general unsupervised domain adaptive object detection benchmarks also supports our superiority to other state-of-the-arts.

preprint2020arXiv

Sublinear-Time Non-Adaptive Group Testing with $O(k \log n)$ Tests via Bit-Mixing Coding

The group testing problem consists of determining a small set of defective items from a larger set of items based on tests on groups of items, and is relevant in applications such as medical testing, communication protocols, pattern matching, and many more. While rigorous group testing algorithms have long been known with runtime at least linear in the number of items, a recent line of works has sought to reduce the runtime to ${\rm poly}(k \log n)$, where $n$ is the number of items and $k$ is the number of defectives. In this paper, we present such an algorithm for non-adaptive probabilistic group testing termed {\em bit mixing coding} (BMC), which builds on techniques that encode item indices in the test matrix, while incorporating novel ideas based on erasure-correction coding. We show that BMC achieves asymptotically vanishing error probability with $O(k \log n)$ tests and $O(k^2 \cdot \log k \cdot \log n)$ runtime, in the limit as $n \to \infty$ (with $k$ having an arbitrary dependence on $n$). This closes a recently-proposed open problem of simultaneously achieving ${\rm poly}(k \log n)$ decoding time using $O(k \log n)$ tests without any assumptions on $k$. In addition, we show that the same scaling laws can be attained in a commonly-considered noisy setting, in which each test outcome is flipped with constant probability.