Source author record

Binbin Chen

Binbin Chen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

eess.AS Sound Computer Vision Cryptography and Security Artificial Intelligence Computation and Language eess.IV eess.SP eess.SY Information Theory math.IT math.PR math.ST Multimedia Statistics Theory Systems and Control

Catalog footprint

What is connected

11works

16topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

1st Place Solution for ECCV 2022 OOD-CV Challenge Object Detection Track

OOD-CV challenge is an out-of-distribution generalization task. To solve this problem in object detection track, we propose a simple yet effective Generalize-then-Adapt (G&A) framework, which is composed of a two-stage domain generalization part and a one-stage domain adaptation part. The domain generalization part is implemented by a Supervised Model Pretraining stage using source data for model warm-up and a Weakly Semi-Supervised Model Pretraining stage using both source data with box-level label and auxiliary data (ImageNet-1K) with image-level label for performance boosting. The domain adaptation part is implemented as a Source-Free Domain Adaptation paradigm, which only uses the pre-trained model and the unlabeled target data to further optimize in a self-supervised training manner. The proposed G&A framework help us achieve the first place on the object detection leaderboard of the OOD-CV challenge. Code will be released in https://github.com/hikvision-research/OOD-CV.

preprint2022arXiv

An Efficient Optimal Energy Flow Model for Integrated Energy Systems Based on Energy Circuit Modeling in the Frequency Domain

With more energy networks being interconnected to form integrated energy systems (IESs), the optimal energy flow (OEF) problem has drawn increasing attention. Extant studies on OEF models mostly utilize the finite difference method (FDM) to address partial-differential-equation (PDE) constraints related to the dynamics in natural gas networks (NGNs) and district heating networks (DHNs). However, this time-domain approach suffers from a heavy computational burden with regard to achieving high finite-difference accuracy. In this paper, a novel OEF model that formulates NGN and DHN constraints in the frequency domain and corresponding model compaction techniques for efficient solving are contributed. First, an energy circuit method (ECM) that algebraizes the PDEs of NGNs and DHNs in the frequency domain is introduced. Then, an ECM-based OEF model is formulated, which contains fewer variables and constraints than an FDM-based OEF model and thereby yields better solving efficiency. Finally, variable space projection is employed to remove implicit variables, by which another constraint generation algorithm is enabled to remove redundant constraints. These two techniques further compact the OEF model and bring about a second improvement in solving efficiency. Numerical tests on actual systems indicate the final OEF model reduces variables and constraints by more than 95% and improves the solving efficiency by more than 10 times. In conclusion, the proposed OEF model and solving techniques well meet the optimization needs of large-scale IESs.

preprint2022arXiv

Distort to Detect, not Affect: Detecting Stealthy Sensor Attacks with Micro-distortion

In this paper, we propose an effective and easily deployable approach to detect the presence of stealthy sensor attacks in industrial control systems, where (legacy) control devices critically rely on accurate (and usually non-encrypted) sensor readings. Specifically, we focus on stealthy attacks that crash a sensor and then immediately impersonate that sensor by sending out fake readings. We consider attackers who aim to stay hidden in the system for a prolonged period. To detect such attacks, our approach relies on continuous injection of "micro distortion" to the original sensor's readings. In particular, the injected distortion should be kept strictly within a small magnitude (e.g., $0.5\%$ of the possible operating value range), to ensure it does not affect the normal functioning of the ICS. Our approach uses a pre-shared secret sequence between a sensor and the defender to generate the micro-distortions. One key challenge is that the micro-distortions injected are often much lower than the sensor's actual readings, hence can be easily overwhelmed by the latter. To overcome this, we leverage the observation that sensor readings in many ICS (and power grid in particular) often change gradually in a significant fraction of time (i.e., with small difference between consecutive time slots). We devise a simple yet effective algorithm that can detect stealthy attackers in a highly accurate and fast (i.e., using less than 100 samples) manner. We demonstrate the effectiveness of our defense using real-world sensor reading traces from two different smart grid systems.

preprint2022arXiv

GLD-Net: Improving Monaural Speech Enhancement by Learning Global and Local Dependency Features with GLD Block

For monaural speech enhancement, contextual information is important for accurate speech estimation. However, commonly used convolution neural networks (CNNs) are weak in capturing temporal contexts since they only build blocks that process one local neighborhood at a time. To address this problem, we learn from human auditory perception to introduce a two-stage trainable reasoning mechanism, referred as global-local dependency (GLD) block. GLD blocks capture long-term dependency of time-frequency bins both in global level and local level from the noisy spectrogram to help detecting correlations among speech part, noise part, and whole noisy input. What is more, we conduct a monaural speech enhancement network called GLD-Net, which adopts encoder-decoder architecture and consists of speech object branch, interference branch, and global noisy branch. The extracted speech feature at global-level and local-level are efficiently reasoned and aggregated in each of the branches. We compare the proposed GLD-Net with existing state-of-art methods on WSJ0 and DEMAND dataset. The results show that GLD-Net outperforms the state-of-the-art methods in terms of PESQ and STOI.

preprint2022arXiv

Improving Visual Speech Enhancement Network by Learning Audio-visual Affinity with Multi-head Attention

Audio-visual speech enhancement system is regarded as one of promising solutions for isolating and enhancing speech of desired speaker. Typical methods focus on predicting clean speech spectrum via a naive convolution neural network based encoder-decoder architecture, and these methods a) are not adequate to use data fully, b) are unable to effectively balance audio-visual features. The proposed model alleviates these drawbacks by a) applying a model that fuses audio and visual features layer by layer in encoding phase, and that feeds fused audio-visual features to each corresponding decoder layer, and more importantly, b) introducing a 2-stage multi-head cross attention (MHCA) mechanism to infer audio-visual speech enhancement for balancing the fused audio-visual features and eliminating irrelevant features. This paper proposes attentional audio-visual multi-layer feature fusion model, in which MHCA units are applied to feature mapping at every layer of decoder. The proposed model demonstrates the superior performance of the network against the state-of-the-art models.

preprint2022arXiv

VSEGAN: Visual Speech Enhancement Generative Adversarial Network

Speech enhancement is an essential task of improving speech quality in noise scenario. Several state-of-the-art approaches have introduced visual information for speech enhancement,since the visual aspect of speech is essentially unaffected by acoustic environment. This paper proposes a novel frameworkthat involves visual information for speech enhancement, by in-corporating a Generative Adversarial Network (GAN). In par-ticular, the proposed visual speech enhancement GAN consistof two networks trained in adversarial manner, i) a generator that adopts multi-layer feature fusion convolution network to enhance input noisy speech, and ii) a discriminator that attemptsto minimize the discrepancy between the distributions of the clean speech signal and enhanced speech signal. Experiment re-sults demonstrated superior performance of the proposed modelagainst several state-of-the-art

preprint2021arXiv

Box Re-Ranking: Unsupervised False Positive Suppression for Domain Adaptive Pedestrian Detection

False positive is one of the most serious problems brought by agnostic domain shift in domain adaptive pedestrian detection. However, it is impossible to label each box in countless target domains. Therefore, it yields our attention to suppress false positive in each target domain in an unsupervised way. In this paper, we model an object detection task into a ranking task among positive and negative boxes innovatively, and thus transform a false positive suppression problem into a box re-ranking problem elegantly, which makes it feasible to solve without manual annotation. An attached problem during box re-ranking appears that no labeled validation data is available for cherrypicking. Considering we aim to keep the detection of true positive unchanged, we propose box number alignment, a self-supervised evaluation metric, to prevent the optimized model from capacity degeneration. Extensive experiments conducted on cross-domain pedestrian detection datasets have demonstrated the effectiveness of our proposed framework. Furthermore, the extension to two general unsupervised domain adaptive object detection benchmarks also supports our superiority to other state-of-the-arts.

preprint2020arXiv

Sublinear-Time Non-Adaptive Group Testing with $O(k \log n)$ Tests via Bit-Mixing Coding

The group testing problem consists of determining a small set of defective items from a larger set of items based on tests on groups of items, and is relevant in applications such as medical testing, communication protocols, pattern matching, and many more. While rigorous group testing algorithms have long been known with runtime at least linear in the number of items, a recent line of works has sought to reduce the runtime to ${\rm poly}(k \log n)$, where $n$ is the number of items and $k$ is the number of defectives. In this paper, we present such an algorithm for non-adaptive probabilistic group testing termed {\em bit mixing coding} (BMC), which builds on techniques that encode item indices in the test matrix, while incorporating novel ideas based on erasure-correction coding. We show that BMC achieves asymptotically vanishing error probability with $O(k \log n)$ tests and $O(k^2 \cdot \log k \cdot \log n)$ runtime, in the limit as $n \to \infty$ (with $k$ having an arbitrary dependence on $n$). This closes a recently-proposed open problem of simultaneously achieving ${\rm poly}(k \log n)$ decoding time using $O(k \log n)$ tests without any assumptions on $k$. In addition, we show that the same scaling laws can be attained in a commonly-considered noisy setting, in which each test outcome is flipped with constant probability.

preprint2016arXiv

Empath: Understanding Topic Signals in Large-Scale Text

Human language is colored by a broad range of topics, but existing text analysis tools only focus on a small number of them. We present Empath, a tool that can generate and validate new lexical categories on demand from a small set of seed terms (like "bleed" and "punch" to generate the category violence). Empath draws connotations between words and phrases by deep learning a neural embedding across more than 1.8 billion words of modern fiction. Given a small set of seed words that characterize a category, Empath uses its neural embedding to discover new related terms, then validates the category with a crowd-powered filter. Empath also analyzes text across 200 built-in, pre-validated categories we have generated from common topics in our web dataset, like neglect, government, and social media. We show that Empath's data-driven, human validated categories are highly correlated (r=0.906) with similar categories in LIWC.

preprint2015arXiv

CLT for linear spectral statistics of normalized sample covariance matrices with the dimension much larger than the sample size

Let $\mathbf{A}=\frac{1}{\sqrt{np}}(\mathbf{X}^T\mathbf{X}-p\mathbf {I}_n)$ where $\mathbf{X}$ is a $p\times n$ matrix, consisting of independent and identically distributed (i.i.d.) real random variables $X_{ij}$ with mean zero and variance one. When $p/n\to\infty$, under fourth moment conditions a central limit theorem (CLT) for linear spectral statistics (LSS) of $\mathbf{A}$ defined by the eigenvalues is established. We also explore its applications in testing whether a population covariance matrix is an identity matrix.

preprint2014arXiv

Automatic Generation of Security Argument Graphs

Graph-based assessment formalisms have proven to be useful in the safety, dependability, and security communities to help stakeholders manage risk and maintain appropriate documentation throughout the system lifecycle. In this paper, we propose a set of methods to automatically construct security argument graphs, a graphical formalism that integrates various security-related information to argue about the security level of a system. Our approach is to generate the graph in a progressive manner by exploiting logical relationships among pieces of diverse input information. Using those emergent argument patterns as a starting point, we define a set of extension templates that can be applied iteratively to grow a security argument graph. Using a scenario from the electric power sector, we demonstrate the graph generation process and highlight its application for system security evaluation in our prototype software tool, CyberSAGE.

Binbin Chen

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

1st Place Solution for ECCV 2022 OOD-CV Challenge Object Detection Track

An Efficient Optimal Energy Flow Model for Integrated Energy Systems Based on Energy Circuit Modeling in the Frequency Domain

Distort to Detect, not Affect: Detecting Stealthy Sensor Attacks with Micro-distortion

GLD-Net: Improving Monaural Speech Enhancement by Learning Global and Local Dependency Features with GLD Block

Improving Visual Speech Enhancement Network by Learning Audio-visual Affinity with Multi-head Attention

VSEGAN: Visual Speech Enhancement Generative Adversarial Network

Box Re-Ranking: Unsupervised False Positive Suppression for Domain Adaptive Pedestrian Detection

Sublinear-Time Non-Adaptive Group Testing with $O(k \log n)$ Tests via Bit-Mixing Coding

Empath: Understanding Topic Signals in Large-Scale Text

CLT for linear spectral statistics of normalized sample covariance matrices with the dimension much larger than the sample size

Automatic Generation of Security Argument Graphs