Source author record

Lei Yu

Lei Yu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

58works

28topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

OxyGent: Making Multi-Agent Systems Modular, Observable, and Evolvable via Oxy Abstraction

Deploying production-ready multi-agent systems (MAS) in complex industrial environments remains challenging due to limitations in scalability, observability, and autonomous evolution. We present OxyGent, an open-source framework driven by two core novelties: a unified Oxy abstraction and the OxyBank evolution engine. The unified abstraction encapsulates agents, tools, LLMs, and reasoning flows as pluggable atomic components, enabling Lego-like scalable system composition and non-intrusive monitoring. To enhance observability, OxyGent introduces permission-driven dynamic planning that replaces rigid workflows with execution graphs generated at runtime, providing adaptive visualizations. Furthermore, to support continuous evolution, OxyBank serves as an AI asset management platform that drives automated data backflow, annotation, and joint evolution. Empirical evaluations and real-world case studies show that OxyGent provides a robust and scalable foundation for MAS. OxyGent is fully open-sourced under the Apache License 2.0 at https://github.com/jd-opensource/OxyGent.

preprint2024arXiv

Re-evaluating the Memory-balanced Pipeline Parallelism: BPipe

Pipeline parallelism is an essential technique in the training of large-scale Transformer models. However, it suffers from imbalanced memory consumption, leading to insufficient memory utilization. The BPipe technique was proposed to address this issue and has proven effective in the GPT-3 model. Nevertheless, our experiments have not yielded similar benefits for LLaMA training. Additionally, BPipe only yields negligible benefits for GPT-3 training when applying flash attention. We analyze the underlying causes of the divergent performance of BPipe on GPT-3 and LLaMA. Furthermore, we introduce a novel method to estimate the performance of BPipe.

preprint2022arXiv

A Normalized Gaussian Wasserstein Distance for Tiny Object Detection

Detecting tiny objects is a very challenging problem since a tiny object only contains a few pixels in size. We demonstrate that state-of-the-art detectors do not produce satisfactory results on tiny objects due to the lack of appearance information. Our key observation is that Intersection over Union (IoU) based metrics such as IoU itself and its extensions are very sensitive to the location deviation of the tiny objects, and drastically deteriorate the detection performance when used in anchor-based detectors. To alleviate this, we propose a new evaluation metric using Wasserstein distance for tiny object detection. Specifically, we first model the bounding boxes as 2D Gaussian distributions and then propose a new metric dubbed Normalized Wasserstein Distance (NWD) to compute the similarity between them by their corresponding Gaussian distributions. The proposed NWD metric can be easily embedded into the assignment, non-maximum suppression, and loss function of any anchor-based detector to replace the commonly used IoU metric. We evaluate our metric on a new dataset for tiny object detection (AI-TOD) in which the average object size is much smaller than existing object detection datasets. Extensive experiments show that, when equipped with NWD metric, our approach yields performance that is 6.7 AP points higher than a standard fine-tuning baseline, and 6.0 AP points higher than state-of-the-art competitors. Codes are available at: https://github.com/jwwangchn/NWD.

preprint2022arXiv

Asymptotics for Strassen's Optimal Transport Problem

In this paper, we consider Strassen's version of optimal transport (OT) problem, which concerns minimizing the excess-cost probability (i.e., the probability that the cost is larger than a given value) over all couplings of two given distributions. We derive large deviation, moderate deviation, and central limit theorems for this problem. Our proof is based on Strassen's dual formulation of the OT problem, Sanov's theorem on the large deviation principle (LDP) of empirical measures, as well as the moderate deviation principle (MDP) and central limit theorems (CLT) of empirical measures. In order to apply the LDP, MDP, and CLT to Strassen's OT problem, nested formulas for Strassen's OT problem are derived. Based on these nested formulas and using a splitting technique, we construct asymptotically optimal solutions to Strassen's OT problem and its dual formulation.

preprint2022arXiv

Autofocus for Event Cameras

Focus control (FC) is crucial for cameras to capture sharp images in challenging real-world scenarios. The autofocus (AF) facilitates the FC by automatically adjusting the focus settings. However, due to the lack of effective AF methods for the recently introduced event cameras, their FC still relies on naive AF like manual focus adjustments, leading to poor adaptation in challenging real-world conditions. In particular, the inherent differences between event and frame data in terms of sensing modality, noise, temporal resolutions, etc., bring many challenges in designing an effective AF method for event cameras. To address these challenges, we develop a novel event-based autofocus framework consisting of an event-specific focus measure called event rate (ER) and a robust search strategy called event-based golden search (EGS). To verify the performance of our method, we have collected an event-based autofocus dataset (EAD) containing well-synchronized frames, events, and focal positions in a wide variety of challenging scenes with severe lighting and motion conditions. The experiments on this dataset and additional real-world scenarios demonstrated the superiority of our method over state-of-the-art approaches in terms of efficiency and accuracy.

preprint2022arXiv

Backpropagation through Time and Space: Learning Numerical Methods with Multi-Agent Reinforcement Learning

We introduce Backpropagation Through Time and Space (BPTTS), a method for training a recurrent spatio-temporal neural network, that is used in a homogeneous multi-agent reinforcement learning (MARL) setting to learn numerical methods for hyperbolic conservation laws. We treat the numerical schemes underlying partial differential equations (PDEs) as a Partially Observable Markov Game (POMG) in Reinforcement Learning (RL). Similar to numerical solvers, our agent acts at each discrete location of a computational space for efficient and generalizable learning. To learn higher-order spatial methods by acting on local states, the agent must discern how its actions at a given spatiotemporal location affect the future evolution of the state. The manifestation of this non-stationarity is addressed by BPTTS, which allows for the flow of gradients across both space and time. The learned numerical policies are comparable to the SOTA numerics in two settings, the Burgers' Equation and the Euler Equations, and generalize well to other simulation set-ups.

preprint2022arXiv

BSRT: Improving Burst Super-Resolution with Swin Transformer and Flow-Guided Deformable Alignment

This work addresses the Burst Super-Resolution (BurstSR) task using a new architecture, which requires restoring a high-quality image from a sequence of noisy, misaligned, and low-resolution RAW bursts. To overcome the challenges in BurstSR, we propose a Burst Super-Resolution Transformer (BSRT), which can significantly improve the capability of extracting inter-frame information and reconstruction. To achieve this goal, we propose a Pyramid Flow-Guided Deformable Convolution Network (Pyramid FG-DCN) and incorporate Swin Transformer Blocks and Groups as our main backbone. More specifically, we combine optical flows and deformable convolutions, hence our BSRT can handle misalignment and aggregate the potential texture information in multi-frames more efficiently. In addition, our Transformer-based structure can capture long-range dependency to further improve the performance. The evaluation on both synthetic and real-world tracks demonstrates that our approach achieves a new state-of-the-art in BurstSR task. Further, our BSRT wins the championship in the NTIRE2022 Burst Super-Resolution Challenge.

preprint2022arXiv

Deep Constrained Least Squares for Blind Image Super-Resolution

In this paper, we tackle the problem of blind image super-resolution(SR) with a reformulated degradation model and two novel modules. Following the common practices of blind SR, our method proposes to improve both the kernel estimation as well as the kernel-based high-resolution image restoration. To be more specific, we first reformulate the degradation model such that the deblurring kernel estimation can be transferred into the low-resolution space. On top of this, we introduce a dynamic deep linear filter module. Instead of learning a fixed kernel for all images, it can adaptively generate deblurring kernel weights conditional on the input and yield a more robust kernel estimation. Subsequently, a deep constrained least square filtering module is applied to generate clean features based on the reformulation and estimated kernel. The deblurred feature and the low input image feature are then fed into a dual-path structured SR network and restore the final high-resolution result. To evaluate our method, we further conduct evaluations on several benchmarks, including Gaussian8 and DIV2KRK. Our experiments demonstrate that the proposed method achieves better accuracy and visual improvements against state-of-the-art methods.

preprint2022arXiv

Detecting tiny objects in aerial images: A normalized Wasserstein distance and a new benchmark

Tiny object detection (TOD) in aerial images is challenging since a tiny object only contains a few pixels. State-of-the-art object detectors do not provide satisfactory results on tiny objects due to the lack of supervision from discriminative features. Our key observation is that the Intersection over Union (IoU) metric and its extensions are very sensitive to the location deviation of the tiny objects, which drastically deteriorates the quality of label assignment when used in anchor-based detectors. To tackle this problem, we propose a new evaluation metric dubbed Normalized Wasserstein Distance (NWD) and a new RanKing-based Assigning (RKA) strategy for tiny object detection. The proposed NWD-RKA strategy can be easily embedded into all kinds of anchor-based detectors to replace the standard IoU threshold-based one, significantly improving label assignment and providing sufficient supervision information for network training. Tested on four datasets, NWD-RKA can consistently improve tiny object detection performance by a large margin. Besides, observing prominent noisy labels in the Tiny Object Detection in Aerial Images (AI-TOD) dataset, we are motivated to meticulously relabel it and release AI-TOD-v2 and its corresponding benchmark. In AI-TOD-v2, the missing annotation and location error problems are considerably mitigated, facilitating more reliable training and validation processes. Embedding NWD-RKA into DetectoRS, the detection performance achieves 4.3 AP points improvement over state-of-the-art competitors on AI-TOD-v2. Datasets, codes, and more visualizations are available at: https://chasel-tsui.github.io/AI-TOD-v2/

preprint2022arXiv

Documentation based Semantic-Aware Log Parsing

With the recent advances of deep learning techniques, there are rapidly growing interests in applying machine learning to log data. As a fundamental part of log analytics, accurate log parsing that transforms raw logs to structured events is critical for subsequent machine learning and data mining tasks. Previous approaches either analyze the source code for parsing or are data-driven such as text clustering. They largely neglect to exploit another widely available and valuable resource, software documentation that provides detailed explanations for the messages, to improve accuracy. In this paper, we propose an approach and system framework to use documentation knowledge for log parsing. With parameter value identification, it not only can improve the parsing accuracy for documented messages but also for undocumented messages. In addition, it can discover the linkages between event templates that are established by sharing parameters and indicate the correlation of the event context.

preprint2022arXiv

Enabling arbitrary translation objectives with Adaptive Tree Search

We introduce an adaptive tree search algorithm, that can find high-scoring outputs under translation models that make no assumptions about the form or structure of the search objective. This algorithm -- a deterministic variant of Monte Carlo tree search -- enables the exploration of new kinds of models that are unencumbered by constraints imposed to make decoding tractable, such as autoregressivity or conditional independence assumptions. When applied to autoregressive models, our algorithm has different biases than beam search has, which enables a new analysis of the role of decoding bias in autoregressive models. Empirically, we show that our adaptive tree search algorithm finds outputs with substantially better model scores compared to beam search in autoregressive models, and compared to reranking techniques in models whose scores do not decompose additively with respect to the words in the output. We also characterise the correlation of several translation model objectives with respect to BLEU. We find that while some standard models are poorly calibrated and benefit from the beam search bias, other often more robust models (autoregressive models tuned to maximize expected automatic metric scores, the noisy channel model and a newly proposed objective) benefit from increasing amounts of search using our proposed decoder, whereas the beam search bias limits the improvements obtained from such objectives. Thus, we argue that as models improve, the improvements may be masked by over-reliance on beam search or reranking based methods.

preprint2022arXiv

Fast Nearest Convolution for Real-Time Efficient Image Super-Resolution

Deep learning-based single image super-resolution (SISR) approaches have drawn much attention and achieved remarkable success on modern advanced GPUs. However, most state-of-the-art methods require a huge number of parameters, memories, and computational resources, which usually show inferior inference times when applying them to current mobile device CPUs/NPUs. In this paper, we propose a simple plain convolution network with a fast nearest convolution module (NCNet), which is NPU-friendly and can perform a reliable super-resolution in real-time. The proposed nearest convolution has the same performance as the nearest upsampling but is much faster and more suitable for Android NNAPI. Our model can be easily deployed on mobile devices with 8-bit quantization and is fully compatible with all major mobile AI accelerators. Moreover, we conduct comprehensive experiments on different tensor operations on a mobile device to illustrate the efficiency of our network architecture. Our NCNet is trained and validated on the DIV2K 3x dataset, and the comparison with other efficient SR methods demonstrated that the NCNet can achieve high fidelity SR results while using fewer inference times. Our codes and pretrained models are publicly available at \url{https://github.com/Algolzw/NCNet}.

preprint2022arXiv

First implementation of full-workflow automation in radiotherapy: the All-in-One solution on rectal cancer

The aim of this work is to describe the technical characteristics of an AI-powered radiotherapy workflow that enables full-process automation (All-in-One), evaluate its performance implemented for on-couch initial treatment of rectal cancer, and provide insight into the behavior of full-workflow automation in the specialty of radiotherapy. The All-in-One workflow was developed based on a CT-integrated linear accelerator. It incorporates routine radiotherapy procedures from simulation, autosegmentation, autoplanning, image guidance, beam delivery, and in vivo quality assurance (QA) into one scheme, with critical decision points involved, while the patient is on the treatment couch during the whole process. For the enrolled ten patients with rectal cancer, minor modifications of the autosegmented target volumes were required, and the Dice similarity coefficient and 95% Hausdorff distance before and after modifications were 0.892{\pm}0.061 and 18.2{\pm}13.0 mm, respectively. The autosegmented normal tissues and automatic plans were clinically acceptable without any modifications or reoptimization. The pretreatment IGRT corrections were within 2 mm in all directions, and the EPID-based in vivo QA showed a γ passing rate better than 97{\%} (3{\%}/3 mm/10{\%} threshold). The duration of the whole process was 23.2{\pm}3.5 minutes, depending mostly on the time required for manual modification and plan evaluation. The All-in-One workflow enables full automation of the entire radiotherapy process by seamlessly integrating multiple routine procedures. The one-stop solution shortens the time scale it takes to ready the first treatment from days to minutes, significantly improving the patient experience and the efficiency of the workflow, and shows potential to facilitate the clinical application of online adaptive replanning.

preprint2022arXiv

GLF-CR: SAR-Enhanced Cloud Removal with Global-Local Fusion

The challenge of the cloud removal task can be alleviated with the aid of Synthetic Aperture Radar (SAR) images that can penetrate cloud cover. However, the large domain gap between optical and SAR images as well as the severe speckle noise of SAR images may cause significant interference in SAR-based cloud removal, resulting in performance degeneration. In this paper, we propose a novel global-local fusion based cloud removal (GLF-CR) algorithm to leverage the complementary information embedded in SAR images. Exploiting the power of SAR information to promote cloud removal entails two aspects. The first, global fusion, guides the relationship among all local optical windows to maintain the structure of the recovered region consistent with the remaining cloud-free regions. The second, local fusion, transfers complementary information embedded in the SAR image that corresponds to cloudy areas to generate reliable texture details of the missing regions, and uses dynamic filtering to alleviate the performance degradation caused by speckle noise. Extensive evaluation demonstrates that the proposed algorithm can yield high quality cloud-free images and outperform state-of-the-art cloud removal algorithms with a gain about 1.7dB in terms of PSNR on SEN12MS-CR dataset.

preprint2022arXiv

Learning to Extract Building Footprints from Off-Nadir Aerial Images

Extracting building footprints from aerial images is essential for precise urban mapping with photogrammetric computer vision technologies. Existing approaches mainly assume that the roof and footprint of a building are well overlapped, which may not hold in off-nadir aerial images as there is often a big offset between them. In this paper, we propose an offset vector learning scheme, which turns the building footprint extraction problem in off-nadir images into an instance-level joint prediction problem of the building roof and its corresponding "roof to footprint" offset vector. Thus the footprint can be estimated by translating the predicted roof mask according to the predicted offset vector. We further propose a simple but effective feature-level offset augmentation module, which can significantly refine the offset vector prediction by introducing little extra cost. Moreover, a new dataset, Buildings in Off-Nadir Aerial Images (BONAI), is created and released in this paper. It contains 268,958 building instances across 3,300 aerial images with fully annotated instance-level roof, footprint, and corresponding offset vector for each building. Experiments on the BONAI dataset demonstrate that our method achieves the state-of-the-art, outperforming other competitors by 3.37 to 7.39 points in F1-score. The codes, datasets, and trained models are available at https://github.com/jwwangchn/BONAI.git.

preprint2022arXiv

Linear change and minutes variability of solar wind velocity revealed by FAST

Observation of Interplanetary Scintillation (IPS) provides an important and effective way to study the solar wind and the space weather. A series of IPS observations were conducted by the Five-hundred-meter Aperture Spherical radio Telescope (FAST). The extraordinary sensitivity and the wide frequency coverage make FAST an ideal platform for IPS studies. In this paper we present some first scientific results from FAST observations of IPS with the L-band receiver. Based on the solar wind velocity fitting values of FAST observations on September 26-28, 2020, we found that the velocity decreases with increasing frequency linearly, which has not yet been reported in literature. And we have also detected a variation of solar wind velocity on a timescale of 3-5 minutes, which imply the slow change of the background solar wind, a co-existence of high- and low-speed streams, or a reflect of the quasi-periodic electron-density fluctuations.

preprint2022arXiv

MAD for Robust Reinforcement Learning in Machine Translation

We introduce a new distributed policy gradient algorithm and show that it outperforms existing reward-aware training procedures such as REINFORCE, minimum risk training (MRT) and proximal policy optimization (PPO) in terms of training stability and generalization performance when optimizing machine translation models. Our algorithm, which we call MAD (on account of using the mean absolute deviation in the importance weighting calculation), has distributed data generators sampling multiple candidates per source sentence on worker nodes, while a central learner updates the policy. MAD depends crucially on two variance reduction strategies: (1) a conditional reward normalization method that ensures each source sentence has both positive and negative reward translation examples and (2) a new robust importance weighting scheme that acts as a conditional entropy regularizer. Experiments on a variety of translation tasks show that policies learned using the MAD algorithm perform very well when using both greedy decoding and beam search, and that the learned policies are sensitive to the specific reward used during training.

preprint2022arXiv

Noun2Verb: Probabilistic frame semantics for word class conversion

Humans can flexibly extend word usages across different grammatical classes, a phenomenon known as word class conversion. Noun-to-verb conversion, or denominal verb (e.g., to Google a cheap flight), is one of the most prevalent forms of word class conversion. However, existing natural language processing systems are impoverished in interpreting and generating novel denominal verb usages. Previous work has suggested that novel denominal verb usages are comprehensible if the listener can compute the intended meaning based on shared knowledge with the speaker. Here we explore a computational formalism for this proposal couched in frame semantics. We present a formal framework, Noun2Verb, that simulates the production and comprehension of novel denominal verb usages by modeling shared knowledge of speaker and listener in semantic frames. We evaluate an incremental set of probabilistic models that learn to interpret and generate novel denominal verb usages via paraphrasing. We show that a model where the speaker and listener cooperatively learn the joint distribution over semantic frame elements better explains the empirical denominal verb usages than state-of-the-art language models, evaluated against data from 1) contemporary English in both adult and child speech, 2) contemporary Mandarin Chinese, and 3) the historical development of English. Our work grounds word class conversion in probabilistic frame semantics and bridges the gap between natural language processing systems and humans in lexical creativity.

preprint2022arXiv

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

This paper reviews the NTIRE 2022 challenge on efficient single image super-resolution with focus on the proposed solutions and results. The task of the challenge was to super-resolve an input image with a magnification factor of $\times$4 based on pairs of low and corresponding high resolution images. The aim was to design a network for single image super-resolution that achieved improvement of efficiency measured according to several metrics including runtime, parameters, FLOPs, activations, and memory consumption while at least maintaining the PSNR of 29.00dB on DIV2K validation set. IMDN is set as the baseline for efficiency measurement. The challenge had 3 tracks including the main track (runtime), sub-track one (model complexity), and sub-track two (overall performance). In the main track, the practical runtime performance of the submissions was evaluated. The rank of the teams were determined directly by the absolute value of the average runtime on the validation set and test set. In sub-track one, the number of parameters and FLOPs were considered. And the individual rankings of the two metrics were summed up to determine a final ranking in this track. In sub-track two, all of the five metrics mentioned in the description of the challenge including runtime, parameter count, FLOPs, activations, and memory consumption were considered. Similar to sub-track one, the rankings of five metrics were summed up to determine a final ranking. The challenge had 303 registered participants, and 43 teams made valid submissions. They gauge the state-of-the-art in efficient single image super-resolution.

preprint2022arXiv

On the Rate-Distortion-Perception Function

Rate-distortion-perception theory generalizes Shannon's rate-distortion theory by introducing a constraint on the perceptual quality of the output. The perception constraint complements the conventional distortion constraint and aims to enforce distribution-level consistencies. In this new theory, the information-theoretic limit is characterized by the rate-distortion-perception function. Although a coding theorem for the rate-distortion-perception function has recently been established, the fundamental nature of the optimal coding schemes remains unclear, especially regarding the role of randomness in encoding and decoding. It is shown in the present work that except for certain extreme cases, the rate-distortion-perception function is achievable by deterministic codes. This paper also clarifies the subtle differences between two notions of perfect perceptual quality and explores some alternative formulations of the perception constraint.

preprint2022arXiv

QPanda: high-performance quantum computing framework for multiple application scenarios

With the birth of Noisy Intermediate Scale Quantum (NISQ) devices and the verification of "quantum supremacy" in random number sampling and boson sampling, more and more fields hope to use quantum computers to solve specific problems, such as aerodynamic design, route allocation, financial option prediction, quantum chemical simulation to find new materials, and the challenge of quantum cryptography to automotive industry security. However, these fields still need to constantly explore quantum algorithms that adapt to the current NISQ machine, so a quantum programming framework that can face multi-scenarios and application needs is required. Therefore, this paper proposes QPanda, an application scenario-oriented quantum programming framework with high-performance simulation. Such as designing quantum chemical simulation algorithms based on it to explore new materials, building a quantum machine learning framework to serve finance, etc. This framework implements high-performance simulation of quantum circuits, a configuration of the fusion processing backend of quantum computers and supercomputers, and compilation and optimization methods of quantum programs for NISQ machines. Finally, the experiment shows that quantum jobs can be executed with high fidelity on the quantum processor using quantum circuit compile and optimized interface and have better simulation performance.

preprint2022arXiv

Sequential Channel Synthesis

The channel synthesis problem has been widely investigated over the last decade. In this paper, we consider the sequential version in which the encoder and the decoder work in a sequential way. Under a mild assumption on the target joint distribution we provide a complete (single-letter) characterization of the solution for the point-to-point case, which shows that the canonical symbol-by-symbol mapping is not optimal in general, but is indeed optimal if we make some additional assumptions on the encoder and decoder. We also extend this result to the broadcast scenario and the interactive communication scenario. We provide bounds in the broadcast setting and a complete characterization of the solution under a mild condition on the target joint distribution in the interactive communication case. Our proofs are based on a Rényi entropy method.

preprint2022arXiv

Unifying Motion Deblurring and Frame Interpolation with Events

Slow shutter speed and long exposure time of frame-based cameras often cause visual blur and loss of inter-frame information, degenerating the overall quality of captured videos. To this end, we present a unified framework of event-based motion deblurring and frame interpolation for blurry video enhancement, where the extremely low latency of events is leveraged to alleviate motion blur and facilitate intermediate frame prediction. Specifically, the mapping relation between blurry frames and sharp latent images is first predicted by a learnable double integral network, and a fusion network is then proposed to refine the coarse results via utilizing the information from consecutive blurry inputs and the concurrent events. By exploring the mutual constraints among blurry frames, latent images, and event streams, we further propose a self-supervised learning framework to enable network training with real-world blurry videos and events. Extensive experiments demonstrate that our method compares favorably against the state-of-the-art approaches and achieves remarkable performance on both synthetic and real-world datasets.

preprint2022arXiv

Weak solutions to an initial-boundary value problem for a continuum equation of motion of grain boundaries

We investigate an initial-(periodic-)boundary value problem for a continuum equation, which is a model for motion of grain boundaries based on the underlying microscopic mechanisms of line defects (disconnections) and integrated the effects of a diverse range of thermodynamic driving forces. We first prove the global-in-time existence and uniqueness of weak solution to this initial-boundary value problem in the case with positive equilibrium disconnection density parameter B, and then investigate the asymptotic behavior of the solutions as B goes to zero. The main difficulties in the proof of main theorems are due to the degeneracy of B=0, a non-local term with singularity, and a non-smooth coefficient of the highest derivative associated with the gradient of the unknown. The key ingredients in the proof are the energy method, an estimate for a singular integral of the Hilbert type, and a compactness lemma.

preprint2021arXiv

On Non-Interactive Simulation of Binary Random Variables

We leverage proof techniques Fourier analysis and an existing result in coding theory to derive new bounds for the problem of non-interactive simulation of binary random variables. Previous bounds in the literature were derived by applying data processing inequalities concerning maximal correlation or hypercontractivity. We show that our bounds are sharp in some regimes. For a specific instance of problem parameters, our main result answers an open problem posed by E. Mossel in 2017. As by-products of our analyses, various new properties of the average distance and distance enumerator of binary block codes are established.

preprint2020arXiv

Better Document-Level Machine Translation with Bayes' Rule

We show that Bayes' rule provides an effective mechanism for creating document translation models that can be learned from only parallel sentences and monolingual documents---a compelling benefit as parallel documents are not always available. In our formulation, the posterior probability of a candidate translation is the product of the unconditional (prior) probability of the candidate output document and the "reverse translation probability" of translating the candidate output back into the source language. Our proposed model uses a powerful autoregressive language model as the prior on target language documents, but it assumes that each sentence is translated independently from the target to the source language. Crucially, at test time, when a source document is observed, the document language model prior induces dependencies between the translations of the source sentences in the posterior. The model's independence assumption not only enables efficient use of available data, but it additionally admits a practical left-to-right beam-search algorithm for carrying out inference. Experiments show that our model benefits from using cross-sentence context in the language model, and it outperforms existing document translation approaches.

preprint2020arXiv

Capturing document context inside sentence-level neural machine translation models with self-training

Neural machine translation (NMT) has arguably achieved human level parity when trained and evaluated at the sentence-level. Document-level neural machine translation has received less attention and lags behind its sentence-level counterpart. The majority of the proposed document-level approaches investigate ways of conditioning the model on several source or target sentences to capture document context. These approaches require training a specialized NMT model from scratch on parallel document-level corpora. We propose an approach that doesn't require training a specialized model on parallel document-level corpora and is applied to a trained sentence-level NMT model at decoding time. We process the document from left to right multiple times and self-train the sentence-level model on pairs of source sentences and generated translations. Our approach reinforces the choices made by the model, thus making it more likely that the same choices will be made in other sentences in the document. We evaluate our approach on three document-level datasets: NIST Chinese-English, WMT'19 Chinese-English and OpenSubtitles English-Russian. We demonstrate that our approach has higher BLEU score and higher human preference than the baseline. Qualitative analysis of our approach shows that choices made by model are consistent across the document.

preprint2020arXiv

Corrections to "Wyner's Common Information under Rényi Divergence Measures"

In this correspondence, we correct an erroneous result on the achievability part of the Rényi common information with order $1+s\in(1,2]$ in [1]. The new achievability result (upper bound) of the Rényi common information no longer coincides with Wyner's common information. We also provide a new converse result (lower bound) in this correspondence for the Rényi common information with order $1+s\in(1,\infty]$. Numerical results show that for doubly symmetric binary sources, the new upper and lower bounds coincide for the order $1+s\in(1,2]$ and they are both strictly larger than Wyner's common information for this case.

preprint2020arXiv

Event Enhanced High-Quality Image Recovery

With extremely high temporal resolution, event cameras have a large potential for robotics and computer vision. However, their asynchronous imaging mechanism often aggravates the measurement sensitivity to noises and brings a physical burden to increase the image spatial resolution. To recover high-quality intensity images, one should address both denoising and super-resolution problems for event cameras. Since events depict brightness changes, with the enhanced degeneration model by the events, the clear and sharp high-resolution latent images can be recovered from the noisy, blurry and low-resolution intensity observations. Exploiting the framework of sparse learning, the events and the low-resolution intensity observations can be jointly considered. Based on this, we propose an explainable network, an event-enhanced sparse learning network (eSL-Net), to recover the high-quality images from event cameras. After training with a synthetic dataset, the proposed eSL-Net can largely improve the performance of the state-of-the-art by 7-12 dB. Furthermore, without additional training process, the proposed eSL-Net can be easily extended to generate continuous frames with frame-rate as high as the events.

preprint2020arXiv

Exact minimum codegree thresholds for $K_4^-$-covering and $K_5^-$-covering

Given two $3$-graphs $F$ and $H$, an $F$-covering of $H$ is a collection of copies of $F$ in $H$ such that each vertex of $H$ is contained in at least one copy of them. Let {$c_2(n,F)$} be the maximum integer $t$ such that every 3-graph with minimum codegree greater than $t$ has an $F$-covering. In this note, we answer an open problem of Falgas-Ravry and Zhao (SIAM J. Discrete Math., 2016) by determining the exact value of {$c_2(n, K_4^-)$} and {$c_2(n, K_5^-)$}, where $K_t^-$ is the complete $3$-graph on $t$ vertices with one edge removed.

preprint2020arXiv

Implicit Euler ODE Networks for Single-Image Dehazing

Deep convolutional neural networks (CNN) have been applied for image dehazing tasks, where the residual network (ResNet) is often adopted as the basic component to avoid the vanishing gradient problem. Recently, many works indicate that the ResNet can be considered as the explicit Euler forward approximation of an ordinary differential equation (ODE). In this paper, we extend the explicit forward approximation to the implicit backward counterpart, which can be realized via a recursive neural network, named IM-block. Given that, we propose an efficient end-to-end multi-level implicit network (MI-Net) for the single image dehazing problem. Moreover, multi-level fusing (MLF) mechanism and residual channel attention block (RCA-block) are adopted to boost performance of our network. Experiments on several dehazing benchmark datasets demonstrate that our method outperforms existing methods and achieves the state-of-the-art performance.

preprint2020arXiv

Improving Spiking Sparse Recovery via Non-Convex Penalties

Compared with digital methods, sparse recovery based on spiking neural networks has great advantages like high computational efficiency and low power-consumption. However, current spiking algorithms cannot guarantee more accurate estimates since they are usually designed to solve the classical optimization with convex penalties, especially the $\ell_{1}$-norm. In fact, convex penalties are observed to underestimate the true solution in practice, while non-convex ones can avoid the underestimation. Inspired by this, we propose an adaptive version of spiking sparse recovery algorithm to solve the non-convex regularized optimization, and provide an analysis on its global asymptotic convergence. Through experiments, the accuracy is greatly improved under different adaptive ways.

preprint2020arXiv

Intra-Ensemble in Neural Networks

Improving model performance is always the key problem in machine learning including deep learning. However, stand-alone neural networks always suffer from marginal effect when stacking more layers. At the same time, ensemble is an useful technique to further enhance model performance. Nevertheless, training several independent deep neural networks for ensemble costs multiple resources. If so, is it possible to utilize ensemble in only one neural network? In this work, we propose Intra-Ensemble, an end-to-end ensemble strategy with stochastic channel recombination operations to train several sub-networks simultaneously within one neural network. Additional parameter size is marginal since the majority of parameters are mutually shared. Meanwhile, stochastic channel recombination significantly increases the diversity of sub-networks, which finally enhances ensemble performance. Extensive experiments and ablation studies prove the applicability of intra-ensemble on various kinds of datasets and network architectures.

preprint2020arXiv

Matching Neuromorphic Events and Color Images via Adversarial Learning

The event camera has appealing properties: high dynamic range, low latency, low power consumption and low memory usage, and thus provides complementariness to conventional frame-based cameras. It only captures the dynamics of a scene and is able to capture almost "continuous" motion. However, different from frame-based camera that reflects the whole appearance as scenes are, the event camera casts away the detailed characteristics of objects, such as texture and color. To take advantages of both modalities, the event camera and frame-based camera are combined together for various machine vision tasks. Then the cross-modal matching between neuromorphic events and color images plays a vital and essential role. In this paper, we propose the Event-Based Image Retrieval (EBIR) problem to exploit the cross-modal matching task. Given an event stream depicting a particular object as query, the aim is to retrieve color images containing the same object. This problem is challenging because there exists a large modality gap between neuromorphic events and color images. We address the EBIR problem by proposing neuromorphic Events-Color image Feature Learning (ECFL). Particularly, the adversarial learning is employed to jointly model neuromorphic events and color images into a common embedding space. We also contribute to the community N-UKbench and EC180 dataset to promote the development of EBIR problem. Extensive experiments on our datasets show that the proposed method is superior in learning effective modality-invariant representation to link two different modalities.

preprint2020arXiv

Mixed Noise Removal with Pareto Prior

Denoising images contaminated by the mixture of additive white Gaussian noise (AWGN) and impulse noise (IN) is an essential but challenging problem. The presence of impulsive disturbances inevitably affects the distribution of noises and thus largely degrades the performance of traditional AWGN denoisers. Existing methods target to compensate the effects of IN by introducing a weighting matrix, which, however, is lack of proper priori and thus hard to be accurately estimated. To address this problem, we exploit the Pareto distribution as the priori of the weighting matrix, based on which an accurate and robust weight estimator is proposed for mixed noise removal. Particularly, a relatively small portion of pixels are assumed to be contaminated with IN, which should have weights with small values and then be penalized out. This phenomenon can be properly described by the Pareto distribution of type 1. Therefore, armed with the Pareto distribution, we formulate the problem of mixed noise removal in the Bayesian framework, where nonlocal self-similarity priori is further exploited by adopting nonlocal low rank approximation. Compared to existing methods, the proposed method can estimate the weighting matrix adaptively, accurately, and robust for different level of noises, thus can boost the denoising performance. Experimental results on widely used image datasets demonstrate the superiority of our proposed method to the state-of-the-arts.

preprint2020arXiv

On Exact and $\infty$-Rényi Common Informations

Recently, two extensions of Wyner's common information\textemdash exact and Rényi common informations\textemdash were introduced respectively by Kumar, Li, and El Gamal (KLE), and the present authors. The class of common information problems involves determining the minimum rate of the common input to two independent processors needed to exactly or approximately generate a target joint distribution. For the exact common information problem, exact generation of the target distribution is required, while for Wyner's and $α$-Rényi common informations, the relative entropy and Rényi divergence with order $α$ were respectively used to quantify the discrepancy between the synthesized and target distributions. The exact common information is larger than or equal to Wyner's common information. However, it was hitherto unknown whether the former is strictly larger than the latter for some joint distributions. In this paper, we first establish the equivalence between the exact and $\infty$-Rényi common informations, and then provide single-letter upper and lower bounds for these two quantities. For doubly symmetric binary sources, we show that the upper and lower bounds coincide, which implies that for such sources, the exact and $\infty$-Rényi common informations are completely characterized. Interestingly, we observe that for such sources, these two common informations are strictly larger than Wyner's. This answers an open problem posed by KLE. Furthermore, we extend Wyner's, $\infty$-Rényi, and exact common informations to sources with countably infinite or continuous alphabets, including Gaussian sources.

preprint2020arXiv

Robust Time-Frequency Reconstruction by Learning Structured Sparsity

Time-frequency distributions (TFDs) play a vital role in providing descriptive analysis of non-stationary signals involved in realistic scenarios. It is well known that low time-frequency (TF) resolution and the emergency of cross-terms (CTs) are two main issues, which make it difficult to analyze and interpret practical signals using TFDs. In order to address these issues, we propose the U-Net aided iterative shrinkage-thresholding algorithm (U-ISTA) for reconstructing a near-ideal TFD by exploiting structured sparsity in signal TF domain. Specifically, the signal ambiguity function is firstly compressed, followed by unfolding the ISTA as a recurrent neural network. To consider continuously distributed characteristics of signals, a structured sparsity constraint is incorporated into the unfolded ISTA by regarding the U-Net as an adaptive threshold block, in which structure-aware thresholds are learned from enormous training data to exploit the underlying dependencies among neighboring TF coefficients. The proposed U-ISTA model is trained by both non-overlapped and overlapped synthetic signals including closely and far located non-stationary components. Experimental results demonstrate that the robust U-ISTA achieves superior performance compared with state-of-the-art algorithms, and gains a high TF resolution with CTs greatly eliminated even in low signal-to-noise ratio (SNR) environments.

preprint2020arXiv

Single Image Deraining with Continuous Rain Density Estimation

Single image deraining (SIDR) often suffers from over/under deraining due to the nonuniformity of rain densities and the variety of raindrop scales. In this paper, we propose a \textbf{\it co}ntinuous \textbf{\it de}nsity guided network (CODE-Net) for SIDR. Particularly, it is composed of { a rain {\color{black}streak} extractor and a denoiser}, where the convolutional sparse coding (CSC) is exploited to filter out noises from the extracted rain streaks. Inspired by the reweighted iterative soft-threshold for CSC, we address the problem of continuous rain density estimation by learning the weights with channel attention blocks from sparse codes. We further {\color{black}develop} a multiscale strategy to depict rain streaks appearing at different scales. Experiments on synthetic and real-world data demonstrate the superiority of our methods over recent {\color{black}state of the arts}, in terms of both quantitative and qualitative results. Additionally, instead of quantizing rain density with several levels, our CODE-Net can provide continuous-valued estimations of rain densities, which is more desirable in real applications.

preprint2020arXiv

Structure-Aware Network for Lane Marker Extraction with Dynamic Vision Sensor

Lane marker extraction is a basic yet necessary task for autonomous driving. Although past years have witnessed major advances in lane marker extraction with deep learning models, they all aim at ordinary RGB images generated by frame-based cameras, which limits their performance in extreme cases, like huge illumination change. To tackle this problem, we introduce Dynamic Vision Sensor (DVS), a type of event-based sensor to lane marker extraction task and build a high-resolution DVS dataset for lane marker extraction. We collect the raw event data and generate 5,424 DVS images with a resolution of 1280$\times$800 pixels, the highest one among all DVS datasets available now. All images are annotated with multi-class semantic segmentation format. We then propose a structure-aware network for lane marker extraction in DVS images. It can capture directional information comprehensively with multidirectional slice convolution. We evaluate our proposed network with other state-of-the-art lane marker extraction models on this dataset. Experimental results demonstrate that our method outperforms other competitors. The dataset is made publicly available, including the raw event data, accumulated images and labels.

preprint2016arXiv

Comments on "Approximate Characterizations for the Gaussian Source Broadcast Distortion Region"

Recently, Tian et al. [1] considered joint source-channel coding of transmitting a Gaussian source over $K$-user Gaussian broadcast channel, and derived an outer bound on the admissible distortion region. In [1], they stated "due to its nonlinear form, it appears difficult to determine whether it is always looser than the trivial outer bound in all distortion regimes with bandwidth compression". However, in this correspondence we solve this problem and prove that for the bandwidth expansion case ($K\geq2$), this outer bound is strictly tighter than the trivial outer bound with each user being optimal in the point-to-point setting; while for the bandwidth compression or bandwidth match case, this outer bound actually degenerates to the trivial outer bound. Therefore, our results imply that on one hand, the outer bound given in [1] is nontrivial only for Gaussian broadcast communication ($K\geq2$) with bandwidth expansion; on the other hand, unfortunately, no nontrivial outer bound exists so far for Gaussian broadcast communication ($K\geq2$) with bandwidth compression.

preprint2016arXiv

Distortion Bounds for Transmitting Correlated Sources with Common Part over MAC

This paper investigates the joint source-channel coding problem of sending two correlated memoryless sources with common part over a memoryless multiple access channel (MAC). An inner bound and two outer bounds on the achievable distortion region are derived. In particular, they respectively recover the existing bounds for several special cases, such as communication without common part, lossless communication, and noiseless communication. When specialized to quadratic Gaussian communication case, transmitting Gaussian sources with Gaussian common part over Gaussian MAC, the inner bound and outer bound are used to generate two new bounds. Numerical result shows that common part improves the distortion of such distributed source-channel coding problem.

preprint2016arXiv

Frequency Estimation of Multiple Sinusoids with Sub-Nyquist Sampling Sequences

In some applications of frequency estimation, the frequencies of multiple sinusoids are required to be estimated from sub-Nyquist sampling sequences. In this paper, we propose a novel method based on subspace techniques to estimate the frequencies by using under-sampled samples. We analyze the impact of under-sampling and demonstrate that three sub-Nyquist sequences are general enough to estimate the frequencies under some condition. The frequencies estimated from one sequence are unfolded in frequency domain, and then the other two sequences are used to pick the correct frequencies from all possible frequencies. Simulations illustrate the validity of the theory. Numerical results show that this method is feasible and accurate at quite low sampling rates.

preprint2016arXiv

GPU-FV: Realtime Fisher Vector and Its Applications in Video Monitoring

Fisher vector has been widely used in many multimedia retrieval and visual recognition applications with good performance. However, the computation complexity prevents its usage in real-time video monitoring. In this work, we proposed and implemented GPU-FV, a fast Fisher vector extraction method with the help of modern GPUs. The challenge of implementing Fisher vector on GPUs lies in the data dependency in feature extraction and expensive memory access in Fisher vector computing. To handle these challenges, we carefully designed GPU-FV in a way that utilizes the computing power of GPU as much as possible, and applied optimizations such as loop tiling to boost the performance. GPU-FV is about 12 times faster than the CPU version, and 50\% faster than a non-optimized GPU implementation. For standard video input (320*240), GPU-FV can process each frame within 34ms on a model GPU. Our experiments show that GPU-FV obtains a similar recognition accuracy as traditional FV on VOC 2007 and Caltech 256 image sets. We also applied GPU-FV for realtime video monitoring tasks and found that GPU-FV outperforms a number of previous works. Especially, when the number of training examples are small, GPU-FV outperforms the recent popular deep CNN features borrowed from ImageNet. The code can be downloaded from the following link https://bitbucket.org/mawenjing/gpu-fv.

preprint2016arXiv

Line Spectral Estimation Based on Compressed Sensing with Deterministic Sub-Nyquist Sampling

As an alternative to the traditional sampling theory, compressed sensing allows acquiring much smaller amount of data, still estimating the spectra of frequency-sparse signals accurately. However, compressed sensing usually requires random sampling in data acquisition, which is difficult to implement in hardware. In this paper, we propose a deterministic and simple sampling scheme, that is, sampling at three sub-Nyquist rates which have coprime undersampled ratios. This sampling method turns out to be valid through numerical experiments. A complex-valued multitask algorithm based on variational Bayesian inference is proposed to estimate the spectra of frequency-sparse signals after sampling. Simulations show that this method is feasible and robust at quite low sampling rates.

preprint2016arXiv

Local exact boundary controllability of entropy solutions to a class of hyperbolic systems of conservation laws

In this paper, we study the local exact boundary controllability of entropy solutions to a class linearly degenerate hyperbolic systems of conservation laws with constant multiplicity. The authors prove the two-sided boundary controllability, one-sided boundary controllability and two-sided controllability with less controls, by applying the strategy used originally for classical solutions with essential modifications. Our constructive method is based on the well-posedness of semi-global solutions constructed by the limit of $ \e $-approximate front tracking solutions to the mixed initial-boundary value problem with general nonlinear boundary conditions and some further properties on both $ \e $-approximate front tracking solutions and entropy solutions.

preprint2016arXiv

Neural Variational Inference for Text Processing

Recent advances in neural variational inference have spawned a renaissance in deep latent variable models. In this paper we introduce a generic variational inference framework for generative and conditional models of text. While traditional variational methods derive an analytic approximation for the intractable distributions over latent variables, here we construct an inference network conditioned on the discrete text input to provide the variational distribution. We validate this framework on two very different text modelling applications, generative document modelling and supervised question answering. Our neural variational document model combines a continuous stochastic document representation with a bag-of-words generative model and achieves the lowest reported perplexities on two standard test corpora. The neural answer selection model employs a stochastic representation layer within an attention mechanism to extract the semantics between a question and answer pair. On two question answering benchmarks this model exceeds all previous published benchmarks.

preprint2016arXiv

Online Segment to Segment Neural Transduction

We introduce an online neural sequence to sequence model that learns to alternate between encoding and decoding segments of the input as it is read. By independently tracking the encoding and decoding representations our algorithm permits exact polynomial marginalization of the latent segmentation during training, and during decoding beam search is employed to find the best alignment path together with the predicted output sequence. Our model tackles the bottleneck of vanilla encoder-decoders that have to read and memorize the entire input sequence in their fixed-length hidden states before producing any output. It is different from previous attentive models in that, instead of treating the attention weights as output of a deterministic function, our model assigns attention weights to a sequential latent variable which can be marginalized out and permits online generation. Experiments on abstractive sentence summarization and morphological inflection show significant performance gains over the baseline encoder-decoders.

preprint2015arXiv

A Class of Deterministic Sensing Matrices and Their Application in Harmonic Detection

In this paper, a class of deterministic sensing matrices are constructed by selecting rows from Fourier matrices. These matrices have better performance in sparse recovery than random partial Fourier matrices. The coherence and restricted isometry property of these matrices are given to evaluate their capacity as compressive sensing matrices. In general, compressed sensing requires random sampling in data acquisition, which is difficult to implement in hardware. By using these sensing matrices in harmonic detection, a deterministic sampling method is provided. The frequencies and amplitudes of the harmonic components are estimated from under-sampled data. The simulations show that this under-sampled method is feasible and valid in noisy environments.

preprint2015arXiv

Assessing Google Correlate Queries for Influenza H1N1 Surveillance in Asian Developing Countries

So far, Google Trend data have been used for influenza surveillance in many European and American countries; however, there are few attempts to apply the low-cost surveillance method in Asian developing countries. To investigate the correlation between the search trends and the influenza activity in Asia, we examined the Google query data of four Asian developing countries.

preprint2015arXiv

Joint Frequency Estimation with Two Sub-Nyquist Sampling Sequences

In many applications of frequency estimation, the frequencies of the signals are so high that the data sampled at Nyquist rate are hard to acquire due to hardware limitation. In this paper, we propose a novel method based on subspace techniques to estimate the frequencies by using two sub-Nyquist sample sequences, provided that the two under-sampled ratios are relatively prime integers. We analyze the impact of under-sampling and expand the estimated frequencies which suffer from aliasing. Through jointing the results estimated from these two sequences, the frequencies approximate to the frequency components really contained in the signals are screened. The method requires a small quantity of hardware and calculation. Numerical results show that this method is valid and accurate at quite low sampling rates.

preprint2015arXiv

Secrecy Communication with Security Rate Measure

We introduce a new measure on secrecy, which is established based on rate-distortion theory. It is named \emph{security rate}, which is the minimum (infimum) of the additional rate needed to reconstruct the source within target distortion level with any positive probability for wiretapper. It denotes the minimum distance in information metric (bits) from what wiretapper has received to any decrypted reconstruction (where decryption is defined as reconstruction within target distortion level with some positive possibility). By source coding theorem, it is equivalent to a distortion-based equivocation $\mathop{\min}\limits _{p\left(v^{n}|s^{n},m\right):Ed\left(S^{n},V^{n}\right)\le D_{E}}\frac{1}{n}I\left(S^{n};V^{n}|M\right)$ which can be seen as a direct extension of equivocation $\frac{1}{n}H\left(S^{n}|M\right)$ to lossy decryption case, given distortion level $D_{E}$ and the received (encrypted) message $M$ of wiretapper. In this paper, we study it in Shannon cipher system with lossless communication, where a source is transmitted from sender to legitimate receiver secretly and losslessly, and also eavesdropped by a wiretapper. We characterize the admissible region of secret key rate, coding rate of the source, wiretapper distortion, and security rate (distortion-based equivocation). Since the security rate equals the distortion-based equivocation, and the equivocation is a special case of the distortion-based equivocation (with Hamming distortion measure and $D_{E}=0$), this gives an answer for the meaning of the maximum equivocation.

preprint2015arXiv

Some new progress on the light absorption properties of linear alkyl benzene solvent

Linear alkyl benzene (LAB) will be used as the solvent of a liquid scintillator mixture for the JUNO antineutrino experiment in the near future. Its light absorption property should therefore be understood prior to its effective use in the experiment. Attenuation length measurements at a light wavelength of 430 nm have been performed on samples of LAB prepared for the purpose of the JUNO experiment. Inorganic impurities in LAB have also been studied for their possibilities of light absorption in our wavelength of interest. In view of a tentative plan by the JUNO collaboration to utilize neutron capture with hydrogen in the detector, we have also presented in this work, a study on the carbon-hydrogen ratio and the relationship thereof with the attenuation length of the samples.

preprint2015arXiv

The SYSU System for the Interspeech 2015 Automatic Speaker Verification Spoofing and Countermeasures Challenge

Many existing speaker verification systems are reported to be vulnerable against different spoofing attacks, for example speaker-adapted speech synthesis, voice conversion, play back, etc. In order to detect these spoofed speech signals as a countermeasure, we propose a score level fusion approach with several different i-vector subsystems. We show that the acoustic level Mel-frequency cepstral coefficients (MFCC) features, the phase level modified group delay cepstral coefficients (MGDCC) and the phonetic level phoneme posterior probability (PPP) tandem features are effective for the countermeasure. Furthermore, feature level fusion of these features before i-vector modeling also enhance the performance. A polynomial kernel support vector machine is adopted as the supervised classifier. In order to enhance the generalizability of the countermeasure, we also adopted the cosine similarity and PLDA scoring as one-class classifications methods. By combining the proposed i-vector subsystems with the OpenSMILE baseline which covers the acoustic and prosodic information further improves the final performance. The proposed fusion system achieves 0.29% and 3.26% EER on the development and test set of the database provided by the INTERSPEECH 2015 automatic speaker verification spoofing and countermeasures challenge.

preprint2014arXiv

Deep Learning for Answer Sentence Selection

Answer sentence selection is the task of identifying sentences that contain the answer to a given question. This is an important problem in its own right as well as in the larger context of open domain question answering. We propose a novel approach to solving this task via means of distributed representations, and learn to match questions with answers by considering their semantic encoding. This contrasts prior work on this task, which typically relies on classifiers with large numbers of hand-crafted syntactic and semantic features and various external resources. Our approach does not require any feature engineering nor does it involve specialist linguistic data, making this model easily applicable to a wide range of domains and languages. Experimental results on a standard benchmark dataset from TREC demonstrate that---despite its simplicity---our model matches state of the art performance on the answer sentence selection task.

preprint2014arXiv

Modelling Electrical Car Diffusion Based on Agents

Replacing traditional fossil fuel vehicles with innovative zero-emission vehicles for the transport in ci ties is one of the major tactics to achieve the UK government 2020 target of cutting emission. We are developing an agent-based simulation model to study the possible impact of different governmental interventions on the diffusion of such vehicles. Options that could be studied with our what-if analysis to include things like car parking charges, price of electrical car, energy awareness and word of mouth. In this paper we present a first case study related to the introduction of a new car park charging scheme at the University of Nottingham. We have developed an agent based model to simulate theimpact of different car parking rates and other incentives on the uptake of electrical cars. The goal of this case study is to demonstrate the usefulness of agent-based modelling and simulation for such investigations.

preprint2014arXiv

Structure of entropy solutions to general scalar conservation laws in one space dimension

In this paper, we show that the entropy solution of a scalar conservation law is - continuous outside a $1$-rectifiable set $Ξ$, - up to a $\mathcal H^1$ negligible set, for each point $(\bar t,\bar x) \in Ξ$ there exists two regions where $u$ is left/right continuous in $(\bar t,\bar x)$. We provide examples showing that these estimates are nearly optimal. In order to achieve these regularity results, we extend the wave representation of the wavefront approximate solutions to entropy solution. This representation can the interpreted as some sort of Lagrangian representation of the solution to the nonlinear scalar PDE, and implies a fine structure on the level sets of the entropy solution.

preprint2012arXiv

Global structure of admissible BV solutions to piecewise genuinely nonlinear, strictly hyperbolic conservation laws in one space dimension

The paper gives an accurate description of the qualitative structure of an admissible BV solution to a strictly hyperbolic, piecewise genuinely nonlinear system of conservation laws. We prove that there are a countable set $Θ$ which contains all interaction points and a family of countably many Lipschitz curves $\T$ such that outside $\T\cup Θ$ $u$ is continuous, and along the curves in $\T$, u has left and right limit except for points in $Θ$. This extends the corresponding structural result in \cite{BL,Liu1} for admissible solutions. The proof is based on approximate wave-front tracking solutions and a proper selection of discontinuity curves in the approximate solutions, which converge to curves covering the discontinuities in the exact solution $u$.

preprint2012arXiv

SBV-like regularity for general hyperbolic systems of conservation laws

We prove the SBV regularity of the characteristic speed of the scalar hyperbolic conservation law and SBV-like regularity of the eigenvalue functions of the Jacobian matrix of flux function for general systems of conservation laws. More precisely, for the equation u_t + f(u)_x = 0, \quad u : \R^+ \times \R \to Ω\subset \R^N, we only assume the flux $f$ is $C^2$ function in the scalar case (N=1) and Jacobian matrix $Df$ has distinct real eigenvalues in the system case $(N\geq 2)$. Using the modification of the main decay estimate in Lau and localization method applied in \cite{R}, we show that for the scalar equation $f'(u)$ belongs to SBV, and for system of conservation laws the scalar measure \[\big(D_u λ_i(u) \cdot r_i(u) \big) \big(l_i(u) \cdot u_x \big)] has no Cantor part, where $λ_i$, $r_i$, $l_i$ are the $i$-th eigenvalue, $i$-th right eigenvector and $i$-th left eigenvector of the matrix $Df$.

Lei Yu

What is connected

Connect this record

See the researcher in context

Building this map preview

58 published item(s)

OxyGent: Making Multi-Agent Systems Modular, Observable, and Evolvable via Oxy Abstraction

Re-evaluating the Memory-balanced Pipeline Parallelism: BPipe

A Normalized Gaussian Wasserstein Distance for Tiny Object Detection

Asymptotics for Strassen's Optimal Transport Problem

Autofocus for Event Cameras

Backpropagation through Time and Space: Learning Numerical Methods with Multi-Agent Reinforcement Learning

BSRT: Improving Burst Super-Resolution with Swin Transformer and Flow-Guided Deformable Alignment

Deep Constrained Least Squares for Blind Image Super-Resolution

Detecting tiny objects in aerial images: A normalized Wasserstein distance and a new benchmark

Documentation based Semantic-Aware Log Parsing

Enabling arbitrary translation objectives with Adaptive Tree Search

Fast Nearest Convolution for Real-Time Efficient Image Super-Resolution

First implementation of full-workflow automation in radiotherapy: the All-in-One solution on rectal cancer

GLF-CR: SAR-Enhanced Cloud Removal with Global-Local Fusion

Learning to Extract Building Footprints from Off-Nadir Aerial Images

Linear change and minutes variability of solar wind velocity revealed by FAST

MAD for Robust Reinforcement Learning in Machine Translation

Noun2Verb: Probabilistic frame semantics for word class conversion

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

On the Rate-Distortion-Perception Function

QPanda: high-performance quantum computing framework for multiple application scenarios

Sequential Channel Synthesis

Unifying Motion Deblurring and Frame Interpolation with Events

Weak solutions to an initial-boundary value problem for a continuum equation of motion of grain boundaries

On Non-Interactive Simulation of Binary Random Variables

Better Document-Level Machine Translation with Bayes' Rule

Capturing document context inside sentence-level neural machine translation models with self-training

Corrections to "Wyner's Common Information under Rényi Divergence Measures"

Event Enhanced High-Quality Image Recovery

Exact minimum codegree thresholds for $K_4^-$-covering and $K_5^-$-covering

Implicit Euler ODE Networks for Single-Image Dehazing

Improving Spiking Sparse Recovery via Non-Convex Penalties

Intra-Ensemble in Neural Networks

Matching Neuromorphic Events and Color Images via Adversarial Learning

Mixed Noise Removal with Pareto Prior

On Exact and $\infty$-Rényi Common Informations

Robust Time-Frequency Reconstruction by Learning Structured Sparsity

Single Image Deraining with Continuous Rain Density Estimation

Structure-Aware Network for Lane Marker Extraction with Dynamic Vision Sensor

Comments on "Approximate Characterizations for the Gaussian Source Broadcast Distortion Region"

Distortion Bounds for Transmitting Correlated Sources with Common Part over MAC

Frequency Estimation of Multiple Sinusoids with Sub-Nyquist Sampling Sequences

GPU-FV: Realtime Fisher Vector and Its Applications in Video Monitoring

Line Spectral Estimation Based on Compressed Sensing with Deterministic Sub-Nyquist Sampling

Local exact boundary controllability of entropy solutions to a class of hyperbolic systems of conservation laws

Neural Variational Inference for Text Processing

Online Segment to Segment Neural Transduction

A Class of Deterministic Sensing Matrices and Their Application in Harmonic Detection

Assessing Google Correlate Queries for Influenza H1N1 Surveillance in Asian Developing Countries

Joint Frequency Estimation with Two Sub-Nyquist Sampling Sequences

Secrecy Communication with Security Rate Measure

Some new progress on the light absorption properties of linear alkyl benzene solvent

The SYSU System for the Interspeech 2015 Automatic Speaker Verification Spoofing and Countermeasures Challenge

Deep Learning for Answer Sentence Selection

Modelling Electrical Car Diffusion Based on Agents

Structure of entropy solutions to general scalar conservation laws in one space dimension

Global structure of admissible BV solutions to piecewise genuinely nonlinear, strictly hyperbolic conservation laws in one space dimension

SBV-like regularity for general hyperbolic systems of conservation laws