Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
26works
0followers
25topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

26 published item(s)

preprint2026arXiv

Geometry-Calibrated Conformal Abstention for Language Models

When language models lack relevant knowledge for a given query, they frequently generate plausible responses that can be hallucinations, rather than admitting being agnostic about the answer. Retraining models to reward admitting ignorance can lead to overly conservative behaviors and poor generalization due to scarce evaluation benchmarks. We propose a post hoc framework, Conformal Abstention (CA), adapted from conformal prediction (CP) to determine whether to abstain from answering a query. CA provides finite-sample guarantees on both the probability of participation (i.e., not abstaining) and the probability that the generated response is correct. Importantly, the abstention decision relies on prediction confidence rather than the non-conformity scores used in CP, which are intractable for open-ended generation. To better align prediction confidence with the model's ignorance, we introduce a calibration strategy using representation geometry within the model to measure knowledge involvement in shaping the response. Experiments demonstrate that we improve selective answering significantly with 75 percent conditional correctness.

preprint2026arXiv

Robust Conditional Conformal Prediction via Branched Normalizing Flow

Conformal prediction (CP) constructs prediction sets with marginal coverage guarantees under the assumption that the calibration and test distributions are identical. However, under distribution shift, existing approaches primarily align marginal conformal score distributions, which is sufficient to preserve marginal coverage but does not control the conditional coverage error at individual test inputs. As a consequence, CP can remain unreliable in regions where the conditional score distributions are mismatched. In this work, we bound the conditional invalidity of CP under distribution shift in terms of the Wasserstein distance between the calibration and test distributions. This result highlights the role of invertible transport in mitigating conditional coverage degradation. Motivated by this insight, we introduce Branched Normalizing Flow (BNF), a two-branch architecture that normalizes a test input to the calibration distribution and transforms the prediction set of the normalized input back to the test distribution while preserving conditional guarantees. Empirically, BNF consistently improves conditional coverage robustness on nine datasets across a wide range of confidence levels.

preprint2023arXiv

Emergent Electronic Kagome Lattice in Correlated Charge-Density-Wave State of 1T-TaS$_2$

Quantum materials with tunable correlated and/or topological electronic states, such as the electronic Kagome lattice, provide an ideal platform to study the exotic quantum properties. However, the real-space investigations on the correlated electronic Kagome lattice have been rarely reported. Herein, we report on the electronic Kagome lattice emerging in the correlated charge-density-wave (CDW) state of 1T-TaS$_2$ at ~200 K via variable-temperature scanning tunneling microscopy (VT-STM). This emergent Kagome lattice can be considered a fractional electron-filling superstructure with reduced translational and rotational symmetries, confirmed by STM measurements and density functional theory simulations. The characteristic band structure and density of states of this electronic Kagome lattice are further explored based on theoretical calculations. Our results demonstrate a self-organized electronic Kagome lattice from the correlated CDW state via the effective tuning parameter of temperature and provide a platform to directly explore the interplay of correlated electrons and topological physics.

preprint2023arXiv

More comprehensive facial inversion for more effective expression recognition

Facial expression recognition (FER) plays a significant role in the ubiquitous application of computer vision. We revisit this problem with a new perspective on whether it can acquire useful representations that improve FER performance in the image generation process, and propose a novel generative method based on the image inversion mechanism for the FER task, termed Inversion FER (IFER). Particularly, we devise a novel Adversarial Style Inversion Transformer (ASIT) towards IFER to comprehensively extract features of generated facial images. In addition, ASIT is equipped with an image inversion discriminator that measures the cosine similarity of semantic features between source and generated images, constrained by a distribution alignment loss. Finally, we introduce a feature modulation module to fuse the structural code and latent codes from ASIT for the subsequent FER work. We extensively evaluate ASIT on facial datasets such as FFHQ and CelebA-HQ, showing that our approach achieves state-of-the-art facial inversion performance. IFER also achieves competitive results in facial expression recognition datasets such as RAF-DB, SFEW and AffectNet. The code and models are available at https://github.com/Talented-Q/IFER-master.

preprint2022arXiv

An Efficient Methodology to Identify Missing Tags in Large-Scale RFID Systems

Radio frequency identification (RFID) has been widely has broad applications. One such application is to use RFID to track inventory in warehouses and retail stores. In this application, timely identifying the missing items is an ongoing engineering problem. A feasible solution to this problem is to map each tag to a time slot and verify the presence of a tag by comparing the status of the predicted time slot and the actual time slot. However, existing works are time inefficient because they only verify tags one by one in singleton slots but ignore the collision slots mapped by multiple tags. To accelerate the identification process, we use bit tracking to verify tags in collision slots and design two protocols accordingly. We first propose the Sequential String based Missing Tag Identification (SSMTI) protocol, which converts all time slots to collision slots and enables tags in each slot to reply to a designed string simultaneously. By using bit tracking to decode the combined string, the reader can verify multiple tags together. To improve the performance of SSMTI when most tags are missing, we further propose the Interactive String based Missing Tag Identification (ISMTI) protocol. ISMTI improves the strategies of designing strings for each collided tag so that the reader can verify more tags using shorter strings than SSMTI.Besides, ISMTI can dynamically adjust the verification mechanism according to the proportion of missing tags to maintain time efficiency. We also provide theoretical analysis for proposed protocols to minimize execution time and evaluate their performance through extensive simulations. Compared with state-of-the-art solutions, the proposed SSMTI and ISMTI can reduce the time cost by as much as 39.74% and 68.87%.

preprint2022arXiv

Coverage Axis: Inner Point Selection for 3D Shape Skeletonization

In this paper, we present a simple yet effective formulation called Coverage Axis for 3D shape skeletonization. Inspired by the set cover problem, our key idea is to cover all the surface points using as few inside medial balls as possible. This formulation inherently induces a compact and expressive approximation of the Medial Axis Transform (MAT) of a given shape. Different from previous methods that rely on local approximation error, our method allows a global consideration of the overall shape structure, leading to an efficient high-level abstraction and superior robustness to noise. Another appealing aspect of our method is its capability to handle more generalized input such as point clouds and poor-quality meshes. Extensive comparisons and evaluations demonstrate the remarkable effectiveness of our method for generating compact and expressive skeletal representation to approximate the MAT.

preprint2022arXiv

Domain Disentangled Generative Adversarial Network for Zero-Shot Sketch-Based 3D Shape Retrieval

Sketch-based 3D shape retrieval is a challenging task due to the large domain discrepancy between sketches and 3D shapes. Since existing methods are trained and evaluated on the same categories, they cannot effectively recognize the categories that have not been used during training. In this paper, we propose a novel domain disentangled generative adversarial network (DD-GAN) for zero-shot sketch-based 3D retrieval, which can retrieve the unseen categories that are not accessed during training. Specifically, we first generate domain-invariant features and domain-specific features by disentangling the learned features of sketches and 3D shapes, where the domain-invariant features are used to align with the corresponding word embeddings. Then, we develop a generative adversarial network that combines the domain-specific features of the seen categories with the aligned domain-invariant features to synthesize samples, where the synthesized samples of the unseen categories are generated by using the corresponding word embeddings. Finally, we use the synthesized samples of the unseen categories combined with the real samples of the seen categories to train the network for retrieval, so that the unseen categories can be recognized. In order to reduce the domain shift problem, we utilized unlabeled unseen samples to enhance the discrimination ability of the discriminator. With the discriminator distinguishing the generated samples from the unlabeled unseen samples, the generator can generate more realistic unseen samples. Extensive experiments on the SHREC'13 and SHREC'14 datasets show that our method significantly improves the retrieval performance of the unseen categories.

preprint2022arXiv

Global Well-posedness of a Prandtl Model from MHD in Gevrey Function Spaces

We consider a Prandtl model derived from MHD in the Prandtl-Hartmann regime that has a damping term due to the effect of the Hartmann boundary layer. A global-in-time well-posedness is obtained in the Gevrey function space with the optimal index $2$. The proof is based on a cancellation mechanism through some auxiliary functions from the study of the Prandtl equation and an observation about the structure of the loss of one order tangential derivatives through twice operations of the Prandtl operator

preprint2022arXiv

LSSANet: A Long Short Slice-Aware Network for Pulmonary Nodule Detection

Convolutional neural networks (CNNs) have been demonstrated to be highly effective in the field of pulmonary nodule detection. However, existing CNN based pulmonary nodule detection methods lack the ability to capture long-range dependencies, which is vital for global information extraction. In computer vision tasks, non-local operations have been widely utilized, but the computational cost could be very high for 3D computed tomography (CT) images. To address this issue, we propose a long short slice-aware network (LSSANet) for the detection of pulmonary nodules. In particular, we develop a new non-local mechanism termed long short slice grouping (LSSG), which splits the compact non-local embeddings into a short-distance slice grouped one and a long-distance slice grouped counterpart. This not only reduces the computational burden, but also keeps long-range dependencies among any elements across slices and in the whole feature map. The proposed LSSG is easy-to-use and can be plugged into many pulmonary nodule detection networks. To verify the performance of LSSANet, we compare with several recently proposed and competitive detection approaches based on 2D/3D CNN. Promising evaluation results on the large-scale PN9 dataset demonstrate the effectiveness of our method. Code is at https://github.com/Ruixxxx/LSSANet.

preprint2022arXiv

Properties and device performance of BN thin films grown on GaN by pulsed laser deposition

Wide and ultrawide-bandgap semiconductors lie at the heart of next-generation high-power, high-frequency electronics. Here, we report the growth of ultrawide-bandgap boron nitride (BN) thin films on wide-bandgap gallium nitride (GaN) by pulsed laser deposition. Comprehensive spectroscopic (core level and valence band XPS, FTIR, Raman) and microscopic (AFM and STEM) characterizations confirm the growth of BN thin films on GaN. Optically, we observed that BN/GaN heterostructure is second-harmonic generation active. Moreover, we fabricated the BN/GaN heterostructure-based Schottky diode that demonstrates rectifying characteristics, lower turn-on voltage, and an improved breakdown capability (234 V) as compared to GaN (168 V), owing to the higher breakdown electrical field of BN. Our approach is an early step towards bridging the gap between wide and ultrawide-bandgap materials for potential optoelectronics as well as next-generation high-power electronics.

preprint2022arXiv

Static spherical vacuum solutions in the bumblebee gravity model

The bumblebee gravity model is a vector-tensor theory of gravitation where the vector field nonminimally couples to the Ricci tensor. By investigating the vacuum field equations with spherical symmetry, we find two families of black-hole (BH) solutions in this model: one has a vanishing radial component of the vector field and the other has a vanishing radial component of the Ricci tensor. When the coupling between the vector field and the Ricci tensor is set to zero, the first family becomes the Reissner-Nordström solution while the second family degenerates to the Schwarzschild solution with the vector field being zero. General numerical solutions in both families are obtained for nonzero coupling between the vector field and the Ricci tensor. Besides BH solutions, we also reveal the existence of solutions that have a nonvanishing $tt$-component of the metric on the supposed event horizon where the $rr$-component of the metric diverges while the curvature scalars are finite. These solutions are not supported by existing observations but present certain properties that are of academic interests. We conclude the study by putting the BH solutions into tests against the Solar-system observations and the images of supermassive BHs.

preprint2022arXiv

Testing gravitational redshift based on microwave frequency links onboard China Space Station

In 2022 China Space Station (CSS) will be equipped with atomic clocks and optical clocks with stabilities of $2 \times 10^{-16}$ and $8 \times 10^{-18}$, respectively, which provides an excellent opportunity to test gravitational redshift (GR) with higher accuracy than previous results. Based on high-precise frequency links between CSS and a ground station, we formulated a model and provided simulation experiments to test GR. Simulation results suggest that this method could test the GR at the accuracy level of $(0.27 \pm 2.15) \times10^{-7}$, more than two orders in magnitude higher than the result of the experiment of a hydrogen clock on board a flying rocket more than 40 years ago.

preprint2022arXiv

Twisted Angle-Dependent Work Functions in CVD-Grown Twisted Bilayer Graphene by Kelvin Probe Force Microscopy

Tailoring the interlayer twist angle of bilayer graphene (BLG) has a significant influence on its electronic properties, including superconductivity, topological transitions, ferromagnetic states and correlated insulating states. These exotic electronic properties are sensitively dependent on the work functions of bilayer graphene samples. Here, the twisted angle-dependent work functions of CVD-grown twisted bilayer graphene (tBLG) are detailed investigated by Kelvin Probe Force Microscopy (KPFM) in combination with Raman spectra. The thickness-dependent surface potentials of Bernal-stacked multilayer graphene were measured. The AB-BLG and tBLG are directly determined by KPFM due to their twist angle-specific surface potentials. The detailed relationship of twist angles and surface potentials are further obtained by the in-situ combination investigation of KPFM and Raman spectra measurements. The thermal stability of tBLG was further explored through controlled annealing process. Our work provides the twisted angle-dependent surface potentials of tBLG and lays the foundation for further exploring their twist-angle-dependent novel electronic properties.

preprint2021arXiv

BaPipe: Exploration of Balanced Pipeline Parallelism for DNN Training

The size of deep neural networks (DNNs) grows rapidly as the complexity of the machine learning algorithm increases. To satisfy the requirement of computation and memory of DNN training, distributed deep learning based on model parallelism has been widely recognized. We propose a new pipeline parallelism training framework, BaPipe, which can automatically explore pipeline parallelism training methods and balanced partition strategies for DNN distributed training. In BaPipe, each accelerator calculates the forward propagation and backward propagation of different parts of networks to implement the intra-batch pipeline parallelism strategy. BaPipe uses a new load balancing automatic exploration strategy that considers the parameters of DNN models and the computation, memory, and communication resources of accelerator clusters. We have trained different DNNs such as VGG-16, ResNet-50, and GNMT on GPU clusters and simulated the performance of different FPGA clusters. Compared with state-of-the-art data parallelism and pipeline parallelism frameworks, BaPipe provides up to 3.2x speedup and 4x memory reduction in various platforms.

preprint2021arXiv

Neutron stars in massive scalar-Gauss-Bonnet gravity: Spherical structure and time-independent perturbations

The class of scalar-tensor theories with the scalar field coupling to the Gauss-Bonnet invariant has drawn great interest since solutions of spontaneous scalarization were found for black holes in these theories. We contribute to the existing literature a detailed study of the spontaneously scalarized neutron stars (NSs) in a typical theory where the coupling function of the scalar field takes the quadratic form and the scalar field is massive. The investigation here includes the spherical solutions of the NSs as well as their perturbative properties, namely the tidal deformability and the moment of inertia, treated in a unified and extendable way under the framework of spherical decomposition. We find that while the mass, the radius, and the moment of inertia of the spontaneously scalarized NSs show very moderate deviations from those of the NSs in general relativity (GR), the tidal deformability exhibits significant differences between the solutions in GR and the solutions of spontaneous scalarization for certain values of the parameters in the scalar-Gauss-Bonnet theory. As a result, the celebrated universal relation between the moment of inertia and the tidal deformability of neutron stars breaks down. With the mass and the tidal deformability of NSs attainable in the gravitational waves from binary NS mergers, the radius measurable using the X-ray satellites, and the moment of inertia accessible via the high-precision pulsar timing techniques, future multi-messenger observations can be contrasted with the theoretical results and provide us necessary information for building up theories beyond GR.

preprint2021arXiv

Signature of Lorentz Violation in Continuous Gravitational-Wave Spectra of Ellipsoidal Neutron Stars

We study effects of Lorentz-invariance violation on the rotation of neutron stars (NSs) in the minimal gravitational Standard-Model Extension framework, and calculate the quadrupole radiation generated by them. Aiming at testing Lorentz invariance with observations of continuous gravitational waves (GWs) from rotating NSs in the future, we compare the GW spectra of a rotating ellipsoidal NS under Lorentz-violating gravity with those of a Lorentz-invariant one. The former are found to possess frequency components higher than the second harmonic, which does not happen for the latter, indicating those higher frequency components to be potential signatures of Lorentz violation in continuous GW spectra of rotating NSs.

preprint2020arXiv

Boosting Connectivity in Retinal Vessel Segmentation via a Recursive Semantics-Guided Network

Many deep learning based methods have been proposed for retinal vessel segmentation, however few of them focus on the connectivity of segmented vessels, which is quite important for a practical computer-aided diagnosis system on retinal images. In this paper, we propose an efficient network to address this problem. A U-shape network is enhanced by introducing a semantics-guided module, which integrates the enriched semantics information to shallow layers for guiding the network to explore more powerful features. Besides, a recursive refinement iteratively applies the same network over the previous segmentation results for progressively boosting the performance while increasing no extra network parameters. The carefully designed recursive semantics-guided network has been extensively evaluated on several public datasets. Experimental results have shown the efficiency of the proposed method.

preprint2020arXiv

Electron acceleration in non-relativistic quasi-perpendicular collisionless shocks

We study diffusive shock acceleration (DSA) of electrons in non-relativistic quasi-perpendicular shocks using self-consistent one-dimensional particle-in-cell (PIC) simulations. By exploring the parameter space of sonic and Alfvénic Mach numbers we find that high Mach number quasi-perpendicular shocks can efficiently accelerate electrons to power-law downstream spectra with slopes consistent with DSA prediction. Electrons are reflected by magnetic mirroring at the shock and drive non-resonant waves in the upstream. Reflected electrons are trapped between the shock front and upstream waves and undergo multiple cycles of shock drift acceleration before the injection into DSA. Strong current-driven waves also temporarily change the shock obliquity and cause mild proton pre-acceleration even in quasi-perpendicular shocks, which otherwise do not accelerate protons. These results can be used to understand nonthermal emission in supernova remnants and intracluster medium in galaxy clusters.

preprint2020arXiv

Implications of the virus-encoded miRNA and host miRNA in the pathogenicity of SARS-CoV-2

The outbreak of COVID-19 caused by SARS-CoV-2 has rapidly spread worldwide and has caused over 1,400,000 infections and 80,000 deaths. There are currently no drugs or vaccines with proven efficacy for its prevention and little knowledge was known about the pathogenicity mechanism of SARS-CoV-2 infection. Previous studies showed both virus and host-derived MicroRNAs (miRNAs) played crucial roles in the pathology of virus infection. In this study, we use computational approaches to scan the SARS-CoV-2 genome for putative miRNAs and predict the virus miRNA targets on virus and human genome as well as the host miRNAs targets on virus genome. Furthermore, we explore miRNAs involved dysregulation caused by the virus infection. Our results implicated that the immune response and cytoskeleton organization are two of the most notable biological processes regulated by the infection-modulated miRNAs. Impressively, we found hsa-miR-4661-3p was predicted to target the S gene of SARS-CoV-2, and a virus-encoded miRNA MR147-3p could enhance the expression of TMPRSS2 with the function of strengthening SARS-CoV-2 infection in the gut. The study may provide important clues for the mechisms of pathogenesis of SARS-CoV-2.

preprint2020arXiv

Local probe of the interlayer coupling strength of few-layers SnSe by contact-resonance atomic force microscopy

The interlayer bonding in two dimensional materials is particularly important because it is not only related to their physical and chemical stability but also affects their mechanical, thermal, electronic, optical, and other properties. To address this issue, we report the direct characterization of the interlayer bonding in 2D SnSe using contact-resonance atomic force microscopy in this study. Site specific CR spectroscopy and CR force spectroscopy measurements are performed on both SnSe and its supporting SiO2 substrate comparatively. Based on the cantilever and contact mechanic models, the contact stiffness and vertical Young's modulus are evaluated in comparison with SiO2 as a reference material. The interlayer bonding of SnSe is further analyzed in combination with the semi-analytical model and density functional theory calculations. The direct characterization of interlayer interactions using this nondestructive methodology of CR AFM would facilitate a better understanding of the physical and chemical properties of 2D layered materials, specifically for interlayer intercalation and vertical heterostructures.

preprint2020arXiv

Neutron Star Structure in the Minimal Gravitational Standard-Model Extension and the Implication to Continuous Gravitational Waves

Tiny violation of Lorentz invariance has been the subject of theoretic study and experimental test for a long time. We use the Standard-Model Extension (SME) framework to investigate the effect of the minimal Lorentz violation on the structure of a neutron star. A set of hydrostatic equations with modifications from Lorentz violation are derived, and then the modifications are isolated and added to the Tolman-Oppenheimer-Volkoff (TOV) equation as the leading-order Lorentz-violation corrections in relativistic systems. A perturbation solution to the leading-order modified TOV equations is found. The quadrupole moments due to the anisotropy in the structure of neutron stars are calculated and used to estimate the quadrupole radiation of a spinning neutron star with the same deformation. The calculation puts forward a new test for Lorentz invariance in the strong-field regime when continuous gravitational waves are observed in the future.

preprint2020arXiv

Progressive Point Cloud Deconvolution Generation Network

In this paper, we propose an effective point cloud generation method, which can generate multi-resolution point clouds of the same shape from a latent vector. Specifically, we develop a novel progressive deconvolution network with the learning-based bilateral interpolation. The learning-based bilateral interpolation is performed in the spatial and feature spaces of point clouds so that local geometric structure information of point clouds can be exploited. Starting from the low-resolution point clouds, with the bilateral interpolation and max-pooling operations, the deconvolution network can progressively output high-resolution local and global feature maps. By concatenating different resolutions of local and global feature maps, we employ the multi-layer perceptron as the generation network to generate multi-resolution point clouds. In order to keep the shapes of different resolutions of point clouds consistent, we propose a shape-preserving adversarial loss to train the point cloud deconvolution generation network. Experimental results demonstrate the effectiveness of our proposed method.

preprint2020arXiv

Top-Down Shape Abstraction Based on Greedy Pole Selection

Motivated by the fact that the medial axis transform is able to encode nearly the complete shape, we propose to use as few medial balls as possible to approximate the original enclosed volume by the boundary surface. We progressively select new medial balls, in a top-down style, to enlarge the region spanned by the existing medial balls. The key spirit of the selection strategy is to encourage large medial balls while imposing given geometric constraints. We further propose a speedup technique based on a provable observation that the intersection of medial balls implies the adjacency of power cells (in the sense of the power crust). We further elaborate the selection rules in combination with two closely related applications. One application is to develop an easy-to-use ball-stick modeling system that helps non-professional users to quickly build a shape with only balls and wires, but any penetration between two medial balls must be suppressed. The other application is to generate porous structures with convex, compact (with a high isoperimetric quotient) and shape-aware pores where two adjacent spherical pores may have penetration as long as the mechanical rigidity can be well preserved.

preprint2020arXiv

Triaxially-deformed Freely-precessing Neutron Stars: Continuous electromagnetic and gravitational radiation

The shape of a neutron star (NS) is closely linked to its internal structure and the equation of state of supranuclear matters. A rapidly rotating, asymmetric NS in the Milky Way undergoes free precession, making it a potential source for multimessenger observation. The free precession could manifest in (i) the spectra of continuous gravitational waves (GWs) in the kilohertz band for ground-based GW detectors, and (ii) the timing behavior and pulse-profile characteristics if the NS is monitored as a pulsar with radio and/or X-ray telescopes. We extend previous work and investigate in great detail the free precession of a triaxially deformed NS with analytical and numerical approaches. In particular, its associated continuous GWs and pulse signals are derived. Explicit examples are illustrated for the continuous GWs, as well as timing residuals in both time and frequency domains. These results are ready to be used for future multimessenger observation of triaxially-deformed freely-precessing NSs, in order to extract scientific implication as much as possible.

preprint2020arXiv

When NAS Meets Robustness: In Search of Robust Architectures against Adversarial Attacks

Recent advances in adversarial attacks uncover the intrinsic vulnerability of modern deep neural networks. Since then, extensive efforts have been devoted to enhancing the robustness of deep networks via specialized learning algorithms and loss functions. In this work, we take an architectural perspective and investigate the patterns of network architectures that are resilient to adversarial attacks. To obtain the large number of networks needed for this study, we adopt one-shot neural architecture search, training a large network for once and then finetuning the sub-networks sampled therefrom. The sampled architectures together with the accuracies they achieve provide a rich basis for our study. Our "robust architecture Odyssey" reveals several valuable observations: 1) densely connected patterns result in improved robustness; 2) under computational budget, adding convolution operations to direct connection edge is effective; 3) flow of solution procedure (FSP) matrix is a good indicator of network robustness. Based on these observations, we discover a family of robust architectures (RobNets). On various datasets, including CIFAR, SVHN, Tiny-ImageNet, and ImageNet, RobNets exhibit superior robustness performance to other widely used architectures. Notably, RobNets substantially improve the robust accuracy (~5% absolute gains) under both white-box and black-box attacks, even with fewer parameter numbers. Code is available at https://github.com/gmh14/RobNets.

preprint2017arXiv

Intrinsic Capacity

Every channel can be expressed as a convex combination of deterministic channels with each deterministic channel corresponding to one particular intrinsic state. Such convex combinations are in general not unique, each giving rise to a specific intrinsic-state distribution. In this paper we study the maximum and the minimum capacities of a channel when the realization of its intrinsic state is causally available at the encoder and/or the decoder. Several conclusive results are obtained for binary-input channels and binary-output channels. Byproducts of our investigation include a generalization of the Birkhoff-von Neumann theorem and a condition on the uselessness of causal state information at the encoder.