Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
29works
0followers
24topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

29 published item(s)

preprint2026arXiv

Domain-Adaptive Communication-Rate Optimization for Sim-to-Real Humanoid-Robot Wireless XR Teleoperation

Wireless extended reality (XR) teleoperation provides embodied interaction capability for collecting humanoid robot demonstrations, but the large-scale adoption is restricted by the overhead of high-frequency motion transmission. This paper develops a system framework that integrates sampling, transmission, interpolation, and reconstruction and formulates a communication-rate optimization that aims to minimize the communication energy while maintaining the reconstruction accuracy of robot motion trajectories through dimension-wise sampling-rate control. Since acquiring real-time feedback from physical robots is limited by hardware costs, it is necessary to solve the problem through simulator interaction with offline real-domain data correction. To guide sim-to-real adaptation, we provide a PAC-Bayes generalization characterization that reveals the effects of latent density-ratio estimation, finite-sample deviation, and encoder bias. Building on this analysis, we propose a proximal policy optimization (PPO) method with density-ratio weighting and trust-region regularization. Experiments on public humanoid teleoperation dataset show that the proposed method improves the tradeoff between reconstruction error and communication energy consumption under sim-to-real distribution shift. We further analyze the effectiveness of the proposed algorithm across various wireless channels and dynamic motion trajectories.

preprint2023arXiv

A Surrogate-Assisted Extended Generative Adversarial Network for Parameter Optimization in Free-Form Metasurface Design

Metasurfaces have widespread applications in fifth-generation (5G) microwave communication. Among the metasurface family, free-form metasurfaces excel in achieving intricate spectral responses compared to regular-shape counterparts. However, conventional numerical methods for free-form metasurfaces are time-consuming and demand specialized expertise. Alternatively, recent studies demonstrate that deep learning has great potential to accelerate and refine metasurface designs. Here, we present XGAN, an extended generative adversarial network (GAN) with a surrogate for high-quality free-form metasurface designs. The proposed surrogate provides a physical constraint to XGAN so that XGAN can accurately generate metasurfaces monolithically from input spectral responses. In comparative experiments involving 20000 free-form metasurface designs, XGAN achieves 0.9734 average accuracy and is 500 times faster than the conventional methodology. This method facilitates the metasurface library building for specific spectral responses and can be extended to various inverse design problems, including optical metamaterials, nanophotonic devices, and drug discovery.

preprint2022arXiv

Analysis and Optimization of A Double-IRS Cooperatively Assisted System with A Quasi-Static Phase Shift Design

The analysis and optimization of single intelligent reflecting surface (IRS)-assisted systems have been extensively studied, whereas little is known regarding multiple-IRS-assisted systems. This paper investigates the analysis and optimization of a double-IRS cooperatively assisted downlink system, where a multi-antenna base station (BS) serves a single-antenna user with the help of two multi-element IRSs, connected by an inter-IRS channel. The channel between any two nodes is modeled with Rician fading. The BS adopts the instantaneous CSI-adaptive maximum-ratio transmission (MRT) beamformer, and the two IRSs adopt a cooperative quasi-static phase shift design. The goal is to maximize the average achievable rate, which can be reflected by the average channel power of the equivalent channel between the BS and user, at a low phase adjustment cost and computational complexity. First, we obtain tractable expressions of the average channel power of the equivalent channel in the general Rician factor, pure line of sight (LoS), and pure non-line of sight (NLoS) regimes, respectively. Then, we jointly optimize the phase shifts of the two IRSs to maximize the average channel power of the equivalent channel in these regimes. The optimization problems are challenging non-convex problems. We obtain globally optimal closed-form solutions for some cases and propose computationally efficient iterative algorithms to obtain stationary points for the other cases. Next, we compare the computational complexity for optimizing the phase shifts and the optimal average channel power of the double-IRS cooperatively assisted system with those of a counterpart single-IRS-assisted system at a large number of reflecting elements in the three regimes. Finally, we numerically demonstrate notable gains of the proposed solutions over the existing solutions at different system parameters.

preprint2022arXiv

Conservation of the particle-hole symmetry in the pseudogap state in optimally-doped Bi2Sr2CuO6+δ superconductor

The pseudogap state is one of the most enigmatic characteristics in the anomalous normal state properties of the high temperature cuprate superconductors. A central issue is to reveal whether there is a symmetry breaking and which symmetries are broken across the pseudogap transition. By performing high resolution laser-based angle-resolved photoemission measurements on the optimally-doped Bi2Sr1.6La0.4CuO6+δ superconductor, we report the observations of the particle-hole symmetry conservation in both the superconducting state and the pseudogap state along the entire Fermi surface. These results provide key insights in understanding the nature of the pseudogap and its relation with high temperature superconductivity.

preprint2022arXiv

Contrastive Embedding Distribution Refinement and Entropy-Aware Attention for 3D Point Cloud Classification

Learning a powerful representation from point clouds is a fundamental and challenging problem in the field of computer vision. Different from images where RGB pixels are stored in the regular grid, for point clouds, the underlying semantic and structural information of point clouds is the spatial layout of the points. Moreover, the properties of challenging in-context and background noise pose more challenges to point cloud analysis. One assumption is that the poor performance of the classification model can be attributed to the indistinguishable embedding feature that impedes the search for the optimal classifier. This work offers a new strategy for learning powerful representations via a contrastive learning approach that can be embedded into any point cloud classification network. First, we propose a supervised contrastive classification method to implement embedding feature distribution refinement by improving the intra-class compactness and inter-class separability. Second, to solve the confusion problem caused by small inter-class compactness and inter-class separability. Second, to solve the confusion problem caused by small inter-class variations between some similar-looking categories, we propose a confusion-prone class mining strategy to alleviate the confusion effect. Finally, considering that outliers of the sample clusters in the embedding space may cause performance degradation, we design an entropy-aware attention module with information entropy theory to identify the outlier cases and the unstable samples by measuring the uncertainty of predicted probability. The results of extensive experiments demonstrate that our method outperforms the state-of-the-art approaches by achieving 82.9% accuracy on the real-world ScanObjectNN dataset and substantial performance gains up to 2.9% in DCGNN, 3.1% in PointNet++, and 2.4% in GBNet.

preprint2022arXiv

Deep 3D-to-2D Watermarking: Embedding Messages in 3D Meshes and Extracting Them from 2D Renderings

Digital watermarking is widely used for copyright protection. Traditional 3D watermarking approaches or commercial software are typically designed to embed messages into 3D meshes, and later retrieve the messages directly from distorted/undistorted watermarked 3D meshes. However, in many cases, users only have access to rendered 2D images instead of 3D meshes. Unfortunately, retrieving messages from 2D renderings of 3D meshes is still challenging and underexplored. We introduce a novel end-to-end learning framework to solve this problem through: 1) an encoder to covertly embed messages in both mesh geometry and textures; 2) a differentiable renderer to render watermarked 3D objects from different camera angles and under varied lighting conditions; 3) a decoder to recover the messages from 2D rendered images. From our experiments, we show that our model can learn to embed information visually imperceptible to humans, and to retrieve the embedded information from 2D renderings that undergo 3D distortions. In addition, we demonstrate that our method can also work with other renderers, such as ray tracers and real-time renderers with and without fine-tuning.

preprint2022arXiv

Doubly Coupled Designs for Computer Experiments with both Qualitative and Quantitative Factors

Computer experiments with both qualitative and quantitative input variables occur frequently in many scientific and engineering applications. How to choose input settings for such experiments is an important issue for accurate statistical analysis, uncertainty quantification and decision making. Sliced Latin hypercube designs are the first systematic approach to address this issue. However, it comes with the increasing cost associated with an increasing large number of level combinations of the qualitative factors. For the reason of run size economy, marginally coupled designs were proposed in which the design for the quantitative factors is a sliced Latin hypercube design with respect to each qualitative factor. The drawback of such designs is that the corresponding data may not be able to capture the effects between any two (and more) qualitative factors and quantitative factors. To balance the run size and design efficiency, we propose a new type of designs, doubly coupled designs, where the design points for the quantitative factors form a sliced Latin hypercube design with respect to the levels of any qualitative factor and with respect to the level combinations of any two qualitative factors, respectively. The proposed designs have the better stratification property between the qualitative and quantitative factors compared with marginally coupled designs. The existence of the proposed designs is established. Several construction methods are introduced, and the properties of the resulting designs are also studied.

preprint2022arXiv

Giant and Reversible Electronic Structure Evolution in a Magnetic Topological Material EuCd2As2

The electronic structure and the physical properties of quantum materials can be significantly altered by charge carrier doping and magnetic state transition. Here we report a discovery of a giant and reversible electronic structure evolution with doping in a magnetic topological material. By performing high-resolution angle-resolved photoemission measurements on EuCd2As2,we found that a huge amount of hole doping can be introduced into the sample surface due to surface absorption. The electronic structure exhibits a dramatic change with the hole doping which can not be described by a rigid band shift. Prominent band splitting is observed at high doping which corresponds to a doping-induced magnetic transition at low temperature (below -15 K) from an antiferromagnetic state to a ferromagnetic state. These results have established a detailed electronic phase diagram of EuCd2As2 where the electronic structure and the magnetic structure change systematically and dramatically with the doping level. They further suggest that the transport, magnetic and topological properties of EuCd2As2 can be greatly modified by doping. These work will stimulate further investigations to explore for new phenomena and properties in doping this magnetic topological material.

preprint2022arXiv

Joint Optimization of Preamble Selection and Access Barring for Random Access in MTC with General Device Activities

Most existing random access schemes for machine-type communications (MTC) simply adopt a uniform preamble selection distribution, irrespective of the underlying device activity distributions. Hence, they may yield unsatisfactory access efficiency. In this paper, we model device activities for MTC as multiple Bernoulli random variables following an arbitrary multivariate Bernoulli distribution which can reflect both dependent and independent device activities. Then, we optimize preamble selection and access barring for random access in MTC according to the underlying joint device activity distribution. Specifically, we investigate three cases of the joint device activity distribution, i.e., the cases of perfect, imperfect, and unknown joint device activity distributions, and formulate the average, worst-case average, and sample average throughput maximization problems, respectively. The problems in the three cases are challenging nonconvex problems. In the case of perfect joint device activity distribution, we develop an iterative algorithm and a low-complexity iterative algorithm to obtain stationary points of the original problem and an approximate problem, respectively. In the case of imperfect joint device activity distribution, we develop an iterative algorithm and a low-complexity iterative algorithm to obtain a Karush-Kuhn-Tucker (KKT) point of an equivalent problem and a stationary point of an approximate problem, respectively. Finally, in the case of unknown joint device activity distribution, we develop an iterative algorithm to obtain a stationary point. The proposed solutions are widely applicable and outperform existing solutions for dependent and independent device activities.

preprint2022arXiv

LECA: A Learned Approach for Efficient Cover-agnostic Watermarking

In this work, we present an efficient multi-bit deep image watermarking method that is cover-agnostic yet also robust to geometric distortions such as translation and scaling as well as other distortions such as JPEG compression and noise. Our design consists of a light-weight watermark encoder jointly trained with a deep neural network based decoder. Such a design allows us to retain the efficiency of the encoder while fully utilizing the power of a deep neural network. Moreover, the watermark encoder is independent of the image content, allowing users to pre-generate the watermarks for further efficiency. To offer robustness towards geometric transformations, we introduced a learned model for predicting the scale and offset of the watermarked images. Moreover, our watermark encoder is independent of the image content, making the generated watermarks universally applicable to different cover images. Experiments show that our method outperforms comparably efficient watermarking methods by a large margin.

preprint2022arXiv

MAXIM: Multi-Axis MLP for Image Processing

Recent progress on Transformers and multi-layer perceptron (MLP) models provide new network architectural designs for computer vision tasks. Although these models proved to be effective in many vision tasks such as image recognition, there remain challenges in adapting them for low-level vision. The inflexibility to support high-resolution images and limitations of local attention are perhaps the main bottlenecks. In this work, we present a multi-axis MLP based architecture called MAXIM, that can serve as an efficient and flexible general-purpose vision backbone for image processing tasks. MAXIM uses a UNet-shaped hierarchical structure and supports long-range interactions enabled by spatially-gated MLPs. Specifically, MAXIM contains two MLP-based building blocks: a multi-axis gated MLP that allows for efficient and scalable spatial mixing of local and global visual cues, and a cross-gating block, an alternative to cross-attention, which accounts for cross-feature conditioning. Both these modules are exclusively based on MLPs, but also benefit from being both global and `fully-convolutional', two properties that are desirable for image processing. Our extensive experimental results show that the proposed MAXIM model achieves state-of-the-art performance on more than ten benchmarks across a range of image processing tasks, including denoising, deblurring, deraining, dehazing, and enhancement while requiring fewer or comparable numbers of parameters and FLOPs than competitive models. The source code and trained models will be available at \url{https://github.com/google-research/maxim}.

preprint2022arXiv

MaxViT: Multi-Axis Vision Transformer

Transformers have recently gained significant attention in the computer vision community. However, the lack of scalability of self-attention mechanisms with respect to image size has limited their wide adoption in state-of-the-art vision backbones. In this paper we introduce an efficient and scalable attention model we call multi-axis attention, which consists of two aspects: blocked local and dilated global attention. These design choices allow global-local spatial interactions on arbitrary input resolutions with only linear complexity. We also present a new architectural element by effectively blending our proposed attention model with convolutions, and accordingly propose a simple hierarchical vision backbone, dubbed MaxViT, by simply repeating the basic building block over multiple stages. Notably, MaxViT is able to ''see'' globally throughout the entire network, even in earlier, high-resolution stages. We demonstrate the effectiveness of our model on a broad spectrum of vision tasks. On image classification, MaxViT achieves state-of-the-art performance under various settings: without extra data, MaxViT attains 86.5% ImageNet-1K top-1 accuracy; with ImageNet-21K pre-training, our model achieves 88.7% top-1 accuracy. For downstream tasks, MaxViT as a backbone delivers favorable performance on object detection as well as visual aesthetic assessment. We also show that our proposed model expresses strong generative modeling capability on ImageNet, demonstrating the superior potential of MaxViT blocks as a universal vision module. The source code and trained models will be available at https://github.com/google-research/maxvit.

preprint2022arXiv

Molecular states from $\bar{B}^{(*)}N$ interactions

In 2019, two new structures $Λ_b(6146)$ and $Λ_b(6152)$ were observed by the LHCb Collaboration at the invariant mass spectrum of $Λ_b^0π^{+}π^{-}$, which aroused a hot discussion about their inner structures. The $Λ_b(6146)$ and $Λ_b(6152)$ might still be molecular states because their masses are close to threshold of a $\bar{B}$ meson and a nucleon. In this work, we perform a systematical investigation of possible heavy baryonic molecular states from the $\bar{B}N$ interaction. Since the $\bar{B}N$ channel strongly couples to the $\bar{B}^{*}N$ channel, the possible $\bar{B}N-\bar{B}^{*}N$ bound states are also studied. The interaction of the system considered is described by the $t$-channel $σ$, $π$, $η$ ,$ω$, and $ρ$ mesons exchanges. By solving the non-relativistic Schrödinger equation with the obtained one-boson-exchange potentials, the $\bar{B}^{(*)}N$ bound states with different quantum numbers are searched. The calculation suggests that recently observed $Λ_b(6146)$ can be assigned as a $P$-wave $\bar{B}N$ molecular state with spin parity $J^P=3/2^{+}$ or a $\bar{B}N-\bar{B}^{*}N$ bound state. However, assignment of $Λ_b(6152)$ as an $F$-wave $\bar{B}N$ molecular is disfavored. The $Λ_b(6152)$ can be explained as meson-baryon molecular state with a small $\bar{B}N$ component. The calculation also predict the existence of two $S$-wave $\bar{B}N-\bar{B}^{*}N$ bound states that can be related to the experimentally observed $Λ_b(5912)$ and $Λ_b(5920)$.

preprint2022arXiv

Video object tracking based on YOLOv7 and DeepSORT

Multiple object tracking (MOT) is an important technology in the field of computer vision, which is widely used in automatic driving, intelligent monitoring, behavior recognition and other directions. Among the current popular MOT methods based on deep learning, Detection Based Tracking (DBT) is the most widely used in industry, and the performance of them depend on their object detection network. At present, the DBT algorithm with good performance and the most widely used is YOLOv5-DeepSORT. Inspired by YOLOv5-DeepSORT, with the proposal of YOLOv7 network, which performs better in object detection, we apply YOLOv7 as the object detection part to the DeepSORT, and propose YOLOv7-DeepSORT. After experimental evaluation, compared with the previous YOLOv5-DeepSORT, YOLOv7-DeepSORT performances better in tracking accuracy.

preprint2021arXiv

Multi-path Neural Networks for On-device Multi-domain Visual Classification

Learning multiple domains/tasks with a single model is important for improving data efficiency and lowering inference cost for numerous vision tasks, especially on resource-constrained mobile devices. However, hand-crafting a multi-domain/task model can be both tedious and challenging. This paper proposes a novel approach to automatically learn a multi-path network for multi-domain visual classification on mobile devices. The proposed multi-path network is learned from neural architecture search by applying one reinforcement learning controller for each domain to select the best path in the super-network created from a MobileNetV3-like search space. An adaptive balanced domain prioritization algorithm is proposed to balance optimizing the joint model on multiple domains simultaneously. The determined multi-path model selectively shares parameters across domains in shared nodes while keeping domain-specific parameters within non-shared nodes in individual domain paths. This approach effectively reduces the total number of parameters and FLOPS, encouraging positive knowledge transfer while mitigating negative interference across domains. Extensive evaluations on the Visual Decathlon dataset demonstrate that the proposed multi-path model achieves state-of-the-art performance in terms of accuracy, model size, and FLOPS against other approaches using MobileNetV3-like architectures. Furthermore, the proposed method improves average accuracy over learning single-domain models individually, and reduces the total number of parameters and FLOPS by 78% and 32% respectively, compared to the approach that simply bundles single-domain models for multi-domain learning.

preprint2021arXiv

Radiative decay of the $Ξ(1620)$ in a hadronic molecule picture

Last year, the $Ξ(1620)$ state that is cataloged in the Particle Data Group (PDG) with only one star is reported again in the $Ξ^{-}π^{+}$ final state by the Belle Collaboration. Its properties not only the spectroscopy but also the decay width cannot be simply explained in the context of conventional constituent quark models. This intrigues an active discussion on the structure of this resonance. In this work, we study the radiative decays of the newly observed $Ξ(1620)$ assuming that it is a meson-baryon molecular state of $Λ\bar{K}$ and $Σ\bar{K}$ with spin-parity $J^P=1/2^{-}$ in our previous work. The partial decay widths of the $Λ\bar{K}-Σ\bar{K}$ molecular state into $Ξγ$ and $Ξπγ$ final states through hadronic loops are evaluated with the help of the effective Lagrangians. The partial widths for the $Ξ(1620)^0\toγΞ$ and $Ξ(1620)^0\toγΞπ$ are evaluated to be about $118.76-174.21$ KeV and $58.19-68.75$ eV, respectively, which may be accessible for the LHCb. If the $Ξ(1620)$ is $Λ\bar{K}-Σ\bar{K}$ molecule, the radiative transition strength $Ξ(1620)^0\toγ\bar{K}Λ$ is quite small and the decay width is of the order of 0.01 eV. Future experimental measurements of these processes can be useful to test the molecule interpretations of the $Ξ(1620)$.

preprint2021arXiv

Uniformity criterion for designs with both qualitative and quantitative factors

Experiments with both qualitative and quantitative factors occur frequently in practical applications. Many construction methods for this kind of designs, such as marginally coupled designs, were proposed to pursue some good space-filling structures. However, few criteria can be adapted to quantify the space-filling property of designs involving both qualitative and quantitative factors. As the uniformity is an important space-filling property of a design, in this paper, a new uniformity criterion, qualitative-quantitative discrepancy (QQD), is proposed for assessing the uniformity of designs with both types of factors. The closed form and lower bounds of the QQD are presented to calculate the exact QQD values of designs and recognize the uniform designs directly. In addition, a connection between the QQD and the balance pattern is derived, which not only helps to obtain a new lower bound but also provides a statistical justification of the QQD. Several examples show that the proposed criterion is reasonable and useful since it can distinguish distinct designs very well.

preprint2020arXiv

A comparative study on magnetic order and field induced magnetic transition on two series of double perovskite iridates RE2BIrO6 (RE=Pr,Nd,Sm,Eu,Gd B=Zn,Mg)

We perform a comparative magnetic study on two series of rare-earth (RE) based double perovskite iridates RE2BIrO6 (RE=Pr,Nd,Sm-Gd;B=Zn,Mg), which show Mott insulating state with tunable charge energy gap from ~330 meV to ~560 meV by changing RE cations. For nonmagnetic RE=Eu cations, Eu2MgIrO6 shows antiferromagnetic (AFM) order and field-induced spin-flop transitions below Néel temperature (TN) in comparison with the ferromagnetic (FM)-like behaviors of Eu2ZnIrO6 at low temperatures. For magnetic-moment-containing RE ions, Gd2BIrO6 show contrasting magnetic behaviors with FM-like transition (B=Zn) and AFM order (B=Mg), respectively. While, for RE=Pr, Nd and Sm ions, all members show AFM ground state and field-induced spin-flop transitions below TN irrespective of B=Zn or Mg cations. Moreover, two successive field-induced metamagnetic transitions are observed for RE2ZnIrO6 (RE=Pr,Nd) in high field up to 56 T, the resultant field temperature (H-T) phase diagrams are constructed. The diverse magnetic behaviors in RE2BIrO6 reveal that the 4f-Ir exchange interactions between the RE and Ir sublattices can mediate their magnetism.

preprint2020arXiv

An Integrated Quadratic Reconstruction for Finite Volume Schemes to Scalar Conservation Laws in Multiple Dimensions

We proposed a piecewise quadratic reconstruction method in multiple dimensions, which is in an integrated style, for finite volume schemes to scalar conservation laws. This integrated quadratic reconstruction is parameter-free and applicable on flexible grids. We show that the finite volume schemes with the new reconstruction satisfy a local maximum principle with properly setup on time steplength. Numerical examples are presented to show that the proposed scheme attains a third-order accuracy for smooth solutions in both 2D and 3D cases. It is indicated by numerical results that the local maximum principle is helpful to prevent overshoots in numerical solutions.

preprint2020arXiv

Distortion Agnostic Deep Watermarking

Watermarking is the process of embedding information into an image that can survive under distortions, while requiring the encoded image to have little or no perceptual difference from the original image. Recently, deep learning-based methods achieved impressive results in both visual quality and message payload under a wide variety of image distortions. However, these methods all require differentiable models for the image distortions at training time, and may generalize poorly to unknown distortions. This is undesirable since the types of distortions applied to watermarked images are usually unknown and non-differentiable. In this paper, we propose a new framework for distortion-agnostic watermarking, where the image distortion is not explicitly modeled during training. Instead, the robustness of our system comes from two sources: adversarial training and channel coding. Compared to training on a fixed set of distortions and noise levels, our method achieves comparable or better results on distortions available during training, and better performance on unknown distortions.

preprint2020arXiv

Electronic Raman Scattering in Suspended Semiconducting Carbon Nanotubes

The electronic Raman scattering (ERS) features of single-walled carbon nanotubes (SWNTs) can reveal a wealth of information about their electronic structures, but have previously been thought to appear exclusively in metallic (M-) but not in semiconducting (S-) SWNTs. We report the experimental observation of the ERS features with an accuracy of 1 meV in suspended S-SWNTs, the processes of which are accomplished via the available high-energy electron-hole pairs. The ERS features can facilitate further systematic studies on the properties of SWNT, both metallic and semiconducting, with defined chirality.

preprint2020arXiv

GIFnets: Differentiable GIF Encoding Framework

Graphics Interchange Format (GIF) is a widely used image file format. Due to the limited number of palette colors, GIF encoding often introduces color banding artifacts. Traditionally, dithering is applied to reduce color banding, but introducing dotted-pattern artifacts. To reduce artifacts and provide a better and more efficient GIF encoding, we introduce a differentiable GIF encoding pipeline, which includes three novel neural networks: PaletteNet, DitherNet, and BandingNet. Each of these three networks provides an important functionality within the GIF encoding pipeline. PaletteNet predicts a near-optimal color palette given an input image. DitherNet manipulates the input image to reduce color banding artifacts and provides an alternative to traditional dithering. Finally, BandingNet is designed to detect color banding, and provides a new perceptual loss specifically for GIF images. As far as we know, this is the first fully differentiable GIF encoding pipeline based on deep neural networks and compatible with existing GIF decoders. User study shows that our algorithm is better than Floyd-Steinberg based GIF encoding.

preprint2020arXiv

McKernel: A Library for Approximate Kernel Expansions in Log-linear Time

McKernel introduces a framework to use kernel approximates in the mini-batch setting with Stochastic Gradient Descent (SGD) as an alternative to Deep Learning. Based on Random Kitchen Sinks [Rahimi and Recht 2007], we provide a C++ library for Large-scale Machine Learning. It contains a CPU optimized implementation of the algorithm in [Le et al. 2013], that allows the computation of approximated kernel expansions in log-linear time. The algorithm requires to compute the product of matrices Walsh Hadamard. A cache friendly Fast Walsh Hadamard that achieves compelling speed and outperforms current state-of-the-art methods has been developed. McKernel establishes the foundation of a new architecture of learning that allows to obtain large-scale non-linear classification combining lightning kernel expansions and a linear classifier. It travails in the mini-batch setting working analogously to Neural Networks. We show the validity of our method through extensive experiments on MNIST and FASHION MNIST [Xiao et al. 2017].

preprint2020arXiv

Spectroscopic Evidence of Bilayer Splitting and Interlayer Pairing in an Iron Based Superconductor

In high temperature cuprate superconductors, the interlayer coupling between the CuO$_2$ planes plays an important role in dictating superconductivity, as indicated by the sensitive dependence of the critical temperature (T$_C$) on the number of CuO$_2$ planes in one structural unit. In Bi$_2$Sr$_2$CaCu$_2$O$_{8+δ}$ superconductor with two CuO$_2$ planes in one structural unit, the interaction between the two CuO$_2$ planes gives rise to band splitting into two Fermi surface sheets (bilayer splitting) that have distinct superconducting gap. The iron based superconductors are composed of stacking of the FeAs/FeSe layers; whether the interlayer coupling can cause similar band splitting and its effect on superconductivity remain unclear. Here we report high resolution laser-based angle-resolved photoemission spectroscopy (ARPES) measurements on a newly discovered iron based superconductor, KCa$_2$Fe$_4$As$_4$F$_2$ (T$_C$=33.5\,K) which consists of stacking FeAs blocks with two FeAs layers separated by insulating Ca$_2$F$_2$ blocks. Bilayer splitting effect is observed for the first time that gives rise to totally five hole-like Fermi surface sheets around the Brilliouin zone center. Band structure calculations reproduce the observed bilayer splitting by identifying interlayer interorbital interaction between the two FeAs layers within one FeAs block. All the hole-like pockets around the zone center exhibit Fermi surface-dependent and nodeless superconducting gap. The gap functions with short-range antiferromagetic fluctuations are proposed and the gap symmetry can be well understood when the interlayer pairing is considered. The particularly strong interlayer pairing is observed for one of the bands. Our observations provide key information on the interlayer coupling and interlayer pairing in understanding superconductivity in iron based superconductors.

preprint2020arXiv

Super-Resolving Commercial Satellite Imagery Using Realistic Training Data

In machine learning based single image super-resolution, the degradation model is embedded in training data generation. However, most existing satellite image super-resolution methods use a simple down-sampling model with a fixed kernel to create training images. These methods work fine on synthetic data, but do not perform well on real satellite images. We propose a realistic training data generation model for commercial satellite imagery products, which includes not only the imaging process on satellites but also the post-process on the ground. We also propose a convolutional neural network optimized for satellite images. Experiments show that the proposed training data generation model is able to improve super-resolution performance on real satellite images.

preprint2020arXiv

The Rate-Distortion-Accuracy Tradeoff: JPEG Case Study

Handling digital images is almost always accompanied by a lossy compression in order to facilitate efficient transmission and storage. This introduces an unavoidable tension between the allocated bit-budget (rate) and the faithfulness of the resulting image to the original one (distortion). An additional complicating consideration is the effect of the compression on recognition performance by given classifiers (accuracy). This work aims to explore this rate-distortion-accuracy tradeoff. As a case study, we focus on the design of the quantization tables in the JPEG compression standard. We offer a novel optimal tuning of these tables via continuous optimization, leveraging a differential implementation of both the JPEG encoder-decoder and an entropy estimator. This enables us to offer a unified framework that considers the interplay between rate, distortion and classification accuracy. In all these fronts, we report a substantial boost in performance by a simple and easily implemented modification of these tables.

preprint2020arXiv

Two-Phase Multi-Party Computation Enabled Privacy-Preserving Federated Learning

Countries across the globe have been pushing strict regulations on the protection of personal or private data collected. The traditional centralized machine learning method, where data is collected from end-users or IoT devices, so that it can discover insights behind real-world data, may not be feasible for many data-driven industry applications in light of such regulations. A new machine learning method, coined by Google as Federated Learning (FL) enables multiple participants to train a machine learning model collectively without directly exchanging data. However, recent studies have shown that there is still a possibility to exploit the shared models to extract personal or confidential data. In this paper, we propose to adopt Multi Party Computation (MPC) to achieve privacy-preserving model aggregation for FL. The MPC-enabled model aggregation in a peer-to-peer manner incurs high communication overhead with low scalability. To address this problem, the authors proposed to develop a two-phase mechanism by 1) electing a small committee and 2) providing MPC-enabled model aggregation service to a larger number of participants through the committee. The MPC enabled FL framework has been integrated in an IoT platform for smart manufacturing. It enables a set of companies to train high quality models collectively by leveraging their complementary data-sets on their own premises, without compromising privacy, model accuracy vis-a-vis traditional machine learning methods and execution efficiency in terms of communication cost and execution time.

preprint2019arXiv

Evidence for an Additional Symmetry Breaking from Direct Observation of Band Splitting in the Nematic State of FeSe Superconductor

The iron-based superconductor FeSe has attracted much recent attention because of its simple crystal structure, distinct electronic structure and rich physics exhibited by itself and its derivatives. Determination of its intrinsic electronic structure is crucial to understand its physical properties and superconductivity mechanism. Both theoretical and experimental studies so far have provided a picture that FeSe consists of one hole-like Fermi surface around the Brillouin zone center in its nematic state. Here we report direct observation of two hole-like Fermi surface sheets around the Brillouin zone center, and the splitting of the associated bands, in the nematic state of FeSe by taking high resolution laser-based angle-resolved photoemission measurements. These results indicate that, in addition to nematic order and spin-orbit coupling, there is an additional order in FeSe that breaks either inversion or time reversal symmetries. The new Fermi surface topology asks for reexamination of the existing theoretical and experimental understanding of FeSe and stimulates further efforts to identify the origin of the hidden order in its nematic state.

preprint2019arXiv

Selective Hybridization between Main Band and Superstructure Band in Bi$_2$Sr$_2$CaCu$_2$O$_{8+δ}$ Superconductor

High-resolution laser-based angle-resolved photoemission measurements have been carried out on Bi$_2$Sr$_2$CaCu$_2$O$_{8+δ}$ (Bi2212) and Bi$_2$Sr$_{2-x}$La$_x$CuO$_{6+δ}$ (Bi2201) superconductors. Unexpected hybridization between the main band and the superstructure band in Bi2212 is clearly revealed. In the momentum space where one main Fermi surface intersects with one superstructure Fermi surface, four bands are observed instead of two. The hybridization exists in both superconducting state and normal state, and in Bi2212 samples with different doping levels. Such a hybridization is not observed in Bi2201. This phenomenon can be understood by considering the bilayer splitting in Bi2212, the selective hybridization of two bands with peculiar combinations, and the altered matrix element effects of the hybridized bands. These observations provide strong evidence on the origin of the superstructure band which is intrinsic to the CuO$_2$ planes. Therefore, understanding physical properties and superconductivity mechanism in Bi2212 should consider the complete Fermi surface topology which involves the main bands, the superstructure bands and their interactions.