Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
33works
0followers
26topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

33 published item(s)

preprint2026arXiv

On the Implicit Reward Overfitting and the Low-rank Dynamics in RLVR

Recent extensive research has demonstrated that the enhanced reasoning capabilities acquired by models through Reinforcement Learning with Verifiable Rewards (RLVR) are primarily concentrated within the rank-1 components. Predicated on this observation, we employed Periodic Rank-1 Substitution and identified a counterintuitive phenomenon: RLVR may exhibit implicit reward overfitting to the training dataset. Specifically, the model can achieve satisfactory performance on the test set even when its rewards remain relatively low during the training process. Furthermore, we characterize three distinct properties of RL training: (1) The effective rank-1 component in RLVR don't maintain other model knowledge except mathematical reasoning capability. (2) RLVR fundamentally functions by optimizing a specific singular spectrum. The distribution of singular values of almost all linear layers in RLVR-trained model behaves like heavy-tailed distribution. (3) the left singular vectors associated with rank-1 components demonstrate a stronger alignment tendency during training, which echoes the discovery that RLVR is optimizing sampling efficiency in essence. Taken together, our findings and analysis further reveal how RLVR shapes model parameters and offer potential insights for improving existing RL paradigms or other training paradigms to implement continual learning.

preprint2023arXiv

Thermo-optic phase shifter based on hydrogen-doped indium oxide microheater

Thermo-optic (TO) phase shifters are very fundamental units in large-scale active silicon photonic integrated circuits (PICs). However, due to the limitation of microheater materials with a trade-off between heating efficiency and absorption loss, designs reported so far typically suffer from slow response time, high power consumption, low yields, and so on. Here, we demonstrate an energy-efficient, fast-response, and low-loss TO phase shifter by introducing hydrogen-doped indium oxide (IHO) films as microheater, and the optimized electron concentration with enhanced mobility endows the IHO high conductivity as well as high near-infrared (NIR) transparency, which allow it to directly contact the silicon waveguide without any insulating layer for efficient tuning and fast response. The TO phase shifter achieves a sub-microsecond response time (970 ns/980 ns) with a π phase shift power consumption of 9.6 mW. And the insertion loss introduced by the IHO microheater is ~ 0.5 dB. The proposed IHO-based microheaters with compatible processing technology illustrate the great potential of such material in the application of large-scale silicon PICs.

preprint2022arXiv

Audio Self-supervised Learning: A Survey

Inspired by the humans' cognitive ability to generalise knowledge and skills, Self-Supervised Learning (SSL) targets at discovering general representations from large-scale data without requiring human annotations, which is an expensive and time consuming task. Its success in the fields of computer vision and natural language processing have prompted its recent adoption into the field of audio and speech processing. Comprehensive reviews summarising the knowledge in audio SSL are currently missing. To fill this gap, in the present work, we provide an overview of the SSL methods used for audio and speech processing applications. Herein, we also summarise the empirical works that exploit the audio modality in multi-modal SSL frameworks, and the existing suitable benchmarks to evaluate the power of SSL in the computer audition domain. Finally, we discuss some open problems and point out the future directions on the development of audio SSL.

preprint2022arXiv

Convex Programs and Lyapunov Functions for Reinforcement Learning: A Unified Perspective on the Analysis of Value-Based Methods

Value-based methods play a fundamental role in Markov decision processes (MDPs) and reinforcement learning (RL). In this paper, we present a unified control-theoretic framework for analyzing valued-based methods such as value computation (VC), value iteration (VI), and temporal difference (TD) learning (with linear function approximation). Built upon an intrinsic connection between value-based methods and dynamic systems, we can directly use existing convex testing conditions in control theory to derive various convergence results for the aforementioned value-based methods. These testing conditions are convex programs in form of either linear programming (LP) or semidefinite programming (SDP), and can be solved to construct Lyapunov functions in a straightforward manner. Our analysis reveals some intriguing connections between feedback control systems and RL algorithms. It is our hope that such connections can inspire more work at the intersection of system/control theory and RL.

preprint2022arXiv

Data Augmentation for Depression Detection Using Skeleton-Based Gait Information

In recent years, the incidence of depression is rising rapidly worldwide, but large-scale depression screening is still challenging. Gait analysis provides a non-contact, low-cost, and efficient early screening method for depression. However, the early screening of depression based on gait analysis lacks sufficient effective sample data. In this paper, we propose a skeleton data augmentation method for assessing the risk of depression. First, we propose five techniques to augment skeleton data and apply them to depression and emotion datasets. Then, we divide augmentation methods into two types (non-noise augmentation and noise augmentation) based on the mutual information and the classification accuracy. Finally, we explore which augmentation strategies can capture the characteristics of human skeleton data more effectively. Experimental results show that the augmented training data set that retains more of the raw skeleton data properties determines the performance of the detection model. Specifically, rotation augmentation and channel mask augmentation make the depression detection accuracy reach 92.15% and 91.34%, respectively.

preprint2022arXiv

EEG based Emotion Recognition: A Tutorial and Review

Emotion recognition technology through analyzing the EEG signal is currently an essential concept in Artificial Intelligence and holds great potential in emotional health care, human-computer interaction, multimedia content recommendation, etc. Though there have been several works devoted to reviewing EEG-based emotion recognition, the content of these reviews needs to be updated. In addition, those works are either fragmented in content or only focus on specific techniques adopted in this area but neglect the holistic perspective of the entire technical routes. Hence, in this paper, we review from the perspective of researchers who try to take the first step on this topic. We review the recent representative works in the EEG-based emotion recognition research and provide a tutorial to guide the researchers to start from the beginning. The scientific basis of EEG-based emotion recognition in the psychological and physiological levels is introduced. Further, we categorize these reviewed works into different technical routes and illustrate the theoretical basis and the research motivation, which will help the readers better understand why those techniques are studied and employed. At last, existing challenges and future investigations are also discussed in this paper, which guides the researchers to decide potential future research directions.

preprint2022arXiv

Energy Minimization for Federated Asynchronous Learning on Battery-Powered Mobile Devices via Application Co-running

Energy is an essential, but often forgotten aspect in large-scale federated systems. As most of the research focuses on tackling computational and statistical heterogeneity from the machine learning algorithms, the impact on the mobile system still remains unclear. In this paper, we design and implement an online optimization framework by connecting asynchronous execution of federated training with application co-running to minimize energy consumption on battery-powered mobile devices. From a series of experiments, we find that co-running the training process in the background with foreground applications gives the system a deep energy discount with negligible performance slowdown. Based on these results, we first study an offline problem assuming all the future occurrences of applications are available, and propose a dynamic programming-based algorithm. Then we propose an online algorithm using the Lyapunov framework to explore the solution space via the energy-staleness trade-off. The extensive experiments demonstrate that the online optimization framework can save over 60% energy with 3 times faster convergence speed compared to the previous schemes.

preprint2022arXiv

Exact Formulas for Finite-Time Estimation Errors of Decentralized Temporal Difference Learning with Linear Function Approximation

In this paper, we consider the policy evaluation problem in multi-agent reinforcement learning (MARL) and derive exact closed-form formulas for the finite-time mean-squared estimation errors of decentralized temporal difference (TD) learning with linear function approximation. Our analysis hinges upon the fact that the decentralized TD learning method can be viewed as a Markov jump linear system (MJLS). Then standard MJLS theory can be applied to quantify the mean and covariance matrix of the estimation error of the decentralized TD method at every time step. Various implications of our exact formulas on the algorithm performance are also discussed. An interesting finding is that under a necessary and sufficient stability condition, the mean-squared TD estimation error will converge to an exact limit at a specific exponential rate.

preprint2022arXiv

Forecast of observing time delay of the strongly lensed quasars with Muztagh-Ata 1.93m telescope

As a completely independent method, the measurement of time delay of strongly lensed quasars (TDSL) are crucial to resolve the Hubble tension. Extensive monitoring is required but so far limited to a small sample of strongly lensed quasars. Together with several partner institutes, Beijing Normal University is constructing a 1.93m reflector telescope at the Muztagh-Ata site in west China, which has the world class observing conditions. The telescope will be equipped with both a three-channel imager/photometer which covers $3500-11000$ Angstrom wavelength band, and a low-medium resolution ($λ/δλ=500/2000/7500$) spectrograph. In this paper, we investigate the capability of Muztagh-Ata 1.93m telescope in measuring time delays of strongly lensed quasars. We generate mock strongly lensed quasar systems and light curves with microlensing effects based on five known strongly lensed quasars, i.e., RX J1131-1231, HE 0435-1223, PG 1115+080, WFI 2033-4723 and SDSS 1206+4332. In particular, RX J1131-1231 is generated with lens modeling in this work. Due to lack of enough information, we simulate the other 4 systems with the public data without lens modeling. According to simulations, for RX J1131-like systems (wide variation in time delay between images) the TDSL measurement can be achieved with the precision about $Δt=0.5$ day with 4 seasons campaign length and 1 day cadence. This accuracy is comparable to the up-coming TDCOSMO project. And it would be better when the campaign length keeps longer and with high cadence. As a result, the capability of Muztagh-Ata 1.93m telescope allows it to join the network of TDSL observatories. It will enrich the database for strongly lensed quasar observations and make more precise measurements of time delays, especially considering the unique coordinate of the site.

preprint2022arXiv

Forecasts on CMB lensing observations with AliCPT-1

AliCPT-1 is the first Chinese CMB experiment aiming for high precision measurement of Cosmic Microwave Background B-mode polarization. The telescope, currently under deployment in Tibet, will observe in two frequency bands centered at 90 and 150 GHz. We forecast the CMB lensing reconstruction, lensing-galaxy as well as lensing-CIB (Cosmic Infrared Background) cross correlation signal-to-noise ratio (SNR) for AliCPT-1. We consider two stages with different integrated observation time, namely &#34;4 module*yr&#34; (first stage) and &#34;48 module*yr&#34; (final stage). For lensing reconstruction, we use three different quadratic estimators, namely temperature-only, polarization-only and minimum-variance estimators, using curved sky geometry. We take into account the impact of inhomogeneous hit counts as well as of the mean-field bias due to incomplete sky coverage. In the first stage, our results show that the 150 GHz channel is able to measure the lensing signal at $15σ$ significance with the minimum-variance estimator. In the final stage, the measurement significance will increase to $31σ$. We also combine the two frequency data in the harmonic domain to optimize the SNR. Our result show that the coadding procedure can significantly reduce the reconstruction bias in the multiple range l>800. Thanks to the high quality of the polarization data in the final stage of AliCPT-1, the EB estimator will dominate the lensing reconstruction in this stage. We also estimate the SNR of cross-correlations between AliCPT-1 CMB lensing and other tracers of the large scale structure of the universe. For its cross-correlation with DESI galaxies/quasars, we report the cross-correlation SNR = 10-20 for the 4 redshift bins at 0.05<z<2.1. In the first stage, the total SNR is about $32$. In the final stage, the lensing-galaxy cross-correlation can reach SNR=52.

preprint2022arXiv

Model-Free $μ$ Synthesis via Adversarial Reinforcement Learning

Motivated by the recent empirical success of policy-based reinforcement learning (RL), there has been a research trend studying the performance of policy-based RL methods on standard control benchmark problems. In this paper, we examine the effectiveness of policy-based RL methods on an important robust control problem, namely $μ$ synthesis. We build a connection between robust adversarial RL and $μ$ synthesis, and develop a model-free version of the well-known $DK$-iteration for solving state-feedback $μ$ synthesis with static $D$-scaling. In the proposed algorithm, the $K$ step mimics the classical central path algorithm via incorporating a recently-developed double-loop adversarial RL method as a subroutine, and the $D$ step is based on model-free finite difference approximation. Extensive numerical study is also presented to demonstrate the utility of our proposed model-free algorithm. Our study sheds new light on the connections between adversarial RL and robust control.

preprint2022arXiv

Revisiting PGD Attacks for Stability Analysis of Large-Scale Nonlinear Systems and Perception-Based Control

Many existing region-of-attraction (ROA) analysis tools find difficulty in addressing feedback systems with large-scale neural network (NN) policies and/or high-dimensional sensing modalities such as cameras. In this paper, we tailor the projected gradient descent (PGD) attack method developed in the adversarial learning community as a general-purpose ROA analysis tool for large-scale nonlinear systems and end-to-end perception-based control. We show that the ROA analysis can be approximated as a constrained maximization problem whose goal is to find the worst-case initial condition which shifts the terminal state the most. Then we present two PGD-based iterative methods which can be used to solve the resultant constrained maximization problem. Our analysis is not based on Lyapunov theory, and hence requires minimum information of the problem structures. In the model-based setting, we show that the PGD updates can be efficiently performed using back-propagation. In the model-free setting (which is more relevant to ROA analysis of perception-based control), we propose a finite-difference PGD estimate which is general and only requires a black-box simulator for generating the trajectories of the closed-loop system given any initial state. We demonstrate the scalability and generality of our analysis tool on several numerical examples with large-scale NN policies and high-dimensional image observations. We believe that our proposed analysis serves as a meaningful initial step toward further understanding of closed-loop stability of large-scale nonlinear systems and perception-based control.

preprint2021arXiv

A comprehensive survey on smart contract construction and execution: paradigms, tools, and systems

Smart contracts are regarded as one of the most promising and appealing notions in blockchain technology. Their self-enforcing and event-driven features make some online activities possible without a trusted third party. Nevertheless, problems such as miscellaneous attacks, privacy leakage, and low processing rates pre-vent them from being widely applied. Various schemes and tools have been proposed to facilitate the construction and execution of secure smart contracts. However, a comprehensive survey for these proposals is absent, hindering new researchers and developers from a quick start. This paper surveys the literature and online resources on smart contract construction and execution over the period 2008-2020. We divide the studies into three categories: (1) design paradigms that give examples and patterns on contract construction, (2) design tools that facilitate the development of secure smart contracts, and (3) extensions and alternatives that improve the privacy or efficiency of the system. We start by grouping the relevant construction schemes into the first two categories. We then review the execution mechanisms in the last category and further divide the state-of-the-art solutions into three classes: private contracts with extra tools, off-chain channels, and extensions on core functionalities. Finally, we summarize several challenges and identify future research directions toward developing secure, privacy-preserving, and efficient smart contracts.

preprint2021arXiv

Building Blocks of Sharding Blockchain Systems: Concepts, Approaches, and Open Problems

Sharding is the prevalent approach to breaking the trilemma of simultaneously achieving decentralization, security, and scalability in traditional blockchain systems, which are implemented as replicated state machines relying on atomic broadcast for consensus on an immutable chain of valid transactions. Sharding is to be understood broadly as techniques for dynamically partitioning nodes in a blockchain system into subsets (shards) that perform storage, communication, and computation tasks without fine-grained synchronization with each other. Despite much recent research on sharding blockchains, much remains to be explored in the design space of these systems. Towards that aim, we conduct a systematic analysis of existing sharding blockchain systems and derive a conceptual decomposition of their architecture into functional components and the underlying assumptions about system models and attackers they are built on. The functional components identified are node selection, epoch randomness, node assignment, intra-shard consensus, cross-shard transaction processing, shard reconfiguration, and motivation mechanism. We describe interfaces, functionality, and properties of each component and show how they compose into a sharding blockchain system. For each component, we systematically review existing approaches, identify potential and open problems, and propose future research directions. We focus on potential security attacks and performance problems, including system throughput and latency concerns such as confirmation delays. We believe our modular architectural decomposition and in-depth analysis of each component, based on a comprehensive literature study, provides a systematic basis for conceptualizing state-of-the-art sharding blockchain systems, proving or improving security and performance properties of components, and developing new sharding blockchain system designs.

preprint2021arXiv

Derivative-Free Policy Optimization for Linear Risk-Sensitive and Robust Control Design: Implicit Regularization and Sample Complexity

Direct policy search serves as one of the workhorses in modern reinforcement learning (RL), and its applications in continuous control tasks have recently attracted increasing attention. In this work, we investigate the convergence theory of policy gradient (PG) methods for learning the linear risk-sensitive and robust controller. In particular, we develop PG methods that can be implemented in a derivative-free fashion by sampling system trajectories, and establish both global convergence and sample complexity results in the solutions of two fundamental settings in risk-sensitive and robust control: the finite-horizon linear exponential quadratic Gaussian, and the finite-horizon linear-quadratic disturbance attenuation problems. As a by-product, our results also provide the first sample complexity for the global convergence of PG methods on solving zero-sum linear-quadratic dynamic games, a nonconvex-nonconcave minimax optimization problem that serves as a baseline setting in multi-agent reinforcement learning (MARL) with continuous spaces. One feature of our algorithms is that during the learning phase, a certain level of robustness/risk-sensitivity of the controller is preserved, which we termed as the implicit regularization property, and is an essential requirement in safety-critical control systems.

preprint2021arXiv

Phase asymmetry ultrasound despeckling with fractional anisotropic diffusion and total variation

We propose an ultrasound speckle filtering method for not only preserving various edge features but also filtering tissue-dependent complex speckle noises in ultrasound images. The key idea is to detect these various edges using a phase congruence-based edge significance measure called phase asymmetry (PAS), which is invariant to the intensity amplitude of edges and takes 0 in non-edge smooth regions and 1 at the idea step edge, while also taking intermediate values at slowly varying ramp edges. By leveraging the PAS metric in designing weighting coefficients to maintain a balance between fractional-order anisotropic diffusion and total variation (TV) filters in TV cost function, we propose a new fractional TV framework to not only achieve the best despeckling performance with ramp edge preservation but also reduce the staircase effect produced by integral-order filters. Then, we exploit the PAS metric in designing a new fractional-order diffusion coefficient to properly preserve low-contrast edges in diffusion filtering. Finally, different from fixed fractional-order diffusion filters, an adaptive fractional order is introduced based on the PAS metric to enhance various weak edges in the spatially transitional areas between objects. The proposed fractional TV model is minimized using the gradient descent method to obtain the final denoised image. The experimental results and real application of ultrasound breast image segmentation show that the proposed method outperforms other state-of-the-art ultrasound despeckling filters for both speckle reduction and feature preservation in terms of visual evaluation and quantitative indices.

preprint2021arXiv

Policy Optimization for $\mathcal{H}_2$ Linear Control with $\mathcal{H}_\infty$ Robustness Guarantee: Implicit Regularization and Global Convergence

Policy optimization (PO) is a key ingredient for reinforcement learning (RL). For control design, certain constraints are usually enforced on the policies to optimize, accounting for either the stability, robustness, or safety concerns on the system. Hence, PO is by nature a constrained (nonconvex) optimization in most cases, whose global convergence is challenging to analyze in general. More importantly, some constraints that are safety-critical, e.g., the $\mathcal{H}_\infty$-norm constraint that guarantees the system robustness, are difficult to enforce as the PO methods proceed. Recently, policy gradient methods have been shown to converge to the global optimum of linear quadratic regulator (LQR), a classical optimal control problem, without regularizing/projecting the control iterates onto the stabilizing set, its (implicit) feasible set. This striking result is built upon the coercive property of the cost, ensuring that the iterates remain feasible as the cost decreases. In this paper, we study the convergence theory of PO for $\mathcal{H}_2$ linear control with $\mathcal{H}_\infty$-norm robustness guarantee. One significant new feature of this problem is the lack of coercivity, i.e., the cost may have finite value around the feasible set boundary, breaking the existing analysis for LQR. Interestingly, we show that two PO methods enjoy the implicit regularization property, i.e., the iterates preserve the $\mathcal{H}_\infty$ robustness constraint as if they are regularized by the algorithms. Furthermore, despite the nonconvexity of the problem, we show that these algorithms converge to the globally optimal policies with globally sublinear rates, avoiding all suboptimal stationary points/local minima, and with locally (super-)linear rates under certain conditions.

preprint2021arXiv

The design of the Ali CMB Polarization Telescope receiver

Ali CMB Polarization Telescope (AliCPT-1) is the first CMB degree-scale polarimeter to be deployed on the Tibetan plateau at 5,250m above sea level. AliCPT-1 is a 90/150 GHz 72 cm aperture, two-lens refracting telescope cooled down to 4 K. Alumina lenses, 800mm in diameter, image the CMB in a 33.4° field of view on a 636mm wide focal plane. The modularized focal plane consists of dichroic polarization-sensitive Transition-Edge Sensors (TESes). Each module includes 1,704 optically active TESes fabricated on a 150mm diameter silicon wafer. Each TES array is read out with a microwave multiplexing readout system capable of a multiplexing factor up to 2,048. Such a large multiplexing factor has allowed the practical deployment of tens of thousands of detectors, enabling the design of a receiver that can operate up to 19 TES arrays for a total of 32,376 TESes. AliCPT-1 leverages the technological advancements in the detector design from multiple generations of previously successful feedhorn-coupled polarimeters, and in the instrument design from BICEP-3, but applied on a larger scale. The cryostat receiver is currently under integration and testing. During the first deployment year, the focal plane will be populated with up to 4 TES arrays. Further TES arrays will be deployed in the following years, fully populating the focal plane with 19 arrays on the fourth deployment year. Here we present the AliCPT-1 receiver design, and how the design has been optimized to meet the experimental requirements.

preprint2020arXiv

A Novel Decision Tree for Depression Recognition in Speech

Depression is a common mental disorder worldwide which causes a range of serious outcomes. The diagnosis of depression relies on patient-reported scales and psychiatrist interview which may lead to subjective bias. In recent years, more and more researchers are devoted to depression recognition in speech , which may be an effective and objective indicator. This study proposes a new speech segment fusion method based on decision tree to improve the depression recognition accuracy and conducts a validation on a sample of 52 subjects (23 depressed patients and 29 healthy controls). The recognition accuracy are 75.8% and 68.5% for male and female respectively on gender-dependent models. It can be concluded from the data that the proposed decision tree model can improve the depression classification performance.

preprint2020arXiv

A Public Website for the Automated Assessment and Validation of SARS-CoV-2 Diagnostic PCR Assays

Summary: Polymerase chain reaction-based assays are the current gold standard for detecting and diagnosing SARS-CoV-2. However, as SARS-CoV-2 mutates, we need to constantly assess whether existing PCR-based assays will continue to detect all known viral strains. To enable the continuous monitoring of SARS-CoV-2 assays, we have developed a web-based assay validation algorithm that checks existing PCR-based assays against the ever-expanding genome databases for SARS-CoV-2 using both thermodynamic and edit-distance metrics. The assay screening results are displayed as a heatmap, showing the number of mismatches between each detection and each SARS-CoV-2 genome sequence. Using a mismatch threshold to define detection failure, assay performance is summarized with the true positive rate (recall) to simplify assay comparisons. Availability: https://covid19.edgebioinformatics.org/#/assayValidation. Contact: Jason Gans (jgans@lanl.gov) and Patrick Chain (pchain@lanl.gov)

preprint2020arXiv

A study of resting-state EEG biomarkers for depression recognition

Background: Depression has become a major health burden worldwide, and effective detection depression is a great public-health challenge. This Electroencephalography (EEG)-based research is to explore the effective biomarkers for depression recognition. Methods: Resting state EEG data was collected from 24 major depressive patients (MDD) and 29 normal controls using 128 channel HydroCel Geodesic Sensor Net (HCGSN). To better identify depression, we extracted different types of EEG features including linear features, nonlinear features and functional connectivity features phase lagging index (PLI) to comprehensively analyze the EEG signals in patients with MDD. And using different feature selection methods and classifiers to evaluate the optimal feature sets. Results: Functional connectivity feature PLI is superior to the linear features and nonlinear features. And when combining all the types of features to classify MDD patients, we can obtain the highest classification accuracy 82.31% using ReliefF feature selection method and logistic regression (LR) classifier. Analyzing the distribution of optimal feature set, it was found that intrahemispheric connection edges of PLI were much more than the interhemispheric connection edges, and the intrahemispheric connection edges had a significant differences between two groups. Conclusion: Functional connectivity feature PLI plays an important role in depression recognition. Especially, intrahemispheric connection edges of PLI might be an effective biomarker to identify depression. And statistic results suggested that MDD patients might exist functional dysfunction in left hemisphere.

preprint2020arXiv

An adaptive finite element DtN method for the three-dimensional acoustic scattering problem

This paper is concerned with a numerical solution of the acoustic scattering by a bounded impenetrable obstacle in three dimensions. The obstacle scattering problem is formulated as a boundary value problem in a bounded domain by using a Dirichlet-to-Neumann (DtN) operator. An a posteriori error estimate is derived for the finite element method with the truncated DtN operator. The a posteriori error estimate consists of the finite element approximation error and the truncation error of the DtN operator, where the latter is shown to decay exponentially with respect to the truncation parameter. Based on the a posteriori error estimate, an adaptive finite element method is developed for the obstacle scattering problem. The truncation parameter is determined by the truncation error of the DtN operator and the mesh elements for local refinement are marked through the finite element approximation error. Numerical experiments are presented to demonstrate the effectiveness of the proposed method.

preprint2020arXiv

Analysis of Biased Stochastic Gradient Descent Using Sequential Semidefinite Programs

We present a convergence rate analysis for biased stochastic gradient descent (SGD), where individual gradient updates are corrupted by computation errors. We develop stochastic quadratic constraints to formulate a small linear matrix inequality (LMI) whose feasible points lead to convergence bounds of biased SGD. Based on this LMI condition, we develop a sequential minimization approach to analyze the intricate trade-offs that couple stepsize selection, convergence rate, optimization accuracy, and robustness to gradient inaccuracy. We also provide feasible points for this LMI and obtain theoretical formulas that quantify the convergence properties of biased SGD under various assumptions on the loss functions.

preprint2020arXiv

Convergence Guarantees of Policy Optimization Methods for Markovian Jump Linear Systems

Recently, policy optimization for control purposes has received renewed attention due to the increasing interest in reinforcement learning. In this paper, we investigate the convergence of policy optimization for quadratic control of Markovian jump linear systems (MJLS). First, we study the optimization landscape of direct policy optimization for MJLS, and, in particular, show that despite the non-convexity of the resultant problem the unique stationary point is the global optimal solution. Next, we prove that the Gauss-Newton method and the natural policy gradient method converge to the optimal state feedback controller for MJLS at a linear rate if initialized at a controller which stabilizes the closed-loop dynamics in the mean square sense. We propose a novel Lyapunov argument to fix a key stability issue in the convergence proof. Finally, we present a numerical example to support our theory. Our work brings new insights for understanding the performance of policy learning methods on controlling unknown MJLS.

preprint2020arXiv

Depression Detection using Resting State Three-channel EEG Signal

In universal environment, a patient-friendly inexpensive method is needed to realize the early diagnosis of depression, which is believed to be an effective way to reduce the mortality of depression. The purpose of this study is only to collect EEG signal from three electrodes Fp1, Fpz and Fp2, then the linear and nonlinear features of EEG used to classify depression patients and healthy controls. The EEG recordings were carried out on a group of 18 medication-free depressive patients and 25 gender and age matched controls. In this paper, the selected features include three linear (maximum, mean and center values of the power) and three nonlinear features (correlation dimension, Renyi entropy and C0 complexity). The accuracy and effectiveness of classification model between depressive and control subjects were calculated using leave-one-out cross-validation. The experimental results indicate that selected three channel EEG and features can distinguish the subjects between depression and normal beings, the classification accuracy is 72.25%. It is hoped that the performed results can provide more choices for the early diagnosis of depression in a universal environment.

preprint2020arXiv

Deterministic Intra-Vehicle Communications: Timing and Synchronization

As we power through to the future, in-vehicle communications reliance on speed is becoming a challenging predicament. This is mainly due to the ever-increasing number of electronic control units (ECUs), which will continue to drain network capacity, hence further increasing bandwidth demand. For a wired network, a tradeoff between bandwidth requirement, reliability, and cost-effectiveness has been our main motivation in developing a high-speed network architecture that is based on the integration of two time-triggered protocols namely; Time Triggered Ethernet (TT-E) and Time Triggered Controller Area Network (TT-CAN). Therefore, as a visible example of an Internet of Vehicles technology, we present a time triggered communication-based network architecture. The new architecture can provide scalable integration of advanced functionalities, while maintaining safety and high reliability. To comply with the bandwidth requirement, we consider high-speed TT-Ethernet as the main bus (i.e., backbone network) where sub-networks can use more cost-effective and lower bandwidth TT-CAN to communicate with other entities in the network via a gateway. The main challenge in the proposed network architecture has been to resolve interoperability between two entirely different time-triggered protocols, especially in terms of timing and synchronization. In this paper, we first explore the main key drivers of the proposed architecture, which are bandwidth, reliability, and timeliness. We then demonstrate the effectiveness of our gateway design in providing full interoperability between the two time-triggered protocols.

preprint2020arXiv

New probe of gravity: strongly lensed gravitational wave multi-messenger approach

Strong gravitational lensing by galaxies provides us with a unique opportunity to understand the nature of gravity on galactic and extra-galactic scales. In this paper, we propose a new multimessenger approach using data from both gravitational wave (GW) and the corresponding electromagnetic (EM) counterpart to infer the constraint of the modified gravity (MG) theory denoted by the scale dependent phenomenological parameter. To demonstrate the robustness of this approach, we calculate the time-delay predictions by choosing various values of the phenomenological parameters and then compare them with that from the general relativity (GR). For the third generation ground-based GW observatory, with one typical strongly lensed GW+EM event, and assuming that the dominated error from the stellar velocity dispersions is 5\%, the GW time-delay data can distinguish an 18\% MG effect on a scale of tens of kiloparsecs with a $68\%$ confidence level. Assuming GR and a Singular Isothermal Sphere mass model, there exists a simplified consistency relationship between time-delay and imaging data. This relationship does not require for the velocity dispersion measurement, and hence can avoid major uncertainties. By using this relationship, the multimessenger approach is able to distinguish an $8\%$ MG effect. Our results show that the GW multimessenger approach can play an important role in revealing the nature of gravity on the galactic and extra-galactic scales.

preprint2020arXiv

Non-linear matter power spectrum without screening dynamics modelling in $f(R)$ gravity

Halo model is a physically intuitive method for modelling the non-linear power spectrum, especially for the alternatives to the standard $Λ$CDM models. In this paper, we exam the Sheth-Tormen barrier formula adopted in the previous \texttt{CHAM} method \citep{2018MNRAS.476L..65H}. As an example, we model the ellipsoidal collapse of top-hat dark matter haloes in $f(R)$ gravity. A good agreement between Sheth-Tormen formula and our result is achieved. The relative difference in the ellipsoidal collapse barrier is less than or equal to $1.6\%$. Furthermore, we verify that, for F4 and F5 cases of Hu-Sawicki $f(R)$ gravity, the screening mechanism do not play a crucial role in the non-linear power spectrum modelling up to $k\sim1[h/{\rm Mpc}]$. We compare two versions of modified gravity modelling, namely with/without screening. We find that by treating the effective Newton constant as constant number ($G_{\rm eff}=4/3G_N$) is acceptable. The scale dependence of the gravitational coupling is sub-relevant. The resulting spectra in F4 and F5, are in $0.1\%$ agreement with the previous \texttt{CHAM} results. The published code is accelerated significantly. Finally, we compare our halo model prediction with N-body simulation. We find that the general spectrum profile agree, qualitatively. However, via the halo model approach, there exists a systematic under-estimation of the matter power spectrum in the co-moving wavenumber range between $0.3 h/{\rm Mpc}$ and $3 h/{\rm Mpc}$. These scales are overlapping with the transition scales from two halo term dominated regimes to those of one halo term dominated.

preprint2020arXiv

Policy Learning of MDPs with Mixed Continuous/Discrete Variables: A Case Study on Model-Free Control of Markovian Jump Systems

Markovian jump linear systems (MJLS) are an important class of dynamical systems that arise in many control applications. In this paper, we introduce the problem of controlling unknown (discrete-time) MJLS as a new benchmark for policy-based reinforcement learning of Markov decision processes (MDPs) with mixed continuous/discrete state variables. Compared with the traditional linear quadratic regulator (LQR), our proposed problem leads to a special hybrid MDP (with mixed continuous and discrete variables) and poses significant new challenges due to the appearance of an underlying Markov jump parameter governing the mode of the system dynamics. Specifically, the state of a MJLS does not form a Markov chain and hence one cannot study the MJLS control problem as a MDP with solely continuous state variable. However, one can augment the state and the jump parameter to obtain a MDP with a mixed continuous/discrete state space. We discuss how control theory sheds light on the policy parameterization of such hybrid MDPs. Then we modify the widely used natural policy gradient method to directly learn the optimal state feedback control policy for MJLS without identifying either the system dynamics or the transition probability of the switching parameter. We implement the (data-driven) natural policy gradient method on different MJLS examples. Our simulation results suggest that the natural gradient method can efficiently learn the optimal controller for MJLS with unknown dynamics.

preprint2020arXiv

Self-Supervised Gait Encoding with Locality-Aware Attention for Person Re-Identification

Gait-based person re-identification (Re-ID) is valuable for safety-critical applications, and using only 3D skeleton data to extract discriminative gait features for person Re-ID is an emerging open topic. Existing methods either adopt hand-crafted features or learn gait features by traditional supervised learning paradigms. Unlike previous methods, we for the first time propose a generic gait encoding approach that can utilize unlabeled skeleton data to learn gait representations in a self-supervised manner. Specifically, we first propose to introduce self-supervision by learning to reconstruct input skeleton sequences in reverse order, which facilitates learning richer high-level semantics and better gait representations. Second, inspired by the fact that motion&#39;s continuity endows temporally adjacent skeletons with higher correlations (&#34;locality&#34;), we propose a locality-aware attention mechanism that encourages learning larger attention weights for temporally adjacent skeletons when reconstructing current skeleton, so as to learn locality when encoding gait. Finally, we propose Attention-based Gait Encodings (AGEs), which are built using context vectors learned by locality-aware attention, as final gait representations. AGEs are directly utilized to realize effective person Re-ID. Our approach typically improves existing skeleton-based methods by 10-20% Rank-1 accuracy, and it achieves comparable or even superior performance to multi-modal methods with extra RGB or depth information. Our codes are available at https://github.com/Kali-Hac/SGE-LA.

preprint2020arXiv

Service Ecosystem: A Lens of Smart Society

Intelligence services are playing an increasingly important role in the operation of our society. Exploring the evolution mechanism, boundaries and challenges of service ecosystem is essential to our ability to realize smart society, reap its benefits and prevent potential risks. We argue that this necessitates a broad scientific research agenda to study service ecosystem that incorporates and expands upon the disciplines of computer science and includes insights from across the sciences. We firstly outline a set of research issues that are fundamental to this emerging field, and then explores the technical, social, legal and institutional challenges on the study of service ecosystem.

preprint2020arXiv

The first simultaneous measurement of Hubble constant and post-Newtonian parameter from Time-Delay Strong Lensing

Strong gravitational lensing has been a powerful probe of cosmological models and gravity. To date, constraints in either domain have been obtained separately. We propose a new methodology through which the cosmological model, specifically the Hubble constant, and post-Newtonian parameter can be simultaneously constrained. Using the time-delay cosmography from strong lensing combined with the stellar kinematics of the deflector lens, we demonstrate the Hubble constant and post-Newtonian parameter are incorporated in two distance ratios which reflect the lensing mass and dynamical mass, respectively. Through the reanalysis of the four publicly released lenses distance posteriors from the H0LiCOW collaboration, the simultaneous constraints of Hubble constant and post-Newtonian parameter are obtained. Our results suggests no deviation from the General Relativity, $γ_{\texttt{PPN}}=0.87^{+0.19}_{-0.17}$ with a Hubble constant favors the local Universe value, $H_0=73.65^{+1.95}_{-2.26}$ km s$^{-1}$ Mpc$^{-1}$. Finally, we forecast the robustness of gravity tests by using the time-delay strong lensing for constraints we expect in the next few years. We find that the joint constraint from 40 lenses are able to reach the order of $7.7\%$ for the post-Newtonian parameter and $1.4\%$ for Hubble constant.