Source author record

Han Yu

Han Yu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

46works

25topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Counting rationals and diophantine approximation in missing-digit Cantor sets

We establish a new upper bound for the number of rationals up to a given height in a missing-digit set, making progress towards a conjecture of Broderick, Fishman, and Reich. This enables us to make novel progress towards another conjecture of those authors about the corresponding intrinsic diophantine approximation problem. Moreover, we make further progress towards conjectures of Bugeaud--Durand and Levesley--Salp--Velani on the distribution of diophantine exponents in missing-digit sets. A key tool in our study is Fourier $\ell^1$ dimension introduced by the last named author in [H. Yu, Rational points near self-similar sets, arXiv:2101.05910]. An important technical contribution of the paper is a method to compute this quantity.

preprint2026arXiv

Federated Nested Learning: Collaborative Training of Self-Referential Memories for Test-Time Adaptation

We rethink Federated Learning (FL) from a nested learning perspective, framing the core challenge as how to collaboratively learn optimization rules, not just static models, to tackle Non-IID client data. To address this, we propose Federated Nested Learning (FedNL), a novel framework that reformulates FL as a three-level nested optimization system. FedNL embeds Titans-based linear attention into FL, enabling clients to perform lightweight, zero-shot test-time adaptation by treating a delta rule as an online gradient step. Experiments on Non-IID MMLU and long-context benchmarks show that FedNL achieves competitive performance in short-context reasoning, enhances the performance of long-context retrieval and streaming Cross-Entropy, and maintains constant inference memory.

preprint2023arXiv

Evolutionary Carrier Selection for Shared Truck Delivery Services

With multiple carriers in a logistics market, customers can choose the best carrier to deliver their products and packages. In this paper, we present a novel approach of using the stochastic evolutionary game to analyze the decision-making of the customers using the less-than-truckload (LTL) delivery service. We propose inter-related optimization and game models that allow us to analyze the vehicle routing optimization for the LTL carriers and carrier selection for the customers, respectively. The stochastic evolutionary game model incorporates a small perturbation of customers' decision-making which exists due to irrationality. The solution of the stochastic evolutionary game in terms of stochastically stable states is characterized by using the Markov chain model. The numerical results show the impact of carriers' and customers' parameters on the stable states.

preprint2023arXiv

Stochastic Qubit Resource Allocation for Quantum Cloud Computing

Quantum cloud computing is a promising paradigm for efficiently provisioning quantum resources (i.e., qubits) to users. In quantum cloud computing, quantum cloud providers provision quantum resources in reservation and on-demand plans for users. Literally, the cost of quantum resources in the reservation plan is expected to be cheaper than the cost of quantum resources in the on-demand plan. However, quantum resources in the reservation plan have to be reserved in advance without information about the requirement of quantum circuits beforehand, and consequently, the resources are insufficient, i.e., under-reservation. Hence, quantum resources in the on-demand plan can be used to compensate for the unsatisfied quantum resources required. To end this, we propose a quantum resource allocation for the quantum cloud computing system in which quantum resources and the minimum waiting time of quantum circuits are jointly optimized. Particularly, the objective is to minimize the total costs of quantum circuits under uncertainties regarding qubit requirement and minimum waiting time of quantum circuits. In experiments, practical circuits of quantum Fourier transform are applied to evaluate the proposed qubit resource allocation. The results illustrate that the proposed qubit resource allocation can achieve the optimal total costs.

preprint2022arXiv

A note on dyadic approximation in Cantor's set

We consider the convergence theory for dyadic approximation in the middle-third Cantor set, $K$, for approximation functions of the form $ψ_τ(n) = n^{-τ}$ ($τ\ge 0$). In particular, we show that for values of $τ$ beyond a certain threshold we have that almost no point in $K$ is dyadically $ψ_τ$-well approximable with respect to the natural probability measure on $K$. This refines a previous result in this direction obtained by the first, third, and fourth named authors (arXiv, 2020).

preprint2022arXiv

Adaptive Memory Networks with Self-supervised Learning for Unsupervised Anomaly Detection

Unsupervised anomaly detection aims to build models to effectively detect unseen anomalies by only training on the normal data. Although previous reconstruction-based methods have made fruitful progress, their generalization ability is limited due to two critical challenges. First, the training dataset only contains normal patterns, which limits the model generalization ability. Second, the feature representations learned by existing models often lack representativeness which hampers the ability to preserve the diversity of normal patterns. In this paper, we propose a novel approach called Adaptive Memory Network with Self-supervised Learning (AMSL) to address these challenges and enhance the generalization ability in unsupervised anomaly detection. Based on the convolutional autoencoder structure, AMSL incorporates a self-supervised learning module to learn general normal patterns and an adaptive memory fusion module to learn rich feature representations. Experiments on four public multivariate time series datasets demonstrate that AMSL significantly improves the performance compared to other state-of-the-art methods. Specifically, on the largest CAP sleep stage detection dataset with 900 million samples, AMSL outperforms the second-best baseline by \textbf{4}\%+ in both accuracy and F1 score. Apart from the enhanced generalization ability, AMSL is also more robust against input noise.

preprint2022arXiv

Adaptive Resource Allocation in Quantum Key Distribution (QKD) for Federated Learning

Increasing privacy and security concerns in intelligence-native 6G networks require quantum key distribution-secured federated learning (QKD-FL), in which data owners connected via quantum channels can train an FL global model collaboratively without exposing their local datasets. To facilitate QKD-FL, the architectural design and routing management framework are essential. However, effective implementation is still lacking. To this end, we propose a hierarchical architecture for QKD-FL systems in which QKD resources (i.e., wavelengths) and routing are jointly optimized for FL applications. In particular, we focus on adaptive QKD resource allocation and routing for FL workers to minimize the deployment cost of QKD nodes under various uncertainties, including security requirements. The experimental results show that the proposed architecture and the resource allocation and routing model can reduce the deployment cost by 7.72\% compared to the CO-QBN algorithm.

preprint2022arXiv

Bernoulli convolutions with Garsia parameters in $(1,\sqrt{2}]$ have continuous density functions

Let $λ\in (1,\sqrt{2}]$ be an algebraic integer with Mahler measure $2.$ A classical result of Garsia shows that the Bernoulli convolution $μ_λ$ is absolutely continuous with respect to the Lebesgue measure with a density function in $L^\infty$. In this paper, we show that the density function is continuous.

preprint2022arXiv

Bias Reducing Multitask Learning on Mental Health Prediction

There has been an increase in research in developing machine learning models for mental health detection or prediction in recent years due to increased mental health issues in society. Effective use of mental health prediction or detection models can help mental health practitioners re-define mental illnesses more objectively than currently done, and identify illnesses at an earlier stage when interventions may be more effective. However, there is still a lack of standard in evaluating bias in such machine learning models in the field, which leads to challenges in providing reliable predictions and in addressing disparities. This lack of standards persists due to factors such as technical difficulties, complexities of high dimensional clinical health data, etc., which are especially true for physiological signals. This along with prior evidence of relations between some physiological signals with certain demographic identities restates the importance of exploring bias in mental health prediction models that utilize physiological signals. In this work, we aim to perform a fairness analysis and implement a multi-task learning based bias mitigation method on anxiety prediction models using ECG data. Our method is based on the idea of epistemic uncertainty and its relationship with model weights and feature space representation. Our analysis showed that our anxiety prediction base model introduced some bias with regards to age, income, ethnicity, and whether a participant is born in the U.S. or not, and our bias mitigation method performed better at reducing the bias in the model, when compared to the reweighting mitigation technique. Our analysis on feature importance also helped identify relationships between heart rate variability and multiple demographic groupings.

preprint2022arXiv

Federated Learning for Personalized Humor Recognition

Computational understanding of humor is an important topic under creative language understanding and modeling. It can play a key role in complex human-AI interactions. The challenge here is that human perception of humorous content is highly subjective. The same joke may receive different funniness ratings from different readers. This makes it highly challenging for humor recognition models to achieve personalization in practical scenarios. Existing approaches are generally designed based on the assumption that users have a consensus on whether a given text is humorous or not. Thus, they cannot handle diverse humor preferences well. In this paper, we propose the FedHumor approach for the recognition of humorous content in a personalized manner through Federated Learning (FL). Extending a pre-trained language model, FedHumor guides the fine-tuning process by considering diverse distributions of humor preferences from individuals. It incorporates a diversity adaptation strategy into the FL paradigm to train a personalized humor recognition model. To the best of our knowledge, FedHumor is the first text-based personalized humor recognition model through federated learning. Extensive experiments demonstrate the advantage of FedHumor in recognizing humorous texts compared to nine state-of-the-art humor recognition approaches with superior capability for handling the diversity in humor labels produced by users with diverse preferences.

preprint2022arXiv

Fractal projections with an application in number theory

In this paper, we discuss a connection between geometric measure theory and number theory. This method brings a new point of view for some number-theoretic problems concerning digit expansions. Among other results, we showed that for each integer $k,$ there is a number $M>0$ such that if $b_1,\dots,b_k$ are multiplicatively independent integers greater than $M$, there are infinitely many integers whose base $b_1,b_2,\dots,b_k$ expansions all do not have any zero digits.

preprint2022arXiv

Heterogeneous Federated Learning via Grouped Sequential-to-Parallel Training

Federated learning (FL) is a rapidly growing privacy-preserving collaborative machine learning paradigm. In practical FL applications, local data from each data silo reflect local usage patterns. Therefore, there exists heterogeneity of data distributions among data owners (a.k.a. FL clients). If not handled properly, this can lead to model performance degradation. This challenge has inspired the research field of heterogeneous federated learning, which currently remains open. In this paper, we propose a data heterogeneity-robust FL approach, FedGSP, to address this challenge by leveraging on a novel concept of dynamic Sequential-to-Parallel (STP) collaborative training. FedGSP assigns FL clients to homogeneous groups to minimize the overall distribution divergence among groups, and increases the degree of parallelism by reassigning more groups in each round. It is also incorporated with a novel Inter-Cluster Grouping (ICG) algorithm to assist in group assignment, which uses the centroid equivalence theorem to simplify the NP-hard grouping problem to make it solvable. Extensive experiments have been conducted on the non-i.i.d. FEMNIST dataset. The results show that FedGSP improves the accuracy by 3.7% on average compared with seven state-of-the-art approaches, and reduces the training time and communication overhead by more than 90%.

preprint2022arXiv

MarS-FL: Enabling Competitors to Collaborate in Federated Learning

Federated learning (FL) is rapidly gaining popularity and enables multiple data owners ({\em a.k.a.} FL participants) to collaboratively train machine learning models in a privacy-preserving way. A key unaddressed scenario is that these FL participants are in a competitive market, where market shares represent their competitiveness. Although they are interested to enhance the performance of their respective models through FL, market leaders (who are often data owners who can contribute significantly to building high performance FL models) want to avoid losing their market shares by enhancing their competitors' models. Currently, there is no modeling tool to analyze such scenarios and support informed decision-making. In this paper, we bridge this gap by proposing the \underline{mar}ket \underline{s}hare-based decision support framework for participation in \underline{FL} (MarS-FL). We introduce {\em two notions of $δ$-stable market} and {\em friendliness} to measure the viability of FL and the market acceptability of FL. The FL participants' behaviours can then be predicted using game theoretic tools (i.e., their optimal strategies concerning participation in FL). If the market $δ$-stability is achievable, the final model performance improvement of each FL-PT shall be bounded, which relates to the market conditions of FL applications. We provide tight bounds and quantify the friendliness, $κ$, of given market conditions to FL. Experimental results show the viability of FL in a wide range of market conditions. Our results are useful for identifying the market conditions under which collaborative FL model training is viable among competitors, and the requirements that have to be imposed while applying FL under these conditions.

preprint2022arXiv

More to Less (M2L): Enhanced Health Recognition in the Wild with Reduced Modality of Wearable Sensors

Accurately recognizing health-related conditions from wearable data is crucial for improved healthcare outcomes. To improve the recognition accuracy, various approaches have focused on how to effectively fuse information from multiple sensors. Fusing multiple sensors is a common scenario in many applications, but may not always be feasible in real-world scenarios. For example, although combining bio-signals from multiple sensors (i.e., a chest pad sensor and a wrist wearable sensor) has been proved effective for improved performance, wearing multiple devices might be impractical in the free-living context. To solve the challenges, we propose an effective more to less (M2L) learning framework to improve testing performance with reduced sensors through leveraging the complementary information of multiple modalities during training. More specifically, different sensors may carry different but complementary information, and our model is designed to enforce collaborations among different modalities, where positive knowledge transfer is encouraged and negative knowledge transfer is suppressed, so that better representation is learned for individual modalities. Our experimental results show that our framework achieves comparable performance when compared with the full modalities. Our code and results will be available at https://github.com/compwell-org/More2Less.git.

preprint2022arXiv

NICO++: Towards Better Benchmarking for Domain Generalization

Despite the remarkable performance that modern deep neural networks have achieved on independent and identically distributed (I.I.D.) data, they can crash under distribution shifts. Most current evaluation methods for domain generalization (DG) adopt the leave-one-out strategy as a compromise on the limited number of domains. We propose a large-scale benchmark with extensive labeled domains named NICO++ along with more rational evaluation methods for comprehensively evaluating DG algorithms. To evaluate DG datasets, we propose two metrics to quantify covariate shift and concept shift, respectively. Two novel generalization bounds from the perspective of data construction are proposed to prove that limited concept shift and significant covariate shift favor the evaluation capability for generalization. Through extensive experiments, NICO++ shows its superior evaluation capability compared with current DG datasets and its contribution in alleviating unfairness caused by the leak of oracle knowledge in model selection.

preprint2022arXiv

On the metric theory of multiplicative Diophantine approximation

In 1962, Gallagher proved an higher dimensional version of Khintchine's theorem on Diophantine approximation. Gallagher's theorem states that for any non-increasing approximation function $ψ:\mathbb{N}\to (0,1/2)$ with $\sum_{q=1}^{\infty} ψ(q)\log q=\infty$ and $γ=γ'=0$ the following set \[ \{(x,y)\in [0,1]^2: \|qx-γ\|\|qy-γ'\|<ψ(q) \text{ infinitely often}\} \] has full Lebesgue measure. Recently, Chow and Technau proved a fully inhomogeneous version (without restrictions on $γ,γ'$) of the above result. In this paper, we prove an Erdős-Vaaler type result for fibred multiplicative Diophantine approximation. Along the way, via a different method, we prove a slightly weaker version of Chow-Technau's theorem with the condition that at least one of $γ,γ'$ is not Liouville. We also extend Chow-Technau's result for fibred inhomogeneous Gallagher's theorem for Liouville fibres.

preprint2022arXiv

Privacy and Robustness in Federated Learning: Attacks and Defenses

As data are increasingly being stored in different silos and societies becoming more aware of data privacy issues, the traditional centralized training of artificial intelligence (AI) models is facing efficiency and privacy challenges. Recently, federated learning (FL) has emerged as an alternative solution and continue to thrive in this new reality. Existing FL protocol design has been shown to be vulnerable to adversaries within or outside of the system, compromising data privacy and system robustness. Besides training powerful global models, it is of paramount importance to design FL systems that have privacy guarantees and are resistant to different types of adversaries. In this paper, we conduct the first comprehensive survey on this topic. Through a concise introduction to the concept of FL, and a unique taxonomy covering: 1) threat models; 2) poisoning attacks and defenses against robustness; 3) inference attacks and defenses against privacy, we provide an accessible review of this important topic. We highlight the intuitions, key techniques as well as fundamental assumptions adopted by various attacks and defenses. Finally, we discuss promising future research directions towards robust and privacy-preserving federated learning.

preprint2022arXiv

Resource Allocation in Quantum Key Distribution (QKD) for Space-Air-Ground Integrated Networks

Space-air-ground integrated networks (SAGIN) are one of the most promising advanced paradigms in the sixth generation (6G) communication. SAGIN can support high data rates, low latency, and seamless network coverage for interconnected applications and services. However, communications in SAGIN are facing tremendous security threats from the ever-increasing capacity of quantum computers. Fortunately, quantum key distribution (QKD) for establishing secure communications in SAGIN, i.e., QKD over SAGIN, can provide information-theoretic security. To minimize the QKD deployment cost in SAGIN with heterogeneous nodes, in this paper, we propose a resource allocation scheme for QKD over SAGIN using stochastic programming. The proposed scheme is formulated via two-stage stochastic programming (SP), while considering uncertainties such as security requirements and weather conditions. Under extensive experiments, the results clearly show that the proposed scheme can achieve the optimal deployment cost under various security requirements and unpredictable weather conditions.

preprint2022arXiv

Robust Optimization with Decision-Dependent Information Discovery

Robust optimization is a popular paradigm for modeling and solving two- and multi-stage decision-making problems affected by uncertainty. In many real-world applications, the time of information discovery is decision-dependent and the uncertain parameters only become observable after an often costly investment. Yet, most of the literature assumes that uncertain parameters can be observed for free and that the sequence in which they are revealed is independent of the decision-maker's actions. To fill this gap. we consider two- and multi-stage robust optimization problems in which part of the decision variables control the time of information discovery. Thus, information can be discovered (at least in part) by making strategic exploratory investments in previous stages. We propose a novel dynamic formulation of the problem and prove its correctness. We leverage our model to provide a solution method inspired from the K-adaptability approximation. We reformulate the problem as a finite mixed-integer (resp. bilinear) program if none (resp. some of the) decision variables are real-valued. This finite program is solvable with off-the-shelf solvers. We generalize our approach to the minimization of piecewise linear convex functions. We demonstrate the effectiveness of our method in terms of interpretability, optimality, and speed on synthetic instances of the Pandora box problem, the preference elicitation problem, the best box problem, and the R&D project portfolio optimization problem. Finally, we evaluate it on an instance of the active preference elicitation problem used to recommend kidney allocation policies to policy-makers at the United Network for Organ Sharing.

preprint2022arXiv

Semi-Supervised Learning and Data Augmentation in Wearable-based Momentary Stress Detection in the Wild

Physiological and behavioral data collected from wearable or mobile sensors have been used to estimate self-reported stress levels. Since the stress annotation usually relies on self-reports during the study, a limited amount of labeled data can be an obstacle in developing accurate and generalized stress predicting models. On the other hand, the sensors can continuously capture signals without annotations. This work investigates leveraging unlabeled wearable sensor data for stress detection in the wild. We first applied data augmentation techniques on the physiological and behavioral data to improve the robustness of supervised stress detection models. Using an auto-encoder with actively selected unlabeled sequences, we pre-trained the supervised model structure to leverage the information learned from unlabeled samples. Then, we developed a semi-supervised learning framework to leverage the unlabeled data sequences. We combined data augmentation techniques with consistency regularization, which enforces the consistency of prediction output based on augmented and original unlabeled data. We validated these methods using three wearable/mobile sensor datasets collected in the wild. Our results showed that combining the proposed methods improved stress classification performance by 7.7% to 13.8% on the evaluated datasets, compared to the baseline supervised learning models.

preprint2022arXiv

Times two, three, five orbits on $\mathbb{T}^2$

In this paper, we study orbit closures under diagonal torus actions. We show that if $(x,y)\in\mathbb{T}^2$ is not contained in any rational lines, then its orbit under the $\times 2, \times 3, \times 5$ actions is dense in $\mathbb{T}^2.$

preprint2022arXiv

Towards Personalized Federated Learning

In parallel with the rapid adoption of Artificial Intelligence (AI) empowered by advances in AI research, there have been growing awareness and concerns of data privacy. Recent significant developments in the data regulation landscape have prompted a seismic shift in interest towards privacy-preserving AI. This has contributed to the popularity of Federated Learning (FL), the leading paradigm for the training of machine learning models on data silos in a privacy-preserving manner. In this survey, we explore the domain of Personalized FL (PFL) to address the fundamental challenges of FL on heterogeneous data, a universal characteristic inherent in all real-world datasets. We analyze the key motivations for PFL and present a unique taxonomy of PFL techniques categorized according to the key challenges and personalization strategies in PFL. We highlight their key ideas, challenges and opportunities and envision promising future trajectories of research towards new PFL architectural design, realistic PFL benchmarking, and trustworthy PFL approaches.

preprint2022arXiv

Towards Verifiable Federated Learning

Federated learning (FL) is an emerging paradigm of collaborative machine learning that preserves user privacy while building powerful models. Nevertheless, due to the nature of open participation by self-interested entities, it needs to guard against potential misbehaviours by legitimate FL participants. FL verification techniques are promising solutions for this problem. They have been shown to effectively enhance the reliability of FL networks and help build trust among participants. Verifiable federated learning has become an emerging topic of research that has attracted significant interest from the academia and the industry alike. Currently, there is no comprehensive survey on the field of verifiable federated learning, which is interdisciplinary in nature and can be challenging for researchers to enter into. In this paper, we bridge this gap by reviewing works focusing on verifiable FL. We propose a novel taxonomy for verifiable FL covering both centralised and decentralised FL settings, summarise the commonly adopted performance evaluation approaches, and discuss promising directions towards a versatile verifiable FL framework.

preprint2021arXiv

Advances and Open Problems in Federated Learning

Federated learning (FL) is a machine learning setting where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while keeping the training data decentralized. FL embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs resulting from traditional, centralized machine learning and data science approaches. Motivated by the explosive growth in FL research, this paper discusses recent advances and presents an extensive collection of open problems and challenges.

preprint2021arXiv

Fourier decay of self-similar measures and self-similar sets of uniqueness

In this paper, we investigate the Fourier transform of self-similar measures on R. We provide quantitative decay rates of Fourier transform of some self-similar measures. Our method is based on random walks on lattices and Diophantine approximation in number fields. We also completely identify all self-similar sets which are sets of uniqueness. This generalizes a classical result of Salem and Zygmund.

preprint2021arXiv

On the Hausdorff dimension of microsets

We investigate how the Hausdorff dimensions of microsets are related to the dimensions of the original set. It is known that the maximal dimension of a microset is the Assouad dimension of the set. We prove that the lower dimension can analogously be obtained as the minimal dimension of a microset. In particular, the maximum and minimum exist. We also show that for an arbitrary $\mathcal{F}_σ$ set $Δ\subseteq [0,d]$ containing its infimum and supremum there is a compact set in $[0,1]^d$ for which the set of Hausdorff dimensions attained by its microsets is exactly equal to the set $Δ$. Our work is motivated by the general programme of determining what geometric information about a set can be determined at the level of tangents.

preprint2021arXiv

Rational points near self-similar sets

In this paper, we consider a problem of counting rational points near self-similar sets. Let $n\geq 1$ be an integer. We shall show that for some self-similar measures on $\mathbb{R}^n$, the set of rational points $\mathbb{Q}^n$ is 'equidistributed' in a sense that will be introduced in this paper. This implies that an inhomogeneous Khinchine convergence type result can be proved for those measures. In particular, for $n=1$ and large enough integers $p,$ the above holds for the middle-$p$th Cantor measure, i.e. the natural Hausdorff measure on the set of numbers whose base $p$ expansions do not have digit $[(p-1)/2].$ Furthermore, we partially proved a conjecture of Bugeaud and Durand for the middle-$p$th Cantor set and this also answers a question posed by Levesley, Salp and Velani. Our method includes a fine analysis of the Fourier coefficients of self-similar measures together with an Erdős-Kahane type argument. We will also provide a numerical argument to show that $p>10^7$ is sufficient for the above conclusions. In fact, $p\geq 15$ is already enough for most of the above conclusions.

preprint2020arXiv

A Robust Spearman Correlation Coefficient Permutation Test

In this work, we show that Spearman's correlation coefficient test about $H_0:ρ_s=0$ found in most statistical software packages is theoretically incorrect and performs poorly when bivariate normality assumptions are not met or the sample size is small. The historical works about these tests make an unverifiable assumption that the approximate bivariate normality of original data justifies using classic approximations. In general, there is common misconception that the tests about $ρ_s=0$ are robust to deviations from bivariate normality. In fact, we found under certain scenarios violation of the bivariate normality assumption has severe effects on type I error control for the most commonly utilized tests. To address this issue, we developed a robust permutation test for testing the general hypothesis $H_0: ρ_s=0$. The proposed test is based on an appropriately studentized statistic. We will show that the test is theoretically asymptotically valid in the general setting when two paired variables are uncorrelated but dependent. This desired property was demonstrated across a range of distributional assumptions and sample sizes in simulation studies, where the proposed test exhibits robust type I error control across a variety of settings, even when the sample size is small. We demonstrated the application of this test in real world examples of transcriptomic data of the TCGA breast cancer patients and a data set of PSA levels and age.

preprint2020arXiv

AliExpress Learning-To-Rank: Maximizing Online Model Performance without Going Online

Learning-to-rank (LTR) has become a key technology in E-commerce applications. Most existing LTR approaches follow a supervised learning paradigm from offline labeled data collected from the online system. However, it has been noticed that previous LTR models can have a good validation performance over offline validation data but have a poor online performance, and vice versa, which implies a possible large inconsistency between the offline and online evaluation. We investigate and confirm in this paper that such inconsistency exists and can have a significant impact on AliExpress Search. Reasons for the inconsistency include the ignorance of item context during the learning, and the offline data set is insufficient for learning the context. Therefore, this paper proposes an evaluator-generator framework for LTR with item context. The framework consists of an evaluator that generalizes to evaluate recommendations involving the context, and a generator that maximizes the evaluator score by reinforcement learning, and a discriminator that ensures the generalization of the evaluator. Extensive experiments in simulation environments and AliExpress Search online system show that, firstly, the classic data-based metrics on the offline dataset can show significant inconsistency with online performance, and can even be misleading. Secondly, the proposed evaluator score is significantly more consistent with the online performance than common ranking metrics. Finally, as the consequence, our method achieves a significant improvement (\textgreater$2\%$) in terms of Conversion Rate (CR) over the industrial-level fine-tuned model in online A/B tests.

preprint2020arXiv

Dyadic Approximation in the Middle-Third Cantor Set

In this paper, we study the metric theory of dyadic approximation in the middle-third Cantor set. This theory complements earlier work of Levesley, Salp, and Velani (2007), who investigated the problem of approximation in the Cantor set by triadic rationals. We find that the behaviour when we consider dyadic approximation in the Cantor set is substantially different to considering triadic approximation in the Cantor set. In some sense, this difference in behaviour is a manifestation of Furstenberg's times 2 times 3 phenomenon from dynamical systems, which asserts that the base 2 and base 3 expansions of a number are not both structured.

preprint2020arXiv

FedCoin: A Peer-to-Peer Payment System for Federated Learning

Federated learning (FL) is an emerging collaborative machine learning method to train models on distributed datasets with privacy concerns. To properly incentivize data owners to contribute their efforts, Shapley Value (SV) is often adopted to fairly assess their contribution. However, the calculation of SV is time-consuming and computationally costly. In this paper, we propose FedCoin, a blockchain-based peer-to-peer payment system for FL to enable a feasible SV based profit distribution. In FedCoin, blockchain consensus entities calculate SVs and a new block is created based on the proof of Shapley (PoSap) protocol. It is in contrast to the popular BitCoin network where consensus entities "mine" new blocks by solving meaningless puzzles. Based on the computed SVs, a scheme for dividing the incentive payoffs among FL clients with nonrepudiation and tamper-resistance properties is proposed. Experimental results based on real-world data show that FedCoin can promote high-quality data from FL clients through accurately computing SVs with an upper bound on the computational resources required for reaching consensus. It opens opportunities for non-data owners to play a role in FL.

preprint2020arXiv

FedVision: An Online Visual Object Detection Platform Powered by Federated Learning

Visual object detection is a computer vision-based artificial intelligence (AI) technique which has many practical applications (e.g., fire hazard monitoring). However, due to privacy concerns and the high cost of transmitting video data, it is highly challenging to build object detection models on centrally stored large training datasets following the current approach. Federated learning (FL) is a promising approach to resolve this challenge. Nevertheless, there currently lacks an easy to use tool to enable computer vision application developers who are not experts in federated learning to conveniently leverage this technology and apply it in their systems. In this paper, we report FedVision - a machine learning engineering platform to support the development of federated learning powered computer vision applications. The platform has been deployed through a collaboration between WeBank and Extreme Vision to help customers develop computer vision-based safety monitoring solutions in smart city applications. Over four months of usage, it has achieved significant efficiency improvement and cost reduction while removing the need to transmit sensitive data for three major corporate customers. To the best of our knowledge, this is the first real application of FL in computer vision-based tasks.

preprint2020arXiv

FOCUS: Dealing with Label Quality Disparity in Federated Learning

Ubiquitous systems with End-Edge-Cloud architecture are increasingly being used in healthcare applications. Federated Learning (FL) is highly useful for such applications, due to silo effect and privacy preserving. Existing FL approaches generally do not account for disparities in the quality of local data labels. However, the clients in ubiquitous systems tend to suffer from label noise due to varying skill-levels, biases or malicious tampering of the annotators. In this paper, we propose Federated Opportunistic Computing for Ubiquitous Systems (FOCUS) to address this challenge. It maintains a small set of benchmark samples on the FL server and quantifies the credibility of the client local data without directly observing them by computing the mutual cross-entropy between performance of the FL model on the local datasets and that of the client local FL model on the benchmark dataset. Then, a credit weighted orchestration is performed to adjust the weight assigned to clients in the FL model based on their credibility values. FOCUS has been experimentally evaluated on both synthetic data and real-world data. The results show that it effectively identifies clients with noisy labels and reduces their impact on the model performance, thereby significantly outperforming existing FL approaches.

preprint2020arXiv

Multi-Participant Multi-Class Vertical Federated Learning

Federated learning (FL) is a privacy-preserving paradigm for training collective machine learning models with locally stored data from multiple participants. Vertical federated learning (VFL) deals with the case where participants sharing the same sample ID space but having different feature spaces, while label information is owned by one participant. Current studies of VFL only support two participants, and mostly focus on binaryclass logistic regression problems. In this paper, we propose the Multi-participant Multi-class Vertical Federated Learning (MMVFL) framework for multi-class VFL problems involving multiple parties. Extending the idea of multi-view learning (MVL), MMVFL enables label sharing from its owner to other VFL participants in a privacypreserving manner. To demonstrate the effectiveness of MMVFL, a feature selection scheme is incorporated into MMVFL to compare its performance against supervised feature selection and MVL-based approaches. Experiment results on real-world datasets show that MMVFL can effectively share label information among multiple VFL participants and match multi-class classification performance of existing approaches.

preprint2020arXiv

Optimal Procurement Auction for Cooperative Production of Virtual Products: Vickrey-Clarke-Groves Meet Cremer-McLean

We set up a supply-side game-theoretic model for the cooperative production of virtual products. In our model, a group of producers collaboratively produce a virtual product by contributing costly input resources to a production coalition. Producers are capacitated, i.e., they cannot contribute more resources than their capacity limits. Our model is an abstraction of emerging internet-based business models such as federated learning and crowd computing. To maintain an efficient and stable production coalition, the coordinator should share with producers the income brought by the virtual product. Besides the demand-side information asymmetry, another two sources of supply-side information asymmetry intertwined in this problem: 1) the capacity limit of each producer and 2) the cost incurred to each producer. In this paper, we rigorously prove that a supply-side mechanism from the VCG family, PVCG, can overcome such multiple information asymmetry and guarantee truthfulness. Furthermore, with some reasonable assumptions, PVCG simultaneously attains truthfulness, ex-post allocative efficiency, ex-post individual rationality, and ex-post weak budget balancedness on the supply side, easing the well-known tension between these four objectives in the mechanism design literature.

preprint2020arXiv

SCNet: A Neural Network for Automated Side-Channel Attack

The side-channel attack is an attack method based on the information gained about implementations of computer systems, rather than weaknesses in algorithms. Information about system characteristics such as power consumption, electromagnetic leaks and sound can be exploited by the side-channel attack to compromise the system. Much research effort has been directed towards this field. However, such an attack still requires strong skills, thus can only be performed effectively by experts. Here, we propose SCNet, which automatically performs side-channel attacks. And we also design this network combining with side-channel domain knowledge and different deep learning model to improve the performance and better to explain the result. The results show that our model achieves good performance with fewer parameters. The proposed model is a useful tool for automatically testing the robustness of computer systems.

preprint2020arXiv

Threats to Federated Learning: A Survey

With the emergence of data silos and popular privacy awareness, the traditional centralized approach of training artificial intelligence (AI) models is facing strong challenges. Federated learning (FL) has recently emerged as a promising solution under this new reality. Existing FL protocol design has been shown to exhibit vulnerabilities which can be exploited by adversaries both within and without the system to compromise data privacy. It is thus of paramount importance to make FL system designers to be aware of the implications of future FL algorithm design on privacy-preservation. Currently, there is no survey on this topic. In this paper, we bridge this important gap in FL literature. By providing a concise introduction to the concept of FL, and a unique taxonomy covering threat models and two major attacks on FL: 1) poisoning attacks and 2) inference attacks, this paper provides an accessible review of this important topic. We highlight the intuitions, key techniques as well as fundamental assumptions adopted by various attacks, and discuss promising future research directions towards more robust privacy preservation in FL.

preprint2020arXiv

Towards Fair and Privacy-Preserving Federated Deep Models

The current standalone deep learning framework tends to result in overfitting and low utility. This problem can be addressed by either a centralized framework that deploys a central server to train a global model on the joint data from all parties, or a distributed framework that leverages a parameter server to aggregate local model updates. Server-based solutions are prone to the problem of a single-point-of-failure. In this respect, collaborative learning frameworks, such as federated learning (FL), are more robust. Existing federated learning frameworks overlook an important aspect of participation: fairness. All parties are given the same final model without regard to their contributions. To address these issues, we propose a decentralized Fair and Privacy-Preserving Deep Learning (FPPDL) framework to incorporate fairness into federated deep learning models. In particular, we design a local credibility mutual evaluation mechanism to guarantee fairness, and a three-layer onion-style encryption scheme to guarantee both accuracy and privacy. Different from existing FL paradigm, under FPPDL, each participant receives a different version of the FL model with performance commensurate with his contributions. Experiments on benchmark datasets demonstrate that FPPDL balances fairness, privacy and accuracy. It enables federated learning ecosystems to detect and isolate low-contribution parties, thereby promoting responsible participation.

preprint2016arXiv

A Survey on Artificial Intelligence and Data Mining for MOOCs

Massive Open Online Courses (MOOCs) have gained tremendous popularity in the last few years. Thanks to MOOCs, millions of learners from all over the world have taken thousands of high-quality courses for free. Putting together an excellent MOOC ecosystem is a multidisciplinary endeavour that requires contributions from many different fields. Artificial intelligence (AI) and data mining (DM) are two such fields that have played a significant role in making MOOCs what they are today. By exploiting the vast amount of data generated by learners engaging in MOOCs, DM improves our understanding of the MOOC ecosystem and enables MOOC practitioners to deliver better courses. Similarly, AI, supported by DM, can greatly improve student experience and learning outcomes. In this survey paper, we first review the state-of-the-art artificial intelligence and data mining research applied to MOOCs, emphasising the use of AI and DM tools and techniques to improve student engagement, learning outcomes, and our understanding of the MOOC ecosystem. We then offer an overview of key trends and important research to carry out in the fields of AI and DM so that MOOCs can reach their full potential.

preprint2016arXiv

Building Robust Crowdsourcing Systems with Reputation-aware Decision Support Techniques

Crowdsourcing refers to the arrangement in which contributions are solicited from a large group of unrelated people. Due to this nature, crowdsourcers (or task requesters) often face uncertainty about the workers' capabilities which, in turn, affects the quality and timeliness of the results obtained. Trust is a mechanism used by people to facilitate interactions in human societies where risk and uncertain are common. The crucial challenge to building a robust crowdsourcing system is how to make trust-aware task delegation decisions to efficiently utilize the capacities of workers (or trustee agents) to achieve high social welfare? This book presents the research addressing this challenge. It goes beyond the existing trust management research framework by removing a widespread assumption implicitly adopted by existing research: that a trustee agent can process an unlimited number of interaction requests per discrete time unit without compromising its performance as perceived by the task requesters (or truster agents). Decision support in crowdsourcing is re-formalized as a multi-agent trust game based on the principles of the Congestion Game, which is solved by two trust-aware interaction decision-making approaches: 1) the Social Welfare Optimizing approach for Reputation-aware Decision-making (SWORD) approach, and 2) the Distributed Request Acceptance approach for Fair utilization of Trustee agents (DRAFT). SWORD is designed for centralized systems, while DRAFT is designed for fully distributed systems. Theoretical analyses have shown that the social welfare produced by these two approaches can be made closer to optimal by adjusting only one key parameter. With these two approaches, the framework of research for crowdsourcing systems can be enriched to handle more realistic scenarios where workers have varied and limited capabilities.

preprint2016arXiv

Surveillance Video Parsing with Single Frame Supervision

Surveillance video parsing, which segments the video frames into several labels, e.g., face, pants, left-leg, has wide applications. However,pixel-wisely annotating all frames is tedious and inefficient. In this paper, we develop a Single frame Video Parsing (SVP) method which requires only one labeled frame per video in training stage. To parse one particular frame, the video segment preceding the frame is jointly considered. SVP (1) roughly parses the frames within the video segment, (2) estimates the optical flow between frames and (3) fuses the rough parsing results warped by optical flow to produce the refined parsing result. The three components of SVP, namely frame parsing, optical flow estimation and temporal fusion are integrated in an end-to-end manner. Experimental results on two surveillance video datasets show the superiority of SVP over state-of-the-arts.

preprint2014arXiv

An Empirical Analysis of Task Allocation in Scrum-based Agile Programming

Agile Software Development (ASD) methodology has become widely used in the industry. Understanding the challenges facing software engineering students is important to designing effective training methods to equip students with proper skills required for effectively using the ASD techniques. Existing empirical research mostly focused on eXtreme Programming (XP) based ASD methodologies. There is a lack of empirical studies about Scrum-based ASD programming which has become the most popular agile methodology among industry practitioners. In this paper, we present empirical findings regarding the aspects of task allocation decision-making, collaboration, and team morale related to the Scrum ASD process which have not yet been well studied by existing research. We draw our findings from a 12 week long course work project in 2014 involving 125 undergraduate software engineering students from a renowned university working in 21 Scrum teams. Instead of the traditional survey or interview based methods, which suffer from limitations in scale and level of details, we obtain fine grained data through logging students' activities in our online agile project management (APM) platform - HASE. During this study, the platform logged over 10,000 ASD activities. Deviating from existing preconceptions, our results suggest negative correlations between collaboration and team performance as well as team morale.

preprint2014arXiv

An Evolutionary Approach for Optimizing Hierarchical Multi-Agent System Organization

It has been widely recognized that the performance of a multi-agent system is highly affected by its organization. A large scale system may have billions of possible ways of organization, which makes it impractical to find an optimal choice of organization using exhaustive search methods. In this paper, we propose a genetic algorithm aided optimization scheme for designing hierarchical structures of multi-agent systems. We introduce a novel algorithm, called the hierarchical genetic algorithm, in which hierarchical crossover with a repair strategy and mutation of small perturbation are used. The phenotypic hierarchical structure space is translated to the genome-like array representation space, which makes the algorithm genetic-operator-literate. A case study with 10 scenarios of a hierarchical information retrieval model is provided. Our experiments have shown that competitive baseline structures which lead to the optimal organization in terms of utility can be found by the proposed algorithm during the evolutionary search. Compared with the traditional genetic operators, the newly introduced operators produced better organizations of higher utility more consistently in a variety of test cases. The proposed algorithm extends of the search processes of the state-of-the-art multi-agent organization design methodologies, and is more computationally efficient in a large search space.

preprint2014arXiv

Designing Socially Intelligent Virtual Companions

Virtual companions that interact with users in a socially complex environment require a wide range of social skills. Displaying curiosity is simultaneously a factor to improve a companion's believability and to unobtrusively affect the user's activities over time. Curiosity represents a drive to know new things. It is a major driving force for engaging learners in active learning. Existing research work pays little attention in curiosity. In this paper, we enrich the social skills of a virtual companion by infusing curiosity into its mental model. A curious companion residing in a Virtual Learning Environment (VLE) to stimulate user's curiosity is proposed. The curious companion model is developed based on multidisciplinary considerations. The effectiveness of the curious companion is demonstrated by a preliminary field study.

preprint2014arXiv

Identifying Talented Software Engineering Students through Data-driven Skill Assessment

For software development companies, one of the most important objectives is to identify and acquire talented software engineers in order to maintain a skilled team that can produce competitive products. Traditional approaches for finding talented young software engineers are mainly through programming contests of various forms which mostly test participants' programming skills. However, successful software engineering in practice requires a wider range of skills from team members including analysis, design, programming, testing, communication, collaboration, and self-management, etc. In this paper, we explore potential ways to identify talented software engineering students in a data-driven manner through an Agile Project Management (APM) platform. Through our proposed HASE online APM tool, we conducted a study involving 21 Scrum teams consisting of over 100 undergraduate software engineering students in multi-week coursework projects in 2014. During this study, students performed over 10,000 ASD activities logged by HASE. We demonstrate the possibility and potentials of this new research direction, and discuss its implications for software engineering education and industry recruitment.

preprint2013arXiv

A Fast local Reconstruction algorithm by selective backprojection for Low-Dose in Dental Computed Tomography

High radiation dose in computed tomography (CT) scans increases the lifetime risk of cancer, which become a major clinical concern. The backprojection-filtration (BPF) algorithm could reduce radiation dose by reconstructing images from truncated data in a short scan. In dental CT, it could reduce radiation dose for the teeth by using the projection acquired in a short scan, and could avoid irradiation to other part by using truncated projection. However, the limit of integration for backprojection varies per PI-line, resulting in low calculation efficiency and poor parallel performance. Recently, a tent BPF (T-BPF) has been proposed to improve calculation efficiency by rearranging projection. However, the memory-consuming data rebinning process is included. Accordingly, the chose-BPF (C-BPF) algorithm is proposed in this paper. In this algorithm, the derivative of projection is backprojected to the points whose x coordinate is less than that of the source focal spot to obtain the differentiated backprojection (DBP). The finite Hilbert inverse is then applied to each PI-line segment. C-BPF avoids the influence of the variable limit of integration by selective backprojection without additional time cost or memory cost. The simulation experiment and the real experiment demonstrated the higher reconstruction efficiency of C-BPF.

Han Yu

What is connected

Connect this record

See the researcher in context

Building this map preview

46 published item(s)

Counting rationals and diophantine approximation in missing-digit Cantor sets

Federated Nested Learning: Collaborative Training of Self-Referential Memories for Test-Time Adaptation

Evolutionary Carrier Selection for Shared Truck Delivery Services

Stochastic Qubit Resource Allocation for Quantum Cloud Computing

A note on dyadic approximation in Cantor's set

Adaptive Memory Networks with Self-supervised Learning for Unsupervised Anomaly Detection

Adaptive Resource Allocation in Quantum Key Distribution (QKD) for Federated Learning

Bernoulli convolutions with Garsia parameters in $(1,\sqrt{2}]$ have continuous density functions

Bias Reducing Multitask Learning on Mental Health Prediction

Federated Learning for Personalized Humor Recognition

Fractal projections with an application in number theory

Heterogeneous Federated Learning via Grouped Sequential-to-Parallel Training

MarS-FL: Enabling Competitors to Collaborate in Federated Learning

More to Less (M2L): Enhanced Health Recognition in the Wild with Reduced Modality of Wearable Sensors

NICO++: Towards Better Benchmarking for Domain Generalization

On the metric theory of multiplicative Diophantine approximation

Privacy and Robustness in Federated Learning: Attacks and Defenses

Resource Allocation in Quantum Key Distribution (QKD) for Space-Air-Ground Integrated Networks

Robust Optimization with Decision-Dependent Information Discovery

Semi-Supervised Learning and Data Augmentation in Wearable-based Momentary Stress Detection in the Wild

Times two, three, five orbits on $\mathbb{T}^2$

Towards Personalized Federated Learning

Towards Verifiable Federated Learning

Advances and Open Problems in Federated Learning

Fourier decay of self-similar measures and self-similar sets of uniqueness

On the Hausdorff dimension of microsets

Rational points near self-similar sets

A Robust Spearman Correlation Coefficient Permutation Test

AliExpress Learning-To-Rank: Maximizing Online Model Performance without Going Online

Dyadic Approximation in the Middle-Third Cantor Set

FedCoin: A Peer-to-Peer Payment System for Federated Learning

FedVision: An Online Visual Object Detection Platform Powered by Federated Learning

FOCUS: Dealing with Label Quality Disparity in Federated Learning

Multi-Participant Multi-Class Vertical Federated Learning

Optimal Procurement Auction for Cooperative Production of Virtual Products: Vickrey-Clarke-Groves Meet Cremer-McLean

SCNet: A Neural Network for Automated Side-Channel Attack

Threats to Federated Learning: A Survey

Towards Fair and Privacy-Preserving Federated Deep Models

A Survey on Artificial Intelligence and Data Mining for MOOCs

Building Robust Crowdsourcing Systems with Reputation-aware Decision Support Techniques

Surveillance Video Parsing with Single Frame Supervision

An Empirical Analysis of Task Allocation in Scrum-based Agile Programming

An Evolutionary Approach for Optimizing Hierarchical Multi-Agent System Organization

Designing Socially Intelligent Virtual Companions

Identifying Talented Software Engineering Students through Data-driven Skill Assessment

A Fast local Reconstruction algorithm by selective backprojection for Low-Dose in Dental Computed Tomography