Research connected to "machine learning"

Search papers, authors, topics, institutions and opportunities, then move straight into the graph around the result.

FiltersOptional

Search results

Showing works 1,313-1,344 from 49,008 works in Machine Learning. Use pages to browse more, or open the graph for the map.

49,008matching works
Full topic scaleMachine Learning

49,008 works and 109,744 authors are indexed for this topic. This page shows 32 works at a time so search stays fast.

Match modeExact match focus
Semantic hits0
Active filters0
Graph viewOpen

Papers

preprint2020arXiv

Learning Goal-Oriented Visual Dialog via Tempered Policy Gradient

Learning goal-oriented dialogues by means of deep reinforcement learning has recently become a popular research topic. However, commonly used policy-based dialogue agents often end up focusing on simple utterances and suboptimal policies. To mitigate this problem, we propose a class of novel temperature-based extensions for policy gradient methods, which are referred to as Tempered Policy Gradients (TPGs). On a recent AI-testbed, i.e., the GuessWhat?! game, we achieve significant improvements with two innovations. The first one is an extension of the state-of-the-art solutions with Seq2Seq and Memory Network structures that leads to an improvement of 7%. The second one is the application of our newly developed TPG methods, which improves the performance additionally by around 5% and, even more importantly, helps produce more convincing utterances.

preprint2020arXiv

Learning from Imperfect Annotations

Many machine learning systems today are trained on large amounts of human-annotated data. Data annotation tasks that require a high level of competency make data acquisition expensive, while the resulting labels are often subjective, inconsistent, and may contain a variety of human biases. To improve the data quality, practitioners often need to collect multiple annotations per example and aggregate them before training models. Such a multi-stage approach results in redundant annotations and may often produce imperfect "ground truth" that may limit the potential of training accurate machine learning models. We propose a new end-to-end framework that enables us to: (i) merge the aggregation step with model training, thus allowing deep learning systems to learn to predict ground truth estimates directly from the available data, and (ii) model difficulties of examples and learn representations of the annotators that allow us to estimate and take into account their competencies. Our approach is general and has many applications, including training more accurate models on crowdsourced data, ensemble learning, as well as classifier accuracy estimation from unlabeled data. We cond

preprint2022arXiv

Examining Uniqueness and Permanence of the WAY EEG GAL dataset toward User Authentication

This study evaluates the discriminating capacity (uniqueness) of the EEG data from the WAY EEG GAL public dataset to authenticate individuals against one another as well as its permanence. In addition to the EEG data, Luciw et al. provide EMG (Electromyography), and kinematics data for engineers and researchers to utilize WAY EEG GAL for further studies. However, evaluating the EMG and kinematics data is outside the scope of this study. The goal of the state-of-the-art is to determine whether EEG data can be utilized to control prosthetic devices. On the other hand, this study aims to evaluate the separability of individuals through EEG data to perform user authentication. A feature importance algorithm is utilized to select the best features for each user to authenticate them against all others. The authentication platform implemented for this study is based on Machine Learning models/classifiers. As an initial test, two pilot studies are performed using Linear Discriminant Analysis (LDA) and Support Vector Machine (SVM) to observe the learning trends of the models by multi-labeling the EEG dataset. Utilizing kNN first as the classifier for user authentication, accuracy around 75% is observed. Thereafter to improve the performance both linear and non-linear SVMs are used to perform classification. The overall average accuracies of 85.18% and 86.92% are achieved using linear and non-linear SVMs respectively. In addition to accuracy, F1 scores are also calculated. The overall average F1 score of 87.51% and 88.94% are achieved for linear and non-linear SVMs respectively. Beyond the overall performance, high performing individuals with 95.3% accuracy (95.3% F1 score) using linear SVM and 97.4% accuracy (97.3% F1 score) using non-linear SVM are also observed.

preprint2022arXiv

Measuring Fairness of Text Classifiers via Prediction Sensitivity

With the rapid growth in language processing applications, fairness has emerged as an important consideration in data-driven solutions. Although various fairness definitions have been explored in the recent literature, there is lack of consensus on which metrics most accurately reflect the fairness of a system. In this work, we propose a new formulation : ACCUMULATED PREDICTION SENSITIVITY, which measures fairness in machine learning models based on the model's prediction sensitivity to perturbations in input features. The metric attempts to quantify the extent to which a single prediction depends on a protected attribute, where the protected attribute encodes the membership status of an individual in a protected group. We show that the metric can be theoretically linked with a specific notion of group fairness (statistical parity) and individual fairness. It also correlates well with humans' perception of fairness. We conduct experiments on two text classification datasets : JIGSAW TOXICITY, and BIAS IN BIOS, and evaluate the correlations between metrics and manual annotations on whether the model produced a fair outcome. We observe that the proposed fairness metric based on prediction sensitivity is statistically significantly more correlated with human annotation than the existing counterfactual fairness metric.

preprint2022arXiv

Combining Diverse Feature Priors

To improve model generalization, model designers often restrict the features that their models use, either implicitly or explicitly. In this work, we explore the design space of leveraging such feature priors by viewing them as distinct perspectives on the data. Specifically, we find that models trained with diverse sets of feature priors have less overlapping failure modes, and can thus be combined more effectively. Moreover, we demonstrate that jointly training such models on additional (unlabeled) data allows them to correct each other's mistakes, which, in turn, leads to better generalization and resilience to spurious correlations. Code available at https://github.com/MadryLab/copriors

preprint2022arXiv

Local Latin Hypercube Refinement for Multi-objective Design Uncertainty Optimization

Optimizing the reliability and the robustness of a design is important but often unaffordable due to high sample requirements. Surrogate models based on statistical and machine learning methods are used to increase the sample efficiency. However, for higher dimensional or multi-modal systems, surrogate models may also require a large amount of samples to achieve good results. We propose a sequential sampling strategy for the surrogate based solution of multi-objective reliability based robust design optimization problems. Proposed local Latin hypercube refinement (LoLHR) strategy is model-agnostic and can be combined with any surrogate model because there is no free lunch but possibly a budget one. The proposed method is compared to stationary sampling as well as other proposed strategies from the literature. Gaussian process and support vector regression are both used as surrogate models. Empirical evidence is presented, showing that LoLHR achieves on average better results compared to other surrogate based strategies on the tested examples.

preprint2022arXiv

Sampling Permutations for Shapley Value Estimation

Game-theoretic attribution techniques based on Shapley values are used to interpret black-box machine learning models, but their exact calculation is generally NP-hard, requiring approximation methods for non-trivial models. As the computation of Shapley values can be expressed as a summation over a set of permutations, a common approach is to sample a subset of these permutations for approximation. Unfortunately, standard Monte Carlo sampling methods can exhibit slow convergence, and more sophisticated quasi-Monte Carlo methods have not yet been applied to the space of permutations. To address this, we investigate new approaches based on two classes of approximation methods and compare them empirically. First, we demonstrate quadrature techniques in a RKHS containing functions of permutations, using the Mallows kernel in combination with kernel herding and sequential Bayesian quadrature. The RKHS perspective also leads to quasi-Monte Carlo type error bounds, with a tractable discrepancy measure defined on permutations. Second, we exploit connections between the hypersphere $\mathbb{S}^{d-2}$ and permutations to create practical algorithms for generating permutation samples with good properties. Experiments show the above techniques provide significant improvements for Shapley value estimates over existing methods, converging to a smaller RMSE in the same number of model evaluations.

preprint2022arXiv

Deep Capsule Encoder-Decoder Network for Surrogate Modeling and Uncertainty Quantification

We propose a novel \textit{capsule} based deep encoder-decoder model for surrogate modeling and uncertainty quantification of systems in mechanics from sparse data. The proposed framework is developed by adapting Capsule Network (CapsNet) architecture into image-to-image regression encoder-decoder network. Specifically, the aim is to exploit the benefits of CapsNet over convolution neural network (CNN) $-$ retaining pose and position information related to an entity to name a few. The performance of proposed approach is illustrated by solving an elliptic stochastic partial differential equation (SPDE), which also governs systems in mechanics such as steady heat conduction, ground water flow or other diffusion processes, based uncertainty quantification problem with an input dimensionality of $1024$. However, the problem definition does not the restrict the random diffusion field to a particular covariance structure, and the more strenuous task of response prediction for an arbitrary diffusion field is solved. The obtained results from performance evaluation indicate that the proposed approach is accurate, efficient, and robust.

preprint2025arXiv

Can machines think efficiently?

The Turing Test is no longer adequate for distinguishing human and machine intelligence. With advanced artificial intelligence systems already passing the original Turing Test and contributing to serious ethical and environmental concerns, we urgently need to update the test. This work expands upon the original imitation game by accounting for an additional factor: the energy spent answering the questions. By adding the constraint of energy, the new test forces us to evaluate intelligence through the lens of efficiency, connecting the abstract problem of thinking to the concrete reality of finite resources. Further, this proposed new test ensures the evaluation of intelligence has a measurable, practical finish line that the original test lacks. This additional constraint compels society to weigh the time savings of using artificial intelligence against its total resource cost.

preprint2012arXiv

Nonparametric Bayesian Estimation of Periodic Functions

Many real world problems exhibit patterns that have periodic behavior. For example, in astrophysics, periodic variable stars play a pivotal role in understanding our universe. An important step when analyzing data from such processes is the problem of identifying the period: estimating the period of a periodic function based on noisy observations made at irregularly spaced time points. This problem is still a difficult challenge despite extensive study in different disciplines. The paper makes several contributions toward solving this problem. First, we present a nonparametric Bayesian model for period finding, based on Gaussian Processes (GP), that does not make strong assumptions on the shape of the periodic function. As our experiments demonstrate, the new model leads to significantly better results in period estimation when the target function is non-sinusoidal. Second, we develop a new algorithm for parameter optimization for GP which is useful when the likelihood function is very sensitive to the setting of the hyper-parameters with numerous local minima, as in the case of period estimation. The algorithm combines gradient optimization with grid search and incorporates severa

preprint2019arXiv

Multiview Representation Learning for a Union of Subspaces

Canonical correlation analysis (CCA) is a popular technique for learning representations that are maximally correlated across multiple views in data. In this paper, we extend the CCA based framework for learning a multiview mixture model. We show that the proposed model and a set of simple heuristics yield improvements over standard CCA, as measured in terms of performance on downstream tasks. Our experimental results show that our correlation-based objective meaningfully generalizes the CCA objective to a mixture of CCA models.

preprint2023arXiv

MessageNet: Message Classification using Natural Language Processing and Meta-data

In this paper we propose a new Deep Learning (DL) approach for message classification. Our method is based on the state-of-the-art Natural Language Processing (NLP) building blocks, combined with a novel technique for infusing the meta-data input that is typically available in messages such as the sender information, timestamps, attached image, audio, affiliations, and more. As we demonstrate throughout the paper, going beyond the mere text by leveraging all available channels in the message, could yield an improved representation and higher classification accuracy. To achieve message representation, each type of input is processed in a dedicated block in the neural network architecture that is suitable for the data type. Such an implementation enables training all blocks together simultaneously, and forming cross channels features in the network. We show in the Experiments Section that in some cases, message's meta-data holds an additional information that cannot be extracted just from the text, and when using this information we achieve better performance. Furthermore, we demonstrate that our multi-modality block approach outperforms other approaches for injecting the meta data to the the text classifier.

preprint2022arXiv

Incident duration prediction using a bi-level machine learning framework with outlier removal and intra-extra joint optimisation

Predicting the duration of traffic incidents is a challenging task due to the stochastic nature of events. The ability to accurately predict how long accidents will last can provide significant benefits to both end-users in their route choice and traffic operation managers in handling of non-recurrent traffic congestion. This paper presents a novel bi-level machine learning framework enhanced with outlier removal and intra-extra joint optimisation for predicting the incident duration on three heterogeneous data sets collected for both arterial roads and motorways from Sydney, Australia and San-Francisco, U.S.A. Firstly, we use incident data logs to develop a binary classification prediction approach, which allows us to classify traffic incidents as short-term or long-term. We find the optimal threshold between short-term versus long-term traffic incident duration, targeting both class balance and prediction performance while also comparing the binary versus multi-class classification approaches. Secondly, for more granularity of the incident duration prediction to the minute level, we propose a new Intra-Extra Joint Optimisation algorithm (IEO-ML) which extends multiple baseline ML models tested against several regression scenarios across the data sets. Final results indicate that: a) 40-45 min is the best split threshold for identifying short versus long-term incidents and that these incidents should be modelled separately, b) our proposed IEO-ML approach significantly outperforms baseline ML models in $66\%$ of all cases showcasing its great potential for accurate incident duration prediction. Lastly, we evaluate the feature importance and show that time, location, incident type, incident reporting source and weather at among the top 10 critical factors which influence how long incidents will last.

preprint2022arXiv

Fractal Gaussian Networks: A sparse random graph model based on Gaussian Multiplicative Chaos

We propose a novel stochastic network model, called Fractal Gaussian Network (FGN), that embodies well-defined and analytically tractable fractal structures. Such fractal structures have been empirically observed in diverse applications. FGNs interpolate continuously between the popular purely random geometric graphs (a.k.a. the Poisson Boolean network), and random graphs with increasingly fractal behavior. In fact, they form a parametric family of sparse random geometric graphs that are parametrized by a fractality parameter which governs the strength of the fractal structure. FGNs are driven by the latent spatial geometry of Gaussian Multiplicative Chaos (GMC), a canonical model of fractality in its own right. We asymptotically characterize the expected number of edges, triangles, cliques and hub-and-spoke motifs in FGNs, unveiling a distinct pattern in their scaling with the size parameter of the network. We then examine the natural question of detecting the presence of fractality and the problem of parameter estimation based on observed network data, in addition to fundamental properties of the FGN as a random graph model. We also explore fractality in community structures by unveiling a natural stochastic block model in the setting of FGNs. Finally, we substantiate our results with phenomenological analysis of the FGN in the context of available scientific literature for fractality in networks, including applications to real-world massive network data.

preprint2022arXiv

Imbedding Deep Neural Networks

Continuous-depth neural networks, such as Neural ODEs, have refashioned the understanding of residual neural networks in terms of non-linear vector-valued optimal control problems. The common solution is to use the adjoint sensitivity method to replicate a forward-backward pass optimisation problem. We propose a new approach which explicates the network's `depth' as a fundamental variable, thus reducing the problem to a system of forward-facing initial value problems. This new method is based on the principle of `Invariant Imbedding' for which we prove a general solution, applicable to all non-linear, vector-valued optimal control problems with both running and terminal loss. Our new architectures provide a tangible tool for inspecting the theoretical--and to a great extent unexplained--properties of network depth. They also constitute a resource of discrete implementations of Neural ODEs comparable to classes of imbedded residual neural networks. Through a series of experiments, we show the competitive performance of the proposed architectures for supervised learning and time series prediction.

preprint2022arXiv

Enabling Reproducibility and Meta-learning Through a Lifelong Database of Experiments (LDE)

Artificial Intelligence (AI) development is inherently iterative and experimental. Over the course of normal development, especially with the advent of automated AI, hundreds or thousands of experiments are generated and are often lost or never examined again. There is a lost opportunity to document these experiments and learn from them at scale, but the complexity of tracking and reproducing these experiments is often prohibitive to data scientists. We present the Lifelong Database of Experiments (LDE) that automatically extracts and stores linked metadata from experiment artifacts and provides features to reproduce these artifacts and perform meta-learning across them. We store context from multiple stages of the AI development lifecycle including datasets, pipelines, how each is configured, and training runs with information about their runtime environment. The standardized nature of the stored metadata allows for querying and aggregation, especially in terms of ranking artifacts by performance metrics. We exhibit the capabilities of the LDE by reproducing an existing meta-learning study and storing the reproduced metadata in our system. Then, we perform two experiments on this metadata: 1) examining the reproducibility and variability of the performance metrics and 2) implementing a number of meta-learning algorithms on top of the data and examining how variability in experimental results impacts recommendation performance. The experimental results suggest significant variation in performance, especially depending on dataset configurations; this variation carries over when meta-learning is built on top of the results, with performance improving when using aggregated results. This suggests that a system that automatically collects and aggregates results such as the LDE not only assists in implementing meta-learning but may also improve its performance.

preprint2026arXiv

Tight Sample Complexity Bounds for Entropic Best Policy Identification

We study best-policy identification for finite-horizon risk-sensitive reinforcement learning under the entropic risk measure. Recent work established a constant gap in the exponential horizon dependence between lower and upper bounds on the number of samples required to identify an approximately optimal policy. Precisely, known lower bounds scale in $Ω(e^{|β| H})$ where $H$ is the horizon of the MDP, while the state-of-the-art upper bound achieves at best $O(e^{2|β| H})$ (arXiv:2506.00286v2) using a generative model. We show that this extra exponential factor can be traced to overly loose concentration control for exponential utilities. To close this open gap, we revisit the analysis of this problem through a forward-model based algorithm building on KL-based exploration bonuses that we adapt to the entropic criterion. The improvement we get is due to two main novel technical innovations. We leverage the smoothness properties of the exponential utility to derive sharper concentration bounds, and we propose a new stopping rule that exploits further this tightness to obtain a sample complexity that matches the lower bound.

preprint2022arXiv

A Survey on Active Deep Learning: From Model-driven to Data-driven

Which samples should be labelled in a large data set is one of the most important problems for trainingof deep learning. So far, a variety of active sample selection strategies related to deep learning havebeen proposed in many literatures. We defined them as Active Deep Learning (ADL) only if theirpredictor is deep model, where the basic learner is called as predictor and the labeling schemes iscalled selector. In this survey, three fundamental factors in selector designation were summarized. Wecategory ADL into model-driven ADL and data-driven ADL, by whether its selector is model-drivenor data-driven. The different characteristics of the two major type of ADL were addressed in indetail respectively. Furthermore, different sub-classes of data-driven and model-driven ADL are alsosummarized and discussed emphatically. The advantages and disadvantages between data-driven ADLand model-driven ADL are thoroughly analyzed. We pointed out that, with the development of deeplearning, the selector in ADL also is experiencing the stage from model-driven to data-driven. Finally,we make discussion on ADL about its uncertainty, explanatory, foundations of cognitive science etc.and survey on the trend of ADL from model-driven to data-driven.

preprint2022arXiv

Ginex: SSD-enabled Billion-scale Graph Neural Network Training on a Single Machine via Provably Optimal In-memory Caching

Recently, Graph Neural Networks (GNNs) have been receiving a spotlight as a powerful tool that can effectively serve various inference tasks on graph structured data. As the size of real-world graphs continues to scale, the GNN training system faces a scalability challenge. Distributed training is a popular approach to address this challenge by scaling out CPU nodes. However, not much attention has been paid to disk-based GNN training, which can scale up the single-node system in a more cost-effective manner by leveraging high-performance storage devices like NVMe SSDs. We observe that the data movement between the main memory and the disk is the primary bottleneck in the SSD-based training system, and that the conventional GNN training pipeline is sub-optimal without taking this overhead into account. Thus, we propose Ginex, the first SSD-based GNN training system that can process billion-scale graph datasets on a single machine. Inspired by the inspector-executor execution model in compiler optimization, Ginex restructures the GNN training pipeline by separating sample and gather stages. This separation enables Ginex to realize a provably optimal replacement algorithm, known as Belady's algorithm, for caching feature vectors in memory, which account for the dominant portion of I/O accesses. According to our evaluation with four billion-scale graph datasets, Ginex achieves 2.11x higher training throughput on average (up to 2.67x at maximum) than the SSD-extended PyTorch Geometric.

preprint2015arXiv

Beta diffusion trees and hierarchical feature allocations

We define the beta diffusion tree, a random tree structure with a set of leaves that defines a collection of overlapping subsets of objects, known as a feature allocation. A generative process for the tree structure is defined in terms of particles (representing the objects) diffusing in some continuous space, analogously to the Dirichlet diffusion tree (Neal, 2003), which defines a tree structure over partitions (i.e., non-overlapping subsets) of the objects. Unlike in the Dirichlet diffusion tree, multiple copies of a particle may exist and diffuse along multiple branches in the beta diffusion tree, and an object may therefore belong to multiple subsets of particles. We demonstrate how to build a hierarchically-clustered factor analysis model with the beta diffusion tree and how to perform inference over the random tree structures with a Markov chain Monte Carlo algorithm. We conclude with several numerical experiments on missing data problems with data sets of gene expression microarrays, international development statistics, and intranational socioeconomic measurements.

preprint2015arXiv

An Empirical Study of the L2-Boost technique with Echo State Networks

A particular case of Recurrent Neural Network (RNN) was introduced at the beginning of the 2000s under the name of Echo State Networks (ESNs). The ESN model overcomes the limitations during the training of the RNNs while introducing no significant disadvantages. Although the model presents some well-identified drawbacks when the parameters are not well initialised. The performance of an ESN is highly dependent on its internal parameters and pattern of connectivity of the hidden-hidden weights Often, the tuning of the network parameters can be hard and can impact in the accuracy of the models. In this work, we investigate the performance of a specific boosting technique (called L2-Boost) with ESNs as single predictors. The L2-Boost technique has been shown to be an effective tool to combine "weak" predictors in regression problems. In this study, we use an ensemble of random initialized ESNs (without control their parameters) as "weak" predictors of the boosting procedure. We evaluate our approach on five well-know time-series benchmark problems. Additionally, we compare this technique with a baseline approach that consists of averaging the prediction of an ensemble

preprint2020arXiv

Clustering via torque balance with mass and distance

Grouping similar objects is a fundamental tool of scientific analysis, ubiquitous in disciplines from biology and chemistry to astronomy and pattern recognition. Inspired by the torque balance that exists in gravitational interactions when galaxies merge, we propose a novel clustering method based on two natural properties of the universe: mass and distance. The concept of torque describing the interactions of mass and distance forms the basis of the proposed parameter-free clustering algorithm, which harnesses torque balance to recognize any cluster, regardless of shape, size, or density. The gravitational interactions govern the merger process, while the concept of torque balance reveals partitions that do not conform to the natural order for removal. Experiments on benchmark data sets show the enormous versatility of the proposed algorithm.

preprint2015arXiv

Optimum Statistical Estimation with Strategic Data Sources

We propose an optimum mechanism for providing monetary incentives to the data sources of a statistical estimator such as linear regression, so that high quality data is provided at low cost, in the sense that the sum of payments and estimation error is minimized. The mechanism applies to a broad range of estimators, including linear and polynomial regression, kernel regression, and, under some additional assumptions, ridge regression. It also generalizes to several objectives, including minimizing estimation error subject to budget constraints. Besides our concrete results for regression problems, we contribute a mechanism design framework through which to design and analyze statistical estimators whose examples are supplied by workers with cost for labeling said examples.

preprint2022arXiv

L$_0$onie: Compressing COINs with L$_0$-constraints

Advances in Implicit Neural Representations (INR) have motivated research on domain-agnostic compression techniques. These methods train a neural network to approximate an object, and then store the weights of the trained model. For example, given an image, a network is trained to learn the mapping from pixel locations to RGB values. In this paper, we propose L$_0$onie, a sparsity-constrained extension of the COIN compression method. Sparsity allows to leverage the faster learning of overparameterized networks, while retaining the desirable compression rate of smaller models. Moreover, our constrained formulation ensures that the final model respects a pre-determined compression rate, dispensing of the need for expensive architecture search.

preprint2020arXiv

Plan2Vec: Unsupervised Representation Learning by Latent Plans

In this paper we introduce plan2vec, an unsupervised representation learning approach that is inspired by reinforcement learning. Plan2vec constructs a weighted graph on an image dataset using near-neighbor distances, and then extrapolates this local metric to a global embedding by distilling path-integral over planned path. When applied to control, plan2vec offers a way to learn goal-conditioned value estimates that are accurate over long horizons that is both compute and sample efficient. We demonstrate the effectiveness of plan2vec on one simulated and two challenging real-world image datasets. Experimental results show that plan2vec successfully amortizes the planning cost, enabling reactive planning that is linear in memory and computation complexity rather than exhaustive over the entire state space.

preprint2021arXiv

Incorporating Interpretable Output Constraints in Bayesian Neural Networks

Domains where supervised models are deployed often come with task-specific constraints, such as prior expert knowledge on the ground-truth function, or desiderata like safety and fairness. We introduce a novel probabilistic framework for reasoning with such constraints and formulate a prior that enables us to effectively incorporate them into Bayesian neural networks (BNNs), including a variant that can be amortized over tasks. The resulting Output-Constrained BNN (OC-BNN) is fully consistent with the Bayesian framework for uncertainty quantification and is amenable to black-box inference. Unlike typical BNN inference in uninterpretable parameter space, OC-BNNs widen the range of functional knowledge that can be incorporated, especially for model users without expertise in machine learning. We demonstrate the efficacy of OC-BNNs on real-world datasets, spanning multiple domains such as healthcare, criminal justice, and credit scoring.

preprint2021arXiv

Learning to Schedule DAG Tasks

Scheduling computational tasks represented by directed acyclic graphs (DAGs) is challenging because of its complexity. Conventional scheduling algorithms rely heavily on simple heuristics such as shortest job first (SJF) and critical path (CP), and are often lacking in scheduling quality. In this paper, we present a novel learning-based approach to scheduling DAG tasks. The algorithm employs a reinforcement learning agent to iteratively add directed edges to the DAG, one at a time, to enforce ordering (i.e., priorities of execution and resource allocation) of "tricky" job nodes. By doing so, the original DAG scheduling problem is dramatically reduced to a much simpler proxy problem, on which heuristic scheduling algorithms such as SJF and CP can be efficiently improved. Our approach can be easily applied to any existing heuristic scheduling algorithms. On the benchmark dataset of TPC-H, we show that our learning based approach can significantly improve over popular heuristic algorithms and consistently achieves the best performance among several methods under a variety of settings.

preprint2014arXiv

The complexity of learning halfspaces using generalized linear methods

Many popular learning algorithms (E.g. Regression, Fourier-Transform based algorithms, Kernel SVM and Kernel ridge regression) operate by reducing the problem to a convex optimization problem over a vector space of functions. These methods offer the currently best approach to several central problems such as learning half spaces and learning DNF's. In addition they are widely used in numerous application domains. Despite their importance, there are still very few proof techniques to show limits on the power of these algorithms. We study the performance of this approach in the problem of (agnostically and improperly) learning halfspaces with margin $γ$. Let $\mathcal{D}$ be a distribution over labeled examples. The $γ$-margin error of a hyperplane $h$ is the probability of an example to fall on the wrong side of $h$ or at a distance $\leγ$ from it. The $γ$-margin error of the best $h$ is denoted $\mathrm{Err}_γ(\mathcal{D})$. An $α(γ)$-approximation algorithm receives $γ,ε$ as input and, using i.i.d. samples of $\mathcal{D}$, outputs a classifier with error rate $\le α(γ)\mathrm{Err}_γ(\mathcal{D}) + ε$. Such an algorithm is efficient if it uses $\mathrm{poly}(\frac{1}γ,\frac{1}

preprint2013arXiv

A scalable stage-wise approach to large-margin multi-class loss based boosting

We present a scalable and effective classification model to train multi-class boosting for multi-class classification problems. Shen and Hao introduced a direct formulation of multi- class boosting in the sense that it directly maximizes the multi- class margin [C. Shen and Z. Hao, "A direct formulation for totally-corrective multi- class boosting", in Proc. IEEE Conf. Comp. Vis. Patt. Recogn., 2011]. The major problem of their approach is its high computational complexity for training, which hampers its application on real-world problems. In this work, we propose a scalable and simple stage-wise multi-class boosting method, which also directly maximizes the multi-class margin. Our approach of- fers a few advantages: 1) it is simple and computationally efficient to train. The approach can speed up the training time by more than two orders of magnitude without sacrificing the classification accuracy. 2) Like traditional AdaBoost, it is less sensitive to the choice of parameters and empirically demonstrates excellent generalization performance. Experimental results on challenging multi-class machine learning and vision tasks demonstrate that the proposed approach substantiall

preprint2020arXiv

PCGRL: Procedural Content Generation via Reinforcement Learning

We investigate how reinforcement learning can be used to train level-designing agents. This represents a new approach to procedural content generation in games, where level design is framed as a game, and the content generator itself is learned. By seeing the design problem as a sequential task, we can use reinforcement learning to learn how to take the next action so that the expected final level quality is maximized. This approach can be used when few or no examples exist to train from, and the trained generator is very fast. We investigate three different ways of transforming two-dimensional level design problems into Markov decision processes and apply these to three game environments.

preprint2026arXiv

Detecting Unobserved Confounders: A Kernelized Regression Approach

Detecting unobserved confounders is crucial for reliable causal inference in observational studies. Existing methods require either linearity assumptions or multiple heterogeneous environments, limiting applicability to nonlinear single-environment settings. To bridge this gap, we propose Kernel Regression Confounder Detection (KRCD), a novel method for detecting unobserved confounding in nonlinear observational data under single-environment conditions. KRCD leverages reproducing kernel Hilbert spaces to model complex dependencies. By comparing standard and higherorder kernel regressions, we derive a test statistic whose significant deviation from zero indicates unobserved confounding. Theoretically, we prove two key results: First, in infinite samples, regression coefficients coincide if and only if no unobserved confounders exist. Second, finite-sample differences converge to zero-mean Gaussian distributions with tractable variance. Extensive experiments on synthetic benchmarks and the Twins dataset demonstrate that KRCD not only outperforms existing baselines but also achieves superior computational efficiency.

preprint2013arXiv

lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits

The paper proposes a novel upper confidence bound (UCB) procedure for identifying the arm with the largest mean in a multi-armed bandit game in the fixed confidence setting using a small number of total samples. The procedure cannot be improved in the sense that the number of samples required to identify the best arm is within a constant factor of a lower bound based on the law of the iterated logarithm (LIL). Inspired by the LIL, we construct our confidence bounds to explicitly account for the infinite time horizon of the algorithm. In addition, by using a novel stopping time for the algorithm we avoid a union bound over the arms that has been observed in other UCB-type algorithms. We prove that the algorithm is optimal up to constants and also show through simulations that it provides superior performance with respect to the state-of-the-art.