Source author record

Tshilidzi Marwala

Tshilidzi Marwala appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Machine Learning Computational Engineering, Finance, and Science Neural and Evolutionary Computing Operating Systems q-fin.GN cs.CY Data Structures and Algorithms econ.GN Human-Computer Interaction Numerical Analysis Other Computer Science q-fin.EC q-fin.PM

Catalog footprint

What is connected

24works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

Healing Products of Gaussian Processes

Gaussian processes (GPs) are nonparametric Bayesian models that have been applied to regression and classification problems. One of the approaches to alleviate their cubic training cost is the use of local GP experts trained on subsets of the data. In particular, product-of-expert models combine the predictive distributions of local experts through a tractable product operation. While these expert models allow for massively distributed computation, their predictions typically suffer from erratic behaviour of the mean or uncalibrated uncertainty quantification. By calibrating predictions via a tempered softmax weighting, we provide a solution to these problems for multiple product-of-expert models, including the generalised product of experts and the robust Bayesian committee machine. Furthermore, we leverage the optimal transport literature and propose a new product-of-expert model that combines predictions of local experts by computing their Wasserstein barycenter, which can be applied to both regression and classification.

preprint2020arXiv

An Automatic Relevance Determination Prior Bayesian Neural Network for Controlled Variable Selection

We present an Automatic Relevance Determination prior Bayesian Neural Network(BNN-ARD) weight l2-norm measure as a feature importance statistic for the model-x knockoff filter. We show on both simulated data and the Norwegian wind farm dataset that the proposed feature importance statistic yields statistically significant improvements relative to similar feature importance measures in both variable selection power and predictive performance on a real world dataset.

preprint2020arXiv

Relative Net Utility and the Saint Petersburg Paradox

The famous Saint Petersburg Paradox (St. Petersburg Paradox) shows that the theory of expected value does not capture the real-world economics of decision-making problems. Over the years, many economic theories were developed to resolve the paradox and explain gaps in the economic value theory in the evaluation of economic decisions, the subjective utility of the expected outcomes, and risk aversion as observed in the game of the St. Petersburg Paradox. In this paper, we use the concept of the relative net utility to resolve the St. Petersburg Paradox. Because the net utility concept is able to explain both behavioral economics and the St. Petersburg Paradox, it is deemed to be a universal approach to handling utility. This paper shows how the information content of the notion of net utility value allows us to capture a broader context of the impact of a decision's possible achievements. It discusses the necessary conditions that the utility function has to conform to avoid the paradox. Combining these necessary conditions allows us to define the theorem of indifference in the evaluation of economic decisions and to present the role of the relative net utility and net utility polarity in a value rational decision-making process.

preprint2016arXiv

Missing Data Estimation in High-Dimensional Datasets: A Swarm Intelligence-Deep Neural Network Approach

In this paper, we examine the problem of missing data in high-dimensional datasets by taking into consideration the Missing Completely at Random and Missing at Random mechanisms, as well as theArbitrary missing pattern. Additionally, this paper employs a methodology based on Deep Learning and Swarm Intelligence algorithms in order to provide reliable estimates for missing data. The deep learning technique is used to extract features from the input data via an unsupervised learning approach by modeling the data distribution based on the input. This deep learning technique is then used as part of the objective function for the swarm intelligence technique in order to estimate the missing data after a supervised fine-tuning phase by minimizing an error function based on the interrelationship and correlation between features in the dataset. The investigated methodology in this paper therefore has longer running times, however, the promising potential outcomes justify the trade-off. Also, basic knowledge of statistics is presumed.

preprint2015arXiv

Artificial Intelligence and Asymmetric Information Theory

When human agents come together to make decisions, it is often the case that one human agent has more information than the other. This phenomenon is called information asymmetry and this distorts the market. Often if one human agent intends to manipulate a decision in its favor the human agent can signal wrong or right information. Alternatively, one human agent can screen for information to reduce the impact of asymmetric information on decisions. With the advent of artificial intelligence, signaling and screening have been made easier. This paper studies the impact of artificial intelligence on the theory of asymmetric information. It is surmised that artificial intelligent agents reduce the degree of information asymmetry and thus the market where these agents are deployed become more efficient. It is also postulated that the more artificial intelligent agents there are deployed in the market the less is the volume of trades in the market. This is because for many trades to happen the asymmetry of information on goods and services to be traded should exist, creating a sense of arbitrage.

preprint2015arXiv

Causal Model Analysis using Collider v-structure with Negative Percentage Mapping

A major problem of causal inference is the arrangement of dependent nodes in a directed acyclic graph (DAG) with path coefficients and observed confounders. Path coefficients do not provide the units to measure the strength of information flowing from one node to the other. Here we proposed the method of causal structure learning using collider v-structures (CVS) with Negative Percentage Mapping (NPM) to get selective thresholds of information strength, to direct the edges and subjective confounders in a DAG. The NPM is used to scale the strength of information passed through nodes in units of percentage from interval from 0 to 1. The causal structures are constructed by bottom up approach using path coefficients, causal directions and confounders, derived implementing collider v-structure and NPM. The method is self-sufficient to observe all the latent confounders present in the causal model and capable of detecting every responsible causal direction. The results are tested for simulated datasets of non-Gaussian distributions and compared with DirectLiNGAM and ICA-LiNGAM to check efficiency of the proposed method.

preprint2015arXiv

Impact of Artificial Intelligence on Economic Theory

Artificial intelligence has impacted many aspects of human life. This paper studies the impact of artificial intelligence on economic theory. In particular we study the impact of artificial intelligence on the theory of bounded rationality, efficient market hypothesis and prospect theory.

preprint2015arXiv

Monte Carlo Dynamically Weighted Importance Sampling For Finite Element Model Updating

The Finite Element Method (FEM) is generally unable to accurately predict natural frequencies and mode shapes of structures (eigenvalues and eigenvectors). Engineers develop numerical methods and a variety of techniques to compensate for this misalignment of modal properties, between experimentally measured data and the computed result from the FEM of structures. In this paper we compare two indirect methods of updating namely, the Adaptive Metropolis Hastings and a newly applied algorithm called Monte Carlo Dynamically Weighted Importance Sampling (MCDWIS). The approximation of a posterior predictive distribution is based on Bayesian inference of continuous multivariate Gaussian probability density functions, defining the variability of physical properties affected by forced vibration. The motivation behind applying MCDWIS is in the complexity of computing normalizing constants in higher dimensional or multimodal systems. The MCDWIS accounts for this intractability by analytically computing importance sampling estimates at each time step of the algorithm. In addition, a dynamic weighting step with an Adaptive Pruned Enriched Population Control Scheme (APEPCS) allows for further control over weighted samples and population size. The performance of the MCDWIS simulation is graphically illustrated for all algorithm dependent parameters and show unbiased, stable sample estimates.

preprint2015arXiv

Proposition of a Theoretical Model for Missing Data Imputation using Deep Learning and Evolutionary Algorithms

In the last couple of decades, there has been major advancements in the domain of missing data imputation. The techniques in the domain include amongst others: Expectation Maximization, Neural Networks with Evolutionary Algorithms or optimization techniques and K-Nearest Neighbor approaches to solve the problem. The presence of missing data entries in databases render the tasks of decision-making and data analysis nontrivial. As a result this area has attracted a lot of research interest with the aim being to yield accurate and time efficient and sensitive missing data imputation techniques especially when time sensitive applications are concerned like power plants and winding processes. In this article, considering arbitrary and monotone missing data patterns, we hypothesize that the use of deep neural networks built using autoencoders and denoising autoencoders in conjunction with genetic algorithms, swarm intelligence and maximum likelihood estimator methods as novel data imputation techniques will lead to better imputed values than existing techniques. Also considered are the missing at random, missing completely at random and missing not at random missing data mechanisms. We also intend to use fuzzy logic in tandem with deep neural networks to perform the missing data imputation tasks, as well as different building blocks for the deep neural networks like Stacked Restricted Boltzmann Machines and Deep Belief Networks to test our hypothesis. The motivation behind this article is the need for missing data imputation techniques that lead to better imputed values than existing methods with higher accuracies and lower errors.

preprint2014arXiv

Rational Counterfactuals

This paper introduces the concept of rational countefactuals which is an idea of identifying a counterfactual from the factual (whether perceived or real) that maximizes the attainment of the desired consequent. In counterfactual thinking if we have a factual statement like: Saddam Hussein invaded Kuwait and consequently George Bush declared war on Iraq then its counterfactuals is: If Saddam Hussein did not invade Kuwait then George Bush would not have declared war on Iraq. The theory of rational counterfactuals is applied to identify the antecedent that gives the desired consequent necessary for rational decision making. The rational countefactual theory is applied to identify the values of variables Allies, Contingency, Distance, Major Power, Capability, Democracy, as well as Economic Interdependency that gives the desired consequent Peace.

preprint2013arXiv

Applying the Negative Selection Algorithm for Merger and Acquisition Target Identification

In this paper, we propose a new methodology based on the Negative Selection Algorithm that belongs to the field of Computational Intelligence, specifically, Artificial Immune Systems to identify takeover targets. Although considerable research based on customary statistical techniques and some contemporary Computational Intelligence techniques have been devoted to identify takeover targets, most of the existing studies are based upon multiple previous mergers and acquisitions. Contrary to previous research, the novelty of this proposal lies in its ability to suggest takeover targets for novice firms that are at the beginning of their merger and acquisition spree. We first discuss the theoretical perspective and then provide a case study with details for practical implementation, both capitalizing from unique generalization capabilities of artificial immune systems algorithms.

preprint2013arXiv

Flexibly-bounded Rationality and Marginalization of Irrationality Theories for Decision Making

In this paper the theory of flexibly-bounded rationality which is an extension to the theory of bounded rationality is revisited. Rational decision making involves using information which is almost always imperfect and incomplete together with some intelligent machine which if it is a human being is inconsistent to make decisions. In bounded rationality, this decision is made irrespective of the fact that the information to be used is incomplete and imperfect and that the human brain is inconsistent and thus this decision that is to be made is taken within the bounds of these limitations. In the theory of flexibly-bounded rationality, advanced information analysis is used, the correlation machine is applied to complete missing information and artificial intelligence is used to make more consistent decisions. Therefore flexibly-bounded rationality expands the bounds within which rationality is exercised. Because human decision making is essentially irrational, this paper proposes the theory of marginalization of irrationality in decision making to deal with the problem of satisficing in the presence of irrationality.

preprint2013arXiv

Semi-bounded Rationality: A model for decision making

In this paper the theory of semi-bounded rationality is proposed as an extension of the theory of bounded rationality. In particular, it is proposed that a decision making process involves two components and these are the correlation machine, which estimates missing values, and the causal machine, which relates the cause to the effect. Rational decision making involves using information which is almost always imperfect and incomplete as well as some intelligent machine which if it is a human being is inconsistent to make decisions. In the theory of bounded rationality this decision is made irrespective of the fact that the information to be used is incomplete and imperfect and the human brain is inconsistent and thus this decision that is to be made is taken within the bounds of these limitations. In the theory of semi-bounded rationality, signal processing is used to filter noise and outliers in the information and the correlation machine is applied to complete the missing information and artificial intelligence is used to make more consistent decisions.

preprint2012arXiv

Soft Computing in Product Recovery: A Survey Focusing on Remanufacturing System

This paper focuses on the application of soft computing in remanufacturing system, in which end-of-life products are disassembled into basic components and then remanufactured for both economic and environmental reasons. The disassembly activities include disassembly sequencing and planning, while the remanufacturing process is composed of product design, production planning & scheduling, and inventory management. This paper presents a review of the related articles and suggests the corresponding further research directions.

preprint2011arXiv

Fuzzy Inference Systems Optimization

This paper compares various optimization methods for fuzzy inference system optimization. The optimization methods compared are genetic algorithm, particle swarm optimization and simulated annealing. When these techniques were implemented it was observed that the performance of each technique within the fuzzy inference system classification was context dependent.

preprint2011arXiv

Improving the performance of the ripper in insurance risk classification : A comparitive study using feature selection

The Ripper algorithm is designed to generate rule sets for large datasets with many features. However, it was shown that the algorithm struggles with classification performance in the presence of missing data. The algorithm struggles to classify instances when the quality of the data deteriorates as a result of increasing missing data. In this paper, a feature selection technique is used to help improve the classification performance of the Ripper model. Principal component analysis and evidence automatic relevance determination techniques are used to improve the performance. A comparison is done to see which technique helps the algorithm improve the most. Training datasets with completely observable data were used to construct the model and testing datasets with missing values were used for measuring accuracy. The results showed that principal component analysis is a better feature selection for the Ripper in improving the classification performance.

preprint2011arXiv

Organizational adaptation to Complexity: A study of the South African Insurance Market as a Complex Adaptive System through Statistical Risk Analysis

South Africa assumes a significant position in the insurance landscape of Africa. The present research based upon qualitative and quantitative analysis, shows that it shows the characteristics of a Complex Adaptive System. In addition, a statistical analysis of risk measures through Value at risk and Conditional tail expectation is carried out to show how an individual insurance company copes under external complexities. The authors believe that an explanation of the coping strategies, and the subsequent managerial implications would enrich our understanding of complexity in business.

preprint2011arXiv

Suitability of using technical indicators as potential strategies within intelligent trading systems

The potential of machine learning to automate and control nonlinear, complex systems is well established. These same techniques have always presented potential for use in the investment arena, specifically for the managing of equity portfolios. In this paper, the opportunity for such exploitation is investigated through analysis of potential simple trading strategies that can then be meshed together for the machine learning system to switch between. It is the eligibility of these strategies that is being investigated in this paper, rather than application. In order to accomplish this, the underlying assumptions of each trading system are explored, and data is created in order to evaluate the efficacy of these systems when trading on data with the underlying patterns that they expect. The strategies are tested against a buy-and-hold strategy to determine if the act of trading has actually produced any worthwhile results, or are simply facets of the underlying prices. These results are then used to produce targeted returns based upon either a desired return or a desired risk, as both are required within the portfolio-management industry. Results show a very viable opportunity for exploitation within the aforementioned industry, with the Strategies performing well within their narrow assumptions, and the intelligent system combining them to perform without assumptions.

preprint2011arXiv

The fuzzy gene filter: A classifier performance assesment

The Fuzzy Gene Filter (FGF) is an optimised Fuzzy Inference System designed to rank genes in order of differential expression, based on expression data generated in a microarray experiment. This paper examines the effectiveness of the FGF for feature selection using various classification architectures. The FGF is compared to three of the most common gene ranking algorithms: t-test, Wilcoxon test and ROC curve analysis. Four classification schemes are used to compare the performance of the FGF vis-a-vis the standard approaches: K Nearest Neighbour (KNN), Support Vector Machine (SVM), Naive Bayesian Classifier (NBC) and Artificial Neural Network (ANN). A nested stratified Leave-One-Out Cross Validation scheme is used to identify the optimal number top ranking genes, as well as the optimal classifier parameters. Two microarray data sets are used for the comparison: a prostate cancer data set and a lymphoma data set.

preprint2010arXiv

Application of Global and One-Dimensional Local Optimization to Operating System Scheduler Tuning

This paper describes a study of comparison of global and one-dimensional local optimization methods to operating system scheduler tuning. The operating system scheduler we use is the Linux 2.6.23 Completely Fair Scheduler (CFS) running in simulator (LinSched). We have ported the Hackbench scheduler benchmark to this simulator and use this as the workload. The global optimization approach we use is Particle Swarm Optimization (PSO). We make use of Response Surface Methodology (RSM) to specify optimal parameters for our PSO implementation. The one-dimensional local optimization approach we use is the Golden Section method. In order to use this approach, we convert the scheduler tuning problem from one involving setting of three parameters to one involving the manipulation of one parameter. Our results show that the global optimization approach yields better response but the one- dimensional optimization approach converges to a solution faster than the global optimization approach.

preprint2010arXiv

Use of Data Mining in Scheduler Optimization

The operating system's role in a computer system is to manage the various resources. One of these resources is the Central Processing Unit. It is managed by a component of the operating system called the CPU scheduler. Schedulers are optimized for typical workloads expected to run on the platform. However, a single scheduler may not be appropriate for all workloads. That is, a scheduler may schedule a workload such that the completion time is minimized, but when another type of workload is run on the platform, scheduling and therefore completion time will not be optimal; a different scheduling algorithm, or a different set of parameters, may work better. Several approaches to solving this problem have been proposed. The objective of this survey is to summarize the approaches based on data mining, which are available in the literature. In addition to solutions that can be directly utilized for solving this problem, we are interested in data mining research in related areas that have potential for use in operating system scheduling. We also explain general technical issues involved in scheduling in modern computers, including parallel scheduling issues related to multi-core CPUs. We propose a taxonomy that classifies the scheduling approaches we discuss into different categories.

preprint2008arXiv

An Intelligent Multi-Agent Recommender System for Human Capacity Building

This paper presents a Multi-Agent approach to the problem of recommending training courses to engineering professionals. The recommendation system is built as a proof of concept and limited to the electrical and mechanical engineering disciplines. Through user modelling and data collection from a survey, collaborative filtering recommendation is implemented using intelligent agents. The agents work together in recommending meaningful training courses and updating the course information. The system uses a users profile and keywords from courses to rank courses. A ranking accuracy for courses of 90% is achieved while flexibility is achieved using an agent that retrieves information autonomously using data mining techniques from websites. This manner of recommendation is scalable and adaptable. Further improvements can be made using clustering and recording user feedback.

preprint2008arXiv

Stochastic Optimization Approaches for Solving Sudoku

In this paper the Sudoku problem is solved using stochastic search techniques and these are: Cultural Genetic Algorithm (CGA), Repulsive Particle Swarm Optimization (RPSO), Quantum Simulated Annealing (QSA) and the Hybrid method that combines Genetic Algorithm with Simulated Annealing (HGASA). The results obtained show that the CGA, QSA and HGASA are able to solve the Sudoku puzzle with CGA finding a solution in 28 seconds, while QSA finding a solution in 65 seconds and HGASA in 1.447 seconds. This is mainly because HGASA combines the parallel searching of GA with the flexibility of SA. The RPSO was found to be unable to solve the puzzle.

preprint2007arXiv

Using Images to create a Hierarchical Grid Spatial Index

This paper presents a hybrid approach to spatial indexing of two dimensional data. It sheds new light on the age old problem by thinking of the traditional algorithms as working with images. Inspiration is drawn from an analogous situation that is found in machine and human vision. Image processing techniques are used to assist in the spatial indexing of the data. A fixed grid approach is used and bins with too many records are sub-divided hierarchically. Search queries are pre-computed for bins that do not contain any data records. This has the effect of dividing the search space up into non rectangular regions which are based on the spatial properties of the data. The bucketing quad tree can be considered as an image with a resolution of two by two for each layer. The results show that this method performs better than the quad tree if there are more divisions per layer. This confirms our suspicions that the algorithm works better if it gets to look at the data with higher resolution images. An elegant class structure is developed where the implementation of concrete spatial indexes for a particular data type merely relies on rendering the data onto an image.

Tshilidzi Marwala

What is connected

Connect this record

See the researcher in context

Building this map preview

24 published item(s)

Healing Products of Gaussian Processes

An Automatic Relevance Determination Prior Bayesian Neural Network for Controlled Variable Selection

Relative Net Utility and the Saint Petersburg Paradox

Missing Data Estimation in High-Dimensional Datasets: A Swarm Intelligence-Deep Neural Network Approach

Artificial Intelligence and Asymmetric Information Theory

Causal Model Analysis using Collider v-structure with Negative Percentage Mapping

Impact of Artificial Intelligence on Economic Theory

Monte Carlo Dynamically Weighted Importance Sampling For Finite Element Model Updating

Proposition of a Theoretical Model for Missing Data Imputation using Deep Learning and Evolutionary Algorithms

Rational Counterfactuals

Applying the Negative Selection Algorithm for Merger and Acquisition Target Identification

Flexibly-bounded Rationality and Marginalization of Irrationality Theories for Decision Making

Semi-bounded Rationality: A model for decision making

Soft Computing in Product Recovery: A Survey Focusing on Remanufacturing System

Fuzzy Inference Systems Optimization

Improving the performance of the ripper in insurance risk classification : A comparitive study using feature selection

Organizational adaptation to Complexity: A study of the South African Insurance Market as a Complex Adaptive System through Statistical Risk Analysis

Suitability of using technical indicators as potential strategies within intelligent trading systems

The fuzzy gene filter: A classifier performance assesment

Application of Global and One-Dimensional Local Optimization to Operating System Scheduler Tuning

Use of Data Mining in Scheduler Optimization

An Intelligent Multi-Agent Recommender System for Human Capacity Building

Stochastic Optimization Approaches for Solving Sudoku

Using Images to create a Hierarchical Grid Spatial Index