Source author record

Terence L. van Zyl

Terence L. van Zyl appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Computation and Language Computational Engineering, Finance, and Science Computer Vision Neural and Evolutionary Computing q-fin.PM

Catalog footprint

What is connected

6works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Multi-Object Tracking Consistently Improves Wildlife Inference

Camera traps have become a common tool for wildlife monitoring efforts in ecological research and biodiversity conservation. Wildlife classification models have benefited from the increase in wildlife visual data. These models reach high levels of accuracy on curated, high-quality datasets. However, their performance remains sensitive to real-world environmental constraints. They often produce inconsistent predictions when performing inference on temporally coherent sequences. The predicted label for a single individual shifts rapidly between frames. This study exploits the temporal nature of camera-trap data to augment inferred predictions from a wildlife classification model. Specifically, we adopt several standard Multi-Object Tracking (MOT) models to link detections across consecutive frames. The curated trajectories are used to fuse the softmax class probabilities. The fused probability score produces a single consensus class label estimate that overrides misclassifications caused by noise. The analysis of the experimental results shows that our proposed strategy improves over a standalone classifier over all datasets and for each metric. Specifically, the best-performing MOT models gain a weighted F1-Score of 5.1%, 3.1% and 2.0% over the classifier across three MOT datasets.

preprint2022arXiv

Deep Similarity Learning for Sports Team Ranking

Sports data is more readily available and consequently, there has been an increase in the amount of sports analysis, predictions and rankings in the literature. Sports are unique in their respective stochastic nature, making analysis, and accurate predictions valuable to those involved in the sport. In response, we focus on Siamese Neural Networks (SNN) in unison with LightGBM and XGBoost models, to predict the importance of matches and to rank teams in Rugby and Basketball. Six models were developed and compared, a LightGBM, a XGBoost, a LightGBM (Contrastive Loss), LightGBM (Triplet Loss), a XGBoost (Contrastive Loss), XGBoost (Triplet Loss). The models that utilise a Triplet loss function perform better than those using Contrastive loss. It is clear LightGBM (Triplet loss) is the most effective model in ranking the NBA, producing a state of the art (SOTA) mAP (0.867) and NDCG (0.98) respectively. The SNN (Triplet loss) most effectively predicted the Super 15 Rugby, yielding the SOTA mAP (0.921), NDCG (0.983), and $r_s$ (0.793). Triplet loss produces the best overall results displaying the value of learning representations/embeddings for prediction and ranking of sports. Overall there is not a single consistent best performing model across the two sports indicating that other Ranking models should be considered in the future.

preprint2022arXiv

Fusion of Sentiment and Asset Price Predictions for Portfolio Optimization

The fusion of public sentiment data in the form of text with stock price prediction is a topic of increasing interest within the financial community. However, the research literature seldom explores the application of investor sentiment in the Portfolio Selection problem. This paper aims to unpack and develop an enhanced understanding of the sentiment aware portfolio selection problem. To this end, the study uses a Semantic Attention Model to predict sentiment towards an asset. We select the optimal portfolio through a sentiment-aware Long Short Term Memory (LSTM) recurrent neural network for price prediction and a mean-variance strategy. Our sentiment portfolio strategies achieved on average a significant increase in revenue above the non-sentiment aware models. However, the results show that our strategy does not outperform traditional portfolio allocation strategies from a stability perspective. We argue that an improved fusion of sentiment prediction with a combination of price prediction and portfolio optimization leads to an enhanced portfolio selection strategy.

preprint2022arXiv

Knowledge Graph Fusion for Language Model Fine-tuning

Language Models such as BERT have grown in popularity due to their ability to be pre-trained and perform robustly on a wide range of Natural Language Processing tasks. Often seen as an evolution over traditional word embedding techniques, they can produce semantic representations of text, useful for tasks such as semantic similarity. However, state-of-the-art models often have high computational requirements and lack global context or domain knowledge which is required for complete language understanding. To address these limitations, we investigate the benefits of knowledge incorporation into the fine-tuning stages of BERT. An existing K-BERT model, which enriches sentences with triplets from a Knowledge Graph, is adapted for the English language and extended to inject contextually relevant information into sentences. As a side-effect, changes made to K-BERT for accommodating the English language also extend to other word-based languages. Experiments conducted indicate that injected knowledge introduces noise. We see statistically significant improvements for knowledge-driven tasks when this noise is minimised. We show evidence that, given the appropriate task, modest injection with relevant, high-quality knowledge is most performant.

preprint2022arXiv

Surrogate Assisted Evolutionary Multi-objective Optimisation applied to a Pressure Swing Adsorption system

Chemical plant design and optimisation have proven challenging due to the complexity of these real-world systems. The resulting complexity translates into high computational costs for these systems' mathematical formulations and simulation models. Research has illustrated the benefits of using machine learning surrogate models as substitutes for computationally expensive models during optimisation. This paper extends recent research into optimising chemical plant design and operation. The study further explores Surrogate Assisted Genetic Algorithms (SA-GA) in more complex variants of the original plant design and optimisation problems, such as the inclusion of parallel and feedback components. The novel extension to the original algorithm proposed in this study, Surrogate Assisted NSGA-\Romannum{2} (SA-NSGA), was tested on a popular literature case, the Pressure Swing Adsorption (PSA) system. We further provide extensive experimentation, comparing various meta-heuristic optimisation techniques and numerous machine learning models as surrogates. The results for both sets of systems illustrate the benefits of using Genetic Algorithms as an optimisation framework for complex chemical plant system design and optimisation for both single and multi-objective scenarios. We confirm that Random Forest surrogate assisted Evolutionary Algorithms can be scaled to increasingly complex chemical systems with parallel and feedback components. We further find that combining a Genetic Algorithm framework with Machine Learning Surrogate models as a substitute for long-running simulation models yields significant computational efficiency improvements, 1.7 - 1.84 times speedup for the increased complexity examples and a 2.7 times speedup for the Pressure Swing Adsorption system.

preprint2021arXiv

Feature-weighted Stacking for Nonseasonal Time Series Forecasts: A Case Study of the COVID-19 Epidemic Curves

We investigate ensembling techniques in forecasting and examine their potential for use in nonseasonal time-series similar to those in the early days of the COVID-19 pandemic. Developing improved forecast methods is essential as they provide data-driven decisions to organisations and decision-makers during critical phases. We propose using late data fusion, using a stacked ensemble of two forecasting models and two meta-features that prove their predictive power during a preliminary forecasting stage. The final ensembles include a Prophet and long short term memory (LSTM) neural network as base models. The base models are combined by a multilayer perceptron (MLP), taking into account meta-features that indicate the highest correlation with each base model's forecast accuracy. We further show that the inclusion of meta-features generally improves the ensemble's forecast accuracy across two forecast horizons of seven and fourteen days. This research reinforces previous work and demonstrates the value of combining traditional statistical models with deep learning models to produce more accurate forecast models for time-series from different domains and seasonality.

Terence L. van Zyl

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

Multi-Object Tracking Consistently Improves Wildlife Inference

Deep Similarity Learning for Sports Team Ranking

Fusion of Sentiment and Asset Price Predictions for Portfolio Optimization

Knowledge Graph Fusion for Language Model Fine-tuning

Surrogate Assisted Evolutionary Multi-objective Optimisation applied to a Pressure Swing Adsorption system

Feature-weighted Stacking for Nonseasonal Time Series Forecasts: A Case Study of the COVID-19 Epidemic Curves