Researcher profile

Ivan Štajduhar

Ivan Štajduhar contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2022arXiv

Deep embedded clustering algorithm for clustering PACS repositories

Creating large datasets of medical radiology images from several sources can be challenging because of the differences in the acquisition and storage standards. One possible way of controlling and/or assessing the image selection process is through medical image clustering. This, however, requires an efficient method for learning latent image representations. In this paper, we tackle the problem of fully-unsupervised clustering of medical images using pixel data only. We test the performance of several contemporary approaches, built on top of a convolutional autoencoder (CAE) - convolutional deep embedded clustering (CDEC) and convolutional improved deep embedded clustering (CIDEC) - and three approaches based on preset feature extraction - histogram of oriented gradients (HOG), local binary pattern (LBP) and principal component analysis (PCA). CDEC and CIDEC are end-to-end clustering solutions, involving simultaneous learning of latent representations and clustering assignments, whereas the remaining approaches rely on k-means clustering from fixed embeddings. We train the models on 30,000 images, and test them using a separate test set consisting of 8,000 images. We sampled the data from the PACS repository archive of the Clinical Hospital Centre Rijeka. For evaluation, we use silhouette score, homogeneity score and normalised mutual information (NMI) on two target parameters, closely associated with commonly occurring DICOM tags - Modality and anatomical region (adjusted BodyPartExamined tag). CIDEC attains an NMI score of 0.473 with respect to anatomical region, and CDEC attains an NMI score of 0.645 with respect to the tag Modality - both outperforming other commonly used feature descriptors.

preprint2022arXiv

Intra-domain and cross-domain transfer learning for time series data -- How transferable are the features?

In practice, it is very demanding and sometimes impossible to collect datasets of tagged data large enough to successfully train a machine learning model, and one possible solution to this problem is transfer learning. This study aims to assess how transferable are the features between different domains of time series data and under which conditions. The effects of transfer learning are observed in terms of predictive performance of the models and their convergence rate during training. In our experiment, we use reduced data sets of 1,500 and 9,000 data instances to mimic real world conditions. Using the same scaled-down datasets, we trained two sets of machine learning models: those that were trained with transfer learning and those that were trained from scratch. Four machine learning models were used for the experiment. Transfer of knowledge was performed within the same domain of application (seismology), as well as between mutually different domains of application (seismology, speech, medicine, finance). We observe the predictive performance of the models and the convergence rate during the training. In order to confirm the validity of the obtained results, we repeated the experiments seven times and applied statistical tests to confirm the significance of the results. The general conclusion of our study is that transfer learning is very likely to either increase or not negatively affect the predictive performance of the model or its convergence rate. The collected data is analysed in more details to determine which source and target domains are compatible for transfer of knowledge. We also analyse the effect of target dataset size and the selection of model and its hyperparameters on the effects of transfer learning.

preprint2022arXiv

ML-Based Approach for NFL Defensive Pass Interference Prediction Using GPS Tracking Data

Defensive Pass Interference (DPI) is one of the most impactful penalties in the NFL. DPI is a spot foul, yielding an automatic first down to the team in possession. With such an influence on the game, referees have no room for a mistake. It is also a very rare event, which happens 1-2 times per 100 pass attempts. With technology improving and many IoT wearables being put on the athletes to collect valuable data, there is a solid ground for applying machine learning (ML) techniques to improve every aspect of the game. The work presented here is the first attempt in predicting DPI using player tracking GPS data. The data we used was collected by NFL's Next Gen Stats throughout the 2018 regular season. We present ML models for highly imbalanced time-series binary classification: LSTM, GRU, ANN, and Multivariate LSTM-FCN. Results showed that using GPS tracking data to predict DPI has limited success. The best performing models had high recall with low precision which resulted in the classification of many false positive examples. Looking closely at the data confirmed that there is just not enough information to determine whether a foul was committed. This study might serve as a filter for multi-step pipeline for video sequence classification which could be able to solve this problem.