Researcher profile

Nino Antulov-Fantulin

Nino Antulov-Fantulin contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
17works
0followers
15topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

17 published item(s)

preprint2026arXiv

Does Your Neural Network Extrapolate? Feature Engineering as Identifiability Bias for OOD Generalization

Successful deep neural networks discover salient features of data. We show when and why they fail to learn out-of-distribution (OOD)-relevant representations from an in-distribution (ID) training window. This requires decoupling feature learning from data-generating-process (DGP) identifiability. From a single training window, OOD extrapolation is non-identifiable: infinitely many DGPs are $\varepsilon$-observationally equivalent on the training data but diverge arbitrarily outside it, and no in-distribution criterion alone reliably breaks the tie. A structural commitment, the feature map, label map, and model class $(\varphi, ψ, \mathcal{M})$, dictates the assumed DGP and governs OOD generalization while leaving ID performance essentially unchanged. When architecture, pretraining, augmentation, input formats, or domain knowledge implicitly inject the missing commitment, the model succeeds. When it cannot infer OOD-relevant structure from ID evidence, it fails. Changing only the representation can make the same architecture, at the same in-distribution loss, differ by ${\sim}520\times$ out of distribution. When the commitment is correct and identifiable, OOD error vanishes. For example, Fourier coordinates turn periodic extrapolation into interpolation on $\mathbb{S}^1$. The same mechanism predicts outcomes in three natural-science settings (mass-action chemistry; Kepler's-third-law exoplanet prediction, $n=2{,}362$; and cross-species coding-DNA detection) and in a 264-run positional-encoding study across Transformer, Mamba, and S4D. Finally, a controlled study shows: correct features are necessary but not sufficient. The model class must express the target, and the transformed training data must cover the relevant representation space.

preprint2022arXiv

Can LSTM outperform volatility-econometric models?

Volatility prediction for financial assets is one of the essential questions for understanding financial risks and quadratic price variation. However, although many novel deep learning models were recently proposed, they still have a "hard time" surpassing strong econometric volatility models. Why is this the case? The volatility prediction task is of non-trivial complexity due to noise, market microstructure, heteroscedasticity, exogenous and asymmetric effect of news, and the presence of different time scales, among others. In this paper, we analyze the class of long short-term memory (LSTM) recurrent neural networks for the task of volatility prediction and compare it with strong volatility-econometric models.

preprint2022arXiv

Simplifying Sparse Expert Recommendation by Revisiting Graph Diffusion

Community Question Answering (CQA) websites have become valuable knowledge repositories where individuals exchange information by asking and answering questions. With an ever-increasing number of questions and high migration of users in and out of communities, a key challenge is to design effective strategies for recommending experts for new questions. In this paper, we propose a simple graph-diffusion expert recommendation model for CQA, that can outperform state-of-the art deep learning representatives and collaborative models. Our proposed method learns users' expertise in the context of both semantic and temporal information to capture their changing interest and activity levels with time. Experiments on five real-world datasets from the Stack Exchange network demonstrate that our approach outperforms competitive baseline methods. Further, experiments on cold-start users (users with a limited historical record) show our model achieves an average of ~ 30% performance gain compared to the best baseline method.

preprint2022arXiv

Topic Community Based Temporal Expertise for Question Routing

Question Routing in Community-based Question Answering websites aims at recommending newly posted questions to potential users who are most likely to provide "accepted answers". Most of the existing approaches predict users' expertise based on their past question answering behavior and the content of new questions. However, these approaches suffer from challenges in three aspects: 1) sparsity of users' past records results in lack of personalized recommendation that at times does not match users' interest or domain expertise, 2) modeling based on all questions and answers content makes periodic updates computationally expensive, and 3) while CQA sites are highly dynamic, they are mostly considered as static. This paper proposes a novel approach to QR that addresses the above challenges. It is based on dynamic modeling of users' activity on topic communities. Experimental results on three real-world datasets demonstrate that the proposed model significantly outperforms competitive baseline models

preprint2022arXiv

Volatility-inspired $σ$-LSTM cell

Volatility models of price fluctuations are well studied in the econometrics literature, with more than 50 years of theoretical and empirical findings. The recent advancements in neural networks (NN) in the deep learning field have naturally offered novel econometric modeling tools. However, there is still a lack of explainability and stylized knowledge about volatility modeling with neural networks; the use of stylized facts could help improve the performance of the NN for the volatility prediction task. In this paper, we investigate how the knowledge about the "physics" of the volatility process can be used as an inductive bias to design or constrain a cell state of long short-term memory (LSTM) for volatility forecasting. We introduce a new type of $σ$-LSTM cell with a stochastic processing layer, design its learning mechanism and show good out-of-sample forecasting performance.

preprint2021arXiv

Implicit energy regularization of neural ordinary-differential-equation control

Although optimal control problems of dynamical systems can be formulated within the framework of variational calculus, their solution for complex systems is often analytically and computationally intractable. In this Letter we present a versatile neural ordinary-differential-equation control (NODEC) framework with implicit energy regularization and use it to obtain neural-network-generated control signals that can steer dynamical systems towards a desired target state within a predefined amount of time. We demonstrate the ability of NODEC to learn control signals that closely resemble those found by corresponding optimal control frameworks in terms of control energy and deviation from the desired target state. Our results suggest that NODEC is capable to solve a wide range of control and optimization problems, including those that are analytically intractable.

preprint2021arXiv

Information dynamics of price and liquidity around the 2017 Bitcoin markets crash

We study the information dynamics between the largest Bitcoin exchange markets during the bubble in 2017-2018. By analysing high-frequency market-microstructure observables with different information theoretic measures for dynamical systems, we find temporal changes in information sharing across markets. In particular, we study the time-varying components of predictability, memory, and synchronous coupling, measured by transfer entropy, active information storage, and multi-information. By comparing these empirical findings with several models we argue that some results could relate to intra-market and inter-market regime shifts, and changes in direction of information flow between different market observables.

preprint2021arXiv

Neural Ordinary Differential Equation Control of Dynamics on Graphs

We study the ability of neural networks to calculate feedback control signals that steer trajectories of continuous time non-linear dynamical systems on graphs, which we represent with neural ordinary differential equations (neural ODEs). To do so, we present a neural-ODE control (NODEC) framework and find that it can learn feedback control signals that drive graph dynamical systems into desired target states. While we use loss functions that do not constrain the control energy, our results show, in accordance with related work, that NODEC produces low energy control signals. Finally, we evaluate the performance and versatility of NODEC against well-known feedback controllers and deep reinforcement learning. We use NODEC to generate feedback controls for systems of more than one thousand coupled, non-linear ODEs that represent epidemic processes and coupled oscillators.

preprint2021arXiv

On the accuracy of short-term COVID-19 fatality forecasts

Forecasting new cases, hospitalizations, and disease-induced deaths is an important part of infectious disease surveillance and helps guide health officials in implementing effective countermeasures. For disease surveillance in the U.S., the Centers for Disease Control and Prevention (CDC) combine more than 65 individual forecasts of these numbers in an ensemble forecast at national and state levels. We collected data on CDC ensemble forecasts of COVID-19 fatalities in the United States, and compare them with easily interpretable ``Euler'' forecasts serving as a model-free benchmark that is only based on the local rate of change of the incidence curve. The term ``Euler method'' is motivated by the eponymous numerical integration scheme that calculates the value of a function at a future time step based on the current rate of change. Our results show that CDC ensemble forecasts are not more accurate than ``Euler'' forecasts on short-term forecasting horizons of one week. However, CDC ensemble forecasts show a better performance on longer forecasting horizons. Using the current rate of change in incidences as estimates of future incidence changes is useful for epidemic forecasting on short time horizons. An advantage of the proposed method over other forecasting approaches is that it can be implemented with a very limited amount of work and without relying on additional data (e.g., human mobility and contact patterns) and high-performance computing systems.

preprint2021arXiv

Time-varying volatility in Bitcoin market and information flow at minute-level frequency

In this paper, we analyze the time-series of minute price returns on the Bitcoin market through the statistical models of generalized autoregressive conditional heteroskedasticity (GARCH) family. Several mathematical models have been proposed in finance, to model the dynamics of price returns, each of them introducing a different perspective on the problem, but none without shortcomings. We combine an approach that uses historical values of returns and their volatilities - GARCH family of models, with a so-called "Mixture of Distribution Hypothesis", which states that the dynamics of price returns are governed by the information flow about the market. Using time-series of Bitcoin-related tweets and volume of transactions as external information, we test for improvement in volatility prediction of several GARCH model variants on a minute level Bitcoin price time series. Statistical tests show that the simplest GARCH(1,1) reacts the best to the addition of external signal to model volatility process on out-of-sample data.

preprint2020arXiv

Low-dimensional statistical manifold embedding of directed graphs

We propose a novel node embedding of directed graphs to statistical manifolds, which is based on a global minimization of pairwise relative entropy and graph geodesics in a non-linear way. Each node is encoded with a probability density function over a measurable space. Furthermore, we analyze the connection between the geometrical properties of such embedding and their efficient learning procedure. Extensive experiments show that our proposed embedding is better in preserving the global geodesic information of graphs, as well as outperforming existing embedding models on directed graphs in a variety of evaluation metrics, in an unsupervised setting.

preprint2020arXiv

Unifying continuous, discrete, and hybrid susceptible-infected-recovered processes on networks

Waiting times between two consecutive infection and recovery events in spreading processes are often assumed to be exponentially distributed, which results in Markovian (i.e., memoryless) continuous spreading dynamics. However, this is not taking into account memory (correlation) effects and discrete interactions that have been identified as relevant in social, transportation, and disease dynamics. We introduce a framework to model continuous, discrete, and hybrid forms of (non-)Markovian susceptible-infected-recovered (SIR) stochastic processes on networks. The hybrid SIR processes that we study in this paper describe infections as discrete-time Markovian and recovery events as continuous-time non-Markovian processes, which mimic the distribution of cell cycles. Our results suggest that the effective-infection-rate description of epidemic processes fails to uniquely capture the behavior of such hybrid and also general non-Markovian disease dynamics. Providing a unifying description of general Markovian and non-Markovian disease outbreaks, we instead show that the mean transmissibility produces the same phase diagrams independent of the underlying inter-event-time distributions.

preprint2014arXiv

Synthetic sequence generator for recommender systems - memory biased random walk on sequence multilayer network

Personalized recommender systems rely on each user's personal usage data in the system, in order to assist in decision making. However, privacy policies protecting users' rights prevent these highly personal data from being publicly available to a wider researcher audience. In this work, we propose a memory biased random walk model on multilayer sequence network, as a generator of synthetic sequential data for recommender systems. We demonstrate the applicability of the synthetic data in training recommender system models for cases when privacy policies restrict clickstream publishing.

preprint2013arXiv

Statistical inference framework for source detection of contagion processes on arbitrary network structures

In this paper we introduce a statistical inference framework for estimating the contagion source from a partially observed contagion spreading process on an arbitrary network structure. The framework is based on a maximum likelihood estimation of a partial epidemic realization and involves large scale simulation of contagion spreading processes from the set of potential source locations. We present a number of different likelihood estimators that are used to determine the conditional probabilities associated to observing partial epidemic realization with particular source location candidates. This statistical inference framework is also applicable for arbitrary compartment contagion spreading processes on networks. We compare estimation accuracy of these approaches in a number of computational experiments performed with the SIR (susceptible-infected-recovered), SI (susceptible-infected) and ISS (ignorant-spreading-stifler) contagion spreading models on synthetic and real-world complex networks.

preprint2012arXiv

Epidemic centrality - is there an underestimated epidemic impact of network peripheral nodes?

In the study of disease spreading on empirical complex networks in SIR model, initially infected nodes can be ranked according to some measure of their epidemic impact. The highest ranked nodes, also referred to as "superspreaders", are associated to dominant epidemic risks and therefore deserve special attention. In simulations on studied empirical complex networks, it is shown that the ranking depends on the dynamical regime of the disease spreading. A possible mechanism leading to this dependence is illustrated in an analytically tractable example. In systems where the allocation of resources to counter disease spreading to individual nodes is based on their ranking, the dynamical regime of disease spreading is frequently not known before the outbreak of the disease. Therefore, we introduce a quantity called epidemic centrality as an average over all relevant regimes of disease spreading as a basis of the ranking. A recently introduced concept of phase diagram of epidemic spreading is used as a framework in which several types of averaging are studied. The epidemic centrality is compared to structural properties of nodes such as node degree, k-cores and betweenness. There is a growing trend of epidemic centrality with degree and k-cores values, but the variation of epidemic centrality is much smaller than the variation of degree or k-cores value. It is found that the epidemic centrality of the structurally peripheral nodes is of the same order of magnitude as the epidemic centrality of the structurally central nodes. The implications of these findings for the distributions of resources to counter disease spreading are discussed.

preprint2012arXiv

FastSIR Algorithm: A Fast Algorithm for simulation of epidemic spread in large networks by using SIR compartment model

The epidemic spreading on arbitrary complex networks is studied in SIR (Susceptible Infected Recovered) compartment model. We propose our implementation of a Naive SIR algorithm for epidemic simulation spreading on networks that uses data structures efficiently to reduce running time. The Naive SIR algorithm models full epidemic dynamics and can be easily upgraded to parallel version. We also propose novel algorithm for epidemic simulation spreading on networks called the FastSIR algorithm that has better average case running time than the Naive SIR algorithm. The FastSIR algorithm uses novel approach to reduce average case running time by constant factor by using probability distributions of the number of infected nodes. Moreover, the FastSIR algorithm does not follow epidemic dynamics in time, but still captures all infection transfers. Furthermore, we also propose an efficient recursive method for calculating probability distributions of the number of infected nodes. Average case running time of both algorithms has also been derived and experimental analysis was made on five different empirical complex networks.

preprint2009arXiv

Phase diagram of epidemic spreading - unimodal vs. bimodal probability distributions

The disease spreading on complex networks is studied in SIR model. Simulations on empirical complex networks reveal two specific regimes of disease spreading: local containment and epidemic outbreak. The variables measuring the extent of disease spreading are in general characterized by a bimodal probability distribution. Phase diagrams of disease spreading for empirical complex networks are introduced. A theoretical model of disease spreading on m-ary tree is investigated both analytically and in simulations. It is shown that the model reproduces qualitative features of phase diagrams of disease spreading observed in empirical complex networks. The role of tree-like structure of complex networks in disease spreading is discussed.