Source author record

Andrea Zaccaria

Andrea Zaccaria appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

physics.soc-ph q-fin.EC econ.GN q-fin.TR Computation and Language cond-mat.dis-nn Digital Libraries Machine Learning physics.data-an q-fin.GN q-fin.ST Social and Information Networks

Catalog footprint

What is connected

12works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Bayesian approach to translators' reliability assessment

Translation Quality Assessment (TQA) is a process conducted by human translators and is widely used, both for estimating the performance of (increasingly used) Machine Translation, and for finding an agreement between translation providers and their customers. While translation scholars are aware of the importance of having a reliable way to conduct the TQA process, it seems that there is limited literature that tackles the issue of reliability with a quantitative approach. In this work, we consider the TQA as a complex process from the point of view of physics of complex systems and approach the reliability issue from the Bayesian paradigm. Using a dataset of translation quality evaluations (in the form of error annotations), produced entirely by the Professional Translation Service Provider Translated SRL, we compare two Bayesian models that parameterise the following features involved in the TQA process: the translation difficulty, the characteristics of the translators involved in producing the translation, and of those assessing its quality - the reviewers. We validate the models in an unsupervised setting and show that it is possible to get meaningful insights into translators even with just one review per translation; subsequently, we extract information like translators' skills and reviewers' strictness, as well as their consistency in their respective roles. Using this, we show that the reliability of reviewers cannot be taken for granted even in the case of expert translators: a translator's expertise can induce a cognitive bias when reviewing a translation produced by another translator. The most expert translators, however, are characterised by the highest level of consistency, both in translating and in assessing the translation quality.

preprint2022arXiv

Machine learning to assess relatedness: the advantage of using firm-level data

The relatedness between a country or a firm and a product is a measure of the feasibility of that economic activity. As such, it is a driver for investments at a private and institutional level. Traditionally, relatedness is measured using networks derived by country-level co-occurrences of product pairs, that is counting how many countries export both. In this work, we compare networks and machine learning algorithms trained not only on country-level data, but also on firms, that is something not much studied due to the low availability of firm-level data. We quantitatively compare the different measures of relatedness, by using them to forecast the exports at the country and firm-level, assuming that more related products have a higher likelihood to be exported in the future. Our results show that relatedness is scale-dependent: the best assessments are obtained by using machine learning on the same typology of data one wants to predict. Moreover, we found that while relatedness measures based on country data are not suitable for firms, firm-level data are very informative also for the development of countries. In this sense, models built on firm data provide a better assessment of relatedness. We also discuss the effect of using parameter optimization and community detection algorithms to identify clusters of related companies and products, finding that a partition into a higher number of blocks decreases the computational time while maintaining a prediction performance well above the network-based benchmarks.

preprint2022arXiv

Meta-validation of bipartite network projections

Monopartite projections of bipartite networks are useful tools for modeling indirect interactions in complex systems. The standard approach to identify significant links is statistical validation using a suitable null network model, such as the popular configuration model (CM) that constrains node degrees and randomizes everything else. However different CM formulations exist, depending on how the constraints are imposed and for which sets of nodes. Here we systematically investigate the application of these formulations in validating the same network, showing that they lead to different results even when the same significance threshold is used. Instead a much better agreement is obtained for the same density of validated links. We thus propose a meta-validation approach that allows to identify model-specific significance thresholds for which the signal is strongest, and at the same time to obtain results independent of the way in which the null hypothesis is formulated. We illustrate this procedure using data on scientific production of world countries.

preprint2022arXiv

The different structure of economic ecosystems at the scales of companies and countries

A key element to understand complex systems is the relationship between the spatial scale of investigation and the structure of the interrelation among its elements. When it comes to economic systems, it is now well-known that the country-product bipartite network exhibits a nested structure, which is the foundation of different algorithms that have been used to scientifically investigate countries' development and forecast national economic growth. Changing the subject from countries to companies, a significantly different scenario emerges. Through the analysis of a unique dataset of Italian firms' exports and a worldwide dataset comprising countries' exports, here we find that, while a globally nested structure is observed at the country level, a local, in-block nested structure emerges at the level of firms. Remarkably, this in-block nestedness is statistically significant with respect to suitable null models and the algorithmic partitions of products into blocks have a high correspondence with exogenous product classifications. These findings lay a solid foundation for developing a scientific approach based on the physics of complex systems to the analysis of companies, which has been lacking until now.

preprint2022arXiv

The trickle down from environmental innovation to productive complexity

We study the empirical relationship between green technologies and industrial production at very fine-grained levels by employing Economic Complexity techniques. Firstly, we use patent data on green technology domains as a proxy for competitive green innovation and data on exported products as a proxy for competitive industrial production. Secondly, with the aim of observing how green technological development trickles down into industrial production, we build a bipartite directed network linking single green technologies at time $t_1$ to single products at time $t_2 \ge t_1$ on the basis of their time-lagged co-occurrences in the technological and industrial specialization profiles of countries. Thirdly we filter the links in the network by employing a maximum entropy null-model. In particular, we find that the industrial sectors most connected to green technologies are related to the processing of raw materials, which we know to be crucial for the development of clean energy innovations. Furthermore, by looking at the evolution of the network over time, we observe that more complex green technological know-how requires more time to be transmitted to industrial production, and is also linked to more complex products.

preprint2020arXiv

Dynamical approach to Zipf's law

The rank-size plots of a large number of different physical and socio-economic systems are usually said to follow Zipf's law, but a unique framework for the comprehension of this ubiquitous scaling law is still lacking. Here we show that a dynamical approach is crucial: during their evolution, some systems are attracted towards Zipf's law, while others presents Zipf's law only temporarily and, therefore, spuriously. A truly Zipfian dynamics is characterized by a dynamical constraint, or coherence, among the parameters of the generating PDF, and the number of elements in the system. A clear-cut example of such coherence is natural language. Our framework allows us to derive some quantitative results that go well beyond the usual Zipf's law: i) earthquakes can evolve only incoherently and thus show Zipf's law spuriously; this allows an assessment of the largest possible magnitude of an earthquake occurring in a geographical region. ii) We prove that Zipfian dynamics are not additive, explaining analytically why US cities evolve coherently, while world cities do not. iii) Our concept of coherence can be used for model selection, for example, the Yule-Simon process can describe the dynamics of world countries' GDP. iv) World cities present spurious Zipf's law and we use this property for estimating the maximal population of an urban agglomeration.

preprint2019arXiv

Where is your field going? A Machine Learning approach to study the relative motion of the domains of Physics

We propose an original approach to describe the scientific progress in a quantitative way. Using innovative Machine Learning techniques we create a vector representation for the PACS codes and we use them to represent the relative movements of the various domains of Physics in a multi-dimensional space. This methodology unveils about 25 years of scientific trends, enables us to predict innovative couplings of fields, and illustrates how Nobel Prize papers and APS milestones drive the future convergence of previously unrelated fields.

preprint2016arXiv

Investigating the interplay between fundamentals of national research systems: performance, investments and international collaborations

We discuss, at the macro-level of nations, the contribution of research funding and rate of international collaboration to research performance, with important implications for the science of science policy. In particular, we cross-correlate suitable measures of these quantities with a scientometric-based assessment of scientific success, studying both the average performance of nations and their temporal dynamics in the space defined by these variables during the last decade. We find significant differences among nations in terms of efficiency in turning (financial) input into bibliometrically measurable output, and we confirm that growth of international collaboration positively correlate with scientific success, with significant benefits brought by EU integration policies. Various geo-cultural clusters of nations naturally emerge from our analysis. We critically discuss the possible factors that potentially determine the observed patterns.

preprint2015arXiv

How log-normal is your country? An analysis of the statistical distribution of the exported volumes of products

We have considered the statistical distributions of the volumes of the different products exported by 148 countries. We have found that the form of these distributions is not unique but heavily depends on the level of development of the nation, as expressed by macroeconomic indicators like GDP, GDP per capita, total export and a recently introduced measure for countries' economic complexity called fitness. We have identified three major classes: a) an incomplete log-normal shape, truncated on the left side, for the less developed countries, b) a complete log-normal, with a wider range of volumes, for nations characterized by intermediate economy, and c) a strongly asymmetric shape for countries with a high degree of development. The ranking curves of the exported volumes from each country seldom cross each other, showing a clear hierarchy of export volumes. Finally, the log-normality hypothesis has been checked for the distributions of all the 148 countries through different tests, Kolmogorov-Smirnov and Cramer-Von Mises, confirming that it cannot be rejected only for the countries of intermediate economy.

preprint2015arXiv

Liquidity crises on different time scales

We present an empirical analysis of the microstructure of financial markets and, in particular, of the static and dynamic properties of liquidity. We find that on relatively large time scales (15 minutes) large price fluctuations are connected to the failure of the subtle mechanism of compensation between the flows of market and limit orders: in other words, the missed revelation of the latent order book breaks the dynamical equilibrium between the flows, triggering the large price jumps. On smaller time scales (30 seconds), instead, the static depletion of the limit order book is an indicator of an intrinsic fragility of the system, which is related to a strongly non linear enhancement of the response. In order to quantify this phenomenon we introduce a measure of the liquidity imbalance present in the book and we show that it is correlated to both the sign and the magnitude of the next price movement. These findings provide a quantitative definition of the effective liquidity, which results to be strongly dependent on the considered time scales.

preprint2014arXiv

How the Taxonomy of Products Drives the Economic Development of Countries

We introduce an algorithm able to reconstruct the relevant network structure on which the time evolution of country-product bipartite networks takes place. The significant links are obtained by selecting the largest values of the projected matrix. We first perform a number of tests of this filtering procedure on synthetic cases and a toy model. Then we analyze the bipartite network constituted by countries and exported products, using two databases for a total of almost 50 years. It is then possible to build a hierarchically directed network, in which the taxonomy of products emerges in a natural way. We study the influence of the structure of this taxonomy network on countries' development; in particular, guided by an example taken from the industrialization of South Korea, we link the structure of the taxonomy network to the empirical temporal connections between product activations, finding that the most relevant edges for countries' development are the ones suggested by our network. These results suggest paths in the product space which are easier to achieve, and so can drive countries' policies in the industrialization process.

preprint2011arXiv

Memory effects in stock price dynamics: evidences of technical trading

Technical trading represents a class of investment strategies for Financial Markets based on the analysis of trends and recurrent patterns of price time series. According standard economical theories these strategies should not be used because they cannot be profitable. On the contrary it is well-known that technical traders exist and operate on different time scales. In this paper we investigate if technical trading produces detectable signals in price time series and if some kind of memory effect is introduced in the price dynamics. In particular we focus on a specific figure called supports and resistances. We first develop a criterion to detect the potential values of supports and resistances. As a second step, we show that memory effects in the price dynamics are associated to these selected values. In fact we show that prices more likely re-bounce than cross these values. Such an effect is a quantitative evidence of the so-called self-fulfilling prophecy that is the self-reinforcement of agents' belief and sentiment about future stock prices' behavior.

Andrea Zaccaria

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

A Bayesian approach to translators' reliability assessment

Machine learning to assess relatedness: the advantage of using firm-level data

Meta-validation of bipartite network projections

The different structure of economic ecosystems at the scales of companies and countries

The trickle down from environmental innovation to productive complexity

Dynamical approach to Zipf's law

Where is your field going? A Machine Learning approach to study the relative motion of the domains of Physics

Investigating the interplay between fundamentals of national research systems: performance, investments and international collaborations

How log-normal is your country? An analysis of the statistical distribution of the exported volumes of products

Liquidity crises on different time scales

How the Taxonomy of Products Drives the Economic Development of Countries

Memory effects in stock price dynamics: evidences of technical trading