Researcher profile

Andrea Zaccaria

Andrea Zaccaria contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2022arXiv

A Bayesian approach to translators' reliability assessment

Translation Quality Assessment (TQA) is a process conducted by human translators and is widely used, both for estimating the performance of (increasingly used) Machine Translation, and for finding an agreement between translation providers and their customers. While translation scholars are aware of the importance of having a reliable way to conduct the TQA process, it seems that there is limited literature that tackles the issue of reliability with a quantitative approach. In this work, we consider the TQA as a complex process from the point of view of physics of complex systems and approach the reliability issue from the Bayesian paradigm. Using a dataset of translation quality evaluations (in the form of error annotations), produced entirely by the Professional Translation Service Provider Translated SRL, we compare two Bayesian models that parameterise the following features involved in the TQA process: the translation difficulty, the characteristics of the translators involved in producing the translation, and of those assessing its quality - the reviewers. We validate the models in an unsupervised setting and show that it is possible to get meaningful insights into translators even with just one review per translation; subsequently, we extract information like translators' skills and reviewers' strictness, as well as their consistency in their respective roles. Using this, we show that the reliability of reviewers cannot be taken for granted even in the case of expert translators: a translator's expertise can induce a cognitive bias when reviewing a translation produced by another translator. The most expert translators, however, are characterised by the highest level of consistency, both in translating and in assessing the translation quality.

preprint2022arXiv

Machine learning to assess relatedness: the advantage of using firm-level data

The relatedness between a country or a firm and a product is a measure of the feasibility of that economic activity. As such, it is a driver for investments at a private and institutional level. Traditionally, relatedness is measured using networks derived by country-level co-occurrences of product pairs, that is counting how many countries export both. In this work, we compare networks and machine learning algorithms trained not only on country-level data, but also on firms, that is something not much studied due to the low availability of firm-level data. We quantitatively compare the different measures of relatedness, by using them to forecast the exports at the country and firm-level, assuming that more related products have a higher likelihood to be exported in the future. Our results show that relatedness is scale-dependent: the best assessments are obtained by using machine learning on the same typology of data one wants to predict. Moreover, we found that while relatedness measures based on country data are not suitable for firms, firm-level data are very informative also for the development of countries. In this sense, models built on firm data provide a better assessment of relatedness. We also discuss the effect of using parameter optimization and community detection algorithms to identify clusters of related companies and products, finding that a partition into a higher number of blocks decreases the computational time while maintaining a prediction performance well above the network-based benchmarks.

preprint2022arXiv

Meta-validation of bipartite network projections

Monopartite projections of bipartite networks are useful tools for modeling indirect interactions in complex systems. The standard approach to identify significant links is statistical validation using a suitable null network model, such as the popular configuration model (CM) that constrains node degrees and randomizes everything else. However different CM formulations exist, depending on how the constraints are imposed and for which sets of nodes. Here we systematically investigate the application of these formulations in validating the same network, showing that they lead to different results even when the same significance threshold is used. Instead a much better agreement is obtained for the same density of validated links. We thus propose a meta-validation approach that allows to identify model-specific significance thresholds for which the signal is strongest, and at the same time to obtain results independent of the way in which the null hypothesis is formulated. We illustrate this procedure using data on scientific production of world countries.

preprint2022arXiv

The different structure of economic ecosystems at the scales of companies and countries

A key element to understand complex systems is the relationship between the spatial scale of investigation and the structure of the interrelation among its elements. When it comes to economic systems, it is now well-known that the country-product bipartite network exhibits a nested structure, which is the foundation of different algorithms that have been used to scientifically investigate countries' development and forecast national economic growth. Changing the subject from countries to companies, a significantly different scenario emerges. Through the analysis of a unique dataset of Italian firms' exports and a worldwide dataset comprising countries' exports, here we find that, while a globally nested structure is observed at the country level, a local, in-block nested structure emerges at the level of firms. Remarkably, this in-block nestedness is statistically significant with respect to suitable null models and the algorithmic partitions of products into blocks have a high correspondence with exogenous product classifications. These findings lay a solid foundation for developing a scientific approach based on the physics of complex systems to the analysis of companies, which has been lacking until now.

preprint2022arXiv

The trickle down from environmental innovation to productive complexity

We study the empirical relationship between green technologies and industrial production at very fine-grained levels by employing Economic Complexity techniques. Firstly, we use patent data on green technology domains as a proxy for competitive green innovation and data on exported products as a proxy for competitive industrial production. Secondly, with the aim of observing how green technological development trickles down into industrial production, we build a bipartite directed network linking single green technologies at time $t_1$ to single products at time $t_2 \ge t_1$ on the basis of their time-lagged co-occurrences in the technological and industrial specialization profiles of countries. Thirdly we filter the links in the network by employing a maximum entropy null-model. In particular, we find that the industrial sectors most connected to green technologies are related to the processing of raw materials, which we know to be crucial for the development of clean energy innovations. Furthermore, by looking at the evolution of the network over time, we observe that more complex green technological know-how requires more time to be transmitted to industrial production, and is also linked to more complex products.

preprint2020arXiv

Dynamical approach to Zipf's law

The rank-size plots of a large number of different physical and socio-economic systems are usually said to follow Zipf's law, but a unique framework for the comprehension of this ubiquitous scaling law is still lacking. Here we show that a dynamical approach is crucial: during their evolution, some systems are attracted towards Zipf's law, while others presents Zipf's law only temporarily and, therefore, spuriously. A truly Zipfian dynamics is characterized by a dynamical constraint, or coherence, among the parameters of the generating PDF, and the number of elements in the system. A clear-cut example of such coherence is natural language. Our framework allows us to derive some quantitative results that go well beyond the usual Zipf's law: i) earthquakes can evolve only incoherently and thus show Zipf's law spuriously; this allows an assessment of the largest possible magnitude of an earthquake occurring in a geographical region. ii) We prove that Zipfian dynamics are not additive, explaining analytically why US cities evolve coherently, while world cities do not. iii) Our concept of coherence can be used for model selection, for example, the Yule-Simon process can describe the dynamics of world countries' GDP. iv) World cities present spurious Zipf's law and we use this property for estimating the maximal population of an urban agglomeration.

preprint2019arXiv

Where is your field going? A Machine Learning approach to study the relative motion of the domains of Physics

We propose an original approach to describe the scientific progress in a quantitative way. Using innovative Machine Learning techniques we create a vector representation for the PACS codes and we use them to represent the relative movements of the various domains of Physics in a multi-dimensional space. This methodology unveils about 25 years of scientific trends, enables us to predict innovative couplings of fields, and illustrates how Nobel Prize papers and APS milestones drive the future convergence of previously unrelated fields.