Researcher profile

Paolo D'Alberto

Paolo D'Alberto contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
5topics
3close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2015arXiv

Mapping and Matching Algorithms: Data Mining by Adaptive Graphs

Assume we have two bijective functions $U(x)$ and $M(x)$ with $M(x)\neq U(x)$ for all $x$ and $M,N: \N \rightarrow \N$ . Every day and in different locations, we see the different results of $U$ and $M$ without seeing $x$. We are not assured about the time stamp nor the order within the day but at least the location is fully defined. We want to find the matching between $U(x)$ and $M(x)$ (i.e., we will not know $x$). We formulate this problem as an adaptive graph mining: we develop the theory, the solution, and the implementation. This work stems from a practical problem thus our definitions. The solution is simple, clear, and the implementation parallel and efficient. In our experience, the problem and the solution are novel and we want to share our finding.

preprint2015arXiv

Multiple-Campaign Ad-Targeting Deployment: Parallel Response Modeling, Calibration and Scoring Without Personal User Information

We present a vertical introduction to campaign optimization; that is, the ability to predict the user response to an ad campaign without any users' profiles on average and for each exposed ad. In practice, we present an approach to build a polytomous model, multi response, composed by several hundred binary models using generalized linear models. The theory has been introduced twenty years ago and it has been applied in different fields since then. Here, we show how we optimize hundreds campaigns and how this large number of campaigns may overcome a few characteristic caveats of single campaign optimization. We discuss the problem and solution of training and calibration at scale. We present statistical performance as {\em coverage}, {\em precision} and {\em recall} used in classification. We present also a discussion about the potential performance as throughput: how many decisions can be done per second streaming the bid auctions also by using dedicated hardware.

preprint2012arXiv

A Heterogeneous Accelerated Matrix Multiplication: OpenCL + APU + GPU+ Fast Matrix Multiply

As users and developers, we are witnessing the opening of a new computing scenario: the introduction of hybrid processors into a single die, such as an accelerated processing unit (APU) processor, and the plug-and-play of additional graphics processing units (GPUs) onto a single motherboard. These APU processors provide multiple symmetric cores with their memory hierarchies and an integrated GPU. Moreover, these processors are designed to work with external GPUs that can push the peak performance towards the TeraFLOPS boundary. We present a case study for the development of dense Matrix Multiplication (MM) codes for matrix sizes up to 19K\times19K, thus using all of the above computational engines, and an achievable peak performance of 200 GFLOPS for, literally, a made- at-home built. We present the results of our experience, the quirks, the pitfalls, the achieved performance, and the achievable peak performance.

preprint2012arXiv

Non-Parametric Methods Applied to the N-Sample Series Comparison

Anomaly and similarity detection in multidimensional series have a long history and have found practical usage in many different fields such as medicine, networks, and finance. Anomaly detection is of great appeal for many different disciplines; for example, mathematicians searching for a unified mathematical formulation based on probability, statisticians searching for error bound estimates, and computer scientists who are trying to design fast algorithms, to name just a few. In summary, we have two contributions: First, we present a self-contained survey of the most promising methods being used in the fields of machine learning, statistics, and bio-informatics today. Included we present discussions about conformal prediction, kernels in the Hilbert space, Kolmogorov's information measure, and non-parametric cumulative distribution function comparison methods (NCDF). Second, building upon this foundation, we provide a powerful NCDF method for series with small dimensionality. Through a combination of data organization and statistical tests, we describe extensions that scale well with increased dimensionality.

preprint2011arXiv

On the Weakenesses of Correlation Measures used for Search Engines' Results (Unsupervised Comparison of Search Engine Rankings)

The correlation of the result lists provided by search engines is fundamental and it has deep and multidisciplinary ramifications. Here, we present automatic and unsupervised methods to assess whether or not search engines provide results that are comparable or correlated. We have two main contributions: First, we provide evidence that for more than 80% of the input queries - independently of their frequency - the two major search engines share only three or fewer URLs in their search results, leading to an increasing divergence. In this scenario (divergence), we show that even the most robust measures based on comparing lists is useless to apply; that is, the small contribution by too few common items will infer no confidence. Second, to overcome this problem, we propose the fist content-based measures - i.e., direct comparison of the contents from search results; these measures are based on the Jaccard ratio and distribution similarity measures (CDF measures). We show that they are orthogonal to each other (i.e., Jaccard and distribution) and extend the discriminative power w.r.t. list based measures. Our approach stems from the real need of comparing search-engine results, it is automatic from the query selection to the final evaluation and it apply to any geographical markets, thus designed to scale and to use as first filtering of query selection (necessary) for supervised methods.