Source author record

Nikolaos Laoutaris

Nikolaos Laoutaris appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

cs.CY Networking and Internet Architecture Social and Information Networks Computer Science and Game Theory Databases physics.soc-ph

Catalog footprint

What is connected

8works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Survey of Data Marketplaces and Their Business Models

"Data" is becoming an indispensable production factor, just like land, infrastructure, labor or capital. As part of this, a myriad of applications in different sectors require huge amounts of information to feed models and algorithms responsible for critical roles in production chains and business processes. Tasks ranging from automating certain functions to facilitating decision-making in data-driven organizations increasingly benefit from acquiring data inputs from third parties. Responding to this demand, new entities and novel business models have appeared with the aim of matching such data requirements with the right providers and facilitating the exchange of information. In this paper, we present the results and conclusions of a comprehensive survey on the state of the art of entities trading data on the internet, as well as novel data marketplace designs from the research community.

preprint2020arXiv

Computing the Relative Value of Spatio-Temporal Data in Wholesale and Retail Data Marketplaces

Spatio-temporal information is used for driving a plethora of intelligent transportation, smart-city, and crowd-sensing applications. Since data is now considered a valuable production factor, data marketplaces have appeared to help individuals and enterprises bring it to market to satisfy the ever-growing demand. In such marketplaces, several sources may need to combine their data in order to meet the requirements of different applications. In this paper we study the problem of estimating the relative value of different spatio-temporal datasets combined in wholesale and retail marketplaces for the purpose of predicting demand in metropolitan areas. Using as case studies large datasets of taxi rides from Chicago and New York, we ask questions such as "When does it make sense for different taxi companies to combine their data?", and "How should different companies be compensated for the data that they share?". We then turn our attention to the even harder problem of establishing the relative value of the data brought to retail marketplaces by individual drivers. Overall, we show that simplistic but popular approaches for estimating the relative value of data, such as using volume, or the ``leave-one-out'' heuristic, are inaccurate. Instead, more complex notions of value from economics and game-theory, such as the Shapley value need to be employed if one wishes to capture the complex effects of mixing different datasets on the accuracy of forecasting algorithms. Applying the Shapley value to large datasets from many sources is, of course, computationally challenging. We resort to structured sampling and manage to compute accurately the importance of thousands of data sources. We show that the relative value of the data held by different taxi companies and drivers may differ substantially, and that its relative ranking may change from district to district within a metropolitan area.

preprint2015arXiv

From advertising profits to bandwidth prices-A quantitative methodology for negotiating premium peering

We have developed a first of its kind methodology for deriving bandwidth prices for premium direct peering between Access ISPs (A-ISPs) and Content and Service Providers (CSPs) that want to deliver content and services in premium quality. Our methodology establishes a direct link between service profitability, e.g., from advertising, user- and subscriber-loyalty, interconnection costs, and finally bandwidth price for peering. Unlike existing work in both the networking and economics literature, our resulting computational model built around Nash bargaining, can be used for deriving quantitative results comparable to actual market prices. We analyze the US market and derive prices for video that compare favorably with existing prices for transit and paid peering. We also observe that the fair prices returned by the model for high-profit/low-volume services such as search, are orders of magnitude higher than current bandwidth prices. This implies that resolving existing (fierce) interconnection tussles may require per service, instead of wholesale, peering between A-ISPs and CSPs. Our model can be used for deriving initial benchmark prices for such negotiations.

preprint2015arXiv

Testing for common sense (violation) in airline pricing or how complexity asymmetry defeated you and the web

We have collected and analysed prices for more than 1.4 million flight tickets involving 63 destinations and 125 airlines and have found that common sense violation i.e., discrepancies between what consumers would expect and what truly holds for those prices, are far more frequent than one would think. For example, oftentimes the price of a single leg flight is higher than two-leg flights that include it under similar terms of travel (class, luggage allowance, etc.). This happened for up to 24.5% of available fares on a specific route in our dataset invalidating the common expectation that "further is more expensive". Likewise, we found several two-leg fares where buying each leg independently leads to lower overall cost than buying them together as a single ticket. This happened for up to 37% of available fares on a specific route invalidating the common expectation that "bundling saves money". Last, several single stop tickets in which the two legs were separated by 1-5 days (called multicity fares), were oftentimes found to be costing more than corresponding back-to-back fares with a small transit time. This was found to be occurring in up to 7.5% fares on a specific route invalidating that "a short transit is better than a longer one".

preprint2014arXiv

Assessing the Potential of Ride-Sharing Using Mobile and Social Data

Ride-sharing on the daily home-work-home commute can help individuals save on gasoline and other car-related costs, while at the same time it can reduce traffic and pollution. This paper assesses the potential of ride-sharing for reducing traffic in a city, based on mobility data extracted from 3G Call Description Records (CDRs, for the cities of Barcelona and Madrid) and from Online Social Networks (Twitter, collected for the cities of New York and Los Angeles). We first analyze these data sets to understand mobility patterns, home and work locations, and social ties between users. We then develop an efficient algorithm for matching users with similar mobility patterns, considering a range of constraints. The solution provides an upper bound to the potential reduction of cars in a city that can be achieved by ride-sharing. We use our framework to understand the different constraints and city characteristics on this potential benefit. For example, our study shows that traffic in the city of Madrid can be reduced by 59% if users are willing to share a ride with people who live and work within 1 km; if they can only accept a pick-up and drop-off delay up to 10 minutes, this potential benefit drops to 24%; if drivers also pick up passengers along the way, this number increases to 53%. If users are willing to ride only with people they know ("friends" in the CDR and OSN data sets), the potential of ride-sharing becomes negligible; if they are willing to ride with friends of friends, the potential reduction is up to 31%.

preprint2013arXiv

Crowd-assisted Search for Price Discrimination in E-Commerce: First results

After years of speculation, price discrimination in e-commerce driven by the personal information that users leave (involuntarily) online, has started attracting the attention of privacy researchers, regulators, and the press. In our previous work we demonstrated instances of products whose prices varied online depending on the location and the characteristics of perspective online buyers. In an effort to scale up our study we have turned to crowd-sourcing. Using a browser extension we have collected the prices obtained by an initial set of 340 test users as they surf the web for products of their interest. This initial dataset has permitted us to identify a set of online stores where price variation is more pronounced. We have focused on this subset, and performed a systematic crawl of their products and logged the prices obtained from different vantage points and browser configurations. By analyzing this dataset we see that there exist several retailers that return prices for the same product that vary by 10%-30% whereas there also exist isolated cases that may vary up to a multiplicative factor, e.g., x2. To the best of our efforts we could not attribute the observed price gaps to currency, shipping, or taxation differences.

preprint2011arXiv

Deep Diving into BitTorrent Locality

A substantial amount of work has recently gone into localizing BitTorrent traffic within an ISP in order to avoid excessive and often times unnecessary transit costs. Several architectures and systems have been proposed and the initial results from specific ISPs and a few torrents have been encouraging. In this work we attempt to deepen and scale our understanding of locality and its potential. Looking at specific ISPs, we consider tens of thousands of concurrent torrents, and thus capture ISP-wide implications that cannot be appreciated by looking at only a handful of torrents. Secondly, we go beyond individual case studies and present results for the top 100 ISPs in terms of number of users represented in our dataset of up to 40K torrents involving more than 3.9M concurrent peers and more than 20M in the course of a day spread in 11K ASes. We develop scalable methodologies that permit us to process this huge dataset and answer questions such as: "\emph{what is the minimum and the maximum transit traffic reduction across hundreds of ISPs?}", "\emph{what are the win-win boundaries for ISPs and their users?}", "\emph{what is the maximum amount of transit traffic that can be localized without requiring fine-grained control of inter-AS overlay connections?}", "\emph{what is the impact to transit traffic from upgrades of residential broadband speeds?}".

preprint2007arXiv

A bounded-degree network formation game

Motivated by applications in peer-to-peer and overlay networks we define and study the \emph{Bounded Degree Network Formation} (BDNF) game. In an $(n,k)$-BDNF game, we are given $n$ nodes, a bound $k$ on the out-degree of each node, and a weight $w_{vu}$ for each ordered pair $(v,u)$ representing the traffic rate from node $v$ to node $u$. Each node $v$ uses up to $k$ directed links to connect to other nodes with an objective to minimize its average distance, using weights $w_{vu}$, to all other destinations. We study the existence of pure Nash equilibria for $(n,k)$-BDNF games. We show that if the weights are arbitrary, then a pure Nash wiring may not exist. Furthermore, it is NP-hard to determine whether a pure Nash wiring exists for a given $(n,k)$-BDNF instance. A major focus of this paper is on uniform $(n,k)$-BDNF games, in which all weights are 1. We describe how to construct a pure Nash equilibrium wiring given any $n$ and $k$, and establish that in all pure Nash wirings the cost of individual nodes cannot differ by more than a factor of nearly 2, whereas the diameter cannot exceed $O(\sqrt{n \log_k n})$. We also analyze best-response walks on the configuration space defined by the uniform game, and show that starting from any initial configuration, strong connectivity is reached within $Θ(n^2)$ rounds. Convergence to a pure Nash equilibrium, however, is not guaranteed. We present simulation results that suggest that loop-free best-response walks always exist, but may not be polynomially bounded. We also study a special family of \emph{regular} wirings, the class of Abelian Cayley graphs, in which all nodes imitate the same wiring pattern, and show that if $n$ is sufficiently large no such regular wiring can be a pure Nash equilibrium.