Topic overview

cs.CY

4196 works13952 researchers

Open map Browse papers

Map preview

Start with the graph, then narrow the list

4196works

13952researchers

Next steps

Use the topic as a working map

Open the full map for clusters, then return here to scan ranked papers and people.

Inspect nearby papers, researchers, institutions and communities without opening a separate graph page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2017arXiv

The leveled approach. Using and evaluating text mining tools AVResearcherXL and Texcavator for historical research on public perceptions of drugs

We introduce our explorative historical leveled approach that we use to understand drug debates in the Royal Dutch Library's digital newspaper archive. In this approach we alternate between distant reading and close reading. Furthermore, we use this approach to evaluate two text mining tools: AVResearcherXL and Texcavator.

preprint2016arXiv

Unique Sense: Smart Computing Prototype for Industry 4.0 Revolution with IOT and Bigdata Implementation Model

Today, The Computing architectures are one of the most complex constrained developing area in the research field. Which delivers solution for different domains computation problem from its stack above. The architectural integration constrains makes difficulties to customize and modify the system for dynamic industrial and business needs. This model is the initiation towards the solution for findings of Industry 4.0 and Bigdata needs. This Unique sense smart computing implementation model for Industry 4.0 holds the innovative Smart computing prototype is a part of UNIQUE SENSE computing architecture which can delivers alternate solution for today's computing architecture to satisfy the future generation needs of diversified technologies and techniques, which brings extended support to the ubiquitous environment. Primitively the industrial 4.0 having a lots of chained interlinked process which also holds valuable information. So it is especially designed for fault tolerance data processing integrated system. This implementation model constructed in the way that smart control and selfaccessible system for next generation cyber physical machine and automation controlling system. Al

preprint2017arXiv

Optimized, Direct Sale of Privacy in Personal-Data Marketplaces

Very recently, we are witnessing the emergence of a number of start-ups that enables individuals to sell their private data directly to brokers and businesses. While this new paradigm may shift the balance of power between individuals and companies that harvest data, it raises some practical, fundamental questions for users of these services: how they should decide which data must be vended and which data protected, and what a good deal is. In this work, we investigate a mechanism that aims at helping users address these questions. The investigated mechanism relies on a hard-privacy model and allows users to share partial or complete profile data with broker companies in exchange for an economic reward. The theoretical analysis of the trade-off between privacy and money posed by such mechanism is the object of this work. We adopt a generic measure of privacy although part of our analysis focuses on some important examples of Bregman divergences. We find a parametric solution to the problem of optimal exchange of privacy for money, and obtain a closed-form expression and characterize the trade-off between profile-disclosure risk and economic reward for several interesting cases.

preprint2017arXiv

Leveraging Multi-aspect Time-related Influence in Location Recommendation

Point-Of-Interest (POI) recommendation aims to mine a user's visiting history and find her/his potentially preferred places. Although location recommendation methods have been studied and improved pervasively, the challenges w.r.t employing various influences including temporal aspect still remain. Inspired by the fact that time includes numerous granular slots (e.g. minute, hour, day, week and etc.), in this paper, we define a new problem to perform recommendation through exploiting all diversified temporal factors. In particular, we argue that most existing methods only focus on a limited number of time-related features and neglect others. Furthermore, considering a specific granularity (e.g. time of a day) in recommendation cannot always apply to each user or each dataset. To address the challenges, we propose a probabilistic generative model, named after Multi-aspect Time-related Influence (MATI) to promote POI recommendation. We also develop a novel optimization algorithm based on Expectation Maximization (EM). Our MATI model firstly detects a user's temporal multivariate orientation using her check-in log in Location-based Social Networks(LBSNs). It then performs reco

preprint2016arXiv

Automatic Data Deformation Analysis on Evolving Folksonomy Driven Environment

The Folksodriven framework makes it possible for data scientists to define an ontology environment where searching for buried patterns that have some kind of predictive power to build predictive models more effectively. It accomplishes this through an abstractions that isolate parameters of the predictive modeling process searching for patterns and designing the feature set, too. To reflect the evolving knowledge, this paper considers ontologies based on folksonomies according to a new concept structure called "Folksodriven" to represent folksonomies. So, the studies on the transformational regulation of the Folksodriven tags are regarded to be important for adaptive folksonomies classifications in an evolving environment used by Intelligent Systems to represent the knowledge sharing. Folksodriven tags are used to categorize salient data points so they can be fed to a machine-learning system and "featurizing" the data.

preprint2016arXiv

Digital Advertising Traffic Operation: Machine Learning for Process Discovery

In a Web Advertising Traffic Operation it's necessary to manage the day-to-day trafficking, pacing and optimization of digital and paid social campaigns. The data analyst on Traffic Operation can not only quickly provide answers but also speaks the language of the Process Manager and visually displays the discovered process problems. In order to solve a growing number of complaints in the customer service process, the weaknesses in the process itself must be identified and communicated to the department. With the help of Process Mining for the CRM data it is possible to identify unwanted loops and delays in the process. With this paper we propose a process discovery based on Machine Learning technique to automatically discover variations and detect at first glance what the problem is, and undertake corrective measures.

preprint2017arXiv

Privacy-Preserving Data Analysis for the Federal Statistical Agencies

Government statistical agencies collect enormously valuable data on the nation's population and business activities. Wide access to these data enables evidence-based policy making, supports new research that improves society, facilitates training for students in data science, and provides resources for the public to better understand and participate in their society. These data also affect the private sector. For example, the Employment Situation in the United States, published by the Bureau of Labor Statistics, moves markets. Nonetheless, government agencies are under increasing pressure to limit access to data because of a growing understanding of the threats to data privacy and confidentiality. "De-identification" - stripping obvious identifiers like names, addresses, and identification numbers - has been found inadequate in the face of modern computational and informational resources. Unfortunately, the problem extends even to the release of aggregate data statistics. This counter-intuitive phenomenon has come to be known as the Fundamental Law of Information Recovery. It says that overly accurate estimates of too many statistics can completely destroy privacy. One

preprint2016arXiv

Managing Commercial HVAC Systems: What do Building Operators Really Need?

Buildings form an essential part of modern life; people spend a significant amount of their time in them, and they consume large amounts of energy. A variety of systems provide services such as lighting, air conditioning and security which are managed using Building Management Systems (BMS) by building operators. To better understand the capability of current BMS and characterize common practices of building operators, we investigated their use across five institutions in the US. We interviewed ten operators and discovered that BMS do not address a number of key concerns for the management of buildings. Our analysis is rooted in the everyday work of building operators and highlights a number of design suggestions to help improve the user experience and management of BMS, ultimately leading to improvements in productivity, as well as buildings comfort and energy efficiency.

preprint2016arXiv

Quantifying Retail Agglomeration using Diverse Spatial Data

Newly available data on the spatial distribution of retail activities in cities makes it possible to build models formalized at the level of the single retailer. Current models tackle consumer location choices at an aggregate level and the opportunity new data offers for modeling at the retail unit level lacks a theoretical framework. The model we present here helps to address these issues. It is a particular case of the Cross-Nested Logit model, based on random utility theory built with the idea of quantifying the role of floor space and agglomeration in retail location choice. We test this model on the city of London: the results are consistent with a super linear scaling of a retailer's attractiveness with its floor space, and with an agglomeration effect approximated as the total retail floorspace within a $325m$ radius from each shop.

preprint2016arXiv

"420 Friendly": Revealing Marijuana Use via Craigslist Rental Ads

Recent studies have shown that information mined from Craigslist can be used for informing public health policy or monitoring risk behavior. This paper presents a text-mining method for conducting public health surveillance of marijuana use concerns in the U.S. using online classified ads in Craigslist. We collected more than 200 thousands of rental ads in the housing categories in Craigslist and devised text-mining methods for efficiently and accurately extract rental ads associated with concerns about the uses of marijuana in different states across the U.S. We linked the extracted ads to their geographic locations and computed summary statistics of the ads having marijuana use concerns. Our data is then compared with the State Marijuana Laws Map published by the U.S. government and marijuana related keywords search in Google to verify our collected data with respect to the demographics of marijuana use concerns. Our data not only indicates strong correlations between Craigslist ads, Google search and the State Marijuana Laws Map in states where marijuana uses are legal, but also reveals some hidden world of marijuana use concerns in other states where marijuana use is illegal. O

preprint2016arXiv

Assisting humans to achieve optimal sleep by changing ambient temperature

Environment plays a vital role in the sleep mechanism of a human. It has been shown from many studies that sleeping and waking environment, waking time and hours of sleep is of very significant importance which can result in sleeping disorders and variety of diseases. This paper finds the sleep cycle of an individual and according changes the ambient temperature to maximize his/her sleep efficiency. We suggest a method which will assist in increasing sleep efficiency. Using Fast-Fourier-Transformation (FFT) of heart rate signals to extract heart rate variability data such that low frequency / high frequency (LF/HF) power ratio we are detecting sleep stages using an automated algorithm and then applying feedback mechanism to alter the ambient temperature depending upon the sleep stage.

preprint2016arXiv

A "Social Bitcoin" could sustain a democratic digital world

A multidimensional financial system could provide benefits for individuals, companies, and states. Instead of top-down control, which is destined to eventually fail in a hyperconnected world, a bottom-up creation of value can unleash creative potential and drive innovations. Multiple currency dimensions can represent different externalities and thus enable the design of incentives and feedback mechanisms that foster the ability of complex dynamical systems to self-organize and lead to a more resilient society and sustainable economy. Modern information and communication technologies play a crucial role in this process, as Web 2.0 and online social networks promote cooperation and collaboration on unprecedented scales. Within this contribution, we discuss how one dimension of a multidimensional currency system could represent socio-digital capital (Social Bitcoins) that can be generated in a bottom-up way by individuals who perform search and navigation tasks in a future version of the digital world. The incentive to mine Social Bitcoins could sustain digital diversity, which mitigates the risk of totalitarian control by powerful monopolies of information and can create new business

preprint2016arXiv

EchoWear: Smartwatch Technology for Voice and Speech Treatments of Patients with Parkinson's Disease

About 90 percent of people with Parkinson's disease (PD) experience decreased functional communication due to the presence of voice and speech disorders associated with dysarthria that can be characterized by monotony of pitch (or fundamental frequency), reduced loudness, irregular rate of speech, imprecise consonants, and changes in voice quality. Speech-language pathologists (SLPs) work with patients with PD to improve speech intelligibility using various intensive in-clinic speech treatments. SLPs also prescribe home exercises to enhance generalization of speech strategies outside of the treatment room. Even though speech therapies are found to be highly effective in improving vocal loudness and speech quality, patients with PD find it difficult to follow the prescribed exercise regimes outside the clinic and to continue exercises once the treatment is completed. SLPs need techniques to monitor compliance and accuracy of their patients exercises at home and in ecologically valid communication situations. We have designed EchoWear, a smartwatch-based system, to remotely monitor speech and voice exercises as prescribed by SLPs. We conducted a study of 6 individuals; three with

preprint2016arXiv

Neighbor-Neighbor Correlations Explain Measurement Bias in Networks

In numerous physical models on networks, dynamics are based on interactions that exclusively involve properties of a node's nearest neighbors. However, a node's local view of its neighbors may systematically bias perceptions of network connectivity or the prevalence of certain traits. We investigate the strong friendship paradox, which occurs when the majority of a node's neighbors have more neighbors than does the node itself. We develop a model to predict the magnitude of the paradox, showing that it is enhanced by negative correlations between degrees of neighboring nodes. We then show that by including neighbor-neighbor correlations, which are degree correlations one step beyond those of neighboring nodes, we accurately predict the impact of the strong friendship paradox in real-world networks. Understanding how the paradox biases local observations can inform better measurements of network structure and our understanding of collective phenomena.

preprint2016arXiv

Development of UMLS Based Health Care Web Services for Android Platform

In this fast developing world of information, the amount of medical knowledge is rising at an exponential level. The UMLS (Unified Medical Language Systems), is rich knowledge base consisting files and software that provides many health and biomedical vocabularies and standards. A Web service is a web solution to facilitate machine-to-machine interaction over a network. Few UMLS web services are currently available for portable devices, but most of them lack in efficiency and performance. It is proposed to develop Android-based web services for healthcare systems underlying rich knowledge source of UMLS. The experimental evaluation was made to analyse the efficiency and performance effect with and without using the designed prototype. The understand-ability and interaction with the prototype were greater than those who used the alternate sources to obtain the answers to their questions. The overall performance indicates that the system is convenient and easy to use. The result of the evaluation clearly proved that designed system retrieves all the pertinent information better than syntactic searches.

preprint2016arXiv

Panel dataset description for econometric analysis of the ISP-OTT relationship in the years 2008-2013

The latest technological advancements in the telecommunications domain (e.g., widespread adoption of mobile devices, introduction of 5G wireless communications, etc.) have brought new stakeholders into the spotlight. More specifically, Over-the-Top (OTT) providers have recently appeared, offering their services over the existing deployed telecommunication networks. The entry of the new players has changed the dynamics in the domain, as it creates conflicting situations with the Internet Service Providers (ISPs), who traditionally dominate the area, motivating the necessity for novel analytical studies for this relationship. However, despite the importance of accessing real observational data, there is no database with the aggregate information that can serve as a solid base for this research. To that end, this document provides a detailed summary report for financial and statistic data for the period 2008-2013 that can be exploited for realistic econometric models that will provide useful insights on this topic. The document summarizes data from various sources with regard to the ISP revenues and Capital Expenditures (CAPEX), the OTT revenues, the Internet penetration and the Gross

preprint2016arXiv

Handwritten Signature Verification Using Hand-Worn Devices

Online signature verification technologies, such as those available in banks and post offices, rely on dedicated digital devices such as tablets or smart pens to capture, analyze and verify signatures. In this paper, we suggest a novel method for online signature verification that relies on the increasingly available hand-worn devices, such as smartwatches or fitness trackers, instead of dedicated ad-hoc devices. Our method uses a set of known genuine and forged signatures, recorded using the motion sensors of a hand-worn device, to train a machine learning classifier. Then, given the recording of an unknown signature and a claimed identity, the classifier can determine whether the signature is genuine or forged. In order to validate our method, it was applied on 1980 recordings of genuine and forged signatures that we collected from 66 subjects in our institution. Using our method, we were able to successfully distinguish between genuine and forged signatures with a high degree of accuracy (0.98 AUC and 0.05 EER).

preprint2016arXiv

Online Actions with Offline Impact: How Online Social Networks Influence Online and Offline User Behavior

Many of today's most widely used computing applications utilize social networking features and allow users to connect, follow each other, share content, and comment on others' posts. However, despite the widespread adoption of these features, there is little understanding of the consequences that social networking has on user retention, engagement, and online as well as offline behavior. Here, we study how social networks influence user behavior in a physical activity tracking application. We analyze 791 million online and offline actions of 6 million users over the course of 5 years, and show that social networking leads to a significant increase in users' online as well as offline activities. Specifically, we establish a causal effect of how social networks influence user behavior. We show that the creation of new social connections increases user online in-application activity by 30%, user retention by 17%, and user offline real-world physical activity by 7% (about 400 steps per day). By exploiting a natural experiment we distinguish the effect of social influence of new social connections from the simultaneous increase in user's motivation to use the app and tak

preprint2016arXiv

How Do App Stores Challenge the Global Internet Governance Ecosystem?

App stores challenge the culture of openness and resistance to central authorities cultivated by the pioneers of the Internet. Could multistakeholder governance bodies bring more inclusivity into the global cyberspace governance ecosystem?

preprint2016arXiv

Smart Contract Templates: essential requirements and design options

Smart Contract Templates support legally-enforceable smart contracts, using operational parameters to connect legal agreements to standardised code. In this paper, we explore the design landscape of potential formats for storage and transmission of smart legal agreements. We identify essential requirements and describe a number of key design options, from which we envisage future development of standardised formats for defining and manipulating smart legal agreements. This provides a preliminary step towards supporting industry adoption of legally-enforceable smart contracts.

preprint2016arXiv

Prerequisites for International Exchanges of Health Information: Comparison of Australian, Austrian, Finnish, Swiss, and US Privacy Policies

Capabilities to exchange health information are critical to accelerate discovery and its diffusion to healthcare practice. However, the same ethical and legal policies that protect privacy hinder these data exchanges, and the issues accumulate if moving data across geographical or organizational borders. This can be seen as one of the reasons why many health technologies and research findings are limited to very narrow domains. In this paper, we compare how using and disclosing personal data for research purposes is addressed in Australian, Austrian, Finnish, Swiss, and US policies with a focus on text data analytics. Our goal is to identify approaches and issues that enable or hinder international health information exchanges. As expected, the policies within each country are not as diverse as across countries. Most policies apply the principles of accountability and/or adequacy and are thereby fundamentally similar. Their following requirements create complications with re-using and re-disclosing data and even secondary data: 1) informing data subjects about the purposes of data collection and use, before the dataset is collected; 2) assurance that the subjects are no longer iden

preprint2016arXiv

Virtual Breathalyzer

Driving under the influence of alcohol is a widespread phenomenon in the US where it is considered a major cause of fatal accidents. In this research we present a novel approach and concept for detecting intoxication from motion differences obtained by the sensors of wearable devices. We formalize the problem of drunkenness detection as a supervised machine learning task, both as a binary classification problem (drunk or sober) and a regression problem (the breath alcohol content level). In order to test our approach, we collected data from 30 different subjects (patrons at three bars) using Google Glass and the LG G-watch, Microsoft Band, and Samsung Galaxy S4. We validated our results against an admissible breathalyzer used by the police. A system based on this concept, successfully detected intoxication and achieved the following results: 0.95 AUC and 0.05 FPR, given a fixed TPR of 1.0. Applications based on our system can be used to analyze the free gait of drinkers when they walk from the car to the bar and vice-versa, in order to alert people, or even a connected car and prevent people from driving under the influence of alcohol.

preprint2016arXiv

Towards the Verification of Safety-critical Autonomous Systems in Dynamic Environments

There is an increasing necessity to deploy autonomous systems in highly heterogeneous, dynamic environments, e.g. service robots in hospitals or autonomous cars on highways. Due to the uncertainty in these environments, the verification results obtained with respect to the system and environment models at design-time might not be transferable to the system behavior at run time. For autonomous systems operating in dynamic environments, safety of motion and collision avoidance are critical requirements. With regard to these requirements, Macek et al. [6] define the passive safety property, which requires that no collision can occur while the autonomous system is moving. To verify this property, we adopt a two phase process which combines static verification methods, used at design time, with dynamic ones, used at run time. In the design phase, we exploit UPPAAL to formalize the autonomous system and its environment as timed automata and the safety property as TCTL formula and to verify the correctness of these models with respect to this property. For the runtime phase, we build a monitor to check whether the assumptions made at design time are also correct at run time. If the curren

preprint2016arXiv

An Open, Multi-Sensor, Dataset of Water Pollution of Ganga Basin and its Application to Understand Impact of Large Religious Gathering

Water is a crucial pre-requisite for all human activities. Due to growing demand from population and shrinking supply of potable water, there is an urgent need to use computational methods to manage available water intelligently, and especially in developing countries like India where even basic data to track water availability or physical infrastructure to process water are inadequate. In this context, we present a dataset of water pollution containing quantitative and qualitative data from a combination for modalities - real-time sensors, lab results, and estimates from people using mobile apps. The data on our API-accessible cloud platform covers more than 60 locations and consists of both what we have ourselves collected from multiple location following a novel process, and from others (lab-results) which were open but hither-to difficult to access. Further, we discuss an application of released data to understand spatio-temporal pollution impact of a large event with hundreds of millions of people converging on a river during a religious gathering (Ardh Khumbh 2016) spread over months. Such unprecedented details can help authorities manage an ongoing event or plan for future o

498 works