Source author record

Kristin Tufte

Kristin Tufte appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Databases Other Computer Science

Catalog footprint

What is connected

3works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2016arXiv

Improving Data Quality in Intelligent Transportation Systems

Intelligent Transportation Systems (ITS) use data and information technology to improve the operation of our transportation network. ITS contributes to sustainable development by using technology to make the transportation system more efficient; improving our environment by reducing emissions, reducing the need for new construction and improving our daily lives through reduced congestion. A key component of ITS is traveler information. The Oregon Department of Transportation (ODOT) recently implemented a new traveler information system on selected freeways to provide drivers with travel time estimates that allow them to make more informed decisions about routing to their destinations. The ODOT project aims to improve traffic flow and promote efficient traffic movement, which can reduce emissions rates and improve air quality. The new ODOT system is based on travel data collected from a recently-increased set of sensors installed on its freeways. Our current project investigates novel data cleaning methodologies and the integration of those methodologies into the prediction of travel times. We use machine learning techniques on our archive to identify suspect data, and calculate revised travel times excluding this suspect data. We compare the resulting travel time predictions to ground-truth data, and to predictions based on simple, rule-based data cleaning. We report on the results of our study using qualitative and quantitative methods.

preprint2015arXiv

S-Store: Streaming Meets Transaction Processing

Stream processing addresses the needs of real-time applications. Transaction processing addresses the coordination and safety of short atomic computations. Heretofore, these two modes of operation existed in separate, stove-piped systems. In this work, we attempt to fuse the two computational paradigms in a single system called S-Store. In this way, S-Store can simultaneously accommodate OLTP and streaming applications. We present a simple transaction model for streams that integrates seamlessly with a traditional OLTP system. We chose to build S-Store as an extension of H-Store, an open-source, in-memory, distributed OLTP database system. By implementing S-Store in this way, we can make use of the transaction processing facilities that H-Store already supports, and we can concentrate on the additional implementation features that are needed to support streaming. Similar implementations could be done using other main-memory OLTP platforms. We show that we can actually achieve higher throughput for streaming workloads in S-Store than an equivalent deployment in H-Store alone. We also show how this can be achieved within H-Store with the addition of a modest amount of new functionality. Furthermore, we compare S-Store to two state-of-the-art streaming systems, Spark Streaming and Storm, and show how S-Store matches and sometimes exceeds their performance while providing stronger transactional guarantees.

preprint2009arXiv

Inter-Operator Feedback in Data Stream Management Systems via Punctuation

High-volume, high-speed data streams may overwhelm the capabilities of stream processing systems; techniques such as data prioritization, avoidance of unnecessary processing and on-demand result production may be necessary to reduce processing requirements. However, the dynamic nature of data streams, in terms of both rate and content, makes the application of such techniques challenging. Such techniques have been addressed in the context of static and centralized query optimization; however, they have not been fully addressed for data stream management systems. In this work, we present a comprehensive framework that supports prioritization, avoidance of unnecessary work, and on-demand result production over distributed, unreliable, bursty, disordered data sources, typical of many data streams. We propose a form of inter-operator feedback, which flows against the stream direction, to communicate the information needed to enable execution of these techniques. This feedback leverages punctuations to describe the subsets of interest. We identify potential sources of feedback information, characterize new types of punctuation to support feedback, and describe the roles of producers, exploiters, and relayers of feedback that query operators may implement. We present initial experimental observations using the NiagaraST data-stream system.