Constant-time sliding window framework with reduced memory footprint and efficient bulk evictions
Cita com:
hdl:2117/121867
Document typeArticle
Defense date2019-03
Rights accessOpen Access
All rights reserved. This work is protected by the corresponding intellectual and industrial
property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public
communication or transformation of this work are prohibited without permission of the copyright holder
ProjectCOMPUTACION DE ALTAS PRESTACIONES VII (MINECO-TIN2015-65316-P)
Hi-EST - Holistic Integration of Emerging Supercomputing Technologies (EC-H2020-639595)
Hi-EST - Holistic Integration of Emerging Supercomputing Technologies (EC-H2020-639595)
Abstract
The fast evolution of data analytics platforms has resulted in an increasing demand for real-time data stream processing. From Internet of Things applications to the monitoring of telemetry generated in large data centers, a common demand for currently emerging scenarios is the need to process vast amounts of data with low latencies, generally performing the analysis process as close to the data source as possible. Stream processing platforms are required to be malleable and absorb spikes generated by fluctuations of data generation rates. Data is usually produced as time series that have to be aggregated using multiple operators, being sliding windows one of the most common abstractions used to process data in real-time. To satisfy the above-mentioned demands, efficient stream processing techniques that aggregate data with minimal computational cost need to be developed. In this paper we present the Monoid Tree Aggregator general sliding window aggregation framework, which seamlessly combines the following features: amortized O(1) time complexity and a worst-case of O(log n) between insertions; it provides both a window aggregation mechanism and a window slide policy that are user programmable; the enforcement of the window sliding policy exhibits amortized O(1) computational cost for single evictions and supports bulk evictions with cost O(log n) ; and it requires a local memory space of O(log n) . The framework can compute aggregations over multiple data dimensions, and has been designed to support decoupling computation and data storage through the use of distributed Key-Value Stores to keep window elements and partial aggregations.
CitationVillalba, Á., Berral, J., Carrera, D. Constant-time sliding window framework with reduced memory footprint and efficient bulk evictions. "IEEE transactions on parallel and distributed systems", Març 2019, vol. 30, núm. 3, p. 486-500.
ISSN1045-9219
Publisher versionhttps://ieeexplore.ieee.org/document/8456588
Files | Description | Size | Format | View |
---|---|---|---|---|
08456588.pdf | Versió publicada pel l'editor. En accés obert a IEEE | 1,200Mb | View/Open |