About the System

Combining data from disparate sources enhances the opportunity to explore different aspects of the phenomena under consideration. However, there are several challenges in doing so effectively that include, inter alia, the heterogeneity in data representation and format, collection patterns, and integration of foreign data attributes in a ready-to-use condition.

Confluence is a distributed data integration framework that dynamically generates accurate interpolations for the targeted spatiotemporal scopes along with an estimate of the uncertainty involved with such estimation. Confluence handles integration among participating spatiotemporal datasets which could be both vector or rasterised.

Confluence also supports dynamic modification of the interpolation parameters to fine-tune the interpolation method based on the spatiotemporal region at which we are interpolating.

Key Functionalities

Confluence allows a spatiotemporal relaxation in its query to enable data integration in cases where the datapoints of the participating datasets are not spatiotemporally aligned.

Confluence supports a generalized data integration among datasets by accomodating datasets of varying resolutions. Participating datasets can be any combination of vector and rasterised data.

Confluence enables low latency distributed data integration operations over BIG spatiotemporal datasets by splitting the entire operations into smaller independent distributed modular sub-operations over a cluster.

Confluence uses random sampling to generate training data to train a machine learning model to dynamically fine-tune the interpolation parameters during runtime based on the point of interpolation.

Publications

Mitra S, Pallickara SL. Confluence: Adaptive Spatiotemporal Data Integration Using Distributed Query Relaxation over Heterogeneous Observational Datasets. In2018 IEEE/ACM 11th International Conference on Utility and Cloud Computing (UCC) 2018 Dec 17 (pp. 184-193). IEEE.

Contributors

Saptashwa Mitra

Graduate Student
Department of Computer Science
Colorado State University

Sangmi Lee Pallickara

Professor
Department of Computer Science
Colorado State University