Project Overview

The BigFastData project addresses a pressing need from emerging big data applications: besides the scale of processing, big data systems must also enable perpetual, low-latency processing for a broad set of analytical tasks, referred to as “big and fast data analysis”. Toward this goal, this project develops an algorithmic foundation that enables three necessary pillars of big and fast data analysis:

  1. Parallelism: We propose new approaches to integrate different forms of parallelism to maximize performance on high-volume large datasets.
  2. Analytics: We propose new algorithmic solutions to enable critical temporal and sequence analytics under different forms of parallelism.
  3. Optimization: To respond to diverse performance goals and budgetary constraints of the user, we develop a principled optimization framework that suits the new characteristics of cloud data analytics and explores the best way to meet user objectives.

To know more about our approach in details, look here.

For demo, code and the datasets used please check out the download section.