Shrideep Pallickara   [Professor]


I am a Professor in the Department of Computer Science at Colorado State University. Agencies in the United States and United Kingdom have funded my research. These include the Department of Homeland Security (including the Long Range program), the National Science Foundation, the Environmental Protection Agency, Department of Agriculture, and the U.K's e-Science program. I am a recipient of the Monfort Professorship, the Board of Governors Award for Excellence in Undergraduate Teaching, the OLIE award, and the National Science Foundation's CAREER award.

My research encompasses methodological and algorithmic innovations in three broad areas: (1) spatiotemporal data management and analytics, (2) file systems, and (3) stream processing for Internet-of-Things and Cyber Physical Systems settings. Systems software resulting from these efforts have been deployed in domains such as brain computer interfaces, epidemiology, earthquake science, environmental and ecological monitoring, health care systems, high energy physics, defense applications, geosciences, GIS, and commercial internet conferencing systems.


Spatiotemporal data analysis at scale
We have designed a suite of algorithms and software to simplify voluminous spatiotemporal data management and analytics. These algorithms are data format agnostic and our reference implementations can cope with data stored in over 20 different formats that include inter alia netCDF, HDF, XML, CSV, GRIB, BUFR, DMSP, NEXRAD, and SIGMET. Highlighted efforts include:

  • An extreme scale file system, Galileo, designed specifically for spatiotemporal data. We are able to support ~ 1 Trillion files each with 1000 multidimensional observations.
  • An innovative sketching algorithm, Synopsis, designed specifically for spatiotemporal data. We are able to achieve 1000-fold compaction rates while preserving statistical representativeness at diverse spatiotemporal scopes.
    • We have designed a novel algorithm, Vantage, for provenance management over sketches. Vantage exploits the tree structure of sketches to encode targeted provenance metadata within the sketches.

File Systems
My research targets the micro/macroscopic aspects of distributed file systems design. At the individual machine level this has targeted efficiency of disk scheduling algorithms and contention. At the distributed scales this has involved metadata management, query support, preservation of timeliness and throughput, and overlay design. Ongoing research in this area has focussed on designing file systems that faciliate high-performance training of ensemble-based data fitting algorithms such as random forests and gradient boosting. File systems that our efforts interoperate with include ext3/4, btrfs, NTFS, F2FS, ZFS, Google BigQuery, HDFS, and HBase.

Stream Processing
This effort targets processing data streams generated in IoT and Cyber Physical Systems settings. Optimal stream scheduling is NP-Hard and our algorithm based on interference scores and time-series models is currently the state-of-the-art for single-stage stream processing. Our efforts have targeted stream processing at edge devices such as the Raspberry Pi. We have been experimenting with a diverse set of physiological and environmental sensors as part of our VitalHome instrumentation project to continuously and non-invasively harvest vital sign data to identify incipient signs of health problems.

  Department of Computer Science
Colorado State University
1100 Center Avenue, Room 364
Fort Collins, CO 80523-1873 USA
Office Computer Science Building, Room 364
Hours 4:00-5:00 pm Tuesdays
9:00-10:00 am Fridays    [Fall 2018]

Phone 970.492.4209