Christopher D. Krieger

Graduate Student
Colorado State University
Computer Science Department
1873 Campus Delivery
Fort Collins, CO 80523-1873
My CSU CS Email



Education

Grad Student, Computer Science, Colorado State University, 2007 - present.
M.S., Electrical and Computer Engineering, University of Utah, 2002.
B.S., Electrical and Computer Engineering, Brigham Young University, 1995.

Research Interests

I'm now doing research in Computer Science at Colorado State. From a high level, my research deals with finding ways to exploit the increasing number of hardware threads available from modern microprocessors to increase the performance of single-threaded applications. Most recently, I've explored using virtual machines to dynamically detect parallelizable loops. I modified the Jikes RVM and gathered data on the potential this method has by testing a range of benchmarks, including NAS/JavaGrande and DaCapo.

I'm currently looking into hardware data prefetchers. Many HPC scheduling algorithms, such as tiling, try to keep data accesses within a limited amount of memory. But these schedules often work counter to the data prefetching hardware on modern microprocessors. I am investigating the positive and negative interactions between hardware prefetching and tiling. I'll also propose different schedules that may garner maximum performance benefit through cooperation with the hardware prefetcher.

I'll next turn my attention to the use of helper threads. I'd like to study how well a helper thread can do in hiding the latency of irregular or indirect memory accesses. I'd then like to compare those results with what can be achieved using an inspector-executor approach. Another comparison is to see whether using a helper thread to do data pre-loading achieves better performance than using that hardware thread to do parallel computation in cases where such parallelization opportunities exist.

My master's thesis research focused on asynchronous hardware systems. Specifically, I worked on efficient state coding of partially coded asynchronous systems, with performance of the final circuit as the driving cost function. This work could be extended to include concurrency reduction, particularly methods of altering system timing to remove state coding violations without completely removing concurrency.


Current Industry Work

I am currently working at Intel's Fort Collins Design Center in Fort Collins, Colorado on the Itanium Processor Family Performance Team. I run different scenarios through the RTL Simulator to evaluate performance and architectural tradeoffs. I also work on defining and using the Performance Monitoring Units found in Itanium chips.

My previous work focused on improving timing convergence for Itanium (R) microprocessors and automating chip-wide power reduction. I worked on the PA-8700 and PA-8800 PA-RISC processors, and the McKinley (Itanium 2), Montecito (Dual Core), Tukwila (Quad Core), and Poulson (8 Core) IPF processors. I ported many EDA tools to the IPF platform and evaluated compilers, tuning, and optimizations on IPF.

Publications


Class Reports


Coursework