CS Colloquium Schedule

Link to Colorado State University Home Page

CS Colloquium Series Spring 2012: Abstracts

Super Manners: Collegiality and Professionalism
Elaine Regelson and Bruce Draper
Computer Science Department
Colorado State University

This mini-workshop will explore topics such: as why this is important; colleagues and friends (objectives, context); the value of "Hmm"; and how to give and receive feedback. Please come prepared to listen and to practice. I am hoping attendees will find this interesting and useful ... and maybe even fun.

Leveraging Model-Driven Engineering Techniques in Optimizing Compiler Research
Robert France and Tomofumi Yuki
Computer Science Department
Colorado State University

A primary goal of Model Driven Engineering (MDE) is to reduce the cost and effort of developing complex software systems using techniques for transforming abstract views of software to concrete implementations. The rich set of tools that have been developed, especially the growing maturity of model transformation technologies, opens the possibility of applying MDE technologies to transformation-based problems in other computer science research domains. In this talk, we present our experience with using MDE technologies to build and evolve compiler infrastructures in the optimizing compiler domain.We illustrate, through our two ongoing research compiler projects for C and a functional language, the challenging aspects of optimizing compiler research and show how mature MDE technologies can be used to address them.

Guidance of Autonomous Aerial Vehicles: Real-time Planning via POMDP
Edwin Chong
Electrical and Computer Engineering Department
Colorado State University

We consider the problem of planning the motion of unmanned aerial vehicles (UAVs) with on-board sensors, with the goal of tracking ground targets. We apply the theory of partially observable Markov decision processes (POMDPs) to this problem. While POMDPs are intractable to optimize exactly, principled approximation methods can be devised based on Bellman's principle. We show how application-specific approximations produces a practical design that coordinates the UAVs to achieve good long-term mean-squared-error tracking performance in the presence of occlusions and dynamic constraints.

Predictive Modeling of Metagenomes
Dan Knights
University of Colorado, Boulder

Human-associated microbial communities have been implicated in a variety of chronic diseases, including inflammatory bowel diseases, obesity, and autoimmune disorders like diabetes. Environmental communities are also important for bioconversion of waste products in biofuel production. However, microbiomes are highly complex systems involving mutualism and competition between many constituent organisms, and a variety of fundamental and interesting computational challenges remain in the modeling of pathogenicity and community-wide response to perturbations. In this talk I will discuss several computational and statistical approaches to predictive modeling of microbiome behavior using high-throughput metagenomic and transcriptomic sequencing data, including models that leverage biological structures such as phylogenies and gene ontologies to extract features and constrain model complexity.

Statistical Approaches to Infer Gene Regulatory Circuits
Majid Kazemian
University of Illinois, Urbana Champaign

Current high-throughput technologies in molecular biology provide vast amounts of data, bringing new computational challenges and unprecedented opportunities for deeper understanding of biological processes and disease. The computational challenges are varied, and range from analyzing noisy, incomplete, heterogeneous data in a single organism to extrapolating discovered knowledge across species.

One of the most important biological questions that face modern computational biology is that of gene regulation: how are thousands of genes turned on and off at the right time and place, in a highly orchestrated way? The first part of my talk will explore a range of statistical approaches used to find the DNA sequences (called "benhancers") responsible for gene regulation in fruit flies and humans. I will focus on supervised learning methods that use fixed and variable order Markov models to realistically capture the biological features of a small set of enhancers, and then identify novel enhancers with similar function. I will demonstrate the effectiveness of this approach to computational enhancer prediction with results from experimental validations. The second part of my talk will focus on a deep mechanistic understanding of enhancers that are responsible for the appearance of segments in the fruit fly body. I will show that a logistic regression model is able to integrate features of genomic sequence and gene expression information to quantitatively predict the precise regulatory function of any enhancer. I will also demonstrate how this model can be used to identify novel enhancers and map out the gene regulatory circuit of fruit fly segmentation.

My talk will focus on statistical and computational approaches used to tackle these important questions of biology, and any necessary biological terminology will be explained.

Tools for Genomic Assembly and Analysis
Christina Boucher
University of California, San Diego

The assembly of next generation sequencing data into full genomes is an open bioinformatics problem. Current methods result in a substantial number of errors that need to be corrected after the assembly process. We develop a tool, NGS-Refine, which corrects errors in the assembled sequences. The NGS-Refine algorithm involves three stages: positional de Bruijn graph construction, graph correction, and contig refinement. I will describe each of these stages in detail, and present results on the assembly produced by Euler-SR, Velvet, and Velvet-SC, an assembler specifically tailored for single-cell data. NGS-Refine reduced the number of insertions and deletions in each assembly of standard multi-cell E. coli data by more than half, and corrected between 30% and 94% of the substitution errors. We show it is imperative to improving single-cell assembly, which is inherently more challenging due to higher error rates and non-uniform coverage; over half of the insertions and deletions, and substitution errors in the single-cell assemblies were corrected.

In the second half of this talk, I will discuss the development and application of sMCL-WMR, a software tool specifically designed to detect motifs in genomic data. sMCL-WMR is capable of distinguishing valid motif sets from decoy sets, which allows for efficient detection of motifs in very large datasets. sMCL-WMR represents the input data as a weighted graph, and uses graph clustering to narrow the search to smaller problems that can be solved with significantly less computation. I will describe several applications of sMCL-WMR, including the detection of transcription factor binding sites in the canola genome.

Designing Motion Gestures for Mobile Interaction
Jaime Ruiz
University of Waterloo

Hand motion -- pointing, gesturing, grasping, shaking, tapping -- is a rich channel of communication. We point and gesture while we talk; we grasp tools to extend our capabilities; we grasp, rotate, and shake items to explore them. Yet, the rich repertoire of hand motion is largely ignored in interfaces to mobile computation: the user of a modern smartphone generally holds the device stationary while tapping or swiping its surface. Why are so many possible affordances ignored? Certainly not for technical reasons, as smartphones contain an evolving set of sensors for recognizing movement of the phone, including accelerometers, gyroscopes and cameras. However, beyond rotating to change screen orientation or shaking to shuffle songs, little has been done to enable rich gestural input through device motion. In this talk, I will explore our on-going work in the design and implementation of motion gestures to control modern smartphones. I will present the results of a guessability study that we conducted that demonstrates that consensus exists among a group of prospective end-users on parameters of movement and on mappings of motion gestures onto commands. I will show how we have used this consensus to develop a taxonomy for motion gestures. Finally, I will discuss some of our current work in segmenting motion gestures from everyday smartphone movements.

Declarative Parallel Programming
William Byrd
Indiana University

Declarative languages allow programmers to specify what a program should do, without specifying how to do it. Expressing the "what", and leaving the "how" to the compiler and runtime system, leads to shorter, simpler, and safer programs. The declarative approach is especially promising for difficult programming tasks, which is why declarative programming has been popular in artificial intelligence research for decades. One notoriously difficult task is the programming of multi-core and multi-processor computers, which are now found not only in supercomputers, but also laptops, cell phones, and gaming consoles. Graphics processors (GPUs) are essentially parallel supercomputers, and are also difficult to program. The recent introduction of hybrid CPU/GPU clusters adds even more complexity.

To address the complexity of programming modern parallel hardware, my colleagues and I in the Declarative Parallel Programming project at Indiana University are developing two related declarative languages: Kanor, a language for specifying collective communication on clusters; and Harlan, a language for describing computational kernels on GPUs and other accelerators.

I will describe how Kanor allows programmers to declaratively, but explicitly, specify the essence of communication patterns. The programmer lets the implementation handle the details when appropriate, but retains the option to hand-encode communications when necessary, providing a balance between declarativeness and performance predictability and tunability. Similarly, Harlan allows the user to declaratively, but explicitly, describe computational kernels and to coordinate computation, data layout, and memory movement. As with Kanor, this approach gives the programmer enough control to write efficient code, while abstracting over the low-level details that make GPU programming so difficult. Integrating Harlan into Kanor results in a unified, high-level, flexible language suitable for efficiently programming hybrid clusters, traditional (CPU-based) clusters, and GPUs on a single machine.

Value of Rapid Prototyping in Building Complex Digital Systems
Arvind
MIT

Modern systems often contain special-purpose hardware for performance and power reasons. It is sometimes difficult to know a priori the best decomposition of the system from the performance point of view. Since different components are designed by different teams it also difficult to ensure that the whole system would function properly when various components are put together. These risks can be mitigated substantially if one can build rapidly an accurate and fast prototype of the system being designed. Such prototyping does not extend the time-to-market if the design methodology ensures that there is an automatic or semiautomatic path from the prototype design to the real product design. Our methodology has three essential aspects: reusing complex blocks involving domain expertise; experimenting with designs to achieve goals such as cost, performance, and power; conducting high-fidelity full system simulation, including software, throughout the design process. We will illustrate this methodology using several prototypes we have built over the past few years: AirBlue, H.264, Mulicore PowerPC simulator and BlueSSD.

A Random Walk on Image Patches
François Meyer
University of Colorado, Boulder

Algorithms that analyze patches extracted from time series or images have led to state-of-the art techniques for classification, denoising, and the study of nonlinear dynamics. In the first part of the talk, we describe two examples of such algorithms: a novel method to estimate the arrival-times of seismic waves from a seismogram, and a new patch-based method to denoise images. Both approaches combine the following two ingredients: the signals (time series or images) are first lifted into a high-dimensional space using time/space-delay embedding; the resulting phase space is then parameterized using a nonlinear method based on the eigenvectors of the graph Laplacian. Both algorithms outperform existing gold standards. In the second part of the talk, we provide a theoretical explanation for the success of algorithms that organize patches according to graph-based metrics. Our approach relies on a detailed analysis of the commute time on prototypical graph models that epitomize the geometry observed in general patch-graphs.

Biography: François Meyer graduated with Honors from Ecole Nationale Superieure d'Informatique et de Mathematiques Appliquees, Grenoble, in 1987, with a M.S. in applied mathematics. He received a Ph.D. degree in electrical engineering from INRIA, France, in 1993. Meyer worked on the thermonuclear fusion program of the French Nuclear Energy Agency during his military service. He is currently an Associate Professor with the Department of Electrical Engineering, University of Colorado, Boulder. He had previously been an Assistant Professor at Yale University, a Visiting Professor at the Institute Henri Poincaré (Paris), a Senior Fellow at the Institute of Pure and Applied Mathematics, (UCLA), and a Visiting Research Scholar at Princeton University.

Models, Algorithms, and Software: Tradeoffs in the Design of High-Performance Computational Simulations in Science and Engineering
Phil Colella
Lawrence Berkeley National Lab

Many important problems for DOE such as combustion, fusion, systems biology, and climate change, involve multiple physical processes operating on multiple space and time scales. In spite of the physical diversity of these problems, there is a great deal of coherence in the underlying mathematical representations. They are all described in terms of various versions of the elliptic, parabolic and hyperbolic partial differential equations (PDE) of classical mathematical physics. The enormous variety and subtlety in these applications comes from the way the PDE are coupled, generalized, and combined with models for other physical processes. The complexity of these models and the need to represent multiple scales lead to a diverse collection of requirements on the numerical methods, with many open questions about stability of coupled algorithms. Finally, the complexity of models and algorithms, combined with uncertainties about the correct combination to use, complicates the problem of designing high performance software. In this talk, I will attempt to describe the tradeoffs between the models, the discretizations, and the software in the development of high-performance computational simulations in science and engineering involving PDE, including some motivating applications, and the combination of analysis and computational experiments that are used to explore the design space.

Simultaneous Segmentation of Multiple Functional Genomics Data Sets with Heterogeneous Patterns of Missing Data
Michael M. Hoffman
University of Washington

New functional genomics methods enabled by high-throughput DNA sequencing have begun to produce an unprecedented amount of data anchored to the genome of humans and other species. We have developed a method to identify joint patterns in the results of multiple classes of functional genomics experiments. The method partitions the genome into variable-length segments using a dynamic Bayesian network where the dynamic (or "time") axis represents genomic position. Segments are assigned one of a finite number of labels such that the vectors of observations are similar in segments with the same label. A multinet switching structure allows inference on sequences with combinations of missing data in different tracks that vary at each position, without downsampling or interpolation. This permits us to take full advantage of the high-resolution data generated by sequencing assays, working at up to 1-base-pair resolution. Our system can also incorporate other kinds of data into its classification, including lower-resolution continuous data such as microarray data, or discrete data such as the dinucleotide sequence beginning at each position. We demonstrate the use of the method in both unsupervised and semisupervised training of segment parameters.

Resilient Asymmetric Security (RAS)
Salim Hariri
University of Arizona

Current advances in computing, networking, software and services will lead to the development of cyberspace services (e.g., cloud services) that are ubiquitous and touch all aspects of our life. These pervasive services will revolutionize the way we do business, maintain our health, conduct education, and how secure, protect, and entertain ourselves. However, along with these advances, we are experiencing grand challenges to ensure that our cyberspace resources and services are highly resilience; that means it can effectively tolerate epidemic-style cyberattacks such as viruses and worms, spams, and denial-of-service attacks; deliver software systems and services that can survive hardware/software failures and attacks; and manage its cyberspace resources and applications by being self-aware, self-adaptive, self-heal, or in general self-*, i.e., autonomic operations. To address these challenges, we are developing a Resilient Asymmetric Security (RAS) approach. The goals of the RAS are: 1) Stop/eliminate the effectiveness of spams, viruses, worms and cyber attacks (known or unknown); 2) Deliver uninterrupted software and cloud services in spite of attacks and failures; and 3) Build "hassle-free" cyber services that are self-aware, self-adapt, self-heal, and self-protect. In this presentation, I will discuss our approach to implement RAS that is based on three techniques: Software Behavior Encryption (SBE), Self Management (SM), and Collective Intelligence. I will also present experimental results and evaluation of our RAS approach to secure and protect applications as well as communications protocols.

Biography: Salim Hariri is a Professor in the Department of Electrical and Computer Engineering at The University of Arizona. He received his Ph.D. in computer engineering from University of Southern California in 1986, and an MSc from The Ohio State University in 1982. He is the UA site director of NSF Center for Autonomic Computing and he is the Editor-In-Chief for the CLUSTER COMPUTING JOURNAL (Springer).

Restoration of Soft X-Ray Laser Images of Nanostructures
Damir Sersic
University of Zagreb

Several advanced techniques for restoration of images obtained with the 46.9 nm soft x-ray (SXR) laser microscope will be presented. We developed two advanced denoising methods, one based on wavelet transform and the other on adaptive zero order modeling of the observed object. Due to the non-uniform distribution of noise, all methods use spatial noise modeling. The wavelet method is based on adaptive thresholding, while the other uses local Wiener filtering in the wavelet domain to achieve very high noise gains. The best results were obtained by adaptive noise modeling. To our knowledge, the results over perform state-of-the art competitive methods. Furthermore, the analysis is robust to enable image acquisition with significantly lower exposure times, which is critical in samples that are sensitive to radiation damage as is the case of biological samples imaged by SXR microscopy.

Biography:Damir Sersic, is an associate professor at the Department of Electronic Systems and Information Processing, University of Zagreb, Faculty of Electrical Engineering and Computing. In 1999, he received his Ph.D. in Electrical Engineering from the University of Zagreb. His research interests are in the field of digital signal processing, image processing, adaptive wavelets and filter banks, blind separation and deconvolution. Several applications of his research are in the area of adaptive control and bioinformatics. He is an active researcher or principal investigator for several projects financed by the EU and Croatian funding agencies. Among other courses, he teaches Signals and Systems and Advanced Digital Signal Processing Methods.

Technology Innovation at Linkabit and Qualcomm: Generating Leading-Edge Products
Irwin Jacobs
CEO Emeritus, Qualcomm

Biography: Dr. Irwin Mark Jacobs is co-founder and chairman of Qualcomm Incorporated, pioneer and world leader of Code Division Multiple Access (CDMA) digital wireless technology. Dr. Jacobs has led the commercialization of CDMA technology and its success as the world's fastest-growing, most advanced voice and data wireless communications technology. Now used by tens of millions of consumers worldwide, CDMA is the technology of choice for third-generation wireless communications services.

Dr. Jacobs holds several CDMA patents, contributing to QUALCOMM's extensive portfolio of more than 3,000 issued and pending U.S. patent applications. More than 115 companies have licensed CDMA for the manufacturing of wireless devices and network infrastructure equipment, integrated circuits and test equipment.

Dr. Jacobs previously served as co-founder, president, chairman and CEO of LINKABIT Corporation, directing its growth from a few part-time employees in 1969 to over 1,400 employees in 1985, and first introduction of Ku-band Very Small Aperature Earth Terminals (VSATs), commercial TDMA wireless phones, and the VideoCipher satellite-to-home TV system. LINKABIT merged with M/A-COM in August 1980, at which time Dr. Jacobs served on the company's board of directors until he resigned from M/A-COM in April 1985. Over 35 San Diego communications companies trace their roots back to LINKABIT.

From 1959 to 1966, Dr. Jacobs was an assistant/associate professor of electrical engineering at Massachusetts Institute of Technology (MIT). From 1966 to 1972 he served as a professor of computer science and engineering at the University of California, San Diego (UCSD). At MIT, Dr. Jacobs co-authored a basic textbook in digital communications entitled, Principles of Communication Engineering. First published in 1965, the book remains in use today.

Dr. Jacobs received a bachelor's degree in electrical engineering in 1956 from Cornell University and master of science and doctor of science degrees in electrical engineering from MIT in 1957 and 1959, respectively.

Dr. Jacobs is a member of a number of industry and community boards and committees. He is a Fellow of the IEEE and a member of Sigma XI, Phi Kappa Phi, Eta Kappa Nu, and Tau Beta Pi. Dr. Jacobs also serves on the Council on Competitiveness, the National Academy of Engineering Committee on Public Awareness of Engineering, the board of directors of Building Engineering & Science Talent, the visiting committee of the MIT Laboratory for Information and Decision Systems, California Council on Science and Technology, and is past chairman of the University of California President's Engineering Advisory Council.

An Overview of Research in Automatic Test Input Generation and Fault Localization
Sudipto Ghosh
Colorado State University

Detecting and fixing faults in programs remain challenging problems for software developers. While a number of test execution frameworks are available, testers still mostly depend on manual creation of test inputs. Once test execution results are obtained, programmers start the process of debugging, which is also mostly manual. In this talk, an approach to automatic test generation and an approach to automatic fault localization will be presented.

Robust Workflows for Science and Engineering
David Abramson
Monash University, Melbourne, Australia

Scientific workflow tools allow users to specify complex computational experiments and provide a good framework for robust science and engineering. Workflows can consist of pipelines of tasks that explore the behaviour of some system, involving computations that are either performed locally or on remote computers. Robust scientific methods require the exploration of the parameter space of a system (some of which can be run in parallel on distributed resources), and may involve complete state space exploration, experimental design or numerical optimization techniques. Whilst workflow engines provide an overall framework, they have not been developed with these concepts in mind, and in general, don't provide the necessary components to implement robust workflows. In this seminar I will discuss Nimrod/K - a set of add in components and a new run time machine for a general workflow engine, Kepler. Nimrod/K provides an execution architecture based on the tagged dataflow concepts developed in 1980's for highly parallel machines. This is embodied in a new Kepler 'Director' that orchestrates the execution on clusters, Grids and Clouds. Nimrod/K also provides a set of 'Actors' that facilitate the various modes of parameter exploration discussed above. I will demonstrate the power of Nimrod/K to solve real problems in science by a set of case studies.

Biography: David Abramson has been involved in computer architecture and high performance computing research since 1979. Previous to joining Monash University in 1997, he has held appointments at Griffith University. At CSIRO he was the program leader of the Division of Information Technology High Performance Computing Program, and was also an adjunct Associate Professor at RMIT in Melbourne. He was also a program manager in the Co-operative Research Centre for Intelligent Decisions Systems and the Co-operative Research Centre for Enterprise Distributed Systems. From 2007 to 2011, Prof. Abramson was an ARC Professorial Fellow. Currently Abramson is a Professor of Computer Science, Director of the Monash e-Education Centre and Science Director of the Monash e-Research Centre. His current interests are in high performance computer systems design and software engineering tools for programming parallel, distributed supercomputers and stained glass windows. He is a Fellow of the Association for Computing Machinery (ACM) and the Academy of Science and Technological Engineering (ATSE), and a member of the IEEE. Prof. Abramson has served on committees for many conferences and workshops, and has published over 200 papers and technical documents. He has given seminars and received awards around Australia and internationally and has received over $8.8 million in research funding. He also has a keen interest in R&D commercialization and consults for Axceleon Inc., which produces an industry strength version of Nimrod, and Guardsoft, a company focused on commercialising the Guard relative debugger.