I am generally interested in network security and traffic analysis, and in the challenges that arise when attempting to design security systems which are performant, effective, and usable. My interests cover:
- Web and cloud security
- Malware detection and understanding
- Usable security
- High-speed packet processing
- Efficient analysis of application-level network protocols
I am working on the following topics: correctness and security of cloud applications; online malware analysis; hardware-accelerated traffic processing; web security and particularly prevention of cross-side scripting attacks. I am also currently looking for capable students. So if you are a current or perspective grad student - or an undergraduate willing to gain research experience - and you are interested in working on the topics above, feel free to reach out to me!
Generation of specifications for malware network protocols
Detecting malware communication from vantage points within the network is complex for various reasons. The rate at which new malware families are released makes it unfeasible for analysts to gain deep comprehension of how malware communicates; furthermore, modern malware actively attempts to avoid detection by using custom communication protocols which are oftentimes encrypted. In this project, we proposed a novel protocol inference algorithm which automatically generates a formal specification of the application-level protocol used by a malware family, and detection procedures which can identify the protocol within network traffic. Our algorithm works in an automated fashion, requiring only the malware's binary and samples of the malware network communication, and can circumvent malware's use of encryption. If you are interested in this work, check out our INFOCOM 2017 paper.
Scope-aware scheduling for intrusion detection systems
Traffic analysis performed by intrusion detection systems (IDSs) present unique challenges: on one hand, analysis has to sustain high throughput to search ever-increasing volumes of traffic. Therefore, IDSs should support scalability, and be able to parallelize their workload over an arbitrary number of processing units. On the other hand such scalability should not impose excessive constraint on developers of traffic analysis algorithm, to avoid limit functionality and effectiveness of IDSs. In this context, we developed a domain-specific concurrency model based on the notion of detection scope: a unit for partitioning network traffic such that the traffic contained in each resulting "slice" is independent for detection purposes. We then developed a program analysis technique that can automatically infer the appropriate scope given an analysis algorithm. The overall vision is that of an IDS where the operator can develop analyses as she sees fit, while the system automatically reasons about the best way to parallelize them. The results of this work were presented in our CCS 2014 paper.
As part of this work I also participated in the design of HILTI, a framework for deep packet inspection consisting in a domain-specific intermediate representation (IR) and a runtime. The goal of HILTI is to provide a generic way to express DPI programs to make their functionality easy to reuse. Indeed, we used HILTI's program analysis capabilities as a foundation to implement our IDS parallelization approach. HILTI was presented in our IMC 2014 paper.
Flexible lookup module for network devices
As part of my research I participated in the development of PLUG (Pipelined LookUp Grid) - a flexible network lookup module designed to be employed in network devices. The goal of our research is to create an "intelligent memory" that can perform network lookups (e.g. for IP forwarding) at hardware speed while being easily reconfigurable to support different data structures (hash tables, trees) and protocols. The core idea to achieve this goal is to express lookup algorithms as dataflow graphs incorporating both lookup data structures and computation. This algorithmic representation can then be compiled and mapped to an array of microcores (the PLUG hardware), which executes it efficiently in a pipelined fashion. The results of this work were the subject of our SIGCOMM 2009 paper. A detailed description of the PLUG hardware architecture was presented at PACT 2010, and PLUG-based solution for efficient packet classification was presented at ANCS 2011.
I then contributed to further research exploring alternative implementations of the PLUG dataflow concept. LEAP (ANCS 2012) replaces PLUG microcores with fixed-function units, diminishing latency and power consumption at the price of some flexibility. SWSL (ANCS 2013) is a compiler that can synthesize hardware lookup pipelines (expressed as PLUG-style dataflow graphs) from C++. In this case, the goal is to reduce hardware design/verification overhead while retaining high performance and low power consumption.
Constraint-based architectural scheduling
PLUG and other similar communication-exposed architectures allow the compiler significant freedom in mapping instructions to cores (and organizing the necessary communication). Compilers typically use heuristics to derive efficient mappings; however, such approach is complex, architecture-specific, and does not provide guarantees on the quality of the resulting mappings. Generalizing insights from the PLUG work, we designed a general framework that can produce schedules for any spatial architecture. In this approach, the architecture is described as a set of constraints on placement of computation on cores, and routing of on-chip data. Many constraints are common to most spatial spatial processors; support for a new processor can be implemented by adding a handful of constraints specific to its architecture. Our work on constraint-based scheduling was presented at PLDI 2013 and further detailed in our TOPLAS article.
Fingerprinting network problems
During my internship at Microsoft Research India, I contributed to the development of a tool - Deja vu - for identifying and classifying network problems. The principle is to generate simple, human-readable signatures by extracting a series of features from network traces. Deja Vu then uses a novel learning algorithm to categorize problem signatures in clusters. The advantage of such approach is that signatures can be effectively used to map failures to known problems (existing clusters), and to detect the occurrence of new problems (new clusters). Also, the signatures - being simple and human readable -can be used by experts to diagnose problems instead of time-consuming network capture analysis. Deja Vu was presented at the CoNEXT 2011 conference.
Parallel data transfers
For my master thesis, I participated in the development of a session-layer protocol - called PATTHEL - that aggregates multiple TCP connections in a single logical channel. PATTHEL is similar in spirit to SCTP but it is implemented on top of TCP, which makes it easier to deploy and more flexible. For example, it can be employed to aggregate the throughput of multiple NICs, but also to spread the load among multiple low-bandwidth relays in a P2P network. PATTHEL was presented at the ISCC 2009 conference.