Finding Malware on a Web Scale
About a year ago, we published a paper on Nozzle, a runtime heap-spraying detector. Nozzle examines individual objects in the heap, interpreting them as code and performing a static analysis on that code to detect malicious intent. To reduce false positives, we aggregate measurements across all heap objects and define a global heap health metric.
We measured the effectiveness of Nozzle by demonstrating that it successfully detects 12 published and 2,000 synthetically generated heap-spraying exploits. We also show that even with a detection threshold set six times lower that is required to detect published malicious attacks, Nozzle reports no false positives when run over 150 popular Internet sites. Using sampling and concurrent scanning to reduce overhead, we show that the performance overhead of Nozzle is less than 7% on average.
However, since then we discovered that overheads even as low as this might not be acceptable for in-browser deployment. We’ve switched our strategy to offline scanning. We also discovered that static analysis is much more successful at detecting malware than we previously believed. In the remainder of the talk I’ll describe our experience of finding and analyzing thousands of malware attacks found in the wild and show some exciting demos.
Ben Livshits is a researcher at Microsoft Research in Redmond, WA and an affiliate faculty member at the University of Washington. Originally from St. Petersburg, Russia, he received a bachelor’s degree in Computer Science and Math from Cornell University in 1999, and his M.S. and Ph.D. in Computer Science from Stanford University in 2002 and 2006, respectively.
He is known for his work in software reliability and especially tools to improve software security, with a primary focus on approaches to finding buffer overruns in C programs and a variety of security vulnerabilities (cross-site scripting, SQL injections, etc.) in Web-based applications. He is the author of several dozen academic papers and patents. Lately he has been focusing on how Web applications and browser reliability, performance, and security can be improved through a combination of static and runtime techniques.