Content area

Abstract

The field of computational mechanics applies ideas from statistical mechanics, information theory, automata theory, and machine learning to create minimally-sized, optimal predictors of stochastic processes. These predictors, called ε-machines, are a subset of a well known statistical model class called the Hidden Markov Model (HMM). Despite being a subset, ε-machines have several important advantages over traditional HMMs. This dissertation illustrates these advantages by applying ε-machines to several problems in computer security: anomaly-based intrusion detection in High Performance Computing (HPC) environments, automated protocol reverse engineering, and structural drift.

Intrusion detection systems (IDSs) detect attacks on computer systems at the host or network level. IDS research is largely ad hoc, and often produces systems that cannot generalize to new attacks or raise prohibitive amounts of alerts. Our first application attempts to address these shortcomings for HPC environments. We construct ε-machine classifiers from the communication patterns of cluster nodes, as well as hardware counters including floating point and integer operation counts. We find these features are sufficient for accurate classification of parallel computation as well as detection of anomalous behavior.

Next, consider computers on a network exchanging data using some protocol whose specification is unknown—for example, a botnet command and control channel. Our work in automated protocol reverse engineering constructs a protocol ε-machine using only observed network traffic. The ε-machine captures both the topological and probabilistic structure of the protocol and is used for anomaly detection, traffic generation, and fuzzing without requiring access to binaries or source code.

Finally, we introduce a model of sequential inference to study the propagation of errors in chains of ε-machine learners. This model, called structural drift, is a generalization of memoryless drift models found in the field of population dynamics. We examine the drift of memoryful models in process space and discuss the impact of model structure on the propagation of errors through time. This propagation has implications for all finite-data applications of the ε-machine.

Details

Title
Security applications of the ε-machine
Author
Whalen, Sean Harrison
Year
2010
Publisher
ProQuest Dissertations Publishing
ISBN
978-1-124-31900-1
Source type
Dissertation or Thesis
Language of publication
English
ProQuest document ID
808524323
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.