System Call Dataset
System call dataset contains strace captures of four Linux programs both under normal conditions and attacks. I needed to collect this data to train system call-based anomaly detectors such as Stide and pH and test how well they worked against naive and stealthy (mimicry) attacks.
Redhat 6.2 was a great vintage if you want to see some buffer overflow vulnerabilities in action. The dataset is collected from a slightly weakened Redhat host, where root login was enabled. I came up with a few "normal" use cases, in order to capture different behavior. I also used some of the publicly available exploits to deploy buffer overflow attacks. The publication at the bottom of this page contains more details.
The data was used to train anomaly detectors against which a Genetic Programing attacker generated stealthy attacks. For more information, please see the paper reference at the end of this page.
Each file below contains several normal use cases and an attack. README files provide more detail on how the normal behavior was generated.
- Download Traceroute system call dataset
- Download FTP system call dataset
- Download Samba system call dataset
- Download Restore system call dataset
- Kayacik H. G., "Can The Best Defense Be A Good Offense? Evolving (Mimicry) Attacks For Detector Vulnerability Testing Under A 'Black-Box' Assumption", PhD Thesis, Dalhousie University May 2009.
If the dataset proves useful to your research or if you discover anything interesting, please let me know :-)