Güneş Kayacık


System Call Dataset

System call dataset contains strace captures of four Linux programs both under normal conditions and attacks. I needed to collect this data to train system call-based anomaly detectors such as Stide and pH and test how well they worked against naive and stealthy (mimicry) attacks.

Redhat 6.2 was a great vintage if you want to see some buffer overflow vulnerabilities in action. The dataset is collected from a slightly weakened Redhat host, where root login was enabled. I came up with a few "normal" use cases, in order to capture different behavior. I also used some of the publicly available exploits to deploy buffer overflow attacks. The publication at the bottom of this page contains more details.

The data was used to train anomaly detectors against which a Genetic Programing attacker generated stealthy attacks. For more information, please see the paper reference at the end of this page.

Each file below contains several normal use cases and an attack. README files provide more detail on how the normal behavior was generated.

If you end up using the dataset in an academic project, please consider citing it as:

If the dataset proves useful to your research or if you discover anything interesting, please let me know :-)