IntermediateMalwareMachine LearningDetection

Malware Data Science

Attack Detection and Attribution

4 / 5

Saxe and Sanders apply machine-learning techniques (classification, clustering, deep learning) to malware detection and attribution, with working Python code and real corpora.

Buy on Amazon

As an Amazon Associate we earn from qualifying purchases. The link above is sponsored.

Published
2018
Publisher
No Starch Press
Pages
272
Language
English

Read this if

Malware analysts and detection engineers who want to scale beyond manual triage. Saxe and Sanders apply classification, clustering, similarity analysis, and deep learning to the malware corpus, with working Python code throughout.

Skip this if

Analysts whose work is one-sample-at-a-time, or readers without basic Python and statistics comfort. The book is for telemetry-rich environments where ML scales matter.

Key takeaways

  • Static-feature classifiers can route a triage queue effectively even at scale; the book's chapters on feature engineering pay back the cost.
  • Similarity analysis (locality-sensitive hashing, ssdeep, imphash, function-level fuzzy hashing) is the analyst's lever for clustering campaigns and tracking actor evolution.
  • Deep learning is overhyped for malware in many contexts and exactly the right tool in others; the book is honest about the trade-offs in a way most ML/security books aren't.

Notes

Pair with Practical Malware Analysis (Sikorski/Honig) for the manual-analysis foundation and with Joshua Saxe's later work at Sophos for the production deployment view. Code samples on GitHub make this rare among security books in still being runnable years later. Useful for anyone designing detection at scale, less useful for boutique malware analysts.