// Comparison

Malware Data Science vs Network Security Through Data Analysis: Which Should You Read?

Two cybersecurity books on Detection, compared honestly: who each is for, what each does best, and which to read first.

Intermediate
4/52018
Malware Data Science

Attack Detection and Attribution

Joshua Saxe, Hillary Sanders

Saxe and Sanders apply machine-learning techniques (classification, clustering, deep learning) to malware detection and attribution, with working Python code and real corpora.

Intermediate
4/52017
Network Security Through Data Analysis

From Data to Action

Michael Collins

Michael Collins on building situational awareness from network telemetry: collection architecture, statistical baseline-setting, and the analytic patterns that turn raw flows into detection.

Read this if

Malware analysts and detection engineers who want to scale beyond manual triage. Saxe and Sanders apply classification, clustering, similarity analysis, and deep learning to the malware corpus, with working Python code throughout.
Detection engineers and SOC analysts who've graduated from "what alert is this" to "is this alert worth triaging at all." Collins is the quantitative-detection text the field needed.

Skip this if

Analysts whose work is one-sample-at-a-time, or readers without basic Python and statistics comfort. The book is for telemetry-rich environments where ML scales matter.
Beginners with no NSM background, or readers who only do log-based detection. The book leans heavily on flow data and statistical thinking; pair with The Practice of Network Security Monitoring (Bejtlich) first if you're new to the discipline.

Key takeaways

  • Static-feature classifiers can route a triage queue effectively even at scale; the book's chapters on feature engineering pay back the cost.
  • Similarity analysis (locality-sensitive hashing, ssdeep, imphash, function-level fuzzy hashing) is the analyst's lever for clustering campaigns and tracking actor evolution.
  • Deep learning is overhyped for malware in many contexts and exactly the right tool in others; the book is honest about the trade-offs in a way most ML/security books aren't.
  • Detection engineering at scale is a statistical problem; the book teaches the framing every modern SOC eventually reinvents.
  • Flow-data analytics (NetFlow / IPFIX / sFlow) catch lateral movement that packet-based detection misses; the book is the cleanest treatment in print.
  • Time-series anomaly detection can be done well with off-the-shelf tooling and clear thinking; the chapters on baseline calibration are the practical core.

How they compare

Malware Data Science and Network Security Through Data Analysis are both rated 4/5 in our catalog. Pick by topic preference and reading style rather than by rating.

Both books target intermediate-level readers, so the choice is about topic, not difficulty.

Malware Data Science and Network Security Through Data Analysis both cover Detection, so reading them in sequence reinforces the same material from different angles.

Keep reading

Related topics