25.8.20
This website uses cookies to ensure you get the best experience on our website. Learn more

Applied Data Science and Machine Learning for Cybersecurity

This interactive course will teach security professionals how to use data science techniques to quickly manipulate and analyze network and security data and ultimately uncover valuable insights from this data. The course will cover the entire data science process from data preparation, feature engineering and selection, exploratory data analysis, data visualization, machine learning, model evaluation and optimization and finally, implementing at scale—all with a focus on security related problems. Participants will learn how to read in data in a variety of common formats then write scripts to analyze and visualize that data. A non-exhaustive list of what will be covered include: Using machine learning to detect network attacks within your organization Hunting anomalous indicators of compromise and reducing false positives Quickly and efficiently parsing executables, log files, pcap and extracting artifacts from them Writing scripts to efficiently read and manipulate CSV, XML, and JSON files Using the Pandas library to quickly manipulate tabular data Preprocessing raw security data for machine learning and feature engineering Building, applying and evaluating machine learning algorithms to identify potential threats Automating the process of tuning and optimizing machine learning models Using supervised learning algorithms such as Random Forests, Naive Bayes, K-Nearest Neighbors (K-NN) and Support Vector Machines (SVM) to classify malicious URLs and identify SQL Injection Applying unsupervised learning algorithms such as K-Means Clustering to detect anomalous behavior Rapidly and effectively visualizing data using Python Finally, we will introduce the students to cutting edge Big Data tools including Apache Spark (PySpark), Apache Drill, and GPU accelerated parallel computing frameworks and demonstrate how to apply these techniques to extremely large datasets. A real-time capture the flag (CTF) will run throughout the class to help you sharpen your data science skills.

Skills / Knowledge

  • Defense
  • Network