The blue team challenge
Ask anyone who has interacted with a Security Operations Center (SOC) and they will tell you that noisy detections (false positives) are one of the biggest challenges. Many companies have tried to solve this problem, but almost all attempts have failed. This article will try to promote a better solution using artificial intelligence (AI) and machine learning (ML) while remaining highly understandable and easily understandable.
First, to understand the challenge facing blue teams – those defenders tasked with identifying and responding to attacks – you realize that almost any indicator will fit into one of two buckets. All detections/indicators can be categorized as signature-based or anomaly-based.
Signature-based detections manifest themselves in things like:
- Look for a running process named “mimikatz.exe”
- Find 50 failed logins in less than 60 minutes
Signature-based detections are easy for attackers to bypass in most cases. Using the two examples above, an attacker can rename their malicious executable mimikatz.exe to notepad.exe to avoid detection. Likewise, if they run 30 failed logins/hour, they also remain under the detection radar because the concern threshold was 50.
The effectiveness of signature-based detections is highly dependent on the scope of detections and maintaining secrecy of what is being monitored. A non-technical analogy would be laying a field with tripwires and landmines; if the attacker knows the location of your defenses, he can successfully pass through them.
A second set of detections are anomaly-based detections. Anomaly-based detections don’t rely on signatures, but instead look for things that aren’t normal. Using the two examples above, the anomaly detections might look like:
- Look for unusual running process names
- Look for statistically significant volumes of failed logins
These anomaly detections are more difficult for attackers to circumvent, but present their own challenges. Specifically, just because something is abnormal doesn’t mean it’s malicious.
Actions like quarterly backups seem statistically similar to data exfiltration, for example. If a defender makes these anomaly detections too sensitive, then they are bombarded with noise. If they set thresholds too high, they risk missing attacks.
Over the years, some companies have tried to solve this problem by aggregating these indicators. Examples include:
- A provider that groups first-time events such as “the first time a user logs in from a foreign country”, “the first time a user sets up a scheduled task”, and “the first time a user sends 1 GB of data”.
- Assign points to indicators and watch which ones accumulate the most points.
- Map metrics to an industry standard (e.g. MITER) and identify players that leverage multiple tactics/techniques.
But advances in computer technology have allowed us to develop a better way. Artificial intelligence and machine learning solutions are at your fingertips and less complicated than you might think. To demonstrate this, we will turn to an example that is not a cybersecurity issue.
A “Dummies’ Introduction to Machine Learning”
Ask the question “Will my spouse be home from work before 6 p.m.?” » Where my spouse leaves work at 5:00 p.m. and it takes 30 minutes by car to get home. To answer this question, there are several questions that need to be considered, such as “did they leave work on time?” or “Was there traffic on the way back?” These questions are called CHARACTERISTICS.
The result of the comparison of the characteristics to the result is rather intuitive:
Programmatically, this can be expressed as follows:
SELECT COLLECT_SET (actual result) FROM TRIPS GROUP BY F1, F2
As long as the collection of results based on the previous characteristics is limited to a single result, we can accurately predict [in theory] the result is that my spouse will arrive home on time (Result=Yes).
However, the problem begins to grow in complexity when the result does not match. Consider this scenario: my wife did NOT leave work at 5:00 p.m., but the traffic was good and my wife still came home at 6:00 p.m. In this scenario, we have the same values in functionality 1 (F1) and feature 2 (F2), but the actual result is different.
In other words, the expected result and the actual result are different. A hypothetical explanation for this difference could be that the question allows an hour to complete the ride, and without challenges, it’s actually a 30 minute ride. Technically, we have 30 minutes of “cushion”.
In this case, the model would be more accurate if we expressed the characteristics as numerical values such as “How many minutes after 5:00 p.m. did my spouse leave?” (F1) or “How many minutes was my spouse held up in traffic?” » (F2)
In our scenario, because our spouse left only 15 minutes after 5:00 p.m., there is enough cushion to predict that he will always arrive before 6:00 p.m. Therefore, our model can be improved if we replace the yes/no values with numerical values. Now we get a working model:
LESSON #1 – How you define features impacts the accuracy of the result.
Even more powerful, I can now create additional functionality combining F1 & F2. Now I will add a new feature (F99) called “Total Delay” which is the sum of F1 and F2. My result is determined by joining these two characteristics. This new feature (F99) allows the system to “guess” the answer for never-before-seen and never-before-considered scenarios.
Suppose my spouse was 15 minutes late at the start (F1) and then delayed 20 minutes in traffic (F2). Even though this is a never-before-seen scenario, the system accurately predicts the outcome based on the similarity of the F99 values:
LESSON #2 – Features can be combined to create additional features to improve accurate results of unknown scenarios.
There is another consideration when building an AI/ML learning. Suppose my spouse stopped at the grocery store for 35 minutes on the way home. Even leaving on time and without traffic, the resulting table has a conflict. Note that when F99 matches, the actual result and the predicted result are different.
This is because there is additional information that we need to take into account that was not reflected in our original model. We need to add a third feature: “How many minutes did they stop before going home?” (F3) and modify our formula F99 to be F1+F2+F3. The resulting table becomes:
With the new feature added, our F99 values are mapped and once again the model works.
LESSON #3 – When results are not accurate, the most common explanation is that a necessary additional feature was not accounted for in the model.
Finally, even when the numbers don’t match exactly, we can still make predictions based on the closest match, a principle called “nearest neighbor.” Now we have added two more scenarios.
Notice that the nearest neighbor of 37 is 35, so we predict an outcome of “No”. In contrast, the nearest neighbor of 14 is 15, so we expect a real result of “Yes”. In both scenarios, we were right. When our nearest neighbor-based estimates are incorrect, we can simply enlarge the size of our training data to get more accurate predictions.
LESSON #4 – Increasing the size of training data is another way to increase the accuracy of predictions.
Application and next steps
It is the position of this author and LogicHub that the industry could significantly improve the quality of detection if we take additional steps beyond the initial signature/anomaly detection.
Rather than simply aggregating the indicators or trying to respond directly to individual indicators, we would benefit from building a knowledge base of the characteristics associated with the indicators. By using these features in machine learning and artificial intelligence systems, we can better predict what is actionable for the SOC.
LogicHub offers a platform that allows users to create detections, determine features, and take advantage of pre-written machine learning functions like nearest neighbor. The platform includes integrations with hundreds of security tools for enrichment and actionable response.
LogicHub harnesses the power of AI and automation for superior detection and response at a fraction of the cost. Of small teams with security challengesat large teams automating SOCsLogicHub makes advanced detection and response simple and effective for everyone.
*** This is a syndicated blog from the Security Bloggers Network of Blog | LogicHub® written by Anthony Morris. Read the original post at: https://www.logichub.com/blog/using-ai/ml-to-create-better-security-detections