Threat Detection: 4 Machine Learning Challenges

4 min readDec 7, 2022

The growth of machine learning and its ability to provide deep insights using big data remains a hot topic. Many C-level executives are intentionally developing ML initiatives to see how their companies can benefit, and cyber security is no exception. Most data security vendors have adopted ML of some sort, but it is clear that it is not the silver bullet some have made it out to be.

While ML solutions for cyber security can and will provide a significant return on investment, they face some challenges today. Organizations must be aware of some potential pitfalls and set realistic goals to realize the full potential of ML.

False positives and alert fatigue

The biggest criticism of ML-detection software is the “impossible” number of alerts it generates — think millions of alerts per day, effectively providing a denial-of-service attack against analysts. This is especially true of “static analysis” approaches that rely heavily on how threats look.

Even an ML-based detection solution that is 97% accurate may not help because simply put, the math is not optimal.

Let’s say there is a threat among 10,000 users on the organization’s network. Thanks to Bayes’ rule, we can calculate the alert by multiplying the true threat among all users by 0.97 (for 97% accuracy), or a 1/10,000 chance that there is actually a positive attack. This means that even with 97% accuracy, the actual probability of an alert being an actual attack is 0.0097%!

Since an improvement over 97% may not be possible, the best way to address this is to limit the population under evaluation by whitelisting or prior filtering with domain expertise. This may mean focusing on highly trusted, privileged users or a specific critical portion of the business unit.

dynamic environment

ML algorithms work by learning the environment and establishing baseline parameters before monitoring for anomalous events that may indicate a compromise. However, if the IT enterprise is constantly changing itself to meet the needs of business agility and does not have a stable baseline in a dynamic environment, then the algorithm cannot effectively determine what is normal and complete. Can’t react in kind. will issue alerts on cordial events from

To help mitigate this impact, security teams working in DevOps environments need to be aware of what changes are being made and update their tooling accordingly. The acronym DevSecOps (Development, Security and Operations) is starting to gain traction because each of these elements must be synchronized and operate within a shared consciousness.

Reference

The power of ML comes from its ability to conduct multi-variable correlation on a large scale to develop its predictions. However, when an actual alert makes its way into a security analyst’s queue, this powerful correlation takes the form of a black box and leaves little more than a ticket that says, “Warning.” From there, an analyst must sift through the logs and events to find out why it initiated the action.

The best way to mitigate this challenge is to enable a security operations center with tools that can quickly filter through log data on the triggering unit. This is one area where artificial intelligence can help automate and accelerate data contextualization. Data visualization tools can also help by providing a faster timeline of events combined with an understanding of a specific environment. A security analyst can then rapidly determine why the ML software sent the alert and whether it is legitimate.

anti-ml attack

The ultimate challenge to ML is hackers who are able to quickly adapt and bypass detection. When this happens, it can have catastrophic effects, as hackers recently demonstrated by changing a 35 MPH sign on a road to speed a Tesla up to 85 MPH.

ML is no different in security. A perfect example is an ML-network-detection algorithm that uses byte analysis to very effectively determine whether traffic is benign or shellcode. Hackers adapted quickly by using polymorphic blending attacks, padding their shellcode attacks with extra bytes to change the byte frequency and bypass detection algorithms entirely. It’s constant proof that no device is bulletproof and that security teams need to continually assess their security posture and stay educated on the latest attack trends.

ML can be extremely effective in enabling and advancing security teams. The ability to automate identification and correlate data can save significant time for security practitioners.

However, the key to a better security posture is human-machine teaming where a symbiotic relationship exists between machine (an evolving library of indicators of compromise) and man (penetration testers and a cadre of mainframe white-hat hackers).

Threat Detection: 4 Machine Learning Challenges

Written by Near Learn

No responses yet