Threat Hunting Techniques, Checklist, Examples, Execution, Metrics

Threat Hunting Techniques Most Commonly Used in the Industry

1. Searching

Searching is the simplest method of hunting, searching is the process of querying data for specific results and can be performed using many tools. To avoid result overload, searching requires well-specified search criteria. There are two primary factors to consider when searching:

Searching too broadly may produce far too many results to be useful.
Searching on specific hosts may produce fewer results than may be useful.

2. Clustering

The second technique is Clustering and which is a statistical approach that divides groups or clusters of related data points based on specific criteria from a larger collection of data, sometimes using machine learning. Hunters may use clustering for many applications including outlier detection, because it can accurately find aggregate behaviors, such as an uncommon number of instances of a certain occurrence. When dealing with a sizable set of data points that do not share immediately apparent behavioral features, this strategy works well.

3. Grouping

The process of grouping is gathering a collection of several unique artifacts and determining when several of them show up together following predetermined criteria. The major difference between grouping and clustering is that in grouping there is an explicit set of items that are of the same interest which may represent a tool or an attacker using TTP. Deciding on the precise criteria to be utilized to group items—such as occurrences occurring within a certain time window—is a crucial step in the application of this approach. This technique works best when the hunter is hunting for multiple related instances of unique artifacts such as in the case of isolating reconnaissance commands that were executed within a specific timeframe.

4. Stack Counting

Stacking One of the methods that hunters use most frequently to test a theory is counting, commonly known as stacking. Stacking is the process of calculating the frequency of values of a particular kind and looking at the extremes or outliers of those results. This method's efficiency is typically reduced when working with vast and/or diverse data sets, but it works best when given a carefully filtered input, such as the endpoints of an organizational unit. Analysts should make an effort to comprehend the input thoroughly enough to estimate the output volume. For instance, stack counting the contents of the WindowsTemp folder on each endpoint throughout an organization can provide a huge amount of results if you are given a dataset including 100k endpoints. Filters for your input can be created using friendly intelligence.

5. Machine Learning

Machine learning uses algorithms and statistical models to progressively improve the performance of specific tasks such as hunting, that is identifying anomalous data that could indicate adversary activities. In supervised machine learning, a set of training data is fed into the algorithm with each data point labeled with the desired output. Unsupervised machine learning is provided with unlabelled data, so the algorithm uses techniques like clustering and grouping to categorize the outputs instead.

Example Hypotheses

Threat Actor

An organizational threat assessment identified Lazarus Group as a high-priority threat. The MITRE ATT&CK Navigator contains a description of the techniques ascribed to this threat actor.

We, therefore, hypothesize that if this threat actor is present in our network, we would be able to detect evidence of multiple techniques being deployed, in a manner consistent with their known attack paths.

Tool

CTI and our situational awareness suggest that our organization is currently vulnerable to a variant of the WannaCry ransomware, as SMBv1 is still used. We, therefore, hypothesize that if our network is infected with WannaCry, we will see an increase in the rate of file renaming.

Technique

Lateral Movement, via Exploitation of Remote Services, can be performed by exploiting vulnerability MS17-10. Specifically, this can be done via the Metasploit framework with a module that uses a Server Message Block (SMB) request of a specific size to attempt a compromise.

We, therefore, hypothesize that we can see evidence of this technique being used by isolating this SMB request in our network logs.

Example Hypothesis

An adversary has gained access to one or more of the organization's Microsoft Windows endpoints. PowerShell is one of the tools used by the adversary to perform unauthorized activities.

The existence of the breach might be discovered by looking for anomalous PowerShell activity, proving the theory. If the following steps are carried out effectively, evidence of compromise may be found:

Suspicious encoded PowerShell command
Suspicious execution of unsigned PowerShell scripts without warning
A process with suspicious PowerShell arguments
The suspicious PowerShell parent process

There are three possibilities for outcomes while undertaking a hunt.

1. The hypothesis is proven: The analysis of the data collected during the hunt expedition confirms the correctness of the hypothesis. The search mission in this instance turned up a security problem.

2. Hypothesis disproven: The analysis of the data collected during the hunt expedition confirms the incorrectness of the hypothesis. The search expedition was unable to find any evidence of a security incident in this case.

3. Inconclusive: There is yet insufficient evidence to either confirm or disprove the notion. This result might be the result of several factors, including inadequate data, improper tools, and scope restrictions.

Executing the Threat Hunt

Executing a threat search may take an hour or it may take one week, depending on several criteria, including:

1. Initial suspicious activities: The number of initial use cases to execute in search for the first set of clues

2. Data: The amount of data to search in, the complexity of the search, and the tools' performance. For example, running a search against 1TB of data on hot storage (disks with high input/output operations per second) would be much faster than running the exact search on cold storage (disks with low input/output operations per second).

3. Threat complexity: Advanced Persistent Threats (APTs) are a type of malware that can take weeks or even months to fully examine and are often connected with complex attacks. This doesn't mean that the quest will go on for months, just that it will probably take longer than usual.

4. Access to data and systems: While on a hunt excursion, not having timely access to systems or data might make the hunt last longer. For instance, delaying the hunter's access to the network flows that are available and managed by a separate team would waste time and require the hunter to either wait or choose more expensive and less dependable solutions, or it would result in an unsuccessful search. Failing to prove the hypothesis does not necessarily mean that the threat does not exist. It means that the hunter could not uncover the threat with the skillset, data, and tools available.

Threat Hunt Check List

1. Prepare and execute threat hunting

a. Search for signs of Command and Control

Look for beacons using a tool like Real Intelligence Threat Analysis – RITA by Black Hills Information Security, with its patented analysis engine.
If we don’t have RITA in place, then review the top 20 IPs with the greatest number of connections, the longest connection time, and the most about of data moved. Ensure that any system in all three lists has a well-understood communication pattern.
Look for long-running transactions (>8 hrs).
DNS responses with high entropy domain names.
Unknown user agents were observed.
SSL interactions with known-malicious / self-signed sites.
Dynamic DNS queries to D-DNS providers.
Long DNS queries, DNS txt queries, excessive DNS failed queries.

2. Observe a potential adversary as they would go after our “crown jewels”

a. An adversary is after sensitive or valuable data operating on objectives, so review the event types and alerts generated from systems that contain the most sensitive data.

b. An adversary will compromise the environment through the desktop and move laterally, so search for 4624 authentication events from within systems on the network and look for odd patterns.

c. Today will “live off the land”, meaning PowerShell and leveraging built-in commands. Review the output of Sysmon and 4688 data for the invoking process, the invoked process, and the command line used for PowerShell and cmd.exe processes.

3. Leverage strong “egress detection”

a. Documents which systems should be used for specific services and look for systems that violate those rules such as DNS, FTP, email – SMTP, IMAP, etc.

b. Monitor all DMZ assets for initial outbound attempts – they should normally respond to inbound, if there are outbound it should be very well understood.

c. Monitor privileged accounts, meaning that we get the current membership of elevated groups and then review actions taken by these users in the aggregate. Activities like scheduling tasks should be related to system management.

d. Ensure the account life cycle events to elevated groups are fully monitored and supported by job roles.

e. Newly Registered Domains – there are several sources such as whoisxmlapi. The cost for these services runs about $ 100/mo. Conceptually, pull the list at 1 AM and run the prior day’s queried domains against this list from our proxy or URL filter to determine if a user successfully connected to one of these.

Threat Hunting Metrics

Threat Hunting metrics are useful in providing a measurement of performance to help drive improvements and can also evidence the ROSI to senior managers within the organization, helping to build the business case for further investment (financial and time) in people and tools. Below is an example set of metrics that could be adopted:

Number of incidents identified proactively (vs. reactively)
Number of vulnerabilities identified proactively (vs. vulnerability assessments)
Dwell time of proactively discovered incidents (vs. reactively)
Containment time of proactively discovered incidents (vs. reactively)
Effort per remediation of proactively discovered incidents (vs. reactively)
Data coverage (data types and coverage of estate)
Hypotheses per MITRE ATT&CK tactic
Hunts per MITRE ATT&CK tactic
Incidents per MITRE ATT&CK tactic
Percentage of successful hunts that result in a new detection analytic or rule
Sensitivity and specificity of analytics or rules derived from hunts (true & false positive rates)

Threat Hunting Step-by-Step Process

Step-1: Determining the Objectives for Threat Hunting

Threat Hunter must first define the scope of the search and its precise goals before beginning to pursue threats. Identifying the areas to perform threat hunting in the system.

For known threats, the hunter can go for an Intel-based threat-hunting model, where the hunter can use IoCs such as hash values IP addresses, domain names, networks, or host artifacts provided by intelligence-sharing platforms such as Computer Emergency Response Team – CERT.

For unknown threats, the hunter goes for the Hypothesis-based threat-hunting model, where the hunter can use IoAs, and TTP of attackers. Based on the environment, domain, and attack behavior patterns utilized to form a hypothesis in line with the MITRE ATT&CK paradigm, the hunter determines the threat actors.

For specific customer requirements, the hunter can go for the Custom threat-hunting model where the hunter identifies anomalies in the SIEM and EDR tools. Hunts are proactively executed based on situations, such as geographical issues and targeted attacks. Both models of Intel and hypothesis-based hunting that are employing IoA and IoC information can be used in these hunting operations.

Step 2: Collecting Data and Defining a Normal

A controlling and monitoring mechanism that can identify a wide range of actions on various operating systems and devices should be a part of an efficient threat-hunting process. This telemetry method needs to keep track of all user behavior, systems, activity logs, and patterns of network traffic.

Step-3: Trigger

When sophisticated detection technologies spot unexpected behaviors that might be signs of malicious activity, a trigger directs threat hunters to a particular machine or region of the network for additional study. A cybercrime hunter will often locate the cause in a specific network area. Threat detection becomes quicker and more accurate as a result.

Trigger for Intel-based threat hunting may be from Threat Intelligence Platforms, Previous Use Cases, Historical Incidents, Other Hunting Organizations, etc.

Trigger for Hypothesis-based threat hunting may be from Domain Expertise, Situational Awareness, Crown Jewel Analysis, TTP from MITRE ATT&CK, etc.

Step-4: Developing a Hypothesis

After the trigger is received, it is necessary to make predictions about the details of activity that might be going on in the system. The majority of the time, security professionals base their hypotheses on social intelligence, prior knowledge, open-source intelligence (OSINT) techniques, and intelligence frameworks like MITRE ATT&CK.

One method of threat hunting is incident-driven. In this case, the occurrence and its repercussions are directly studied during the stage of developing hypotheses.

Step-5: Investigation

After formulating the hypothesis or prediction, the following step is investigating different tactics, techniques, and procedures (TTP) to uncover new adversary activities and patterns in the collected data.

To research possible compromises within the infrastructure systems, threat hunters use threat-hunting tools like EDR, XDR, SIEM, DNSTwist, Yara, etc. Threat hunters use threat-hunting techniques like Searching, Clustering, Grouping, and Stacking and can use some Machine Learning algorithms and models. Threat Hunter can prepare a checklist to go through.

The process of study continues until the theory is either accepted or rejected. If the hypothesis is proven, the hunter can expand the scope and follow the hunting process. If the hypothesis is disproven, the hunter optimizes the hunt, search, and scope, rechecks the triggers, tactics, and techniques used in the hunt, and adjusts the hypothesis accordingly to restart the hunt.

Step-6: Enrich & Response

When the hypothesis is proved, the threat hunter should neutralize the incident by creating an immediate response. Threat hunters should immediately inform the threat details relevant teams such as Incident Response Team, and Vulnerability Management Team, and update the threat definitions in threat intelligence platforms.

This step's main objective is to stop the current assault as quickly as possible to prevent the system from being harmed by the detected danger.

It is also crucial for threat hunters to comprehend the vulnerability and its reason to enhance security and control similar cyberattacks in the future.

Step-7: Automating Routine Tasks

Modern threat-hunting comprehension involves automating tasks as much as possible. By automating several processes, such as data-gathering techniques and outlier analysis, a successful threat-hunting strategy creates a strong foundation for further searches. The information gathered throughout the process helps firms establish a strong cyber security architecture and enhances the EDR systems.

Step-8: Creating a Consolidated Threat Hunting Report

The report describes the actions taken by the threat hunter and adversary. A threat-hunting report is useful for capturing the key details and documenting them in a comprehensive and well-structured manner. We can also use the Threat Hunting metrics to measure the performance of the threat hunting. Below are some key fields that should be included in a Threat Hunting Report.

Title:
Description / Objective:
Roles:
Resources:
Hypothesis:
Scope:
Process Triggers:
Threat Technique:
Procedure:
Exist criteria:
Workflow:
Adversary’s Actions and Tactics:
Courses of Action During Threat Hunting:
Courses of Action During Incident Response:
Threat Analysis:
Impact of the Threat on Organization:
Resolution: