Detecting an incident means one of two things. You have to see either a known problem — such as high-risk malware infecting one or more client endpoints — or something that is suspicious. But how do you know if suspicious activity is good or bad unless you know what is normal for the enterprise and for the specific computer?
“Normal” is relative
Knowing normal is hard enough for an organization’s IT staff, harder still for an external party. Each organization is different; technical infrastructures depend on organizational hierarchies and operational modes. Whether the organization is beholden to regulatory requirements can affect its infrastructure as well. Different companies even within the same industry may choose to comply with rules and regulations in different ways.
Additionally, different roles in a company use systems differently; for example, people who work in finance should connect to finance servers, engineers to engineering servers, and so on. Or, while it may be normal for a web developer to have a webserver running on their computer, it would not be for an administrative assistant.
Finally, different users, apart from the roles they hold, have different behaviors and programs that they use. These can depend on their position, level of training and experience, responsibilities, and other variables.
Thus, along with lack of infrastructure, one of an incident response consultant’s chief challenges is understanding what constitutes “normal” on the client’s network and on each endpoint. The ports that one client uses may be different to a greater or lesser extent than another’s; the remote hosts they consider suspicious may likewise raise no red flags for a different client.
The challenges of finding normal
It’s unlikely that clients will be able to communicate “normal” when they first hire your consulting firm, and especially not after they’ve detected an incident.
First, they may not know normal. As we noted in our last post, their security posture may be relatively immature; as a result, many organizations do not know what their networks look like and what devices should be there.
Second, it may require different people to know what normal is. The intrusion could span departments; knowing normal for a device may involve mapping the device to a person, a person to a department, and talking to someone in that department to know what is expected.
Depending on the organization’s size and culture, this process can add time to an already time-critical engagement. People may have forgotten or filtered important information, or may be altogether uncooperative. In addition, IT may not be fully trained or prepared to help respond to a security incident, and thus may not be able to help your team answer questions about what is normal.
To avoid problems arising from these situations, you need a way to use the data to learn the client’s “normal” on the fly.
Using triage to achieve situational awareness
Even if there is no way to know normal when you first respond to a client’s security incident, you can nonetheless obtain, and maintain, situational awareness by collecting data from endpoints using a fast triage process.
Once you have the collected data in a central location, you can begin to determine for yourself what is normal, without having to rely on the client. This way, you can see whether (for example) a particular process is running on all the endpoints, or only some of them.
From the collective data, you can determine what you think is suspicious and prioritize what to investigate further. For example, you may first investigate the startup item that is on only one of the finance computers rather than the startup item that is on all of them.
Global attack trends are important, too. It’s wise to have a way to integrate threat intelligence feeds so that you can compare the client’s organization to global trends as your response proceeds. However, it’s also good practice to leverage host data results from other investigations, in case you find indicators of a brand-new type of attack.
Ensure that you can quickly get a sense of your clients’ “normal” even when you have limited time and visibility. Using a system like Cyber Triage is useful for situational awareness because its automated collection is thorough and fast, while its automated analysis correlates the data from all endpoints in the system.
This gives you visibility into where else data was found within the current incident, as well as from past incidents you were involved with. Because its collection and analysis are automated, it can scale to many endpoints in the company.
To learn more about how Cyber Triage meets your needs when responding to client calls, follow the link below.