What is Digital Forensics & Incident Response?
Was a computer used in a crime? Did someone break into it? Did it malfunction or fail? The process of Digital Forensics & Incident Response or DFIR emerged when computer scientists needed to understand what happened with the machine in the past. Digital forensics involves analyzing the data stored inside the computer and Incident Response is the strategy of the best way to handle any breach that’s been detected. Together the processes formalized the best possible approaches to answering the questions of how and when someone accessed the data inside a computer.
DFIR investigators use a variety of techniques and specialized software packages to examine the data inside the computer to look for clues or virtual footprints. The operating systems include many layers and data files that adapt and change as the users start up software, visit web sites, fire up apps, or just interact with the computer. Some programs keep logs to help debugging. Others change their state. All of these digital traces can help investigators piece together a history of what happened. Even when people try to hide their behavior, there are often traces left behind that can be revealing.
When the DFIR investigation is successful, the investigators can answer questions like whether the computer has been used to store illicit information or if an intruder has downloaded copies of the data inside it. The investigation can reveal when the computer was used, for how long and by whom. If it interacted with other computers locally or across the globe, there may be traces left behind. If files were deleted, the investigator may be able to restore them. If data was accessed, these artifacts will be flagged and analyzed.
Not all investigations completely succeed. Sometimes the malicious actors can cover their tracks. Sometimes continued use can delete important data when the computer recycles storage. In many cases, there are limits to how much data is available. Many investigations produce partial reports that only answer some questions.
How did DFIR evolve?
The work of DFIR is closely linked to developing software and building computers. The process of understanding what a computer did depends upon understanding how software works because the goal is to understand just what work a computer did. It might be thought of as historical computer science or deep, reverse engineering
A great deal of digital forensic work depends upon understanding the software itself. Good forensic software engineers know the structure of the operating system and various programs that are common on machines. They understand how these packages work and, often most importantly, which details they may cache or store to do their work. The records in the file system and the database often contain timestamps and details of the tasks.
In many cases, DFIR requires working with criminal investigators who want to understand how the machine may have been used to commit a crime. This requires extra care to ensure that the evidence is collected appropriately so it might be used by the police and legal authorities.
While DFIR techniques are generally used after a suspected attack or failure, there’s no reason why they can’t help with traditional debugging. Some developers use the tools to uncover failure modes in their software during development.
What is Digital Forensics (DF)?
While the terminology often joins digital forensics together with incident response, the two areas are often thought of separately. Digital forensics involves the work of extracting the best and most trustworthy information from a device. It usually requires a deep understanding of the architecture and engineering of the devices because the most useful details are often buried deeply inside the file systems.
In some cases, the forensics experts are called in without a specific event or incursion. Human resources teams, for example, may want to investigate the computers of some employees. Tiger teams may regularly examine the network logs and computer file systems searching for evidence of stealthy subversion of security.
What is Incident Response (IR)?
The work of detecting and responding to a problem, gathering any evidence and then resolving up any problems is becoming a separate area of specialization. While a good knowledge of digital forensics is important, some teams are focusing on solving the often-urgent problems that can appear quickly after an intrusion or an event.
Many incident response teams are structured like fire departments or other first responders. They focus on stopping any dangerous behavior, mitigating harm, and restoring normal operations.
Large organizations often have multi-layer internal incident response teams that build and maintain security infrastructure, monitor alerts, and investigate suspicious activity on their networks.
What happens after an incident?
When a DFIR team approaches a computer, they will try to isolate it and freeze the data as much as possible. In an ideal situation, the team takes control of the hardware and removes access to the outside world by shutting down network connections through the Internet, the cellular networks and other local connections like Bluetooth. They will take snapshots of the memory and database for later analysis. If this is a legal investigation, they will isolate the copy of the hard drive to ensure that the evidence is collected with a proper chain of custody.
Many investigations, though, aren’t ideal. Perhaps the computer can not be isolated because it’s in regular use. Perhaps the hardware is not nearby and so any investigation must take place remotely. Sometimes the report comes long after the intrusion and the team must recover whatever it can. The techniques can still work and reveal the past, often in great detail, but the conclusions will not be as trustworthy as if there were perfect isolation.
In some cases, investigators will want to leave a computer in use so they can monitor how it evolves. They may want to track the criminals to gather better evidence. Periodic data copies can be more revealing than a single incident response.
What types of incidents can be studied?
DFIR techniques are useful anytime someone wants to know what a computer did in the past. The most common events usually involve broken laws because the computer or smartphone may reveal the guilt or innocence of the subjects of the investigation. The police or private investigators can take a complete image of the data stored on the computer so it can be examined.
Another common incident is after a digital breach of a company’s server. An investigation may reveal details about how much information was compromised. It may also uncover the identity of the attacker in cases. Understanding the scale of the data loss allows businesses to notify customers and employees when their personal information was part of the loss.
Some incidents are part of audits, often random, of a company’s IT infrastructure. These investigations may reveal nothing or they can expose undetected intrusions.
The most traumatic investigations usually involve ransomware because businesses or users are called to make difficult and sometimes expensive decisions on short notice. The DFIR investigation can reveal the true scope of the intrusion and the amount of data at risk so that a better decision can be made. In some cases, the damage is not as great as the criminals claim and it’s possible to reassemble much of the data from backups and careful forensic work.
What can be learned?
An investigation can reveal a variety of details.
- Which applications a user started recently.
- Which data is stored in files or databases on the machine.
- Which files were recently deleted and what data was stored inside them.
- Any evidence of malicious software like a virus or a rootkit.
- Any new accounts that might have been created, sometimes surreptitiously.
- A recent change to system settings like access permissions.
- Any new software that was installed.
- Any old software that was removed or deleted.
- Any information that might have been removed or downloaded or shipped to another location on the Internet.
- What other systems on a network interacted with the machine.
Many of these questions are answered independently and so each investigation may reveal different details.
What are the types of important tools?
DFIR investigators call upon a collection of tools when there is an event. They can roughly be divided into these categories, although some tools are hybrids that can operate in multiple ways:
Collect – These tools gather data from the machines, effectively freezing the state of the machine so it can be studied later. They largely focus on the main data storage systems like the disks or the memory but there are some that can download bytes from subsystems like the network cards.
These tools often try to collect complete copies of the data although they can be focused on particular areas when time or storage limits make it impractical to store complete copies. Predicting which sections of the data storage may be useful is often hard at the beginning of investigations.
Teams devoted to criminal or national security investigations use tools designed to preserve a chain of custody for the evidence. The data collected may be used in trials and ensure it is a trustworthy record of the machine’s state.
Some of the most common tools in this category include: Autopsy, Cyber Triage, Sleuthkit, CAINE, Exiftool, Neo, Kape, FTK Imager and Bulk Extractor.
Categorize – In most cases, the data collected can be identified and sorted based on its use. Cyber Triage, for instance, automatically groups the artifacts into general categories to guide the investigators. The categorization algorithms depend upon an understanding of software design to separate the gathered data according to its role. Some of the most useful information is found in the log files and the caches which include records of the most recent events and communications.
Most of the data is often of little interest or importance for understanding what just happened. The databases and software that handle the machine’s everyday tasks are often not important because the investigation is trying to understand the anomalies like an intrusion or a virus. The categorization routines leave this aside but preserve it because it can explain just what information might have been compromised or exposed.
Searching and Scoring – Some intrusions or failures involve well-known malware like viruses, rootkits, and ransomware.. DFIR systems maintain catalogs of data snippets that identify them and they will search through the data sets looking for these digital fingerprints. Cyber Triage, for example, creates scores for the artifacts based upon their potential value to any investigations. It also maintains a hash database of past work in case there may be some connection with the data collected in other investigations.
Designing these systems and maintaining a current list of the most virulent and destructive malware is an on-going process. Keeping these catalogs up-to-date is often the most expensive and time-consuming task that depends upon knowledgeable and curious researchers. The malware authors are often trying hard to obscure their work and hide effectively.
Some of the best known sources for this information are found either in the archives of DFIR tool vendors like Cyber Triage, government agencies or other non-profits. (See here, here, here or here for instance)
Report – This software collects the results from the categorization and analysis into a report that summarizes what was found and presents the evidence for any conclusions. Preliminary reports may just offer a quick summary that’s useful for an ongoing investigation. The final report may include additional details about the approach and the quality of the evidence so it can be presented in any court.
How is DFIR software designed?
The tools used in an investigation can be divided into three major categories:
- Capture – These create a digital image from the data that is stored on the hard disk, the memory or inside some of the subsystems like the network card. They may collect all possible information or they may focus on a particular area to save space and time. The process is designed to create copies that are as perfect as possible to preserve any evidence. When the results are going to be used in a legal proceeding, they can be sealed with digital signatures.
- Analysis – These tools encapsulate all of the knowledge of computer architecture and design to simplify and automate the search for clues or evidence. The tools use a collection of known patterns to identify potential traces in log files, caches or other parts of the file system. They also carry catalogs of vectors for known viruses, ransomware, rootkits or other malicious tools that may embed themselves inside the computer.
- Reporting – This creates a useful summary of what may have happened to the machine in a concise timeline. The output is designed to be easily accessible to other criminal investigators and corporate leaders who may need to understand what went wrong with the computer.
Many of the tools include features from each of these areas.
How can data be saved or recovered?
While many start DFIR investigations to understand what happened and create a strategic plan for preventing it in the future, another part of the job is restoring the system to an untouched state. The DFIR team must preserve and restore the data so a business can return to normal.
This role has grown more serious in recent years as ransomware attacks grow more common. The DFIR response must both identify the security flaws that made any attack possible, but also restore as much of the information as possible as soon as possible. In the ideal scenario, a strong backup plan coupled with good defenses can allow the leadership to ignore the ransom demands.
While backups and archival records are not always considered part of computer forensics, they can help answer questions and restore normal operations. The DFIR teams often rely upon backups and off-site records for comparison. This can reveal what’s changed as well as offering a way to restore order.
What are some limitations to DFIR?
In ideal cases, DFIR teams will have perfect backups and extensive logs that make it possible to recover all data and understand exactly what happened and when. In practice, everyone must make do with imperfect records and these can limit a DFIR investigation. Too much logging and copying can slow a system to a crawl while boosting the cost for data storage. All businesses must balance these costs with need when planning how many records and copies to keep. They often limit the size of the logs to save money and speed processing, a choice that can limit the data available for analysis.
When records are imperfect, DFIR investigators can still make educated guesses that can provide useful paths forward. The details and anomalies gathered by investigators may not offer a perfect explanation of what happened, but they can still offer details on timing, breadth and depth of the intrusion.
To make matters more complicated, some intruders are able to manipulate the record keeping to their advantage. If they gain deep access, they may edit the log files to remove other hints of their actions and cover their tracks. Occasionally, they may be able to do this so well that there’s no evidence left behind. Some teams fight this with write-only record keeping with special databases or blockchains which can help with legal compliance.
What are the key questions for a DFIR investigator to answer?
What happened? Which events or symptoms triggered the investigation? Did they affect any users? Was there any data loss? Was there some crime?
Who was involved? Who had physical access to the hardware? Who might have had digital access? Is it possible to reduce the size of the set effectively? Can we reach any firm conclusions about the perpetrators?
Are outside intruders responsible? Does this involve a breach of physical or digital security?
How did it happen? Was there some event that started the process? Were there any particular malware packages, viruses, or compromised software involved?
Are there innocent victims? Was any personal information released? Was anyone targeted in particular?
What can be done to prevent it from happening again? Can any software holes be closed? Can better education prevent lapses in judgment?
The answers presented in any report can improve practices and encourage better data care.
How can a company prepare?
The best IT teams don’t wait for an intrusion or an event to happen. They design their systems architectures to be both secure and auditable in the case the security fails. They ensure that the logging system records the most information that is practical and legal to keep.
Some teams add in extra tools for surveillance and protection. Instead of just relying upon the basic logs and data collected, they add extra layers and hardware to the software stacks and networks to create secondary records that can provide an independent source. Some of these extra layers can watch for anomalies or other indicators of security failure so that the response can begin sooner.
One technique that is often useful is to create an extra system with fake data that’s easily accessed by potential hackers. These attractive targets, sometimes called “honeypots”, can be an ideal early warning system for intrusions. They can be designed with extra telemetry to make reconstructing any attacker much easier than it might be on a production system.
Appropriate backups are often an important part of any DFIR plan. While they are often considered as insurance against physical damage like a fire or hardware failure like a disk crash, they can be very useful after a cyber security incident. They offer a chance to compare the affected computers against their past configurations which can help investigators understand what data may have changed. They can also be invaluable for companies facing ransomware demands because good backups make it possible to stand up replacements quickly.
Good teams also prepare and test plans in advance, enumerating the best practices. When an event occurs, they can quickly choose the right plan and start isolating and investigating immediately.
What are the most important high-level questions for CIOs and the team to as a takeaway?
Preparing for a DFIR involves all layers of an IT team. Many of the different approaches and solutions can be summarized with a few key questions:
- Which data does the system collect?
- What happens if the data leaks?
- Who has access?
- Who has responsibility for the systems?