Your network is unique. It’s a living, breathing system evolving over time. Data is created. Data is processed. Data is accessed. Data is manipulated. Data can be forgotten. The applications and users performing these actions are all unique parts of the system, adding degrees of disorder and entropy to your operating environment. No two networks on the planet are exactly the same, even if they operate within the same industry, utilize the exact same applications, and even hire workers from one another. In fact, the only attribute your network may share with another network is simply how unique they are from one another.
If we follow the analogy of an organization or network as a living being, it’s logical to drill down deeper, into the individual computers, applications, and users that function as cells within our organism. Each cell is unique in how it’s configured, how it operates, the knowledge or data it brings to the network, and even the vulnerabilities each piece carries with it. It’s important to note that cancer begins at the cellular level and can ultimately bring down the entire system. But where incident response and recovery are accounted for, the greater the level of entropy and chaos across a system, the more difficult it becomes to locate potentially harmful entities. Incident Response is about locating the source of cancer in a system in an effort to remove it and make the system healthy once more.
Let’s take the human body for example. A body that remains at rest 8-10 hours a day, working from a chair in front of a computer, and with very little physical activity, will start to develop health issues. The longer the body remains in this state, the further it drifts from an ideal state, and small problems begin to manifest. Perhaps it’s diabetes. Maybe it’s high blood pressure. Or it could be weight gain creating fatigue within the joints and muscles of the body. Your network is similar to the body. The longer we leave the network unattended, the more it will drift from an ideal state to a state where small problems begin to manifest, putting the entire system at risk.
Why is this important? Let’s consider an incident response process where a network has been compromised. As a responder and investigator, we want to discover what has happened, what the cause was, what the damage is, and determine how best we can fix the issue and get back on the road to a healthy state. This entails looking for clues or anomalies; things that stand out from the normal background noise of an operating network. In essence, let’s identify what’s truly unique in the system, and drill down on those items. Are we able to identify cancerous cells because they look and act so differently from the vast majority of the other healthy cells?
Consider a medium-size organization with 5,000 computer systems. Last week, the organization was notified by a law enforcement agency that customer data was discovered on the dark web, dated from two weeks ago. We start our investigation on the date we know the data likely left the network. What computer systems hold that data? What users have access to those systems? What windows of time are normal for those users to interact with the system? What processes or services are running on those systems? Forensically we want to know what system was impacted, who was logging in to the system around the timeframe in question, what actions were performed, where those logins came from, and whether there are any unique indicators. Unique indicators are items that stand out from the normal operating environment. Unique users, system interaction times, protocols, binary files, data files, services, registry keys, and configurations (such as rogue registry keys).
Our investigation reveals a unique service running on a member server with SQL Server. In fact, analysis shows that service has an autostart entry in the registry and starts the service from a file in the c:\windows\perflogs directory, which is an unusual location for an autostart, every time the system is rebooted. We haven’t seen this service before, so we investigate against all the systems on the network to locate other instances of the registry startup key or the binary files we’ve identified. Out of 5,000 systems, we locate these pieces of evidence on only three systems, one of which is a Domain Controller.
This process of identifying what is unique allows our investigative team to highlight the systems, users, and data at risk during a compromise. It also helps us potentially identify the source of attacks, what data may have been pilfered, and foreign Internet computers calling the shots and allowing access to the environment. Additionally, any recovery efforts will require this information to be successful.
This all sounds like common sense, so why cover it here? Remember we discussed how unique your network is, and how there are no other systems exactly like it elsewhere in the world? That means every investigative process into a network compromise is also unique, even if the same attack vector is being used to attack multiple organizational entities. We want to provide the best foundation for a secure environment and the investigative process, now, while we’re not in the middle of an active investigation.
The unique nature of a system isn’t inherently a bad thing. Your network can be unique from other networks. In many cases, it may even provide a strategic advantage over your competitors. Where we run afoul of security best practice is when we allow too much entropy to build upon the network, losing the ability to differentiate “normal” from “abnormal.” In short, will we be able to easily locate the evidence of a compromise because it stands out from the rest of the network, or are we hunting for the proverbial needle in a haystack? Clues related to a system compromise don’t stand out if everything we look at appears abnormal. This can exacerbate an already tense response situation, extending the timeframe for investigation and dramatically increasing the costs required to return to a trusted operating state.
To tie this back to our human body analogy, when a breathing problem appears, we need to be able to understand whether this is new, or whether it’s something we already know about, such as asthma. It’s much more difficult to correctly identify and recover from a problem if it blends in with the background noise, such as difficulty breathing because of air quality, lack of exercise, smoking, or allergies. You can’t know what’s unique if you don’t already know what’s normal or healthy.
To counter this problem, we pre-emptively bring the background noise on the network to a manageable level. All systems move towards entropy unless acted upon. We must put energy into the security process to counter the growth of entropy, which would otherwise exponentially complicate our security problem set. Standardization and control are the keys here. If we limit what users can install on their systems, we quickly notice when an untrusted application is being installed. If it’s against policy for a Domain Administrator to log in to Tier 2 workstations, then any attempts to do this will stand out. If it’s unusual for Domain Controllers to create outgoing web traffic, then it stands out when this occurs or is attempted.
Centralize the security process. Enable that process. Standardize security configuration, monitoring, and expectations across the organization. Enforce those standards. Enforce the tenet of least privilege across all user levels. Understand your ingress and egress network traffic patterns, and when those are allowed or blocked.
In the end, your success in investigating and responding to inevitable security incidents depends on what your organization does on the network today, not during an active investigation. By reducing entropy on your network and defining what “normal” looks like, you’ll be better prepared to quickly identify questionable activity on your network and respond appropriately. Bear in mind that security is a continuous process and should not stop. The longer we ignore the security problem, the further the state of the network will drift from “standardized and controlled” back into disorder and entropy. And the further we sit from that state of normal, the more difficult and time consuming it will be to bring our network back to a trusted operating environment in the event of an incident or compromise.