Overcoming Obstacles to Data Classification

April, 2006

One of the foundational elements of an information security program is the existence of and adherence to a formal data classification scheme. Yet, many organizations–even those that profess a commitment to protecting company and customer information–fail to implement data classification. This article looks at the reasons that data classification can be difficult and offers several practical guidelines to overcome these obstacles.

What is Data Classification?
Data classification is a simple concept. It is a scheme by which the organization assigns a level of sensitivity and an owner to each piece of information that it owns and maintains. In a hospital, for example, a data classification scheme would identify the sensitivity of every piece of data in the hospital, from the cafeteria menu to patient medical records.

The most widely recognized data classification scheme is the one used by governments, such as the U.S., which assigns classifications such as:

  • Top secret
  • Secret
  • Confidential

When a document, letter, memo, or other piece of information is created, the owner assigns to it a classification level, which among other things, defines the security clearance of individuals that can access that information.

Similarly, in business, organizations adopt data classification schemes to define the levels of confidentiality that are required for each piece of information created or maintained by the organization. A corporate data classification scheme might comprise information classifications such as:

  • Company confidential
  • Private
  • Sensitive
  • Public

Such a scheme greatly facilitates data security, because it instantly identifies and communicates the level of protection required for any piece of data as well as the audience that may view it. For example, a document that is tagged as “company confidential” is easily recognized as not to be released outside of the company. Further, it limits those who may access the information to a defined group.

A good data classification scheme also includes a time-element, to allow a piece of information to change its status on a certain date. An example would be a public company’s earnings announcement, which might be company confidential until the date of the earnings announcement, at which time it becomes “public.”

There are many other attributes to data classification schemes, but these few points are sufficient to establish why data classification is fundamental to information security. Without a data classification scheme, an organization treats all information the same. This increases the probability that sensitive data will not have adequate security controls, increasing the risk of sensitive data being compromised. It also means that less sensitive data will have more security controls than necessary, leading to unnecessary restrictions and loss of efficiency for operational personnel.

Consequence of Failure in Data Classification
Two high profile cases in 2005 show the severe losses that can arise when data is not properly classified, the scheme is not adhered to in practice, or the scheme is not used to drive security controls appropriate for each class of data.

In early 2005, ChoicePoint, a U.S. firm that provides information on consumers to insurance companies and other types of businesses and government agencies, revealed that criminals had fraudulently obtained valid customer accounts that enabled access to approximately 150,000 consumer names, addresses, Social Security numbers, and credit reports. Clearly, the security controls that ChoicePoint had in place for its new customer account setup process were not adequate for the class of data that it allowed such customers to access.

Around the same time, Bank of America disclosed that it lost several backup tapes in transit to a backup center. The tapes contained financial information on 1.2 million government employees that were members of the U.S. government’s SmartPay credit card program. Although the Bank’s data classification scheme may have recognized the confidential nature of such information when residing on the Bank’s primary systems, it did not, in this case, appear to extend to the same information when it was contained on backup media.

Although ChoicePoint and Bank of America can be faulted for not adequately protecting confidential information, it is likely that both organizations had a data classification scheme in place. The problem was that they did not have adequate security controls based on the classification, at least in these instances.

Why Implementing Data Classification is Difficult
Many organizations have an even more fundamental problem: they do not have any data classification scheme at all. If data classification is a foundational requirement for information security, what explains this failure?

First, data classification is one place where the old maxim is true: perfection is the enemy of the good. Some security professionals insist upon a scheme that is perfect in theory, but difficult to implement in all but the most disciplined of organizations. For example, if most users are ignorant of basic security practices, successfully implementing a robust data classification scheme will be extremely challenging. A data classification program will only be effective if employees are willing to properly classify each piece of information and maintain the classification. If the scheme is overly complex or too restrictive, it will fail for lack of use. An organization will be better served by a simple data classification scheme that is put into practice–even one that is theoretically imperfect–than the perfect scheme that exists in name only.

Second, the development and implementation of data classification can be downright expensive. The costs are two-fold: the cost of developing the data classification scheme with appropriate controls based on each class of data and then training all employees to recognize and classify data accordingly. The development and training effort can be significant, but there is even more effort required to classify existing data and to continue to classify new data on an on-going basis. For healthcare organizations, financial services firms, and others that are required by law to classify data, the cost of these efforts may be rationalized in terms of regulatory compliance. But for non-regulated organizations, it is often difficult for management to justify such efforts as a necessary part of doing business, when they do not directly lead to revenue generation.
Finally, the leaders of the security program–the chief information security officer, and others–often lack the authority to drive a data classification program through to full implementation. In many companies, the security program does not have the political clout required to gain acceptance for such an ambitious initiative. This type of effort affects the entire organization by mandating changes to the means by which work is accomplished.
Practical Tips for Implementing a Data Classification Scheme
With these challenges now on the table, let us look at some practical approaches to implementing a data classification scheme.

  • Understand what is realistically achievable. As indicated earlier, not all organizations are ready to accept the disciplines required for a complete data classification scheme. Therefore, a realistic assessment is needed concerning the readiness of the organization. Let this understanding guide your development of the data classification scheme. If the organization is far along in accepting the disciplines required, the data classification scheme can be more detailed. If there is likely to be resistance, then a simpler scheme is better.
  • Bring in key influencers early. Regardless of the approach chosen, it is important that key stakeholders be part of the data classification strategy and design. Individuals that feel they are part of the strategy are more likely to support it during implementation.
  • Get the data classification strategy approved, as soon as possible, even if full implementation is slow.  Be sure that the classification strategy is formally documented and approved at the necessary levels of management as soon as possible. This can benefit the security program in several ways. First, it is an important milestone, and it costs much less than the implementation. Second, any new systems that are built can reference the data classification scheme in their design, narrowing the implementation burden to existing systems. Finally, it can protect the security program in the event of a future security incident. If confidential information is inadvertently disclosed, the security program can point to the classification strategy and push accountability to the line of business managers that have not yet implemented the strategy.
  • Appeal to regulatory requirements, shamelessly.  The growing body of local, state, and federal regulations that directly or indirectly mandate data classification in organizations have become one of the most effective tools that can be used by a security program. For example, in order for a healthcare provider to be HIPAA compliant, it must be able to identify the elements that constitute Protected Health Information (PHI). You can point to these regulations to bring awareness to the need for data classification and give the security program the necessary clout that it might not otherwise have to get data classification implemented. Make friends with your legal department to learn which regulations are applicable, and use them for all they are worth to do what is right for the business.
  • Align with best-practice frameworks. Two of the most popular are ISO-17799 and COBIT. Both of these standards require the identification of a data owner and the classification of each piece of data. Appealing to these frameworks, and other industry-specific schemes, can elevate the authority of the security office in designing the data classification scheme.
  • Classify networks instead of data. For organizations where classification of data appears to be an unreachable goal, try classifying the networks instead of the data.  If a network contains sensitive data, then it will be classified as a confidential or sensitive network. The network classification then will mandate the type of security controls that the network must possess.  For example, a network that provides access to confidential information might demand firewalls that protect the perimeter, limited connectivity from other networks, as well as more extensive monitoring and logging. Network classification is not a trivial exercise, but it is often easier than the implementation of a comprehensive data classification scheme for data that is digitally stored in large organizations.

As we have seen, data classification is a fundamental requirement for information security, and the consequences for not fully implementing a data classification scheme can be severe. Nevertheless, many organizations do not implement data classification. Therefore, the chief information security officer must exercise wisdom in proposing and developing the scheme, based on realistic expectations. In the end, the practical guidelines outlined in this article will pay dividends; even if the ideal data classification scheme is not immediately achievable.

This article was written by Contributing Research Analysts Ron Collette, CISSP and Mike Gentile, CISSP. They are authors of The CISO Handbook: A Practical Guide to Securing Your Company, published by Auerbach. For more information, please visit http://www.cisohandbook.com.

For a more complete analysis of current trends IT security spending, staffing, technology, and management best practices, please refer to our 2006 IT Security Study.