The Metadata Era CONTENTS OF THIS WHITE PAPER Introduction ..............................................1 How to Prepare Your Organization for the Metadata Era The Dawn of the Metadata Era................2 Metadata Collection .................................3 Metadata Analysis ...................................3 The Varonis Metadata Framework ..........4 Summary .................................................5 INTRODUCTION Over the past two decades, the widespread interconnectivity and availability of computing resources has precipitated rapid growth in digital collaboration and an exponential increase in the amount of data that is created, shared, streamed and stored. Now we enter into a new era, where organizations have more digital data than ever that must be continuously managed and protected in order for it to remain safe and retain its value. To do so, organizations need continuous, up-to-date information about the data. To comprehensively manage data, you need metadata. Use and analysis of metadata is already more common than we realize; automated collection, storage, analysis, and presentation of metadata will become necessity in the new metadata era. The digital revolution shares similarities with the transportation revolution. When there were fewer than 10 automobiles in the world, traffic lights weren’t necessary—it took about a hundred years after cars showed up for the first documented electric traffic light and motor vehicle speed limits to appear in the US. Wilbur Wright didn’t file a flight plan when he took The Flyer out for a spin in 1903—it took more than 30 years for the first Airway Traffic Control Center to be built in 1935—a year that saw over 30 airplane crashes (Source: planecrashinfo.com, centennialofflight.gov). The amount of data that IT Organizations must manage daily has reached that same watershed. IT is working at capacity to manage and protect data manually as best they can— responding to authorization requests, migrating data, and cleaning up excessive access. Despite this effort, they have been falling further and further behind for the past 15 years. There is simply too much data being created too quickly to manage, protect, and realize its full value without automated collection, analysis, storage, and presentation of metadata; the Metadata Era has arrived. The Metadata Era Varonis Systems, Inc. 1 The Metadata Era THE DAWN OF THE METADATA ERA When objects have value and they start to multiply, we must begin to cope with their increasing numbers. We must observe them, manage them, create and enforce rules in order to fully realize their benefits and potential. Not only do the rules provide safety, but they provide a framework that enhances the value of the objects—we can drive faster with designated lanes and fly more planes with air traffic control. Prior to the digital revolution, the number of information items was relatively small, and distribution was slow. Organizations grew to the capacities of their human networks; hierarchies of humans and file cabinets created a pool of tribal knowledge to draw from and analyze. Rules about who had access to what information pertained to verbal and physical (paper) distribution. As individuals and organizations started to use computers and create files, they naturally began to organize, manage, and protect them using the available controls—they put them in folders on servers and protected them with access control lists—the same methods that they used with their physical files, like paper folders in a shared file cabinet with locks on some of the drawers. This worked reasonably well while the set of files and number of users was still relatively small, though even small workgroups could lose track of their data assets before long. Fast forward 15 years later, and the amount of available digital information has increased by several orders of magnitude, the criticality of the data has grown, and the necessity for collaboration through digital information has never been more prevalent. Gartner estimates that 80% of all data is unstructured, and that it will grow by 650% in the next 5 years, or roughly 50% year over year. Not only are personnel within organizations digitally collaborating on a daily basis, but interdepartmental digital collaboration at scale is a vital necessity. Data is more valuable when it is organized, managed, and protected; it is more available and less at risk of loss, theft, or tampering. Organizations now store more and more information about their customers and partners, and have a responsibility to safeguard it. Failure to protect this data can be damaging to organizations and individuals beyond the organization storing the data; partners and customers now expect assurance that their information is being consistently protected in order to conduct business. Managing and protecting millions of files manually is unrealistic, so more containers (shared folders, sites, etc.) are needed to share the files among changing, cross-functional teams. More containers mean more access decisions and reviews under more intense pressure. It can be almost impossible to mentally calculate the many complex functional relationships between users, groups, and data—wrong decisions mean lost productivity and increased risk. Currently, an average terabyte of data contains roughly 50,000 containers. Of those 50,000 folders, 2,500 usually have unique permissions applied to them. These folders’ permissions usually refer to several groups that contain a few or dozens of users—an organization of 1,000 users often has 1,000 or more groups stored in their directory service (e.g. Active Directory). All of these folder permissions and groups need to be maintained and updated as employees change roles, yet 91% of organizations can’t identify business owners for their folders, (Source: Ponemon Institute Study, June 2008), nor can they determine which folders their groups grant access to. IT simply cannot keep up by manually creating, updating and reviewing spreadsheets. Manual techniques are acceptable when the numbers of objects and the functional relationships between them are relatively small, but as the numbers of objects and relationships grow, a person’s ability to effectively observe and analyze them diminishes very quickly. Luckily, WORLDWIDE HEADQUARTERS EUROPE, MIDDLE EAST AND AFRICA 499 7th Ave., 23rd Floor, South Tower New York, NY 10018 Phone: 877-292-8767 [email protected] 1 Northumberland Ave., Trafalgar Square London, United Kingdom WC2N 5BW Phone: +44-0-800-756-9784 [email protected] The Metadata Era computers are excellent tools to analyze large amounts of information, as long as we can program them to collect, process, and analyze relevant metadata. METADATA COLLECTION In order to manage data, we need metadata that will help us determine, for example, who it belongs to, has access to it, who uses it, and what kind of content it contains. Metadata comes in many forms: files and folders have names, size, access timestamps, and permissions. Personnel don’t usually annotate the files they create with useful information, so many organizations are automating content analysis to discover files that contain interesting data, like those that concern special projects or contain regulated content like credit card numbers or other private customer information. This metadata is commonly called “file classification.” Another useful metadata element is a record of who is using each file, or audit trail. Unfortunately, most organizations currently have no audit trail of data usage because native operating system auditing is too taxing on disk and CPU resources. Imagine trying to manage your finances without a record of expenses! Each file and folder has many metadata elements associated with it at any given point in time. If we track changes and access activity, the associated metadata grows very quickly. The constantly changing files and folders generate streams of metadata, and the combined metadata streams become a torrent. To capture, analyze, store and understand so much metadata requires technology specifically designed for this purpose. Adding to the complexity, these files and folders have interesting functional relationships between them—folders contain many files, some users access the same folders and files, some files contain the same types of content, and some folders are accessible by the same people. The number of functional relationships between metadata elements is another order of magnitude greater than the elements themselves. METADATA ANALYSIS Simply collecting the metadata will not be enough to help us visualize and understand the complex functional relationships which surround our data; the metadata must be synthesized and analyzed to help us determine where sensitive data is exposed, who it belongs to, who has excessive permissions to it, and identify other data management and protection concerns. The torrent of metadata elements and the functional relationships between them are far too numerous and complex for humans to analyze effectively, so we must turn to automated analysis. Automated analysis already plays a large part in how we interact with the world. For example: • You probably used an Internet search engine today—more than once. • Amazon.com now makes recommendations about books I might like based on what I’ve previously ordered. • Credit card companies analyze transactions to spot possible fraudulent activity. • I met my wife on Match.com—my profile popped up on her monitor with the words, “If you liked him, you might also like this guy,” after she had contacted some other fellow (Tough luck, pal). ITunes and other online shopping engines have similar functionality. WORLDWIDE HEADQUARTERS EUROPE, MIDDLE EAST AND AFRICA 499 7th Ave., 23rd Floor, South Tower New York, NY 10018 Phone: 877-292-8767 [email protected] 1 Northumberland Ave., Trafalgar Square London, United Kingdom WC2N 5BW Phone: +44-0-800-756-9784 [email protected] The Metadata Era Automated analysis transforms an overwhelming set of objects into a digestible one, picking out items of high interest so we don’t have to ferret through them manually. There are simply too many websites, books, songs, people using credit cards, and potential mates for any human to go through them all, much less analyze them. Automating the analysis of metadata will help us find the data and access rights that require our attention. One technology is built for this purpose: The Varonis Metadata Framework. THE VARONIS METADATA FRAMEWORK The Varonis metadata framework non-intrusively collects critical metadata, generates metadata where existing metadata is lacking (e.g. its file system filters and content inspection technologies), pre-processes it, normalizes it, analyzes it, stores it, and presents it to IT administrators and data owners in an interactive, dynamic interface. Four distinct metadata streams are currently collected: • User and Group Information – From Active Directory, LDAP, NIS, SharePoint, etc. • Permissions Information – Knowing who can access what data in which containers • Access Activity – Knowing which users do access what data, when and what they’ve done • Sensitive Content Indicators – Knowing which files contain items of sensitivity and importance, and where they reside With these metadata streams collected, synthesized, processed, and presented intelligently by the Varonis framework, organizations can regularly answer the numerous pressing questions that arise in data governance: • Who has access to a data set? • Who should have access to a data set? • Who has been accessing it? • What other data have they been accessing? • Who is the likely data owner? • Which data is sensitive? • • Where is my sensitive data overexposed and how do I fix it? What data is unused? Like search engines and online stores, Varonis uses sophisticated analytics to identify objects of interest, like users whose access activity indicates that they have changed roles, yet still have access to data sets that are no longer relevant for them, or users that suddenly access a statistically significant number of files. Varonis also uses automation to help identify data owners—the most active users of a high level container where the business has write access are very likely candidates. Once data owners are identified, they are empowered to make informed authorization and permissions maintenance decisions through a web-based interface—that are then executed—with no IT overhead or manual backend processes. By collecting, processing, analyzing and presenting metadata to IT and the business, Varonis completes a full value cycle for IT and the organization: comprehensive visibility into access rights, auditing of all access and authorization activities, automated recommendations for where access should be restricted and activity scrutinized, and clear, robust interfaces and reports. Effective access control, comprehensive auditing and data ownership are the foundations of data WORLDWIDE HEADQUARTERS EUROPE, MIDDLE EAST AND AFRICA 499 7th Ave., 23rd Floor, South Tower New York, NY 10018 Phone: 877-292-8767 [email protected] 1 Northumberland Ave., Trafalgar Square London, United Kingdom WC2N 5BW Phone: +44-0-800-756-9784 [email protected] The Metadata Era management and protection—not only will they address most current data governance issues, they will also enable successful execution of future data management and protection initiatives. The Varonis Metadata Framework will scale to present and future requirements using standard computing infrastructure, even as the number of functional relationships between metadata entities grows exponentially. As new platforms and metadata streams emerge, they will be seamlessly absorbed into the Varonis framework and the productive methodologies it enables for data management and protection. SUMMARY To fully realize the benefits of the digital information revolution, organizations will need governance, automation, and analysis; data is simply more valuable when it is organized, managed, and protected. Managing and protecting data without automation will be as inefficient and ineffective as trying to find information on the Internet without a decent search engine—with exponential growth of mutually-critical data shared by organizations, customers, partners and employees, organizations that do not protect and manage their data with automation will struggle to remain competitive and survive. Those that do protect and manage data with automation will have significant advantages: the right data will be more promptly available to the right people, and only the right people. Intellectual property will be secure, and secrets will stay secret. Customers and partners will have confidence that shared information is protected. To keep up with their already overwhelming data-related tasks, like permissions management, data auditing, data ownership, data classification, data migrations, and archiving, it is inevitable that IT will need metadata and automated analysis. Doing so will provide actionable intelligence and workflows that augment and accelerate existing business processes. Automation will accelerate tasks that IT is laboring with on a daily basis, like creating permissions reports, finding lost and deleted files, securing sensitive data, and remediating folders and SharePoint sites that are incorrectly permissioned. They will also be able to perform tasks that they are unable to do today, such as identifying data owners, providing them actionable information about who has access to their data, who accesses their data, and who has access that probably shouldn’t. Those organizations that adopt and embrace metadata technology will have a distinct advantage over those that do not—their organization will be more efficient, secure, and cost effective. Those organizations that can harness the power of metadata will be leaders in the era following the digital information revolution—the era of metadata. FOR MORE INFORMATION Phone: 877-292-8767 www.varonis.com/product WORLDWIDE HEADQUARTERS EUROPE, MIDDLE EAST AND AFRICA 499 7th Ave., 23rd Floor, South Tower New York, NY 10018 Phone: 877-292-8767 [email protected] 1 Northumberland Ave., Trafalgar Square London, United Kingdom WC2N 5BW Phone: +44-0-800-756-9784 [email protected]
© Copyright 2024