Some of today’s most successful companies are using Apache Hadoop* to create a competitive advantage by extracting real-time insights from petabytes of structured, semi-structured, and unstructured data. Yet data security is a concern for many of these businesses. How much personally identifiable information (PII) and intellectual property is buried within these companies’ massive data sets, and what would a data breach do to their business? Big data repositories often contain sensitive information, much of it obtained by monitoring customer behavior and employee output. A data breach could be catastrophic, yet strong security safeguards have not been integrated into widely used Apache Hadoop distributions, and for good reason. Big data technologies are implemented precisely because they enable fast analysis of large data sets; yet data encryption, the method of choice for protecting sensitive business data, is a compute-intensive process that can slow analysis, negating the primary value of an Apache Hadoop cluster. Intel. Proven in production at some of the most demanding enterprise deployments in the world, the Intel Distribution is supported by experts at Intel with deep optimization experience in the Apache Hadoop software stack as well knowledge of the underlying processor, storage, and networking components.