Running BigDL on Amazon Web Services* (AWS)

Published: 04/24/2017  

Last Updated: 04/24/2017

In recent years, deep learning has significantly improved several AI applications, such as recommendation engines, voice and speech recognition, and image and video recognition. Many customers process the massive amounts of data that feed these deep neural networks in Apache Spark, only to later feed it into a separate infrastructure to train models using popular frameworks, such as Apache MXNet* and TensorFlow*. Because of the popularity of Apache Spark* and contributors that exceed a thousand, the developer community has expressed interest in uniting the big data infrastructure and deep learning into a single workflow under Apache Spark.

Apache Spark is an open-source cluster-computing framework. Originally developed at the University of California, Berkeley‘s AMPLab, the Spark codebase was later donated to the Apache Software Foundation, which maintains it. Spark provides an interface for programming entire clusters with implicit data parallelism and fault-tolerance.

BigDL is a distributed deep learning framework for Apache Spark that was developed by Intel and contributed to the open source community for the purposes of uniting big data processing and deep learning. BigDL helps make deep learning more accessible to the big data community by allowing developers to continue using familiar tools and infrastructure to build deep learning applications. BigDL is licensed under the Apache 2.0 license.

Check out the complete blog, running BigDL on Amazon Web Services (AWS).

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at