RayOnSpark: Running Emerging AI Applications on Big Data Clusters with Ray* and Analytics Zoo

Published: 07/31/2019

By Jinquan Dai

AI has evolved significantly in recent years. In order to gain insight and make decisions based on massive amounts of data, we need to embrace advanced and emerging AI technologies such as Deep Learning, Reinforcement Learning (RL), AutoML, etc.

Ray* is a distributed framework for emerging AI applications open-sourced by UC Berkeley* RISELab. It implements a unified interface, distributed scheduler, and distributed and fault-tolerant store to address the new and demanding systems requirements for advanced AI technologies. Ray allows users to easily and efficiently to run many emerging AI applications, such as deep reinforcement learning using RLlib, scalable hyperparameter search using Ray Tune, automatic program synthesis using AutoPandas, etc.

ray logo

Check out this RISELab blog which describes how to use RayOnSpark, a feature recently added to Analytics Zoo (an end-to-end data analytics and AI platform open sourced by Intel), to directly run Ray programs on existing Big Data clusters in a distributed fashion.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.