Intel® Distribution of Modin*
Scale your pandas workflows by changing a single line of code.
Accelerate pandas DataFrame Processing
Modin* is a drop-in replacement for pandas, enabling data scientists to scale to distributed DataFrame processing without having to change API code. Intel® Distribution of Modin* adds optimizations to further accelerate processing on Intel hardware.
Using this library, you can:
- Process terabytes of data on a single workstation
- Scale from a single workstation to the cloud using the same code
- Focus more on data analysis and less on learning new APIs
Intel Distribution of Modin is part of the end-to-end suite of Intel® AI and machine learning development tools and resources.
Download as Part of the Toolkit
Intel Distribution of Modin is included as part of the Intel® AI Analytics Toolkit, which provides accelerated machine learning and data analytics pipelines with optimized deep learning frameworks and high-performing Python* libraries.
Download the Stand-Alone Version
A stand-alone download of Intel Distribution of Modin is available. You can download binaries from your preferred repository or install the package using PIP* or Anaconda*.
Conda-Forge | GitHub* | PIP
Features
Accelerated DataFrame Processing
- Speed up the extract, transform, and load (ETL) process for large DataFrames.
- Automatically use all of the processing cores available on your machine.
Optimized for Intel Hardware
- Scale to terabytes of data on a single data science workstation.
- Analyze large datasets (over one billion rows) using heterogeneous data kernels (HDK) and performant end-to-end analytics frameworks that take advantage of the compute power for current and future Intel hardware.
Compatible with Existing APIs and Engines
- Change one line of code to use your existing pandas API calls, no matter the scale. Instead of import pandas as pd use import modin.pandas as pd
- Use Dask*, Ray, or HEAVY.AI* compute engines to distribute the data without having to write code.
- Continue to use the rest of your Python ecosystem code, such as NumPy, XGBoost, and scikit-learn*.
- Use the same notebook to scale from your local machine to the cloud.
Benchmarks
Demos
Use Case: Fraud Detection
Follow this step-by-step tutorial to learn how to use Intel Distribution of Modin to preprocess, analyze, and transform a credit card transaction dataset for use in a fraud detection application.
Seamlessly Scale pandas Workloads with a Single Code-Line Change
Learn how the Intel Distribution of Modin scales pandas workloads using the same APIs, with a live demonstration that walks you through the tools and process.
Intel and Anaconda*: Python* Data Science at Scale
Intel and Anaconda have partnered to bring high-performance Python optimizations with simple installations. With minimal code changes, you can accelerate preprocessing, model training, and model inference. See the power of this end-to-end solution in action using the New York City Taxi dataset.
Scale Your pandas Workflow with Modin
Data scientists no longer have to learn new APIs and rewrite code when their datasets require parallel processing or terabytes of data. See benchmark results that show speedup results for a variety of datasets.
In The News
Data Science at Scale with Modin
Get started with the Intel Distribution of Modin by installing via your preferred method, then changing your pandas import statement to use Modin. This brief tutorial then shows how to try it yourself using the New York City Taxi dataset.
Scale Interactive Data Science with Modin and Ray
Learn about the technology that underpins the ability of Modin to scale, how to apply Modin in practice, and how it compares to alternative solutions.
Unleash The Power Of Dataframes At Any Scale With Modin
This podcast provides insight to how Modin is architected to scale from small to large volumes of data, and how to get started using it.
The Modin View of Scaling pandas
This article provides background on the vision for Modin, some of the key architectural decisions, and compares Modin to alternative solutions for distributed data processing.
Documentation & Code Samples
Specifications
Processors:
- Intel® Core™ processors
- Intel® Xeon® processors
Operating systems:
- Linux*
- Windows*
Languages:
- Python
Get Help
Your success is our success. Access this support resource when you need assistance.
For additional help, see our general oneAPI Support.
Related Products

Stay Up to Date on AI Workload Optimizations
Sign up to receive hand-curated technical articles, tutorials, developer tools, training opportunities, and more to help you accelerate and optimize your end-to-end AI and data science workflows.
Take a chance and subscribe. You can change your mind at any time.