Transfer Learning
Humans inherently apply knowledge across tasks. The more the tasks are related, the easier it is to transfer knowledge to a new task. For example, if you know how to speak French, it may be easier for you to learn Spanish, as both are considered part of the same language family.
Transfer learning is a machine learning technique that uses a previously trained model as the starting point for a new, yet related model. Reusing a model saves time rather than starting from scratch.
Real-World Use Cases
Transfer learning is widely used in computer vision and natural language processing (NLP). The following examples show how these technologies can be used to improve health and well-being.
Drug Discovery
Deep learning helps bring new drugs to market more quickly by predicting drug properties, possible interactions, and formulating new compounds.
Mental Health Therapy
Mental health professionals use NLP to augment screening and diagnosis techniques, evaluate therapy effectiveness, and monitor patient progress.
How Transfer Learning Works
The most common transfer learning techniques are feature extraction and fine-tuning. The techniques differ in how the base layer is treated. Feature extraction freezes the base layer; fine-tuning adjusts the base layer.
What Makes a Model Similar?
Often the best strategy for transfer learning is to start with a proven architecture for your network. When selecting a model, consider the following questions.
- How similar is the dataset in terms of categories? (For example, dogs versus cats.)
- How similar is the type of model architecture? (For example, ResNet* and edges.)
- How similar is the type of task? (For example, image classification versus object detection.)
Use Transfer Learning for Predictions
The Hunting Dinosaurs AI project uses a standard ResNet computer vision model to classify topography images for their likelihood of having bones. This same pretrained model can be used as a starting point with transfer learning to predict wildfires.
What is an Epoch?
An epoch is when the entire dataset is passed through a machine learning algorithm. Data is often divided into batches. If your dataset has 1,000 rows and your batch size is 100, then one epoch requires 10 iterations through the machine learning algorithm.
Select a Pretrained Model
Begin with a ResNet-50 model, which has 50 trainable layers.
Freeze Layers (Feature Extraction)
The model parameters are kept as is for 49 of the 50 trainable layers.
Fine-Tune
The model parameters are kept consistent for batch normalization layers, which have had the inputs standardized. The rest of the layers are trainable.
Add Layers
The classes and the data from the wildfire dataset replaces the last layer in the model.
Train
Train your model for a few epochs. Normally, you would train a model for a larger number of epochs, but because you are not training all the layers, fewer epochs are required. This is the major advantage of transfer learning.
Intel® AI & Machine Learning Portfolio
AI use cases and workloads continue to grow and diversify across vision, speech, recommender systems, and more. Intel offers an unparalleled AI development and deployment ecosystem combined with a heterogeneous portfolio of hardware optimized for AI. Intel's goal is to make it as seamless as possible for every developer, data scientist, researcher, and data engineer to accelerate their AI journey from the edge to the cloud.
Intel® AI Analytics Toolkit (AI Kit)
Data scientists, AI developers, and researchers can get familiar Python* tools and frameworks to accelerate end-to-end data science and analytics pipelines on Intel architecture. The components are built using oneAPI libraries for low-level compute optimizations. This toolkit maximizes performance from preprocessing through machine learning, and provides interoperability for efficient model development.
Using this toolkit, you can:
- Deliver high-performance, deep learning training and integrate fast inference into your AI development workflow with Intel®-optimized, deep learning frameworks for TensorFlow* and PyTorch*, as well as pretrained models and low-precision tools.
- Achieve drop-in acceleration for data preprocessing and machine learning workflows with compute-intensive Python* packages, Modin*, scikit-learn*, and XGBoost, optimized for Intel hardware.
- Gain direct access to analytics and AI optimizations from Intel to help ensure your software works together seamlessly.
Hugging Face*
This is a very large transformer community for research, data scientists, and machine learning engineers. Intel and Hugging Face* collaborated to democratize AI and machine learning to build a state-of-the-art solution to train, fine-tune, and predict with transformers.
Intel works with Hugging Face to bring the latest innovations of Intel® Xeon® processors and Intel AI software to the transformer community, through open source integration and consistent integrated experiences. With Intel®-optimized Hugging Face, you can:
- Train and fine-tune transformer models in a single or distributed cluster that includes Intel Xeon processors and Habana Gaudi* platforms.
- Automatically perform hyperparameter optimization for training and fine-tuning with the integrated SigOpt HPO feature in Hugging Face transformers.
- Quantize, prune, and distill transformer models after fine-tuning through Optimum for Intel. It provides much better performance without sacrificing accuracy for inference on Intel platforms.
- Fine-tune downstream tasks with Intel-optimized pretrained models (from Intel and Habana Gaudi).
Developer Resources from Intel and Hugging Face
Scale Transformer Model Performance with Intel AI
Habana Labs and Hugging Face Partner to Accelerate Transformer Model Training
Recommended Resources
SetFit: Efficient Few-Shot Learning without Prompts
Learn how SetFit works on Sentence Transformer models with data that have few to no labels.
Faster, Easier Optimization with Intel® Neural Compressor
See what is possible with the Intel® Neural Compressor, including knowledge distillation and a student model that learns features from a teacher model.
GitHub* Repository for Transfer Learning
Try out transfer learning with the Model Zoo for Intel architecture.