Skip To Main Content
Support Knowledge Base

Why Choose the FP16 Model in Weight Compression Using Optimum Intel / Neural Network Compression Framework (NNCF)?

Content Type: Troubleshooting   |   Article ID: 000098174   |   Last Reviewed: 03/21/2024

Description

Unable to determine the reason for choosing FP16 model in Weight Compression using Optimum Intel / NNCF.

Resolution

FP16 half-precision, which halves the model size of FP32 precision, can get an almost identical inference outcome while using half the GPU resources.

Related Products

This article applies to 1 products.