A preview is not available for this record, please engage by choosing from the available options ‘download’ or ‘view’ to engage with the material
Description
In this whitepaper, we demonstrate how you can perform hardware platform-specific optimization to improve the inference speed of your LLaMA2 LLM model on the llama.cpp (an open-source LLaMA model inference software) running on the Intel® CPU Platform.