A preview is not available for this record, please engage by choosing from the available options ‘download’ or ‘view’ to engage with the material
            
            
            
            
                
                
                    Description
                    
                        In this whitepaper, we demonstrate how you can perform hardware platform-specific optimization to improve the inference speed of your LLaMA2 LLM model on the llama.cpp (an open-source LLaMA model inference software) running on the Intel® CPU Platform.