How Prediction Guard Delivers Trustworthy AI on Intel® Gaudi® 2 AI Accelerators
Overview
Large language models (LLM) promise to revolutionize how enterprises operate, but making them production-ready means solving privacy risks, security vulnerabilities, and performance bottlenecks.
Not so easy.
This session focuses on how AI startup Prediction Guard found a solution to these challenges by using the processing power of Intel® Gaudi® 2 AI accelerators in the Intel® Tiber™ AI Cloud.1 The topics include:
- Prediction Guard’s pioneering work with hosting open source LLMs like Llama 2 and neural-chat-7B in a secure, privacy-preserving environment with filters for PII, prompt-injection attacks, toxic outputs, and factual inconsistencies.
- How Prediction Guard optimized batching, model replication, tensor shaping, and hyperparameters for 2x throughput gains and industry-leading time to first token for streaming.
- Architectural insights and best practices for capitalizing on LLMs.
Skill level: Expert
Featured Software
This session showcases the Intel Tiber AI Cloud: Learn More | Sign Up
Download Code Samples
Other Resources
Related Webinar