How Prediction Guard Delivers Trustworthy AI on Intel® Gaudi® 2 AI Accelerators
Subscribe Now
Stay in the know on all things CODE. Updates are delivered to your inbox.
Overview
Large language models (LLM) promise to revolutionize how enterprises operate, but making them production-ready means solving privacy risks, security vulnerabilities, and performance bottlenecks.
Not so easy.
This session focuses on how AI startup Prediction Guard found a solution to these challenges by using the processing power of Intel® Gaudi® 2 AI accelerators in the Intel® Tiber™ AI Cloud.1 The topics include:
- Prediction Guard’s pioneering work with hosting open source LLMs like Llama 2 and neural-chat-7B in a secure, privacy-preserving environment with filters for PII, prompt-injection attacks, toxic outputs, and factual inconsistencies.
- How Prediction Guard optimized batching, model replication, tensor shaping, and hyperparameters for 2x throughput gains and industry-leading time to first token for streaming.
- Architectural insights and best practices for capitalizing on LLMs.
Skill level: Expert
Featured Software
This session showcases the Intel Tiber AI Cloud: Learn More | Sign Up
Download Code Samples
Other Resources
Related Webinar