Power DeepSeek* Models and Applications on Intel® Hardware
Subscribe Now
Stay in the know on all things CODE. Updates are delivered to your inbox.
Overview
Run DeepSeek* models on Intel® hardware to experience the advantages of open source freedom, advanced reasoning capabilities, and a lightweight footprint. This video uses the virtual large language model (vLLM) inferencing engine and the ChatQnA application to illustrate the essential qualities of DeepSeek. The session also demonstrates the low cost of running AI on the CPUs of Intel® Xeon® processors and Intel® Gaudi® AI accelerators, an alternative to GPUs that is efficient and cost-effective.
Gain familiarity with the Open Platform for Enterprise AI (OPEA), which is used to power the ChatQnA application and provides a useful tool to show the fundamentals of constructing chatbots. Developers viewing the tutorials provided can see how the DeepSeek models can be run on relatively modest hardware and get excited about the potential for running them successfully on Intel hardware.
Other topics include optimization techniques using the Intel® Extension for PyTorch* and the process by which ChatQnA can be deployed in minutes on most cloud service providers.
This novice-level video focuses on enterprise customers and partners, C-level executives, AI application developers, and technical decision makers.
The session covers these topics:
- Survey DeepSeek R1 distill models, witnessing how they can be run with the vLLM inference serving engine delivering high throughput and efficiency.
- See how a basic ChatQnA application can be built in minutes with DeepSeek R1 distill models using OPEA with just a single change to one environment variable.
- Discover how Intel Xeon processors and Intel Gaudi AI accelerators are cost-effective platforms relative to GPUs for running lightweight models such as DeepSeek.
Featured software:
- vLLM, a fast and easier-to-use library for LLM inference and serving.
- ChatQnA application – OPEA.