Lenovo Validated Design for AI Infrastructure on ThinkSystem Servers

Top

Authors

Peter Seidel
Ajay Dholakia
Nathan Stewart

Updated

9 Nov 2018

Form Number

LP0892

PDF size

48 pages, 289 KB

Rate & Provide Feedback

Download PDF

Abstract

This document describes the reference architecture for a flexible and scalable Artificial Intelligence (AI) infrastructure on Lenovo ThinkSystem servers. It provides a predefined and optimized hardware infrastructure for data access, model training and inference under various usage scenarios. The reference architecture provides planning, design considerations, and best practices for implementing the AI infrastructure with Lenovo products.

The AI adoption journey involves the following key steps:

Data access
Model training
Inference

The task of providing data access entails connection with various data repositories. Typical models are based on deep neural networks (DNNs) and require a significant amount of computational resources for training. Using hardware infrastructure designed as a scale-out cluster for such model training use cases is a key requirement for enabling DL adoption. The inference step is aimed at deploying and using the trained model in the target application environment.

The intended audience for this reference architecture is IT professionals, technical architects, sales engineers, and consultants to assist in planning, designing, and implementing advanced analytics solutions with Lenovo hardware.

1 Introduction
2 Business problem and business value
3 Requirements
4 Architectural overview
5 Component model
6 Operational model
7 Deployment considerations
8 Appendix: Bill of Material
9 Appendix: Example Training Workload
Resources
Document history

To view the document, click the language links under Download PDF.

Change History

Changes in the November 9 update (Version 2.0):

Added inference
Added big data storage
Updated BOM tables to include configurations with 25Gb Ethernet switch

Lenovo Press