Lenovo Big Data Validated Design for Cloudera Enterprise with Local and Decoupled SAS StorageReference Architecture
Abstract
This document describes the reference architecture for Cloudera Enterprise on bare-metal infrastructure as well as infrastructure virtualized with VMware. The document includes solutions with internal storage and solutions with decoupled SAS storage. It provides a predefined and optimized hardware infrastructure for the Cloudera Enterprise, a distribution of Apache Hadoop and Apache Spark with enterprise-ready capabilities from Cloudera.
This reference architecture provides the planning, design considerations, and best practices for implementing Cloudera Enterprise with Lenovo products including ThinkSystem servers. Jointly tested and validated by Lenovo, Cloudera and VMware, the predefined configuration provides a baseline configuration for a big data solution, which can be modified, based on the specific customer requirements such as improved performance, optimized performance/cost and increased reliability.
The intended audience of this document is IT professionals, technical architects, sales engineers, and consultants to assist in planning, designing, and implementing the big data solution with Lenovo hardware.
Table of Contents
Introduction
Business problem and business value
Requirements
Architectural overview
Component model
Operational model
Deployment considerations
Appendix: Bill of Materials
Acknowledgements
Resources
Change History
Changes in the October 24 update (version 1.3):
- Added decoupled external JBOD SAS storage configuration with dense compute nodes
Related product families
Product families related to this document are the following: