cca131 cloudera certified administration

CCA 131 – Cloudera Certified Associate – Administrator

Current Status
Not Enrolled
Price
Closed
Get Started
This course is currently closed

Are you planning to give CCA Administrator Certification? Do you want to know how to build Big Data cluster with confidence and prepare well for this scenario based exam?

We are thriving to provide you enough skills at an affordable cost to prepare and give the exam with confidence.

Training Design

Let us give you an idea about how the training is designed. Before going into details let us contemplate few of the challenges.

  • Each of the aspirant come from different background. One can be an expert in System Administration using Linux while other might be working on production support role.
  • Each aspirant might be living in different timezones in different parts of the world
  • Each aspirant’s spending capability might be significantly different
  • Each aspirant might have access to different type of environment (such as servers with in organization or might have to rent a server to get real experience)
  • Each aspirant might be targeting different certification
  • Due to the above reasons, the traditional one size fit all approach will not work

Keeping all those things in mind, we have come up with a plan which will work for most of the folks:

  • Self paced content with appropriate reference material and scripts
  • Support from our team of 6 to 7 well trained administrators with 24 hour turn around time for complex issues
  • Collaborative learning
  • Focus on automation, so that you can focus on skills that are required
  • Support different platforms GCP, AWS and any hosting platform as long as we get sudo access to the host

Pre-requisites

Let us see some of the pre-requisites to learn Big Data administration using Hands-On approach.

  • Mac OS or Linux or Windows 10 Operating System with Ubuntu Subsystem/some Linux based virtual machine using virtual box
  • Need to have server of at least 32 GB and 8 cores. If you do not have one then you have to spend money on renting a server or leveraging cloud-based services such as GCP or AWS
As of now you will get $300 credit from GCP which is good enough to setup 8 node cluster to cover all the scenarios with respect to exam. Hence the main demo will be given using GCP.
If you have to buy a computer and if you are a DevOps aspirant, we would recommend you to buy a server of 32 GB and 16 core configuration or build one from scratch.
Platform ~Cost Per Month Advantages
OVH 32 GB Server Cost $80 per month Unlimited access
OVH 64 GB Server Cost $160 per month Unlimited access
AWS EC2 Instances – 7 c5.large instances Cost ~$60 + ~25 (100 hours) Quick Setup
GCP VM Instances – 8 2 vCPUs and 8 GB RAM machines ~$120 for 8 instances and 800 GB Storage. Quick Setup and $300 credit
AWS have fixed cost for storage and variable cost for compute (CPU and memory). Storage cost is approximately $25 per month for 500 GB storage. If you want to spend less than 40 hours per month for practice then going for AWS might be good idea. Similar is the case with respect to GCP.

OVH can be highly effective if you want to spend at least 100 hours. You can also share costs with like minded people to keep infrastructure cost under control for learning Hadoop.

Agenda

We do not want to just focus on tasks as per the CCA 131 curriculum but we want to focus on training you to build Big Data clusters ground up.

  • Overview about Cloudera Quickstart VM. It is not enough to get actual skills though. But it can help to use it for your reference at later point in time.
  • Understand GCP and provision 8 VM Instances
    • Cost can be as high as $1.2 including storage per hour for all the 7 instances (if you run for 100 hours in a month, it will be $120)
    • You can control costs significantly if you make sure the EC2 instances are down
  • We will also support AWS as well as bare metal servers from any provider.
  • Setup Ansible for automating mundane tasks, pre-requisites on all nodes and mysql database on gateway node
  • Setup httpd service and setup yum repository server on gateway node and configure yum repositories on other nodes pointing to server
  • Setup Cloudera Manager and Cloudera reporting service
  • Setup Zookeeper
  • Setup HDFS
  • Deep dive into HDFS
    • Important properties
    • Important commands
    • Concepts such as block size, replication factor, compression codecs etc
    • Rack Awareness
  • Setup YARN + MR2 and Spark
    • YARN – Resource Manager, Node Manager, App timeline Server
    • MR2 – Job History Server and submitting map reduce jobs
    • Spark UI and History Server
  • Deep dive into YARN + MR2
    • Role of Resource Manager, Node Manager, and Application Master
    • FIFO Scheduler
    • Fair Scheduler
    • Capacity Scheduler
  • Setup High Availability for HDFS and YARN
  • Setup Pig, Sqoop, Hive, Oozie and Impala
  • Setup HBase
  • Setup Kafka
  • Capacity Planning
  • Day to Day Operations
  • Map to certification curriculum

Share this post