Setting up HBase

HBase is a distributed data storage that comes as part of the Hadoop ecosystem with most of the distributions such as Cloudera, Hortonworks etc. HBase is master-slave architecture with HBase Master Process and Region Servers as slaves.

Key Components

Here are key components of HBase.

  • HBase Master
  • HBase Region Servers
    • WAL – Write Ahead log – store new data that hasn’t yet been persisted to permanent storage
    • Blockcache – Stores frequently read data in memory.
    • Memcache – Stores new data which has not yet been written to disk.
    • Hfiles – Stores the rows as sorted KeyValues on disk.
  • HBase uses HDFS for the file system to persist the data. We can perform real time operations in HBase tables – such as insert, update, delete etc.

Add HBase Service

  • Go to the Cloudera Manager Dashboard
  • Make sure you have installed “zookeeper”
  • Click on Add Service in drop down of the cluster
  • Choose HBase from the list of services
  • We will be using bigdataserver – 2, 3 and 4 as HBase Masters and bigdataserver – 5, 6 and 7 as Region Servers.
  • Review properties and complete the setup process.
  • We can review important properties as well as service log files to troubleshoot any issues with respect to HBase service.

Share this post