HBase is a distributed data storage that comes as part of the Hadoop ecosystem with most of the distributions such as Cloudera, Hortonworks etc. HBase is master-slave architecture with HBase Master Process and Region Servers as slaves.
Key Components
Here are key components of HBase.
- HBase Master
- HBase Region Servers
- WAL – Write Ahead log – store new data that hasn’t yet been persisted to permanent storage
- Blockcache – Stores frequently read data in memory.
- Memcache – Stores new data which has not yet been written to disk.
- Hfiles – Stores the rows as sorted KeyValues on disk.
- HBase uses HDFS for the file system to persist the data. We can perform real time operations in HBase tables – such as insert, update, delete etc.
Add HBase Service
- Go to the Cloudera Manager Dashboard
- Make sure you have installed “zookeeper”
- Click on Add Service in drop down of the cluster
- Choose HBase from the list of services
- We will be using bigdataserver – 2, 3 and 4 as HBase Masters and bigdataserver – 5, 6 and 7 as Region Servers.
- Review properties and complete the setup process.
- We can review important properties as well as service log files to troubleshoot any issues with respect to HBase service.