Unlike plain vanilla distribution and other vendor distributions, Cloudera manages configuration files a bit different. Typically configuration files will be in /etc/hadoop/conf. But when it comes to Cloudera, /etc/hadoop/conf will only have templates. Actual properties files are managed under /var/run/cloudera-scm-agent/process on each node.
- hadoop-env.sh – for memory settings of Resource Manager, Node Manager etc.
- core-site.xml – Namenode URI as well as Compression Algorithms.
- yarn-site.xml – Parameters related Resource and Node Managers. Using appropriate values for yarn is very important, we will review those things as part of the planning of cluster at a later point in time.
- mapred-site.xml – Parameters related to Map Reduce framework.