Now let us look into how we can troubleshoot Hive issues.
- We have server components such as Hive Metastore and HiveServer2 running on bigdataserver-4. If we need to troubleshoot any issues related to those we need to login to bigdataserver-4 and go to /var/log/hive. There will be a different log file for each of the service.
- As with many of the other services, Hive Server logs are controlled by properties defined in log4j.properties.
- Whenever we run hive query or command, information related to the query will be logged into a file under the subdirectory of /tmp. This subdirectory is named after the OS user submitted the job. Filename is hive.log
- We can change the location as well as the name of the log file using log4j.properties.
- When we run Hive Queries, it uses the underlying processing engine to run most of the Hive Queries (especially SELECT queries).
- Many time the log file might have a very high-level exception and might not provide us with actual details of the issue.
- Based on the underlying processing engine, we need to get into the job logs to get details about the actual issue. For example, if the data is processed using Map Reduce, we need to check job logs by using job history server.