Let us see how we can configure a proxy for Impala.
- We have seen how to launch Impala Shell (
impala-shell -i bigdataserver-5) and run the commands. However, we need to use JDBC to connect to Impala with respect to tools like Tableau so that we can generate reports.
- As part of the cluster, we get a tool called beeline using which we can validate connecting to Impala Daemon via JDBC (using Hive2 JDBC Driver).
beeline -u 'jdbc:hive2://bigdataserver-5:21050/default;auth=noSasl' --silent=true
- As of now, Impala Daemons are running on all worker nodes. We are hard coding the server IP address while connecting to Impala Daemon.
- Configuring HA
- Servers – All Servers on which impalad is running
- Proxy Server – bigdataserver-1
- Proxy Ports – 21000 for impala-shell and 21050 for JDBC
- Port Numbers – 21000 for impala-shell and 21050 for JDBC
- Update haproxy.cfg file
- Restart the service –
sudo /usr/sbin/haproxy –f /etc/haproxy/haproxy.cfg
- Validate by using impala-shell command and connecting to impala using bigdataserver-1 as proxy.