Configure proxy for Impala

Let us see how we can configure a proxy for Impala.

  • We have seen how to launch Impala Shell (impala-shell -i bigdataserver-5) and run the commands. However, we need to use JDBC to connect to Impala with respect to tools like Tableau so that we can generate reports.
  • As part of the cluster, we get a tool called beeline using which we can validate connecting to Impala Daemon via JDBC (using Hive2 JDBC Driver). beeline -u 'jdbc:hive2://bigdataserver-5:21050/default;auth=noSasl' --silent=true
  • As of now, Impala Daemons are running on all worker nodes. We are hard coding the server IP address while connecting to Impala Daemon.
  • Configuring HA
    • Servers – All Servers on which impalad is running
    • Proxy Server – bigdataserver-1
    • Proxy Ports – 21000 for impala-shell and 21050 for JDBC
    • Port Numbers – 21000 for impala-shell and 21050 for JDBC
    • Update haproxy.cfg file

https://gist.github.com/dgadiraju/6e440d160d92e3fb526ea4a18de79839

  • Restart the service – sudo /usr/sbin/haproxy –f /etc/haproxy/haproxy.cfg
  • Validate by using impala-shell command and connecting to impala using bigdataserver-1 as proxy.

Share this post