Let us see how we can configure a proxy for Hiveserver2 as well as Impala. To configure Proxy we need to have software called haproxy.
- We can have multiple servers for clients to connect.
- There are several advantages of High Availability and Proxy.
- Load Balancing
- Transparent Fail Over
- When we have multiple servers for client connectivity, we should not give specific server’s IP addresses in our applications. If that server goes down all the client traffic using the IP address will get affected.
- Using a proxy, we can hide such complexity. We will use proxy ip address and it will resolve to which server it should connect to.
- As a root user, install haproxy –
sudo yum -y install haproxy
- We will be setting up High Availability on at least 2 servers.
- Enable haproxy as part of the startup services –
sudo chkconfig haproxy on
- For each service we need to update haproxy config file – /etc/haproxy/haproxy.cfg.
- We might have to restart once the services are added.