Apache Drill High Availability using HAproxy

Discussion:

drill

2018-08-20 12:47:46 UTC

Hi Team,

Good Evening . I am Satish working as big data developer. I need your help regarding Drill high availability usinh Ha proxy load balancer.
Is Apache drill supports High availability if yes please let me know the process.

-Thanks,
Satish

Sent from Mail for Windows 10

Paul Rogers

2018-08-20 18:01:51 UTC

Permalink

Hi Satish,

You did not say if you are using HAProxy for the RESTful API or the native Drill RPC (as used by the Drill client, JDBC and ODBC.)

To understand the use of proxies and load balancers, it is helpful to remember that Drill is a stateful SQL engine. Drill encourages the use of many stateful commands such as USE, CTTAS, and ALTER SESSION.

Session state is lost when connecting to a new Drillbit, or reconnecting to the same Drillbit. Thus, a query that runs fine before the reconnect can fail afterwards.

This issue is not unique to Drill; it is a common constraint of all old-school SQL engines.

If state were not an issue, then the Drill client itself could handle HA. The client is given a list of ZK nodes. The client, on encountering a disconnect, could ask ZK for a new node and reconnect. Since ZK is HA, the client can also recover from a ZK node failure by trying another.

We discussed this client-based HA approach multiple times, but each time, the SQL state has been a show-stopper.

In short, the issue is not whether to use HAProxy to solve the problem; Drill can do it internally in the client. The issue is how to handle session state.

A possible solution would be to store user session state in ZK so that we could re-establish the same logical session after a physical reconnection. In particular a unique session ID could be used to key connections to session state in ZK.

Making this change would be a good contributor project: it involves detailed knowledge of how the Drill session and ZK state work, but is pretty isolated to just those specific areas.Â
Thanks,
- Paul

On Monday, August 20, 2018, 8:26:09 AM PDT, drill <***@gmail.com> wrote:

Hi Team,

Good Evening . I am Satish working as big data developer. I need your help regarding Drill high availability usinh Ha proxy load balancer.
Is Apache drill supports High availability if yes please let me know the process.

-Thanks,
Satish

Sent from Mail for Windows 10

John Omernik

2018-08-27 13:22:57 UTC

Permalink

This is a great topic, that I have run into running Drill on Apache Mesos
due to each of my bits having essentially a DNS load balancer. (One DNS
Name, multiple Drill bits IPs assigned to them). That said, I've run into
a few issues and have a few workarounds. Note, I am talking about the REST
API here, not the other interfaces, I am not sure how that would work,
(perhaps the same)

So the best way, if you are using HAProxy, is to use sticky connections.
Essentially, when a user connects to HA PRoxy, the connection to the
backend Drillbit will stay sticky there until a timeout period or the
session is closed. This should allow you to ensure the best user exp,
while keeping HA. I am not sure how HAProxy balances things, however, with
a decent Drill cluster size, it shouldn't be an issue.

I didn't have HAProxy setup, and so what I did in my jupyter_drill module (
https://github.com/johnomernik/jupyter_drill) is at the application level,
prior connecting to Drill, I did a DNS lookup and grabbed the first IP
returned. Then I directly connected to that drill bit, for the the duration
of the session. It's not perfect, and I have not tested this at scale, but
it has worked on a small scale. I even used some python requests module
magic to use use the host name in the SSL verification even though I am
connecting by IP.

So a few options, if you already are looking at HAProxy, checking into the
sticky connections.

John

Post by Paul Rogers
Hi Satish,
You did not say if you are using HAProxy for the RESTful API or the native
Drill RPC (as used by the Drill client, JDBC and ODBC.)
To understand the use of proxies and load balancers, it is helpful to
remember that Drill is a stateful SQL engine. Drill encourages the use of
many stateful commands such as USE, CTTAS, and ALTER SESSION.
Session state is lost when connecting to a new Drillbit, or reconnecting
to the same Drillbit. Thus, a query that runs fine before the reconnect can
fail afterwards.
This issue is not unique to Drill; it is a common constraint of all old-school SQL engines.
If state were not an issue, then the Drill client itself could handle HA.
The client is given a list of ZK nodes. The client, on encountering a
disconnect, could ask ZK for a new node and reconnect. Since ZK is HA, the
client can also recover from a ZK node failure by trying another.
We discussed this client-based HA approach multiple times, but each time,
the SQL state has been a show-stopper.
In short, the issue is not whether to use HAProxy to solve the problem;
Drill can do it internally in the client. The issue is how to handle
session state.
A possible solution would be to store user session state in ZK so that we
could re-establish the same logical session after a physical reconnection.
In particular a unique session ID could be used to key connections to
session state in ZK.
Making this change would be a good contributor project: it involves
detailed knowledge of how the Drill session and ZK state work, but is
pretty isolated to just those specific areas.
Thanks,
- Paul
On Monday, August 20, 2018, 8:26:09 AM PDT, drill <
Hi Team,
Good Evening . I am Satish working as big data developer. I need your help
regarding Drill high availability usinh Ha proxy load balancer.
Is Apache drill supports High availability if yes please let me know the process.
-Thanks,
Satish
Sent from Mail for Windows 10