Discussion:
Apache Drill Question
Add Reply
Edgardo Robles
2017-06-15 00:04:07 UTC
Reply
Permalink
Raw Message
Hi,

I setup a 3 node zooker/drill cluster but would like to test parquet files but do not want to setup hdfs. Would drill work if I used sshfs or glusterfs to store the parquet files and the cluster be able to query the parquet file with similar performance as hdfs or is using sshfs or glusterfs fundamentally work differently and I am trying to do something stupid. Thank you for any feedback. I tried to search on Google but did not find any links to drill and glusterfs or sshfs.

-Edgardo Robles
Kunal Khatua
2017-06-15 06:54:17 UTC
Reply
Permalink
Raw Message
Not familiar with SSHFS or GlusterFS specs, but It should, in theory, work out of the box.

You can start off Drill with having the underlying storage plugins talk to a localFS. I'm presuming SSHFS / GlusterFS can expose the files through a local NFS-like mount.

However, if your three nodes allow their 3 local Drillbits to view the same file, it is likely that, as a cluster, the Drillbits will interpret it as the same file (similar to HDFS). It's something you'll need to try. A simple test would be to simply do a rowcount on a parquet file. If you get 3x the actual count.. my theory is wrong and you'll need to figure out a way to ensure that the 3 Drillbits don't replicate the file scans 3 times independently. Else, you're good.

Let us know how it works out! :)

~ Kunal

P.S.: There's no such thing as a stupid question if you don't already know the answer to it.

-----Original Message-----
From: Edgardo Robles [mailto:***@outlook.com]
Sent: Wednesday, June 14, 2017 5:04 PM
To: ***@drill.apache.org
Subject: Apache Drill Question


Hi,

I setup a 3 node zooker/drill cluster but would like to test parquet files but do not want to setup hdfs. Would drill work if I used sshfs or glusterfs to store the parquet files and the cluster be able to query the parquet file with similar performance as hdfs or is using sshfs or glusterfs fundamentally work differently and I am trying to do something stupid. Thank you for any feedback. I tried to search on Google but did not find any links to drill and glusterfs or sshfs.

-Edgardo Robles

Loading...