Discussion:
S3 configuration for ceph or atmos
(too old to reply)
Raz Baluchi
2017-05-25 00:58:35 UTC
Permalink
Raw Message
I was able to connect to the endpoint by setting the property
'fs.s3a.endpoint' to the appropriate url 'https://storage.xxx.com:8181'


I am now able to query the data in the bucket. However, as soon as I enable
the S3 plugin - the response from Drill becomes extremely slow. This is
true even if I am not querying the S3 bucket. As an example, just issuing a
'use' command takes forever:

with the S3 plugin disabled:

0: jdbc:drill:zk=local> use cp;

+-------+---------------------------------+

| ok | summary |

+-------+---------------------------------+

| true | Default schema changed to [cp] |

+-------+---------------------------------+

1 row selected (0.543 seconds)


with the S3 plugin enabled:


0: jdbc:drill:zk=local> use cp;

+-------+---------------------------------+

| ok | summary |

+-------+---------------------------------+

| true | Default schema changed to [cp] |

+-------+---------------------------------+

1 row selected (221.293 seconds)


The S3 bucket configured in the plugin has approximately 20,000 objects. My
assumption is that there is some sort of metadata scan that occurs anytime
a command is executed? Any suggestions on how to improve performance?


Thanks
I'm not sure if anyone has ever tried that. Connecting to S3 buckets (AWS)
works via the S3a library. You could file a enhancement request on JIRA
[1].
If someone has any experience with it, they can share details on the JIRA,
or work on it. You are welcome to contribute yourself.
[1] https://issues.apache.org/jira/browse/DRILL
Where would I specify to use SSL since the endpoint is https?
Hi Raz,
Please see here for an example https://drill.apache.org/docs/
s3-storage-plugin/
Gautam
________________________________
Sent: Wednesday, May 24, 2017 7:03:12 AM
Subject: S3 configuration for ceph or atmos
Is there a guide for configuring the S3 storage plugin for non AWS S3
storage?
As and example, we have Ceph storage that is accessible via the S3 API
at
an endpoint like: "https://storage.xxx.com:8181" and bucket:"xyz"
How would I go about configuring the S3 storage plugin?
Thanks
Abhishek Girish
2017-05-25 01:08:03 UTC
Permalink
Raw Message
Hey thanks for sharing!

Regarding your degraded query performance, it's a known issue [1]. Please
add a comment to it, so that someone can verify this scenario when working
on it.

[1] DRILL-5089 <https://issues.apache.org/jira/browse/DRILL-5089>
Post by Raz Baluchi
I was able to connect to the endpoint by setting the property
'fs.s3a.endpoint' to the appropriate url 'https://storage.xxx.com:8181'
I am now able to query the data in the bucket. However, as soon as I enable
the S3 plugin - the response from Drill becomes extremely slow. This is
true even if I am not querying the S3 bucket. As an example, just issuing a
0: jdbc:drill:zk=local> use cp;
+-------+---------------------------------+
| ok | summary |
+-------+---------------------------------+
| true | Default schema changed to [cp] |
+-------+---------------------------------+
1 row selected (0.543 seconds)
0: jdbc:drill:zk=local> use cp;
+-------+---------------------------------+
| ok | summary |
+-------+---------------------------------+
| true | Default schema changed to [cp] |
+-------+---------------------------------+
1 row selected (221.293 seconds)
The S3 bucket configured in the plugin has approximately 20,000 objects. My
assumption is that there is some sort of metadata scan that occurs anytime
a command is executed? Any suggestions on how to improve performance?
Thanks
I'm not sure if anyone has ever tried that. Connecting to S3 buckets
(AWS)
works via the S3a library. You could file a enhancement request on JIRA
[1].
If someone has any experience with it, they can share details on the
JIRA,
or work on it. You are welcome to contribute yourself.
[1] https://issues.apache.org/jira/browse/DRILL
Where would I specify to use SSL since the endpoint is https?
Hi Raz,
Please see here for an example https://drill.apache.org/docs/
s3-storage-plugin/
Gautam
________________________________
Sent: Wednesday, May 24, 2017 7:03:12 AM
Subject: S3 configuration for ceph or atmos
Is there a guide for configuring the S3 storage plugin for non AWS S3
storage?
As and example, we have Ceph storage that is accessible via the S3
API
at
an endpoint like: "https://storage.xxx.com:8181" and bucket:"xyz"
How would I go about configuring the S3 storage plugin?
Thanks
Raz Baluchi
2017-05-24 13:54:26 UTC
Permalink
Raw Message
Is there a guide for configuring the S3 storage plugin for non AWS S3
storage?

As and example, we have Ceph storage that is accessible via the S3 API at
an endpoint like: "https://storage.xxx.com:8181" and bucket:"xyz"

How would I go about configuring the S3 storage plugin?

Thanks

Loading...