Discussion:
R interface to Drill heading to CRAN (last call for issues/features)
(too old to reply)
Bob Rudis
2017-06-18 01:11:38 UTC
Permalink
Raw Message
Hey folks,

I've mentioned sergeant - <https://github.com/hrbrmstr/sergeant> -
before. It's an R package that provides an RJDBC driver, R DBI driver,
dplyr interface (with some custom functions mapped) and a REST
interface client to Apache Drill. Most of the focus/dev has been on
the dplyr interface since it provides the most "modern R-like"
experience for Drill.

If folks are unfamiliar with R's dplyr, you can get a feel for the
dplyr interface at <https://rud.is/rpubs/yelp.html> (it's a
mostly-dplyr port of the official Yelp analysis tutorial on the Drill
site-proper; some bits, such as pulling from nested JSON columns,
can't be 100% dplyr).

I have plans to submit sergeant to CRAN (the official R package
repository) this week and wanted to do a "last call" for anyone using
the package to file any issues they may be encountering or features
they would like implemented before the CRAN release.

CRAN doesn't like more than one update a month, hence my desire to get
everything in that I can on an initial release to CRAN.

Major thx to Edward Visel who assisted with the dplyr 0.7.0 conversion
(not sure if he's on the list but his efforts were greatly
appreciated).

Most recently, Drill + sergeant & R were used to analyze the results
of 30 TCP port scans of over 160 million internet hosts in one of our
annual cybersecurity research efforts at Rapid7 (ref:
https://www.rapid7.com/data/national-exposure/2017.html).

Many thanks, also, to the Drill dev team. It's an awesome tool & ecosystem.

-Bob
Parth Chandra
2017-06-19 17:27:56 UTC
Permalink
Raw Message
Hi Bob,

This is cool stuff. Glad you posted a link to it.
If you have any thoughts on improvements to Drill's APIs that would help
your effort, please post on the dev list.

Parth
Post by Bob Rudis
Most recently, Drill + sergeant & R were used to analyze the results
of 30 TCP port scans of over 160 million internet hosts in one of our
https://www.rapid7.com/data/national-exposure/2017.html).
Many thanks, also, to the Drill dev team. It's an awesome tool & ecosystem.
-Bob
Loading...