Discussion:
Custom UDFs not found
(too old to reply)
Knapp, Michael
2017-05-08 21:10:40 UTC
Permalink
Raw Message
I have followed all of the instructions here<https://drill.apache.org/docs/tutorial-develop-a-simple-function/> and also here<https://drill.apache.org/docs/manually-adding-custom-functions-to-drill/> as closely as possible, but unfortunately Drill is still not finding my custom UDF.

I have checked that:

· My source and binary jars are present in jars/3rdparty

· My jars both have “drill-module.conf” in their root, and that file’s contents are:

o drill.classpath.scanning.packages += "path.to.my.package"

o but with my real package, which holds drill functions.

· I have removed the drill.exec.udf section from my drill-override.conf file.

· I have configured my pom to build using ‘jar-no-fork’ like in your example.

· My function implements DrillSimpleFunc and is annotated with FunctionTemplate. It’s scope is simple, and uses “NULL_IF_NULL”

· My function has a NullableVarCharHolder input parameter, NullableVarCharHolder output parameter, and also accepts a @Inject DrillBuf parameter. It is expected to be called with a single string argument.



From the Drill UI, I keep getting this error:



VALIDATION ERROR: From line 1, column 8 to line 1, column 29: No match found for function signature 
()



SQL Query null



I have tried issuing the query with my function in all caps and also lower-case. In the logs I see that my 3rdparty jar is the first in the list of scanning jars, and the appropriate package is listed in the scanning packages. The logs indicate that 433 functions were loaded upon startup. For some reason the logs mention loading functions from the hive UDF jars, but not mine.



Other details:

· I am running zookeeper separately from Drill, but on the same node. I use drillbit.sh to run, so it’s like a cluster of one.

· This is on AWS.

· I did have a drill.exec.udf section defined previously, but it is not defined in drill-override now. I wonder if ZK persisted those values from a previous run and that is still getting used.

· I am not running Hadoop, there is no HDFS that I can add dynamic UDF jars to.

· I am using drill 1.10.



I have also tried setting “exec.udf.enable_dynamic_support” to false and restarting, but that did not resolve the issue.



I have noticed one unrelated problem, the paths used for udfs on the file system do not match what I set in drill-override.conf, I think drill is prepending them with a temp directory even though I provided an absolute path.

Questions:

1. Does anybody know what I am doing wrong?

2. Can I use dynamic UDFs without HDFS?

3. Are there more troubleshooting techniques I can use here? How can I list all of the known UDFs and their jars?



Michael Knapp
________________________________________________________

The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.
Charles Givre
2017-05-09 04:13:14 UTC
Permalink
Raw Message
Hi Michael,
I’ve encountered this issue when developing Drill UDFs and sometimes it can mean that there is an error in the UDF itself. What is particularly insidious about these kinds of errors is that the UDF will compile and build just fine, but when you try to use it in a query, Drill can’t find the function.

I would recommend first testing the UDF on Drill in embedded mode so that way you can minimize the things which can go wrong. Next, I would comment out the entirety of the eval() and next() functions, build the UDF and see if Drill recognizes the function. If it does, then slowly start uncommenting lines to see what is breaking it.

One other thing, I believe the drill-module.conf is supposed to be in the resources folder in your project. Mine are always in <project>/src/main/resources.

Can you share any of your code?
— C
Post by Knapp, Michael
I have followed all of the instructions here<https://drill.apache.org/docs/tutorial-develop-a-simple-function/> and also here<https://drill.apache.org/docs/manually-adding-custom-functions-to-drill/> as closely as possible, but unfortunately Drill is still not finding my custom UDF.
· My source and binary jars are present in jars/3rdparty
o drill.classpath.scanning.packages += "path.to.my.package"
o but with my real package, which holds drill functions.
· I have removed the drill.exec.udf section from my drill-override.conf file.
· I have configured my pom to build using ‘jar-no-fork’ like in your example.
· My function implements DrillSimpleFunc and is annotated with FunctionTemplate. It’s scope is simple, and uses “NULL_IF_NULL”
VALIDATION ERROR: From line 1, column 8 to line 1, column 29: No match found for function signature …()
SQL Query null
I have tried issuing the query with my function in all caps and also lower-case. In the logs I see that my 3rdparty jar is the first in the list of scanning jars, and the appropriate package is listed in the scanning packages. The logs indicate that 433 functions were loaded upon startup. For some reason the logs mention loading functions from the hive UDF jars, but not mine.
· I am running zookeeper separately from Drill, but on the same node. I use drillbit.sh to run, so it’s like a cluster of one.
· This is on AWS.
· I did have a drill.exec.udf section defined previously, but it is not defined in drill-override now. I wonder if ZK persisted those values from a previous run and that is still getting used.
· I am not running Hadoop, there is no HDFS that I can add dynamic UDF jars to.
· I am using drill 1.10.
I have also tried setting “exec.udf.enable_dynamic_support” to false and restarting, but that did not resolve the issue.
I have noticed one unrelated problem, the paths used for udfs on the file system do not match what I set in drill-override.conf, I think drill is prepending them with a temp directory even though I provided an absolute path.
1. Does anybody know what I am doing wrong?
2. Can I use dynamic UDFs without HDFS?
3. Are there more troubleshooting techniques I can use here? How can I list all of the known UDFs and their jars?
Michael Knapp
________________________________________________________
The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.
Knapp, Michael
2017-05-09 20:47:59 UTC
Permalink
Raw Message
Thanks for your advice Charles.

First, I did have drill-module.conf in src/main/resources, which is the reason that it’s in the root of my jar file. I thought it would be better to check the jar file itself instead of my project.

I did back out all my UDFs code and reached a point where it was recognized. I even got it to work where it just appends one string to the input. What I am trying to do now, is to encrypt the input with a cipher. Drill is p***ing me off with how it never works and never delivers a useful error message in the logs when this fails. I believe it was a poor choice to use runtime compilation and off heap memory for this.

At first I was depending on an external jar to do this, but in my troubleshooting, I decided to copy in the core logic.

I attached the latest code I have tried that is failing.

Michael Knapp

On 5/9/17, 12:13 AM, "Charles Givre" <***@gmail.com> wrote:

Hi Michael,
I’ve encountered this issue when developing Drill UDFs and sometimes it can mean that there is an error in the UDF itself. What is particularly insidious about these kinds of errors is that the UDF will compile and build just fine, but when you try to use it in a query, Drill can’t find the function.

I would recommend first testing the UDF on Drill in embedded mode so that way you can minimize the things which can go wrong. Next, I would comment out the entirety of the eval() and next() functions, build the UDF and see if Drill recognizes the function. If it does, then slowly start uncommenting lines to see what is breaking it.

One other thing, I believe the drill-module.conf is supposed to be in the resources folder in your project. Mine are always in <project>/src/main/resources.

Can you share any of your code?
— C
Post by Knapp, Michael
I have followed all of the instructions here<https://drill.apache.org/docs/tutorial-develop-a-simple-function/> and also here<https://drill.apache.org/docs/manually-adding-custom-functions-to-drill/> as closely as possible, but unfortunately Drill is still not finding my custom UDF.
· My source and binary jars are present in jars/3rdparty
o drill.classpath.scanning.packages += "path.to.my.package"
o but with my real package, which holds drill functions.
· I have removed the drill.exec.udf section from my drill-override.conf file.
· I have configured my pom to build using ‘jar-no-fork’ like in your example.
· My function implements DrillSimpleFunc and is annotated with FunctionTemplate. It’s scope is simple, and uses “NULL_IF_NULL”
VALIDATION ERROR: From line 1, column 8 to line 1, column 29: No match found for function signature 
()
SQL Query null
I have tried issuing the query with my function in all caps and also lower-case. In the logs I see that my 3rdparty jar is the first in the list of scanning jars, and the appropriate package is listed in the scanning packages. The logs indicate that 433 functions were loaded upon startup. For some reason the logs mention loading functions from the hive UDF jars, but not mine.
· I am running zookeeper separately from Drill, but on the same node. I use drillbit.sh to run, so it’s like a cluster of one.
· This is on AWS.
· I did have a drill.exec.udf section defined previously, but it is not defined in drill-override now. I wonder if ZK persisted those values from a previous run and that is still getting used.
· I am not running Hadoop, there is no HDFS that I can add dynamic UDF jars to.
· I am using drill 1.10.
I have also tried setting “exec.udf.enable_dynamic_support” to false and restarting, but that did not resolve the issue.
I have noticed one unrelated problem, the paths used for udfs on the file system do not match what I set in drill-override.conf, I think drill is prepending them with a temp directory even though I provided an absolute path.
1. Does anybody know what I am doing wrong?
2. Can I use dynamic UDFs without HDFS?
3. Are there more troubleshooting techniques I can use here? How can I list all of the known UDFs and their jars?
Michael Knapp
________________________________________________________
The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.
________________________________________________________

The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.
Ted Dunning
2017-05-10 15:49:11 UTC
Permalink
Raw Message
Michael,

One of the key issues that I have had is that the dynamic compilation
imposes a number of restrictions on how my UDF's have to work.

I have had the best luck working with very simple UDF's that refer to other
classes (with fully expanded class references). The surprising constraints
come up when the UDF gets dissected during the dynamic compilation, but I
can avoid this entirely by putting another class in the jar. Since that
isn't dynamically compiled at all, I have full freedom to write real Java
there. In such cases, I consider the UDF code to be pretty much just
argument marshalling.
Post by Knapp, Michael
Thanks for your advice Charles.
First, I did have drill-module.conf in src/main/resources, which is the
reason that it’s in the root of my jar file. I thought it would be better
to check the jar file itself instead of my project.
I did back out all my UDFs code and reached a point where it was
recognized. I even got it to work where it just appends one string to the
input. What I am trying to do now, is to encrypt the input with a cipher.
Drill is p***ing me off with how it never works and never delivers a useful
error message in the logs when this fails. I believe it was a poor choice
to use runtime compilation and off heap memory for this.
At first I was depending on an external jar to do this, but in my
troubleshooting, I decided to copy in the core logic.
I attached the latest code I have tried that is failing.
Michael Knapp
Hi Michael,
I’ve encountered this issue when developing Drill UDFs and sometimes
it can mean that there is an error in the UDF itself. What is particularly
insidious about these kinds of errors is that the UDF will compile and
build just fine, but when you try to use it in a query, Drill can’t find
the function.
I would recommend first testing the UDF on Drill in embedded mode so
that way you can minimize the things which can go wrong. Next, I would
comment out the entirety of the eval() and next() functions, build the UDF
and see if Drill recognizes the function. If it does, then slowly start
uncommenting lines to see what is breaking it.
One other thing, I believe the drill-module.conf is supposed to be in
the resources folder in your project. Mine are always in
<project>/src/main/resources.
Can you share any of your code?
— C
On May 8, 2017, at 17:10, Knapp, Michael <
I have followed all of the instructions here<
https://drill.apache.org/docs/tutorial-develop-a-simple-function/> and
also here<https://drill.apache.org/docs/manually-adding-custom-
functions-to-drill/> as closely as possible, but unfortunately Drill is
still not finding my custom UDF.
· My source and binary jars are present in jars/3rdparty
· My jars both have “drill-module.conf” in their root, and
o drill.classpath.scanning.packages += "path.to.my.package"
o but with my real package, which holds drill functions.
· I have removed the drill.exec.udf section from my
drill-override.conf file.
· I have configured my pom to build using ‘jar-no-fork’
like in your example.
· My function implements DrillSimpleFunc and is annotated
with FunctionTemplate. It’s scope is simple, and uses “NULL_IF_NULL”
· My function has a NullableVarCharHolder input parameter,
parameter. It is expected to be called with a single string argument.
VALIDATION ERROR: From line 1, column 8 to line 1, column 29: No
match found for function signature 
()
SQL Query null
I have tried issuing the query with my function in all caps and also
lower-case. In the logs I see that my 3rdparty jar is the first in the
list of scanning jars, and the appropriate package is listed in the
scanning packages. The logs indicate that 433 functions were loaded upon
startup. For some reason the logs mention loading functions from the hive
UDF jars, but not mine.
· I am running zookeeper separately from Drill, but on the
same node. I use drillbit.sh to run, so it’s like a cluster of one.
· This is on AWS.
· I did have a drill.exec.udf section defined previously,
but it is not defined in drill-override now. I wonder if ZK persisted
those values from a previous run and that is still getting used.
· I am not running Hadoop, there is no HDFS that I can add
dynamic UDF jars to.
· I am using drill 1.10.
I have also tried setting “exec.udf.enable_dynamic_support” to
false and restarting, but that did not resolve the issue.
I have noticed one unrelated problem, the paths used for udfs on the
file system do not match what I set in drill-override.conf, I think drill
is prepending them with a temp directory even though I provided an absolute
path.
1. Does anybody know what I am doing wrong?
2. Can I use dynamic UDFs without HDFS?
3. Are there more troubleshooting techniques I can use here? How
can I list all of the known UDFs and their jars?
Michael Knapp
________________________________________________________
The information contained in this e-mail is confidential and/or
proprietary to Capital One and/or its affiliates and may only be used
solely in performance of work or services for Capital One. The information
transmitted herewith is intended only for use by the individual or entity
to which it is addressed. If the reader of this message is not the intended
recipient, you are hereby notified that any review, retransmission,
dissemination, distribution, copying or other use of, or taking of any
action in reliance upon this information is strictly prohibited. If you
have received this communication in error, please contact the sender and
delete the material from your computer.
________________________________________________________
The information contained in this e-mail is confidential and/or
proprietary to Capital One and/or its affiliates and may only be used
solely in performance of work or services for Capital One. The information
transmitted herewith is intended only for use by the individual or entity
to which it is addressed. If the reader of this message is not the intended
recipient, you are hereby notified that any review, retransmission,
dissemination, distribution, copying or other use of, or taking of any
action in reliance upon this information is strictly prohibited. If you
have received this communication in error, please contact the sender and
delete the material from your computer.
Jim Bates
2017-05-09 21:35:50 UTC
Permalink
Raw Message
Attachments get removed. The code will need to be in the email.

On 5/9/17, 3:47 PM, "Knapp, Michael" <***@capitalone.com> wrote:

Thanks for your advice Charles.

First, I did have drill-module.conf in src/main/resources, which is the reason that it’s in the root of my jar file. I thought it would be better to check the jar file itself instead of my project.

I did back out all my UDFs code and reached a point where it was recognized. I even got it to work where it just appends one string to the input. What I am trying to do now, is to encrypt the input with a cipher. Drill is p***ing me off with how it never works and never delivers a useful error message in the logs when this fails. I believe it was a poor choice to use runtime compilation and off heap memory for this.

At first I was depending on an external jar to do this, but in my troubleshooting, I decided to copy in the core logic.

I attached the latest code I have tried that is failing.

Michael Knapp

On 5/9/17, 12:13 AM, "Charles Givre" <***@gmail.com> wrote:

Hi Michael,
I’ve encountered this issue when developing Drill UDFs and sometimes it can mean that there is an error in the UDF itself. What is particularly insidious about these kinds of errors is that the UDF will compile and build just fine, but when you try to use it in a query, Drill can’t find the function.

I would recommend first testing the UDF on Drill in embedded mode so that way you can minimize the things which can go wrong. Next, I would comment out the entirety of the eval() and next() functions, build the UDF and see if Drill recognizes the function. If it does, then slowly start uncommenting lines to see what is breaking it.

One other thing, I believe the drill-module.conf is supposed to be in the resources folder in your project. Mine are always in <project>/src/main/resources.

Can you share any of your code?
— C
Post by Knapp, Michael
I have followed all of the instructions here<https://drill.apache.org/docs/tutorial-develop-a-simple-function/> and also here<https://drill.apache.org/docs/manually-adding-custom-functions-to-drill/> as closely as possible, but unfortunately Drill is still not finding my custom UDF.
· My source and binary jars are present in jars/3rdparty
o drill.classpath.scanning.packages += "path.to.my.package"
o but with my real package, which holds drill functions.
· I have removed the drill.exec.udf section from my drill-override.conf file.
· I have configured my pom to build using ‘jar-no-fork’ like in your example.
· My function implements DrillSimpleFunc and is annotated with FunctionTemplate. It’s scope is simple, and uses “NULL_IF_NULL”
VALIDATION ERROR: From line 1, column 8 to line 1, column 29: No match found for function signature …()
SQL Query null
I have tried issuing the query with my function in all caps and also lower-case. In the logs I see that my 3rdparty jar is the first in the list of scanning jars, and the appropriate package is listed in the scanning packages. The logs indicate that 433 functions were loaded upon startup. For some reason the logs mention loading functions from the hive UDF jars, but not mine.
· I am running zookeeper separately from Drill, but on the same node. I use drillbit.sh to run, so it’s like a cluster of one.
· This is on AWS.
· I did have a drill.exec.udf section defined previously, but it is not defined in drill-override now. I wonder if ZK persisted those values from a previous run and that is still getting used.
· I am not running Hadoop, there is no HDFS that I can add dynamic UDF jars to.
· I am using drill 1.10.
I have also tried setting “exec.udf.enable_dynamic_support” to false and restarting, but that did not resolve the issue.
I have noticed one unrelated problem, the paths used for udfs on the file system do not match what I set in drill-override.conf, I think drill is prepending them with a temp directory even though I provided an absolute path.
1. Does anybody know what I am doing wrong?
2. Can I use dynamic UDFs without HDFS?
3. Are there more troubleshooting techniques I can use here? How can I list all of the known UDFs and their jars?
Michael Knapp
________________________________________________________
The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.
________________________________________________________

The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the m
Paul Rogers
2017-05-09 22:11:55 UTC
Permalink
Raw Message
Hi Michael,

I hear your pain with debugging Drill. Many folks are brave, strong and patient and debug Drill as a server: build the whole thing, deploy the code, adjust the config files, start the server, attach a remote debugger, find a bug, and repeat the whole shebang. I admire their fortitude.

Those of us that are too lazy for that go another route. We define unit tests that run an embedded Drilllbit so the whole thing is built and runs directly from your favorite IDE (in my case, Eclipse, but most prefer IntelliJ.)

In the embedded mode, you can make make a change, press “Run” and, within 20-30 seconds, be at your breakpoint figuring out what’s what.

I’ve not tried this for UDFs, but no reason it should not work.

If you want to try this, use the latest Drill source code and look for the file ExampleTest.java. It has a collection of example unit tests that spin up an embedded Drill server and client, with a variety of config options. You can use these as examples for how to set the config options needed for your case.

Plus, you can use the resulting JUnit test as the unit test for your UDF to make sure that it works as you expect in all cases. We’ve recently added a few tools to help with that also, but let’s defer that until the basics work.

If anyone tries this, feel free to contact me for the details. And, once we find the exact steps needed to do this to debug UDFs, we can post them back here to help others.

Thanks,

- Paul
Post by Knapp, Michael
Thanks for your advice Charles.
First, I did have drill-module.conf in src/main/resources, which is the reason that it’s in the root of my jar file. I thought it would be better to check the jar file itself instead of my project.
I did back out all my UDFs code and reached a point where it was recognized. I even got it to work where it just appends one string to the input. What I am trying to do now, is to encrypt the input with a cipher. Drill is p***ing me off with how it never works and never delivers a useful error message in the logs when this fails. I believe it was a poor choice to use runtime compilation and off heap memory for this.
At first I was depending on an external jar to do this, but in my troubleshooting, I decided to copy in the core logic.
I attached the latest code I have tried that is failing.
Michael Knapp
Hi Michael,
I’ve encountered this issue when developing Drill UDFs and sometimes it can mean that there is an error in the UDF itself. What is particularly insidious about these kinds of errors is that the UDF will compile and build just fine, but when you try to use it in a query, Drill can’t find the function.
I would recommend first testing the UDF on Drill in embedded mode so that way you can minimize the things which can go wrong. Next, I would comment out the entirety of the eval() and next() functions, build the UDF and see if Drill recognizes the function. If it does, then slowly start uncommenting lines to see what is breaking it.
One other thing, I believe the drill-module.conf is supposed to be in the resources folder in your project. Mine are always in <project>/src/main/resources.
Can you share any of your code?
— C
Post by Knapp, Michael
I have followed all of the instructions here<https://drill.apache.org/docs/tutorial-develop-a-simple-function/> and also here<https://drill.apache.org/docs/manually-adding-custom-functions-to-drill/> as closely as possible, but unfortunately Drill is still not finding my custom UDF.
· My source and binary jars are present in jars/3rdparty
o drill.classpath.scanning.packages += "path.to.my.package"
o but with my real package, which holds drill functions.
· I have removed the drill.exec.udf section from my drill-override.conf file.
· I have configured my pom to build using ‘jar-no-fork’ like in your example.
· My function implements DrillSimpleFunc and is annotated with FunctionTemplate. It’s scope is simple, and uses “NULL_IF_NULL”
VALIDATION ERROR: From line 1, column 8 to line 1, column 29: No match found for function signature …()
SQL Query null
I have tried issuing the query with my function in all caps and also lower-case. In the logs I see that my 3rdparty jar is the first in the list of scanning jars, and the appropriate package is listed in the scanning packages. The logs indicate that 433 functions were loaded upon startup. For some reason the logs mention loading functions from the hive UDF jars, but not mine.
· I am running zookeeper separately from Drill, but on the same node. I use drillbit.sh to run, so it’s like a cluster of one.
· This is on AWS.
· I did have a drill.exec.udf section defined previously, but it is not defined in drill-override now. I wonder if ZK persisted those values from a previous run and that is still getting used.
· I am not running Hadoop, there is no HDFS that I can add dynamic UDF jars to.
· I am using drill 1.10.
I have also tried setting “exec.udf.enable_dynamic_support” to false and restarting, but that did not resolve the issue.
I have noticed one unrelated problem, the paths used for udfs on the file system do not match what I set in drill-override.conf, I think drill is prepending them with a temp directory even though I provided an absolute path.
1. Does anybody know what I am doing wrong?
2. Can I use dynamic UDFs without HDFS?
3. Are there more troubleshooting techniques I can use here? How can I list all of the known UDFs and their jars?
Michael Knapp
________________________________________________________
The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.
________________________________________________________
The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact th
Jim Bates
2017-05-08 22:53:12 UTC
Permalink
Raw Message
It has been a while since I wrote a Drill udf but checking the drill error and out log files will normally point you to any errors drill finds whill loading the udf. Did you see anything in the log?



Sent via the Samsung Galaxy S7 active, an AT&T 4G LTE smartphone


-------- Original message --------
From: "Knapp, Michael" <***@capitalone.com>
Date: 5/8/17 16:11 (GMT-06:00)
To: ***@drill.apache.org
Subject: Custom UDFs not found

I have followed all of the instructions here<https://drill.apache.org/docs/tutorial-develop-a-simple-function/> and also here<https://drill.apache.org/docs/manually-adding-custom-functions-to-drill/> as closely as possible, but unfortunately Drill is still not finding my custom UDF.

I have checked that:

· My source and binary jars are present in jars/3rdparty

· My jars both have “drill-module.conf” in their root, and that file’s contents are:

o drill.classpath.scanning.packages += "path.to.my.package"

o but with my real package, which holds drill functions.

· I have removed the drill.exec.udf section from my drill-override.conf file.

· I have configured my pom to build using ‘jar-no-fork’ like in your example.

· My function implements DrillSimpleFunc and is annotated with FunctionTemplate. It’s scope is simple, and uses “NULL_IF_NULL”

· My function has a NullableVarCharHolder input parameter, NullableVarCharHolder output parameter, and also accepts a @Inject DrillBuf parameter. It is expected to be called with a single string argument.



From the Drill UI, I keep getting this error:



VALIDATION ERROR: From line 1, column 8 to line 1, column 29: No match found for function signature …()



SQL Query null



I have tried issuing the query with my function in all caps and also lower-case. In the logs I see that my 3rdparty jar is the first in the list of scanning jars, and the appropriate package is listed in the scanning packages. The logs indicate that 433 functions were loaded upon startup. For some reason the logs mention loading functions from the hive UDF jars, but not mine.



Other details:

· I am running zookeeper separately from Drill, but on the same node. I use drillbit.sh to run, so it’s like a cluster of one.

· This is on AWS.

· I did have a drill.exec.udf section defined previously, but it is not defined in drill-override now. I wonder if ZK persisted those values from a previous run and that is still getting used.

· I am not running Hadoop, there is no HDFS that I can add dynamic UDF jars to.

· I am using drill 1.10.



I have also tried setting “exec.udf.enable_dynamic_support” to false and restarting, but that did not resolve the issue.



I have noticed one unrelated problem, the paths used for udfs on the file system do not match what I set in drill-override.conf, I think drill is prepending them with a temp directory even though I provided an absolute path.

Questions:

1. Does anybody know what I am doing wrong?

2. Can I use dynamic UDFs without HDFS?

3. Are there more troubleshooting techniques I can use here? How can I list all of the known UDFs and their jars?



Michael Knapp
________________________________________________________

The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.
Loading...