Discussion:
Queries getting CANCELED
Rahul Raj
2017-10-18 02:09:35 UTC
Permalink
I have a web app that generates CSV files using Drill. When the CSV size
gets larger, the query state moves to CANCELED and results are always
partial/truncated. The same happens with larger parquet files too and works
fine with smaller data sets.

Code snippet is similar to:

try(Connection connection = ctx.getConnection()){
try(Statement st = connection.createStatement()){
st.executeQuery("alter session set `store.format` ='csv'");
st.executeQuery(query);
st.executeQuery("alter session set `store.format` ='parquet'");
}
}

The connections are wrapped within DBCP connection pool, I suspect DBCP
connection pool cancelling the queries. I set queryTimeout as 0, tried
adding some delays to see if its related to finishing the CSV writer, still
getting cancelled.

When the same queries are executed from Drill Web Console, complete results
are generated. Anyone faced similar issues? What could be wrong the the
scenario above?

Regards,
Rahul
--
**** This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom it is
addressed. If you are not the named addressee then you should not
disseminate, distribute or copy this e-mail. Please notify the sender
immediately and delete this e-mail from your system.****
Khurram Faraaz
2017-10-18 05:56:48 UTC
Permalink
Can you please share your query that generates/creates the CSV ?

What is the size of the CSV file ?

What version of Drill are you on ?


Thanks,

Khurram

________________________________
From: Rahul Raj <***@option3consulting.com>
Sent: Tuesday, October 17, 2017 7:09:35 PM
To: ***@drill.apache.org
Subject: Queries getting CANCELED

I have a web app that generates CSV files using Drill. When the CSV size
gets larger, the query state moves to CANCELED and results are always
partial/truncated. The same happens with larger parquet files too and works
fine with smaller data sets.

Code snippet is similar to:

try(Connection connection = ctx.getConnection()){
try(Statement st = connection.createStatement()){
st.executeQuery("alter session set `store.format` ='csv'");
st.executeQuery(query);
st.executeQuery("alter session set `store.format` ='parquet'");
}
}

The connections are wrapped within DBCP connection pool, I suspect DBCP
connection pool cancelling the queries. I set queryTimeout as 0, tried
adding some delays to see if its related to finishing the CSV writer, still
getting cancelled.

When the same queries are executed from Drill Web Console, complete results
are generated. Anyone faced similar issues? What could be wrong the the
scenario above?

Regards,
Rahul

--
**** This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom it is
addressed. If you are not the named addressee then you should not
disseminate, distribute or copy this e-mail. Please notify the sender
immediately and delete this e-mail from your system.****
Rahul Raj
2017-10-18 06:45:20 UTC
Permalink
I think i found the issue - I was not reading the result set back. Just
reading the number of results written fixes the problem.

try(Connection connection = ctx.getConnection()){
try(Statement st = connection.createStatement()){
st.executeQuery("alter session set `store.format` ='csv'")

ResultSet rs = st.executeQuery(ctasQuery);
while(rs.next()){
rs.getObject(1); // Read the number of records written
}

st.executeQuery("alter session set `store.format` ='parquet'");
}
}

However, In the case of CTAS, is it required to read the Result set for the
records written?

Regards,
Rahul
Post by Khurram Faraaz
Can you please share your query that generates/creates the CSV ?
What is the size of the CSV file ?
What version of Drill are you on ?
Thanks,
Khurram
________________________________
Sent: Tuesday, October 17, 2017 7:09:35 PM
Subject: Queries getting CANCELED
I have a web app that generates CSV files using Drill. When the CSV size
gets larger, the query state moves to CANCELED and results are always
partial/truncated. The same happens with larger parquet files too and works
fine with smaller data sets.
try(Connection connection = ctx.getConnection()){
try(Statement st = connection.createStatement()){
st.executeQuery("alter session set `store.format` ='csv'");
st.executeQuery(query);
st.executeQuery("alter session set `store.format` ='parquet'");
}
}
The connections are wrapped within DBCP connection pool, I suspect DBCP
connection pool cancelling the queries. I set queryTimeout as 0, tried
adding some delays to see if its related to finishing the CSV writer, still
getting cancelled.
When the same queries are executed from Drill Web Console, complete results
are generated. Anyone faced similar issues? What could be wrong the the
scenario above?
Regards,
Rahul
--
**** This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom it is
addressed. If you are not the named addressee then you should not
disseminate, distribute or copy this e-mail. Please notify the sender
immediately and delete this e-mail from your system.****
--
**** This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom it is
addressed. If you are not the named addressee then you should not
disseminate, distribute or copy this e-mail. Please notify the sender
immediately and delete this e-mail from your system.****
Loading...