Discussion:
Exception while reading parquet data
PROJJWAL SAHA
2017-10-11 08:50:22 UTC
Permalink
I get below exception when querying parquet data on Oracle Storage Cloud
service.
Any pointers on what does this point to ?

Regards,
Projjwal


ERROR o.a.d.e.u.f.BufferedDirectBufInputStream - Error reading from stream
part-00006-25a9ae4b-fd9e-4770-b17e-9a29b270a4c2.parquet. Error was : null
2017-10-09 09:42:18,516 [scan-2] INFO o.a.d.e.s.p.c.AsyncPageReader - User
Error Occurred: Exception occurred while reading from disk.
(java.lang.IndexOutOfBoundsException)
org.apache.drill.common.exceptions.UserException: DATA_READ ERROR:
Exception occurred while reading from disk.

File:
/data25GB/storereturns/part-00006-25a9ae4b-fd9e-4770-b17e-9a29b270a4c2.parquet
Column: sr_return_time_sk
Row Group Start: 479751

[Error Id: 10680bb8-d1d6-43a1-b5e0-ef15bd8a9406 ]
at
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550)
~[drill-common-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader.handleAndThrowException(AsyncPageReader.java:185)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader.access$700(AsyncPageReader.java:82)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader$AsyncPageReaderTask.call(AsyncPageReader.java:461)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader$AsyncPageReaderTask.call(AsyncPageReader.java:381)
[drill-java-exec-1.11.0.jar:1.11.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_121]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[na:1.8.0_121]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[na:1.8.0_121]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
Caused by: java.io.IOException: java.lang.IndexOutOfBoundsException
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInputStream.getNextBlock(BufferedDirectBufInputStream.java:185)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInputStream.readInternal(BufferedDirectBufInputStream.java:212)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInputStream.read(BufferedDirectBufInputStream.java:277)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.DirectBufInputStream.getNext(DirectBufInputStream.java:111)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader$AsyncPageReaderTask.call(AsyncPageReader.java:421)
[drill-java-exec-1.11.0.jar:1.11.0]
... 5 common frames omitted
Caused by: java.lang.IndexOutOfBoundsException: null
at java.nio.Buffer.checkBounds(Buffer.java:567) ~[na:1.8.0_121]
at java.nio.ByteBuffer.put(ByteBuffer.java:827) ~[na:1.8.0_121]
at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:379) ~[na:1.8.0_121]
at
org.apache.parquet.hadoop.util.CompatibilityUtil.getBuf(CompatibilityUtil.java:110)
~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInputStream.getNextBlock(BufferedDirectBufInputStream.java:182)
~[drill-java-exec-1.11.0.jar:1.11.0]
... 9 common frames omitted
2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.fragment.FragmentExecutor -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested
AWAITING_ALLOCATION --> RUNNING
2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.f.FragmentStatusReporter -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State to report: RUNNING
2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.fragment.FragmentExecutor -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested RUNNING
--> CANCELLATION_REQUESTED
2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.f.FragmentStatusReporter -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State to report:
CANCELLATION_REQUESTED
Arjun kr
2017-10-12 01:51:30 UTC
Permalink
Can you try disabling async parquet reader to see if problem gets resolved.


alter session set `store.parquet.reader.pagereader.async`=false;

Thanks,

Arjun


________________________________
From: PROJJWAL SAHA <***@gmail.com>
Sent: Wednesday, October 11, 2017 2:20 PM
To: ***@drill.apache.org
Subject: Exception while reading parquet data

I get below exception when querying parquet data on Oracle Storage Cloud
service.
Any pointers on what does this point to ?

Regards,
Projjwal


ERROR o.a.d.e.u.f.BufferedDirectBufInputStream - Error reading from stream
part-00006-25a9ae4b-fd9e-4770-b17e-9a29b270a4c2.parquet. Error was : null
2017-10-09 09:42:18,516 [scan-2] INFO o.a.d.e.s.p.c.AsyncPageReader - User
Error Occurred: Exception occurred while reading from disk.
(java.lang.IndexOutOfBoundsException)
org.apache.drill.common.exceptions.UserException: DATA_READ ERROR:
Exception occurred while reading from disk.

File:
/data25GB/storereturns/part-00006-25a9ae4b-fd9e-4770-b17e-9a29b270a4c2.parquet
Column: sr_return_time_sk
Row Group Start: 479751

[Error Id: 10680bb8-d1d6-43a1-b5e0-ef15bd8a9406 ]
at
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550)
~[drill-common-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader.handleAndThrowException(AsyncPageReader.java:185)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader.access$700(AsyncPageReader.java:82)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader$AsyncPageReaderTask.call(AsyncPageReader.java:461)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader$AsyncPageReaderTask.call(AsyncPageReader.java:381)
[drill-java-exec-1.11.0.jar:1.11.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_121]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[na:1.8.0_121]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[na:1.8.0_121]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
Caused by: java.io.IOException: java.lang.IndexOutOfBoundsException
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInputStream.getNextBlock(BufferedDirectBufInputStream.java:185)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInputStream.readInternal(BufferedDirectBufInputStream.java:212)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInputStream.read(BufferedDirectBufInputStream.java:277)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.DirectBufInputStream.getNext(DirectBufInputStream.java:111)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader$AsyncPageReaderTask.call(AsyncPageReader.java:421)
[drill-java-exec-1.11.0.jar:1.11.0]
... 5 common frames omitted
Caused by: java.lang.IndexOutOfBoundsException: null
at java.nio.Buffer.checkBounds(Buffer.java:567) ~[na:1.8.0_121]
at java.nio.ByteBuffer.put(ByteBuffer.java:827) ~[na:1.8.0_121]
at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:379) ~[na:1.8.0_121]
at
org.apache.parquet.hadoop.util.CompatibilityUtil.getBuf(CompatibilityUtil.java:110)
~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInputStream.getNextBlock(BufferedDirectBufInputStream.java:182)
~[drill-java-exec-1.11.0.jar:1.11.0]
... 9 common frames omitted
2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.fragment.FragmentExecutor -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested
AWAITING_ALLOCATION --> RUNNING
2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.f.FragmentStatusReporter -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State to report: RUNNING
2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.fragment.FragmentExecutor -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested RUNNING
--> CANCELLATION_REQUESTED
2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.f.FragmentStatusReporter -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State to report:
CANCELLATION_REQUESTED
Kunal Khatua
2017-10-12 04:09:46 UTC
Permalink
If this resolves the issue, could you share some additional details, such as the metadata of the Parquet files, the OS, etc.? Details describing the setup is also very helpful in identifying what could be the cause of the error.

We had observed some similar DATA_READ errors in the early iterations of the Async Parquet reader, but those have been resolved. I'm presuming you're already on the latest (i.e. Apache Drill 1.11.0)

-----Original Message-----
From: Arjun kr [mailto:***@outlook.com]
Sent: Wednesday, October 11, 2017 6:52 PM
To: ***@drill.apache.org
Subject: Re: Exception while reading parquet data


Can you try disabling async parquet reader to see if problem gets resolved.


alter session set `store.parquet.reader.pagereader.async`=false;

Thanks,

Arjun


________________________________
From: PROJJWAL SAHA <***@gmail.com>
Sent: Wednesday, October 11, 2017 2:20 PM
To: ***@drill.apache.org
Subject: Exception while reading parquet data

I get below exception when querying parquet data on Oracle Storage Cloud service.
Any pointers on what does this point to ?

Regards,
Projjwal


ERROR o.a.d.e.u.f.BufferedDirectBufInputStream - Error reading from stream part-00006-25a9ae4b-fd9e-4770-b17e-9a29b270a4c2.parquet. Error was : null
2017-10-09 09:42:18,516 [scan-2] INFO o.a.d.e.s.p.c.AsyncPageReader - User Error Occurred: Exception occurred while reading from disk.
(java.lang.IndexOutOfBoundsException)
org.apache.drill.common.exceptions.UserException: DATA_READ ERROR:
Exception occurred while reading from disk.

File:
/data25GB/storereturns/part-00006-25a9ae4b-fd9e-4770-b17e-9a29b270a4c2.parquet
Column: sr_return_time_sk
Row Group Start: 479751

[Error Id: 10680bb8-d1d6-43a1-b5e0-ef15bd8a9406 ] at
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550)
~[drill-common-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader.handleAndThrowException(AsyncPageReader.java:185)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader.access$700(AsyncPageReader.java:82)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader$AsyncPageReaderTask.call(AsyncPageReader.java:461)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader$AsyncPageReaderTask.call(AsyncPageReader.java:381)
[drill-java-exec-1.11.0.jar:1.11.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_121] at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[na:1.8.0_121]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[na:1.8.0_121]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] Caused by: java.io.IOException: java.lang.IndexOutOfBoundsException
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInputStream.getNextBlock(BufferedDirectBufInputStream.java:185)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInputStream.readInternal(BufferedDirectBufInputStream.java:212)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInputStream.read(BufferedDirectBufInputStream.java:277)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.DirectBufInputStream.getNext(DirectBufInputStream.java:111)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader$AsyncPageReaderTask.call(AsyncPageReader.java:421)
[drill-java-exec-1.11.0.jar:1.11.0]
... 5 common frames omitted
Caused by: java.lang.IndexOutOfBoundsException: null at java.nio.Buffer.checkBounds(Buffer.java:567) ~[na:1.8.0_121] at java.nio.ByteBuffer.put(ByteBuffer.java:827) ~[na:1.8.0_121] at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:379) ~[na:1.8.0_121] at
org.apache.parquet.hadoop.util.CompatibilityUtil.getBuf(CompatibilityUtil.java:110)
~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInputStream.getNextBlock(BufferedDirectBufInputStream.java:182)
~[drill-java-exec-1.11.0.jar:1.11.0]
... 9 common frames omitted
2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.fragment.FragmentExecutor -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested AWAITING_ALLOCATION --> RUNNING
2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.f.FragmentStatusReporter -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State to report: RUNNING
2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.fragment.FragmentExecutor -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested RUNNING
--> CANCELLATION_REQUESTED
2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.f.FragmentStatusReporter -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State to report:
CANCELLATION_REQUESTED
PROJJWAL SAHA
2017-10-12 08:19:37 UTC
Permalink
sure, I can try disabling sync parquet reader.
Will this however, impact the performance of queries on parquet data ?
Post by Kunal Khatua
If this resolves the issue, could you share some additional details, such
as the metadata of the Parquet files, the OS, etc.? Details describing the
setup is also very helpful in identifying what could be the cause of the
error.
We had observed some similar DATA_READ errors in the early iterations of
the Async Parquet reader, but those have been resolved. I'm presuming
you're already on the latest (i.e. Apache Drill 1.11.0)
-----Original Message-----
Sent: Wednesday, October 11, 2017 6:52 PM
Subject: Re: Exception while reading parquet data
Can you try disabling async parquet reader to see if problem gets resolved.
alter session set `store.parquet.reader.pagereader.async`=false;
Thanks,
Arjun
________________________________
Sent: Wednesday, October 11, 2017 2:20 PM
Subject: Exception while reading parquet data
I get below exception when querying parquet data on Oracle Storage Cloud service.
Any pointers on what does this point to ?
Regards,
Projjwal
ERROR o.a.d.e.u.f.BufferedDirectBufInputStream - Error reading from
stream part-00006-25a9ae4b-fd9e-4770-b17e-9a29b270a4c2.parquet. Error was
: null
2017-10-09 09:42:18,516 [scan-2] INFO o.a.d.e.s.p.c.AsyncPageReader -
User Error Occurred: Exception occurred while reading from disk.
(java.lang.IndexOutOfBoundsException)
Exception occurred while reading from disk.
/data25GB/storereturns/part-00006-25a9ae4b-fd9e-4770-b17e-
9a29b270a4c2.parquet
Column: sr_return_time_sk
Row Group Start: 479751
[Error Id: 10680bb8-d1d6-43a1-b5e0-ef15bd8a9406 ] at
org.apache.drill.common.exceptions.UserException$
Builder.build(UserException.java:550)
~[drill-common-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader.
handleAndThrowException(AsyncPageReader.java:185)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.
AsyncPageReader.access$700(AsyncPageReader.java:82)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader$
AsyncPageReaderTask.call(AsyncPageReader.java:461)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader$
AsyncPageReaderTask.call(AsyncPageReader.java:381)
[drill-java-exec-1.11.0.jar:1.11.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[na:1.8.0_121] at
java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1142)
[na:1.8.0_121]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:617)
[na:1.8.0_121]
java.io.IOException: java.lang.IndexOutOfBoundsException
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInputStream.
getNextBlock(BufferedDirectBufInputStream.java:185)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInputStream.
readInternal(BufferedDirectBufInputStream.java:212)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInputStream.read(
BufferedDirectBufInputStream.java:277)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.DirectBufInputStream.getNext(
DirectBufInputStream.java:111)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader$
AsyncPageReaderTask.call(AsyncPageReader.java:421)
[drill-java-exec-1.11.0.jar:1.11.0]
... 5 common frames omitted
Caused by: java.lang.IndexOutOfBoundsException: null at
java.nio.Buffer.checkBounds(Buffer.java:567) ~[na:1.8.0_121] at
java.nio.ByteBuffer.put(ByteBuffer.java:827) ~[na:1.8.0_121] at
java.nio.DirectByteBuffer.put(DirectByteBuffer.java:379) ~[na:1.8.0_121]
at
org.apache.parquet.hadoop.util.CompatibilityUtil.getBuf(
CompatibilityUtil.java:110)
~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInputStream.
getNextBlock(BufferedDirectBufInputStream.java:182)
~[drill-java-exec-1.11.0.jar:1.11.0]
... 9 common frames omitted
2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.fragment.FragmentExecutor -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested
AWAITING_ALLOCATION --> RUNNING
2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.f.FragmentStatusReporter -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State to report: RUNNING
2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.fragment.FragmentExecutor -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested RUNNING
--> CANCELLATION_REQUESTED
2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.f.FragmentStatusReporter -
CANCELLATION_REQUESTED
PROJJWAL SAHA
2017-10-12 09:08:54 UTC
Permalink
hi,

disabling sync parquet reader doesnt solve the problem. I am getting
similar exception
I dont see any issue with the parquet file since the same file works on
loading the same on alluxio.

2017-10-12 04:19:50,502
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] ERROR
o.a.d.e.u.f.BufferedDirectBufInputStream - Error reading from stream
part-00000-7ce26fde-f342-4aae-a727-71b8b7a60e63.parquet. Error was :
null
2017-10-12 04:19:50,506
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] ERROR
o.a.d.exec.physical.impl.ScanBatch - SYSTEM ERROR:
IndexOutOfBoundsException


[Error Id: 3b7c4587-c1b8-4e79-bdaa-b2aa1516275b ]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR:
IndexOutOfBoundsException


[Error Id: 3b7c4587-c1b8-4e79-bdaa-b2aa1516275b ]
at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550)
~[drill-common-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:249)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.buildSchema(HashAggBatch.java:111)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.buildSchema(ExternalSortBatch.java:264)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext(LimitRecordBatch.java:115)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:105)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:81)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:95)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:234)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:227)
[drill-java-exec-1.11.0.jar:1.11.0]
at java.security.AccessController.doPrivileged(Native Method) [na:1.8.0_121]
at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_121]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
[hadoop-common-2.7.1.jar:na]
at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:227)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
[drill-common-1.11.0.jar:1.11.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[na:1.8.0_121]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[na:1.8.0_121]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
Caused by: org.apache.drill.common.exceptions.DrillRuntimeException:
Error in parquet record reader.
Message:
Hadoop path: /data1GBparquet/storereturns/part-00000-7ce26fde-f342-4aae-a727-71b8b7a60e63.parquet
Total records read: 0
Row group index: 0
Records in row group: 287514
Parquet Metadata: ParquetMetaData{FileMetaData{schema: message spark_schema {
optional int32 sr_returned_date_sk;
optional int32 sr_return_time_sk;
optional int32 sr_item_sk;
optional int32 sr_customer_sk;
optional int32 sr_cdemo_sk;
optional int32 sr_hdemo_sk;
optional int32 sr_addr_sk;
optional int32 sr_store_sk;
optional int32 sr_reason_sk;
optional int32 sr_ticket_number;
optional int32 sr_return_quantity;
optional double sr_return_amt;
optional double sr_return_tax;
optional double sr_return_amt_inc_tax;
optional double sr_fee;
optional double sr_return_ship_cost;
optional double sr_refunded_cash;
optional double sr_reversed_charge;
optional double sr_store_credit;
optional double sr_net_loss;
optional binary sr_dummycol (UTF8);
}
, metadata: {org.apache.spark.sql.parquet.row.metadata={"type":"struct","fields":[{"name":"sr_returned_date_sk","type":"integer","nullable":true,"metadata":{}},{"name":"sr_return_time_sk","type":"integer","nullable":true,"metadata":{}},{"name":"sr_item_sk","type":"integer","nullable":true,"metadata":{}},{"name":"sr_customer_sk","type":"integer","nullable":true,"metadata":{}},{"name":"sr_cdemo_sk","type":"integer","nullable":true,"metadata":{}},{"name":"sr_hdemo_sk","type":"integer","nullable":true,"metadata":{}},{"name":"sr_addr_sk","type":"integer","nullable":true,"metadata":{}},{"name":"sr_store_sk","type":"integer","nullable":true,"metadata":{}},{"name":"sr_reason_sk","type":"integer","nullable":true,"metadata":{}},{"name":"sr_ticket_number","type":"integer","nullable":true,"metadata":{}},{"name":"sr_return_quantity","type":"integer","nullable":true,"metadata":{}},{"name":"sr_return_amt","type":"double","nullable":true,"metadata":{}},{"name":"sr_return_tax","type":"double","nullable":true,"metadata":{}},{"name":"sr_return_amt_inc_tax","type":"double","nullable":true,"metadata":{}},{"name":"sr_fee","type":"double","nullable":true,"metadata":{}},{"name":"sr_return_ship_cost","type":"double","nullable":true,"metadata":{}},{"name":"sr_refunded_cash","type":"double","nullable":true,"metadata":{}},{"name":"sr_reversed_charge","type":"double","nullable":true,"metadata":{}},{"name":"sr_store_credit","type":"double","nullable":true,"metadata":{}},{"name":"sr_net_loss","type":"double","nullable":true,"metadata":{}},{"name":"sr_dummycol","type":"string","nullable":true,"metadata":{}}]}}},
blocks: [BlockMetaData{287514, 18570101 [ColumnMetaData{UNCOMPRESSED
[sr_returned_date_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 4},
ColumnMetaData{UNCOMPRESSED [sr_return_time_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 417866}, ColumnMetaData{UNCOMPRESSED
[sr_item_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 1096347},
ColumnMetaData{UNCOMPRESSED [sr_customer_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 1708118}, ColumnMetaData{UNCOMPRESSED
[sr_cdemo_sk] INT32 [RLE, PLAIN, BIT_PACKED], 2674001},
ColumnMetaData{UNCOMPRESSED [sr_hdemo_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 3812205}, ColumnMetaData{UNCOMPRESSED
[sr_addr_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 4320246},
ColumnMetaData{UNCOMPRESSED [sr_store_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 5102635}, ColumnMetaData{UNCOMPRESSED
[sr_reason_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 5235151},
ColumnMetaData{UNCOMPRESSED [sr_ticket_number] INT32 [RLE, PLAIN,
BIT_PACKED], 5471579}, ColumnMetaData{UNCOMPRESSED
[sr_return_quantity] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED],
6621731}, ColumnMetaData{UNCOMPRESSED [sr_return_amt] DOUBLE [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 6893357}, ColumnMetaData{UNCOMPRESSED
[sr_return_tax] DOUBLE [RLE, PLAIN_DICTIONARY, BIT_PACKED], 8419465},
ColumnMetaData{UNCOMPRESSED [sr_return_amt_inc_tax] DOUBLE [RLE,
PLAIN, PLAIN_DICTIONARY, BIT_PACKED], 9201856},
ColumnMetaData{UNCOMPRESSED [sr_fee] DOUBLE [RLE, PLAIN_DICTIONARY,
BIT_PACKED], 11366007}, ColumnMetaData{UNCOMPRESSED
[sr_return_ship_cost] DOUBLE [RLE, PLAIN_DICTIONARY, BIT_PACKED],
11959880}, ColumnMetaData{UNCOMPRESSED [sr_refunded_cash] DOUBLE
[RLE, PLAIN_DICTIONARY, BIT_PACKED], 13218730},
ColumnMetaData{UNCOMPRESSED [sr_reversed_charge] DOUBLE [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 14635937}, ColumnMetaData{UNCOMPRESSED
[sr_store_credit] DOUBLE [RLE, PLAIN_DICTIONARY, BIT_PACKED],
15824898}, ColumnMetaData{UNCOMPRESSED [sr_net_loss] DOUBLE [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 17004301}, ColumnMetaData{UNCOMPRESSED
[sr_dummycol] BINARY [RLE, PLAIN, BIT_PACKED], 18570072}]}]}
at org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.handleException(ParquetRecordReader.java:272)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next(ParquetRecordReader.java:299)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:180)
[drill-java-exec-1.11.0.jar:1.11.0]
... 60 common frames omitted
Caused by: java.io.IOException: java.lang.IndexOutOfBoundsException
at org.apache.drill.exec.util.filereader.BufferedDirectBufInputStream.getNextBlock(BufferedDirectBufInputStream.java:185)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.util.filereader.BufferedDirectBufInputStream.readInternal(BufferedDirectBufInputStream.java:212)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.util.filereader.BufferedDirectBufInputStream.read(BufferedDirectBufInputStream.java:277)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.util.filereader.DirectBufInputStream.getNext(DirectBufInputStream.java:111)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.PageReader.readPage(PageReader.java:216)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.PageReader.nextInternal(PageReader.java:283)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.PageReader.next(PageReader.java:307)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.NullableColumnReader.processPages(NullableColumnReader.java:69)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.BatchReader.readAllFixedFieldsSerial(BatchReader.java:63)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.BatchReader.readAllFixedFields(BatchReader.java:56)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.BatchReader$FixedWidthReader.readRecords(BatchReader.java:143)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.BatchReader.readBatch(BatchReader.java:42)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next(ParquetRecordReader.java:297)
~[drill-java-exec-1.11.0.jar:1.11.0]
... 61 common frames omitted
Caused by: java.lang.IndexOutOfBoundsException: null
at java.nio.Buffer.checkBounds(Buffer.java:567) ~[na:1.8.0_121]
at java.nio.ByteBuffer.put(ByteBuffer.java:827) ~[na:1.8.0_121]
at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:379) ~[na:1.8.0_121]
at org.apache.parquet.hadoop.util.CompatibilityUtil.getBuf(CompatibilityUtil.java:110)
~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
at org.apache.drill.exec.util.filereader.BufferedDirectBufInputStream.getNextBlock(BufferedDirectBufInputStream.java:182)
~[drill-java-exec-1.11.0.jar:1.11.0]
... 73 common frames omitted
2017-10-12 04:19:50,506
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] INFO
o.a.d.e.w.fragment.FragmentExecutor -
2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0: State change requested
RUNNING --> FAILED
2017-10-12 04:19:50,507
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] INFO
o.a.d.e.w.fragment.FragmentExecutor -
2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0: State change requested
FAILED --> FINISHED
2017-10-12 04:19:50,533 [BitServer-2] WARN
o.a.drill.exec.work.foreman.Foreman - Dropping request to move to
COMPLETED state as query is already at FAILED state (which is
terminal).
2017-10-12 04:19:50,533 [BitServer-2] WARN
o.a.d.e.w.b.ControlMessageHandler - Dropping request to cancel
fragment. 2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0 does not exist.
Post by PROJJWAL SAHA
sure, I can try disabling sync parquet reader.
Will this however, impact the performance of queries on parquet data ?
Post by Kunal Khatua
If this resolves the issue, could you share some additional details, such
as the metadata of the Parquet files, the OS, etc.? Details describing the
setup is also very helpful in identifying what could be the cause of the
error.
We had observed some similar DATA_READ errors in the early iterations of
the Async Parquet reader, but those have been resolved. I'm presuming
you're already on the latest (i.e. Apache Drill 1.11.0)
-----Original Message-----
Sent: Wednesday, October 11, 2017 6:52 PM
Subject: Re: Exception while reading parquet data
Can you try disabling async parquet reader to see if problem gets resolved.
alter session set `store.parquet.reader.pagereader.async`=false;
Thanks,
Arjun
________________________________
Sent: Wednesday, October 11, 2017 2:20 PM
Subject: Exception while reading parquet data
I get below exception when querying parquet data on Oracle Storage Cloud service.
Any pointers on what does this point to ?
Regards,
Projjwal
ERROR o.a.d.e.u.f.BufferedDirectBufInputStream - Error reading from
stream part-00006-25a9ae4b-fd9e-4770-b17e-9a29b270a4c2.parquet. Error
was : null
2017-10-09 09:42:18,516 [scan-2] INFO o.a.d.e.s.p.c.AsyncPageReader -
User Error Occurred: Exception occurred while reading from disk.
(java.lang.IndexOutOfBoundsException)
Exception occurred while reading from disk.
/data25GB/storereturns/part-00006-25a9ae4b-fd9e-4770-b17e-9a
29b270a4c2.parquet
Column: sr_return_time_sk
Row Group Start: 479751
[Error Id: 10680bb8-d1d6-43a1-b5e0-ef15bd8a9406 ] at
org.apache.drill.common.exceptions.UserException$Builder.
build(UserException.java:550)
~[drill-common-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader.handleAndThrowException(AsyncPageReader.java:185)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader.access$700(AsyncPageReader.java:82)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader$AsyncPageReaderTask.call(AsyncPageReader.java:461)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader$AsyncPageReaderTask.call(AsyncPageReader.java:381)
[drill-java-exec-1.11.0.jar:1.11.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[na:1.8.0_121] at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
Executor.java:1142)
[na:1.8.0_121]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
lExecutor.java:617)
[na:1.8.0_121]
java.io.IOException: java.lang.IndexOutOfBoundsException
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.getNextBlock(BufferedDirectBufInputStream.java:185)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.readInternal(BufferedDirectBufInputStream.java:212)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.read(BufferedDirectBufInputStream.java:277)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.DirectBufInputStream.
getNext(DirectBufInputStream.java:111)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader$AsyncPageReaderTask.call(AsyncPageReader.java:421)
[drill-java-exec-1.11.0.jar:1.11.0]
... 5 common frames omitted
Caused by: java.lang.IndexOutOfBoundsException: null at
java.nio.Buffer.checkBounds(Buffer.java:567) ~[na:1.8.0_121] at
java.nio.ByteBuffer.put(ByteBuffer.java:827) ~[na:1.8.0_121] at
java.nio.DirectByteBuffer.put(DirectByteBuffer.java:379) ~[na:1.8.0_121]
at
org.apache.parquet.hadoop.util.CompatibilityUtil.getBuf(Comp
atibilityUtil.java:110)
~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.getNextBlock(BufferedDirectBufInputStream.java:182)
~[drill-java-exec-1.11.0.jar:1.11.0]
... 9 common frames omitted
2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.fragment.FragmentExecutor -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested
AWAITING_ALLOCATION --> RUNNING
2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.f.FragmentStatusReporter -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State to report: RUNNING
2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.fragment.FragmentExecutor -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested RUNNING
--> CANCELLATION_REQUESTED
2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.f.FragmentStatusReporter -
CANCELLATION_REQUESTED
Parth Chandra
2017-10-12 17:28:51 UTC
Permalink
Seems like a bug in BufferedDirectBufInputStream. Is it possible to share
a minimal data file that triggers this?

You can also try turning off the buffering reader.
store.parquet.reader.pagereader.bufferedread=false

With async reader on and buffering off, you might not see any degradation
in performance in most cases.
Post by PROJJWAL SAHA
hi,
disabling sync parquet reader doesnt solve the problem. I am getting
similar exception
I dont see any issue with the parquet file since the same file works on
loading the same on alluxio.
2017-10-12 04:19:50,502
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] ERROR
o.a.d.e.u.f.BufferedDirectBufInputStream - Error reading from stream
null
2017-10-12 04:19:50,506
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] ERROR
IndexOutOfBoundsException
[Error Id: 3b7c4587-c1b8-4e79-bdaa-b2aa1516275b ]
IndexOutOfBoundsException
[Error Id: 3b7c4587-c1b8-4e79-bdaa-b2aa1516275b ]
at org.apache.drill.common.exceptions.UserException$
Builder.build(UserException.java:550)
~[drill-common-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.ScanBatch.next(
ScanBatch.java:249)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.svremover.
RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.project.
ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.aggregate.
HashAggBatch.buildSchema(HashAggBatch.java:111)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:142)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.xsort.
ExternalSortBatch.buildSchema(ExternalSortBatch.java:264)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:142)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.svremover.
RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.project.
ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.limit.
LimitRecordBatch.innerNext(LimitRecordBatch.java:115)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.svremover.
RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.project.
ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.BaseRootExec.
next(BaseRootExec.java:105)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.ScreenCreator$
ScreenRoot.innerNext(ScreenCreator.java:81)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.BaseRootExec.
next(BaseRootExec.java:95)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.work.fragment.FragmentExecutor$1.
run(FragmentExecutor.java:234)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.work.fragment.FragmentExecutor$1.
run(FragmentExecutor.java:227)
[drill-java-exec-1.11.0.jar:1.11.0]
at java.security.AccessController.doPrivileged(Native Method) [na:1.8.0_121]
at javax.security.auth.Subject.doAs(Subject.java:422)
[na:1.8.0_121]
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1657)
[hadoop-common-2.7.1.jar:na]
at org.apache.drill.exec.work.fragment.FragmentExecutor.run(
FragmentExecutor.java:227)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.common.SelfCleaningRunnable.run(
SelfCleaningRunnable.java:38)
[drill-common-1.11.0.jar:1.11.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1142)
[na:1.8.0_121]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:617)
[na:1.8.0_121]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
Error in parquet record reader.
Hadoop path: /data1GBparquet/storereturns/part-00000-7ce26fde-f342-4aae-
a727-71b8b7a60e63.parquet
Total records read: 0
Row group index: 0
Records in row group: 287514
Parquet Metadata: ParquetMetaData{FileMetaData{schema: message
spark_schema {
optional int32 sr_returned_date_sk;
optional int32 sr_return_time_sk;
optional int32 sr_item_sk;
optional int32 sr_customer_sk;
optional int32 sr_cdemo_sk;
optional int32 sr_hdemo_sk;
optional int32 sr_addr_sk;
optional int32 sr_store_sk;
optional int32 sr_reason_sk;
optional int32 sr_ticket_number;
optional int32 sr_return_quantity;
optional double sr_return_amt;
optional double sr_return_tax;
optional double sr_return_amt_inc_tax;
optional double sr_fee;
optional double sr_return_ship_cost;
optional double sr_refunded_cash;
optional double sr_reversed_charge;
optional double sr_store_credit;
optional double sr_net_loss;
optional binary sr_dummycol (UTF8);
}
, metadata: {org.apache.spark.sql.parquet.row.metadata={"type":"struct",
"fields":[{"name":"sr_returned_date_sk","type":"integer","nullable":true,"
metadata":{}},{"name":"sr_return_time_sk","type":"
integer","nullable":true,"metadata":{}},{"name":"sr_
item_sk","type":"integer","nullable":true,"metadata":{}},
true,"metadata":{}},{"name":"sr_cdemo_sk","type":"integer",
"integer","nullable":true,"metadata":{}},{"name":"sr_
addr_sk","type":"integer","nullable":true,"metadata":{}},
{"name":"sr_store_sk","type":"integer","nullable":true,"
metadata":{}},{"name":"sr_reason_sk","type":"integer","
nullable":true,"metadata":{}},{"name":"sr_ticket_number","
type":"integer","nullable":true,"metadata":{}},{"name":"
sr_return_quantity","type":"integer","nullable":true,"
metadata":{}},{"name":"sr_return_amt","type":"double","
nullable":true,"metadata":{}},{"name":"sr_return_tax","type"
:"double","nullable":true,"metadata":{}},{"name":"sr_
return_amt_inc_tax","type":"double","nullable":true,"
true,"metadata":{}},{"name":"sr_return_ship_cost","type":"
double","nullable":true,"metadata":{}},{"name":"sr_
refunded_cash","type":"double","nullable":true,"metadata":{}
true,"metadata":{}},{"name":"sr_store_credit","type":"
double","nullable":true,"metadata":{}},{"name":"sr_net_
loss","type":"double","nullable":true,"metadata":{}},
{"name":"sr_dummycol","type":"string","nullable":true,"metadata":{}}]}}},
blocks: [BlockMetaData{287514, 18570101 [ColumnMetaData{UNCOMPRESSED
[sr_returned_date_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 4},
ColumnMetaData{UNCOMPRESSED [sr_return_time_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 417866}, ColumnMetaData{UNCOMPRESSED
[sr_item_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 1096347},
ColumnMetaData{UNCOMPRESSED [sr_customer_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 1708118}, ColumnMetaData{UNCOMPRESSED
[sr_cdemo_sk] INT32 [RLE, PLAIN, BIT_PACKED], 2674001},
ColumnMetaData{UNCOMPRESSED [sr_hdemo_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 3812205}, ColumnMetaData{UNCOMPRESSED
[sr_addr_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 4320246},
ColumnMetaData{UNCOMPRESSED [sr_store_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 5102635}, ColumnMetaData{UNCOMPRESSED
[sr_reason_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 5235151},
ColumnMetaData{UNCOMPRESSED [sr_ticket_number] INT32 [RLE, PLAIN,
BIT_PACKED], 5471579}, ColumnMetaData{UNCOMPRESSED
[sr_return_quantity] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED],
6621731}, ColumnMetaData{UNCOMPRESSED [sr_return_amt] DOUBLE [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 6893357}, ColumnMetaData{UNCOMPRESSED
[sr_return_tax] DOUBLE [RLE, PLAIN_DICTIONARY, BIT_PACKED], 8419465},
ColumnMetaData{UNCOMPRESSED [sr_return_amt_inc_tax] DOUBLE [RLE,
PLAIN, PLAIN_DICTIONARY, BIT_PACKED], 9201856},
ColumnMetaData{UNCOMPRESSED [sr_fee] DOUBLE [RLE, PLAIN_DICTIONARY,
BIT_PACKED], 11366007}, ColumnMetaData{UNCOMPRESSED
[sr_return_ship_cost] DOUBLE [RLE, PLAIN_DICTIONARY, BIT_PACKED],
11959880}, ColumnMetaData{UNCOMPRESSED [sr_refunded_cash] DOUBLE
[RLE, PLAIN_DICTIONARY, BIT_PACKED], 13218730},
ColumnMetaData{UNCOMPRESSED [sr_reversed_charge] DOUBLE [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 14635937}, ColumnMetaData{UNCOMPRESSED
[sr_store_credit] DOUBLE [RLE, PLAIN_DICTIONARY, BIT_PACKED],
15824898}, ColumnMetaData{UNCOMPRESSED [sr_net_loss] DOUBLE [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 17004301}, ColumnMetaData{UNCOMPRESSED
[sr_dummycol] BINARY [RLE, PLAIN, BIT_PACKED], 18570072}]}]}
at org.apache.drill.exec.store.parquet.columnreaders.
ParquetRecordReader.handleException(ParquetRecordReader.java:272)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
ParquetRecordReader.next(ParquetRecordReader.java:299)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.ScanBatch.next(
ScanBatch.java:180)
[drill-java-exec-1.11.0.jar:1.11.0]
... 60 common frames omitted
Caused by: java.io.IOException: java.lang.IndexOutOfBoundsException
at org.apache.drill.exec.util.filereader.
BufferedDirectBufInputStream.getNextBlock(BufferedDirectBufInputStream.
java:185)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.util.filereader.
BufferedDirectBufInputStream.readInternal(BufferedDirectBufInputStream.
java:212)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.util.filereader.
BufferedDirectBufInputStream.read(BufferedDirectBufInputStream.java:277)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.util.filereader.
DirectBufInputStream.getNext(DirectBufInputStream.java:111)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
PageReader.readPage(PageReader.java:216)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
PageReader.nextInternal(PageReader.java:283)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
PageReader.next(PageReader.java:307)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
NullableColumnReader.processPages(NullableColumnReader.java:69)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.BatchReader.
readAllFixedFieldsSerial(BatchReader.java:63)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.BatchReader.
readAllFixedFields(BatchReader.java:56)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
BatchReader$FixedWidthReader.readRecords(BatchReader.java:143)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
BatchReader.readBatch(BatchReader.java:42)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
ParquetRecordReader.next(ParquetRecordReader.java:297)
~[drill-java-exec-1.11.0.jar:1.11.0]
... 61 common frames omitted
Caused by: java.lang.IndexOutOfBoundsException: null
at java.nio.Buffer.checkBounds(Buffer.java:567) ~[na:1.8.0_121]
at java.nio.ByteBuffer.put(ByteBuffer.java:827) ~[na:1.8.0_121]
at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:379) ~[na:1.8.0_121]
at org.apache.parquet.hadoop.util.CompatibilityUtil.getBuf(
CompatibilityUtil.java:110)
~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
at org.apache.drill.exec.util.filereader.
BufferedDirectBufInputStream.getNextBlock(BufferedDirectBufInputStream.
java:182)
~[drill-java-exec-1.11.0.jar:1.11.0]
... 73 common frames omitted
2017-10-12 04:19:50,506
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] INFO
o.a.d.e.w.fragment.FragmentExecutor -
2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0: State change requested
RUNNING --> FAILED
2017-10-12 04:19:50,507
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] INFO
o.a.d.e.w.fragment.FragmentExecutor -
2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0: State change requested
FAILED --> FINISHED
2017-10-12 04:19:50,533 [BitServer-2] WARN
o.a.drill.exec.work.foreman.Foreman - Dropping request to move to
COMPLETED state as query is already at FAILED state (which is
terminal).
2017-10-12 04:19:50,533 [BitServer-2] WARN
o.a.d.e.w.b.ControlMessageHandler - Dropping request to cancel
fragment. 2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0 does not exist.
Post by PROJJWAL SAHA
sure, I can try disabling sync parquet reader.
Will this however, impact the performance of queries on parquet data ?
Post by Kunal Khatua
If this resolves the issue, could you share some additional details,
such
Post by PROJJWAL SAHA
Post by Kunal Khatua
as the metadata of the Parquet files, the OS, etc.? Details describing
the
Post by PROJJWAL SAHA
Post by Kunal Khatua
setup is also very helpful in identifying what could be the cause of the
error.
We had observed some similar DATA_READ errors in the early iterations of
the Async Parquet reader, but those have been resolved. I'm presuming
you're already on the latest (i.e. Apache Drill 1.11.0)
-----Original Message-----
Sent: Wednesday, October 11, 2017 6:52 PM
Subject: Re: Exception while reading parquet data
Can you try disabling async parquet reader to see if problem gets resolved.
alter session set `store.parquet.reader.pagereader.async`=false;
Thanks,
Arjun
________________________________
Sent: Wednesday, October 11, 2017 2:20 PM
Subject: Exception while reading parquet data
I get below exception when querying parquet data on Oracle Storage Cloud service.
Any pointers on what does this point to ?
Regards,
Projjwal
ERROR o.a.d.e.u.f.BufferedDirectBufInputStream - Error reading from
stream part-00006-25a9ae4b-fd9e-4770-b17e-9a29b270a4c2.parquet. Error
was : null
2017-10-09 09:42:18,516 [scan-2] INFO o.a.d.e.s.p.c.AsyncPageReader -
User Error Occurred: Exception occurred while reading from disk.
(java.lang.IndexOutOfBoundsException)
Exception occurred while reading from disk.
/data25GB/storereturns/part-00006-25a9ae4b-fd9e-4770-b17e-9a
29b270a4c2.parquet
Column: sr_return_time_sk
Row Group Start: 479751
[Error Id: 10680bb8-d1d6-43a1-b5e0-ef15bd8a9406 ] at
org.apache.drill.common.exceptions.UserException$Builder.
build(UserException.java:550)
~[drill-common-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader.handleAndThrowException(AsyncPageReader.java:185)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader.access$700(AsyncPageReader.java:82)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader$AsyncPageReaderTask.call(AsyncPageReader.java:461)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader$AsyncPageReaderTask.call(AsyncPageReader.java:381)
[drill-java-exec-1.11.0.jar:1.11.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[na:1.8.0_121] at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
Executor.java:1142)
[na:1.8.0_121]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
lExecutor.java:617)
[na:1.8.0_121]
java.io.IOException: java.lang.IndexOutOfBoundsException
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.getNextBlock(BufferedDirectBufInputStream.java:185)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.readInternal(BufferedDirectBufInputStream.java:212)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.read(BufferedDirectBufInputStream.java:277)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.DirectBufInputStream.
getNext(DirectBufInputStream.java:111)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader$AsyncPageReaderTask.call(AsyncPageReader.java:421)
[drill-java-exec-1.11.0.jar:1.11.0]
... 5 common frames omitted
Caused by: java.lang.IndexOutOfBoundsException: null at
java.nio.Buffer.checkBounds(Buffer.java:567) ~[na:1.8.0_121] at
java.nio.ByteBuffer.put(ByteBuffer.java:827) ~[na:1.8.0_121] at
java.nio.DirectByteBuffer.put(DirectByteBuffer.java:379)
~[na:1.8.0_121]
Post by PROJJWAL SAHA
Post by Kunal Khatua
at
org.apache.parquet.hadoop.util.CompatibilityUtil.getBuf(Comp
atibilityUtil.java:110)
~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.getNextBlock(BufferedDirectBufInputStream.java:182)
~[drill-java-exec-1.11.0.jar:1.11.0]
... 9 common frames omitted
2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.fragment.FragmentExecutor -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested
AWAITING_ALLOCATION --> RUNNING
2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.f.FragmentStatusReporter -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State to report: RUNNING
2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.fragment.FragmentExecutor -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested
RUNNING
Post by PROJJWAL SAHA
Post by Kunal Khatua
--> CANCELLATION_REQUESTED
2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-507f6857e0ea:frag:2:3]
INFO o.a.d.e.w.f.FragmentStatusReporter -
CANCELLATION_REQUESTED
PROJJWAL SAHA
2017-10-15 15:07:07 UTC
Permalink
Is there any place where I can upload the 12MB parquet data. I am not able
to send the file through mail to the user group.
Post by Parth Chandra
Seems like a bug in BufferedDirectBufInputStream. Is it possible to share
a minimal data file that triggers this?
You can also try turning off the buffering reader.
store.parquet.reader.pagereader.bufferedread=false
With async reader on and buffering off, you might not see any degradation
in performance in most cases.
Post by PROJJWAL SAHA
hi,
disabling sync parquet reader doesnt solve the problem. I am getting
similar exception
I dont see any issue with the parquet file since the same file works on
loading the same on alluxio.
2017-10-12 04:19:50,502
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] ERROR
o.a.d.e.u.f.BufferedDirectBufInputStream - Error reading from stream
null
2017-10-12 04:19:50,506
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] ERROR
IndexOutOfBoundsException
[Error Id: 3b7c4587-c1b8-4e79-bdaa-b2aa1516275b ]
IndexOutOfBoundsException
[Error Id: 3b7c4587-c1b8-4e79-bdaa-b2aa1516275b ]
at org.apache.drill.common.exceptions.UserException$
Builder.build(UserException.java:550)
~[drill-common-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.ScanBatch.next(
ScanBatch.java:249)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.svremover.
RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.project.
ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.aggregate.
HashAggBatch.buildSchema(HashAggBatch.java:111)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:142)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.xsort.
ExternalSortBatch.buildSchema(ExternalSortBatch.java:264)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:142)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.svremover.
RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.project.
ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.limit.
LimitRecordBatch.innerNext(LimitRecordBatch.java:115)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.svremover.
RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.project.
ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.BaseRootExec.
next(BaseRootExec.java:105)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.ScreenCreator$
ScreenRoot.innerNext(ScreenCreator.java:81)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.BaseRootExec.
next(BaseRootExec.java:95)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.work.fragment.FragmentExecutor$1.
run(FragmentExecutor.java:234)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.work.fragment.FragmentExecutor$1.
run(FragmentExecutor.java:227)
[drill-java-exec-1.11.0.jar:1.11.0]
at java.security.AccessController.doPrivileged(Native Method) [na:1.8.0_121]
at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_121]
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1657)
[hadoop-common-2.7.1.jar:na]
at org.apache.drill.exec.work.fragment.FragmentExecutor.run(
FragmentExecutor.java:227)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.common.SelfCleaningRunnable.run(
SelfCleaningRunnable.java:38)
[drill-common-1.11.0.jar:1.11.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1142)
[na:1.8.0_121]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:617)
[na:1.8.0_121]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
Error in parquet record reader.
Hadoop path: /data1GBparquet/storereturns/part-00000-7ce26fde-f342-4aae-
a727-71b8b7a60e63.parquet
Total records read: 0
Row group index: 0
Records in row group: 287514
Parquet Metadata: ParquetMetaData{FileMetaData{schema: message spark_schema {
optional int32 sr_returned_date_sk;
optional int32 sr_return_time_sk;
optional int32 sr_item_sk;
optional int32 sr_customer_sk;
optional int32 sr_cdemo_sk;
optional int32 sr_hdemo_sk;
optional int32 sr_addr_sk;
optional int32 sr_store_sk;
optional int32 sr_reason_sk;
optional int32 sr_ticket_number;
optional int32 sr_return_quantity;
optional double sr_return_amt;
optional double sr_return_tax;
optional double sr_return_amt_inc_tax;
optional double sr_fee;
optional double sr_return_ship_cost;
optional double sr_refunded_cash;
optional double sr_reversed_charge;
optional double sr_store_credit;
optional double sr_net_loss;
optional binary sr_dummycol (UTF8);
}
, metadata: {org.apache.spark.sql.parquet.row.metadata={"type":"struct",
"fields":[{"name":"sr_returned_date_sk","type":"
integer","nullable":true,"
Post by PROJJWAL SAHA
metadata":{}},{"name":"sr_return_time_sk","type":"
integer","nullable":true,"metadata":{}},{"name":"sr_
item_sk","type":"integer","nullable":true,"metadata":{}},
true,"metadata":{}},{"name":"sr_cdemo_sk","type":"integer",
"integer","nullable":true,"metadata":{}},{"name":"sr_
addr_sk","type":"integer","nullable":true,"metadata":{}},
{"name":"sr_store_sk","type":"integer","nullable":true,"
metadata":{}},{"name":"sr_reason_sk","type":"integer","
nullable":true,"metadata":{}},{"name":"sr_ticket_number","
type":"integer","nullable":true,"metadata":{}},{"name":"
sr_return_quantity","type":"integer","nullable":true,"
metadata":{}},{"name":"sr_return_amt","type":"double","
nullable":true,"metadata":{}},{"name":"sr_return_tax","type"
:"double","nullable":true,"metadata":{}},{"name":"sr_
return_amt_inc_tax","type":"double","nullable":true,"
true,"metadata":{}},{"name":"sr_return_ship_cost","type":"
double","nullable":true,"metadata":{}},{"name":"sr_
refunded_cash","type":"double","nullable":true,"metadata":{}
true,"metadata":{}},{"name":"sr_store_credit","type":"
double","nullable":true,"metadata":{}},{"name":"sr_net_
loss","type":"double","nullable":true,"metadata":{}},
{"name":"sr_dummycol","type":"string","nullable":true,"
metadata":{}}]}}},
Post by PROJJWAL SAHA
blocks: [BlockMetaData{287514, 18570101 [ColumnMetaData{UNCOMPRESSED
[sr_returned_date_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 4},
ColumnMetaData{UNCOMPRESSED [sr_return_time_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 417866}, ColumnMetaData{UNCOMPRESSED
[sr_item_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 1096347},
ColumnMetaData{UNCOMPRESSED [sr_customer_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 1708118}, ColumnMetaData{UNCOMPRESSED
[sr_cdemo_sk] INT32 [RLE, PLAIN, BIT_PACKED], 2674001},
ColumnMetaData{UNCOMPRESSED [sr_hdemo_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 3812205}, ColumnMetaData{UNCOMPRESSED
[sr_addr_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 4320246},
ColumnMetaData{UNCOMPRESSED [sr_store_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 5102635}, ColumnMetaData{UNCOMPRESSED
[sr_reason_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 5235151},
ColumnMetaData{UNCOMPRESSED [sr_ticket_number] INT32 [RLE, PLAIN,
BIT_PACKED], 5471579}, ColumnMetaData{UNCOMPRESSED
[sr_return_quantity] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED],
6621731}, ColumnMetaData{UNCOMPRESSED [sr_return_amt] DOUBLE [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 6893357}, ColumnMetaData{UNCOMPRESSED
[sr_return_tax] DOUBLE [RLE, PLAIN_DICTIONARY, BIT_PACKED], 8419465},
ColumnMetaData{UNCOMPRESSED [sr_return_amt_inc_tax] DOUBLE [RLE,
PLAIN, PLAIN_DICTIONARY, BIT_PACKED], 9201856},
ColumnMetaData{UNCOMPRESSED [sr_fee] DOUBLE [RLE, PLAIN_DICTIONARY,
BIT_PACKED], 11366007}, ColumnMetaData{UNCOMPRESSED
[sr_return_ship_cost] DOUBLE [RLE, PLAIN_DICTIONARY, BIT_PACKED],
11959880}, ColumnMetaData{UNCOMPRESSED [sr_refunded_cash] DOUBLE
[RLE, PLAIN_DICTIONARY, BIT_PACKED], 13218730},
ColumnMetaData{UNCOMPRESSED [sr_reversed_charge] DOUBLE [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 14635937}, ColumnMetaData{UNCOMPRESSED
[sr_store_credit] DOUBLE [RLE, PLAIN_DICTIONARY, BIT_PACKED],
15824898}, ColumnMetaData{UNCOMPRESSED [sr_net_loss] DOUBLE [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 17004301}, ColumnMetaData{UNCOMPRESSED
[sr_dummycol] BINARY [RLE, PLAIN, BIT_PACKED], 18570072}]}]}
at org.apache.drill.exec.store.parquet.columnreaders.
ParquetRecordReader.handleException(ParquetRecordReader.java:272)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
ParquetRecordReader.next(ParquetRecordReader.java:299)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.ScanBatch.next(
ScanBatch.java:180)
[drill-java-exec-1.11.0.jar:1.11.0]
... 60 common frames omitted
Caused by: java.io.IOException: java.lang.IndexOutOfBoundsException
at org.apache.drill.exec.util.filereader.
BufferedDirectBufInputStream.getNextBlock(BufferedDirectBufInputStream.
java:185)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.util.filereader.
BufferedDirectBufInputStream.readInternal(BufferedDirectBufInputStream.
java:212)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.util.filereader.
BufferedDirectBufInputStream.read(BufferedDirectBufInputStream.java:277)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.util.filereader.
DirectBufInputStream.getNext(DirectBufInputStream.java:111)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
PageReader.readPage(PageReader.java:216)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
PageReader.nextInternal(PageReader.java:283)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
PageReader.next(PageReader.java:307)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
NullableColumnReader.processPages(NullableColumnReader.java:69)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
BatchReader.
Post by PROJJWAL SAHA
readAllFixedFieldsSerial(BatchReader.java:63)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
BatchReader.
Post by PROJJWAL SAHA
readAllFixedFields(BatchReader.java:56)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
BatchReader$FixedWidthReader.readRecords(BatchReader.java:143)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
BatchReader.readBatch(BatchReader.java:42)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
ParquetRecordReader.next(ParquetRecordReader.java:297)
~[drill-java-exec-1.11.0.jar:1.11.0]
... 61 common frames omitted
Caused by: java.lang.IndexOutOfBoundsException: null
at java.nio.Buffer.checkBounds(Buffer.java:567) ~[na:1.8.0_121]
at java.nio.ByteBuffer.put(ByteBuffer.java:827) ~[na:1.8.0_121]
at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:379) ~[na:1.8.0_121]
at org.apache.parquet.hadoop.util.CompatibilityUtil.getBuf(
CompatibilityUtil.java:110)
~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
at org.apache.drill.exec.util.filereader.
BufferedDirectBufInputStream.getNextBlock(BufferedDirectBufInputStream.
java:182)
~[drill-java-exec-1.11.0.jar:1.11.0]
... 73 common frames omitted
2017-10-12 04:19:50,506
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] INFO
o.a.d.e.w.fragment.FragmentExecutor -
2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0: State change requested
RUNNING --> FAILED
2017-10-12 04:19:50,507
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] INFO
o.a.d.e.w.fragment.FragmentExecutor -
2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0: State change requested
FAILED --> FINISHED
2017-10-12 04:19:50,533 [BitServer-2] WARN
o.a.drill.exec.work.foreman.Foreman - Dropping request to move to
COMPLETED state as query is already at FAILED state (which is
terminal).
2017-10-12 04:19:50,533 [BitServer-2] WARN
o.a.d.e.w.b.ControlMessageHandler - Dropping request to cancel
fragment. 2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0 does not exist.
Post by PROJJWAL SAHA
sure, I can try disabling sync parquet reader.
Will this however, impact the performance of queries on parquet data ?
Post by Kunal Khatua
If this resolves the issue, could you share some additional details,
such
Post by PROJJWAL SAHA
Post by Kunal Khatua
as the metadata of the Parquet files, the OS, etc.? Details describing
the
Post by PROJJWAL SAHA
Post by Kunal Khatua
setup is also very helpful in identifying what could be the cause of
the
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
error.
We had observed some similar DATA_READ errors in the early iterations
of
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
the Async Parquet reader, but those have been resolved. I'm presuming
you're already on the latest (i.e. Apache Drill 1.11.0)
-----Original Message-----
Sent: Wednesday, October 11, 2017 6:52 PM
Subject: Re: Exception while reading parquet data
Can you try disabling async parquet reader to see if problem gets resolved.
alter session set `store.parquet.reader.pagereader.async`=false;
Thanks,
Arjun
________________________________
Sent: Wednesday, October 11, 2017 2:20 PM
Subject: Exception while reading parquet data
I get below exception when querying parquet data on Oracle Storage
Cloud
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
service.
Any pointers on what does this point to ?
Regards,
Projjwal
ERROR o.a.d.e.u.f.BufferedDirectBufInputStream - Error reading from
stream part-00006-25a9ae4b-fd9e-4770-b17e-9a29b270a4c2.parquet. Error
was : null
2017-10-09 09:42:18,516 [scan-2] INFO o.a.d.e.s.p.c.AsyncPageReader -
User Error Occurred: Exception occurred while reading from disk.
(java.lang.IndexOutOfBoundsException)
Exception occurred while reading from disk.
/data25GB/storereturns/part-00006-25a9ae4b-fd9e-4770-b17e-9a
29b270a4c2.parquet
Column: sr_return_time_sk
Row Group Start: 479751
[Error Id: 10680bb8-d1d6-43a1-b5e0-ef15bd8a9406 ] at
org.apache.drill.common.exceptions.UserException$Builder.
build(UserException.java:550)
~[drill-common-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader.handleAndThrowException(AsyncPageReader.java:185)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader.access$700(AsyncPageReader.java:82)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader$AsyncPageReaderTask.call(AsyncPageReader.java:461)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader$AsyncPageReaderTask.call(AsyncPageReader.java:381)
[drill-java-exec-1.11.0.jar:1.11.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_121] at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
Executor.java:1142)
[na:1.8.0_121]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
lExecutor.java:617)
[na:1.8.0_121]
java.io.IOException: java.lang.IndexOutOfBoundsException
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.getNextBlock(BufferedDirectBufInputStream.java:185)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.readInternal(BufferedDirectBufInputStream.java:212)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.read(BufferedDirectBufInputStream.java:277)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.DirectBufInputStream.
getNext(DirectBufInputStream.java:111)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader$AsyncPageReaderTask.call(AsyncPageReader.java:421)
[drill-java-exec-1.11.0.jar:1.11.0]
... 5 common frames omitted
Caused by: java.lang.IndexOutOfBoundsException: null at
java.nio.Buffer.checkBounds(Buffer.java:567) ~[na:1.8.0_121] at
java.nio.ByteBuffer.put(ByteBuffer.java:827) ~[na:1.8.0_121] at
java.nio.DirectByteBuffer.put(DirectByteBuffer.java:379)
~[na:1.8.0_121]
Post by PROJJWAL SAHA
Post by Kunal Khatua
at
org.apache.parquet.hadoop.util.CompatibilityUtil.getBuf(Comp
atibilityUtil.java:110)
~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.getNextBlock(BufferedDirectBufInputStream.java:182)
~[drill-java-exec-1.11.0.jar:1.11.0]
... 9 common frames omitted
2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-
507f6857e0ea:frag:2:3]
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
INFO o.a.d.e.w.fragment.FragmentExecutor -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested
AWAITING_ALLOCATION --> RUNNING
2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-
507f6857e0ea:frag:2:3]
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
INFO o.a.d.e.w.f.FragmentStatusReporter -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State to report: RUNNING
2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-
507f6857e0ea:frag:2:3]
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
INFO o.a.d.e.w.fragment.FragmentExecutor -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested
RUNNING
Post by PROJJWAL SAHA
Post by Kunal Khatua
--> CANCELLATION_REQUESTED
2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-
507f6857e0ea:frag:2:3]
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
INFO o.a.d.e.w.f.FragmentStatusReporter -
CANCELLATION_REQUESTED
Kunal Khatua
2017-10-15 15:29:53 UTC
Permalink
You could try uploading to Google Drive (since you have a Gmail account) and share the link .

Did Parth's suggestion of
store.parquet.reader.pagereader.bufferedread=false
resolve the issue?

Also share the details of the hardware setup... #nodes, Hadoop version, etc.


-----Original Message-----
From: PROJJWAL SAHA [mailto:***@gmail.com]
Sent: Sunday, October 15, 2017 8:07 AM
To: ***@drill.apache.org
Subject: Re: Exception while reading parquet data

Is there any place where I can upload the 12MB parquet data. I am not able to send the file through mail to the user group.
Post by Parth Chandra
Seems like a bug in BufferedDirectBufInputStream. Is it possible to
share a minimal data file that triggers this?
You can also try turning off the buffering reader.
store.parquet.reader.pagereader.bufferedread=false
With async reader on and buffering off, you might not see any
degradation in performance in most cases.
Post by PROJJWAL SAHA
hi,
disabling sync parquet reader doesnt solve the problem. I am getting
similar exception I dont see any issue with the parquet file since
the same file works on loading the same on alluxio.
2017-10-12 04:19:50,502
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] ERROR
o.a.d.e.u.f.BufferedDirectBufInputStream - Error reading from stream
null
2017-10-12 04:19:50,506
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] ERROR
IndexOutOfBoundsException
[Error Id: 3b7c4587-c1b8-4e79-bdaa-b2aa1516275b ]
IndexOutOfBoundsException
[Error Id: 3b7c4587-c1b8-4e79-bdaa-b2aa1516275b ]
at org.apache.drill.common.exceptions.UserException$
Builder.build(UserException.java:550)
~[drill-common-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.ScanBatch.next(
ScanBatch.java:249)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.svremover.
RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.project.
ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.aggregate.
HashAggBatch.buildSchema(HashAggBatch.java:111)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:142)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.xsort.
ExternalSortBatch.buildSchema(ExternalSortBatch.java:264)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:142)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.svremover.
RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.project.
ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.limit.
LimitRecordBatch.innerNext(LimitRecordBatch.java:115)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.svremover.
RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.project.
ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.BaseRootExec.
next(BaseRootExec.java:105)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.ScreenCreator$
ScreenRoot.innerNext(ScreenCreator.java:81)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.BaseRootExec.
next(BaseRootExec.java:95)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.work.fragment.FragmentExecutor$1.
run(FragmentExecutor.java:234)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.work.fragment.FragmentExecutor$1.
run(FragmentExecutor.java:227)
[drill-java-exec-1.11.0.jar:1.11.0]
at java.security.AccessController.doPrivileged(Native Method) [na:1.8.0_121]
at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_121]
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1657)
[hadoop-common-2.7.1.jar:na]
at org.apache.drill.exec.work.fragment.FragmentExecutor.run(
FragmentExecutor.java:227)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.common.SelfCleaningRunnable.run(
SelfCleaningRunnable.java:38)
[drill-common-1.11.0.jar:1.11.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1142)
[na:1.8.0_121]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:617)
[na:1.8.0_121]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
Error in parquet record reader.
/data1GBparquet/storereturns/part-00000-7ce26fde-f342-4aae-
a727-71b8b7a60e63.parquet
Total records read: 0
Row group index: 0
Records in row group: 287514
Parquet Metadata: ParquetMetaData{FileMetaData{schema: message spark_schema {
optional int32 sr_returned_date_sk;
optional int32 sr_return_time_sk;
optional int32 sr_item_sk;
optional int32 sr_customer_sk;
optional int32 sr_cdemo_sk;
optional int32 sr_hdemo_sk;
optional int32 sr_addr_sk;
optional int32 sr_store_sk;
optional int32 sr_reason_sk;
optional int32 sr_ticket_number;
optional int32 sr_return_quantity;
optional double sr_return_amt;
optional double sr_return_tax;
optional double sr_return_amt_inc_tax;
optional double sr_fee;
optional double sr_return_ship_cost;
optional double sr_refunded_cash;
optional double sr_reversed_charge;
optional double sr_store_credit;
optional double sr_net_loss;
{org.apache.spark.sql.parquet.row.metadata={"type":"struct",
"fields":[{"name":"sr_returned_date_sk","type":"
integer","nullable":true,"
Post by PROJJWAL SAHA
metadata":{}},{"name":"sr_return_time_sk","type":"
integer","nullable":true,"metadata":{}},{"name":"sr_
item_sk","type":"integer","nullable":true,"metadata":{}},
true,"metadata":{}},{"name":"sr_cdemo_sk","type":"integer",
"integer","nullable":true,"metadata":{}},{"name":"sr_
addr_sk","type":"integer","nullable":true,"metadata":{}},
{"name":"sr_store_sk","type":"integer","nullable":true,"
metadata":{}},{"name":"sr_reason_sk","type":"integer","
nullable":true,"metadata":{}},{"name":"sr_ticket_number","
type":"integer","nullable":true,"metadata":{}},{"name":"
sr_return_quantity","type":"integer","nullable":true,"
metadata":{}},{"name":"sr_return_amt","type":"double","
nullable":true,"metadata":{}},{"name":"sr_return_tax","type"
:"double","nullable":true,"metadata":{}},{"name":"sr_
return_amt_inc_tax","type":"double","nullable":true,"
true,"metadata":{}},{"name":"sr_return_ship_cost","type":"
double","nullable":true,"metadata":{}},{"name":"sr_
refunded_cash","type":"double","nullable":true,"metadata":{}
true,"metadata":{}},{"name":"sr_store_credit","type":"
double","nullable":true,"metadata":{}},{"name":"sr_net_
loss","type":"double","nullable":true,"metadata":{}},
{"name":"sr_dummycol","type":"string","nullable":true,"
metadata":{}}]}}},
Post by PROJJWAL SAHA
blocks: [BlockMetaData{287514, 18570101 [ColumnMetaData{UNCOMPRESSED
[sr_returned_date_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED],
4}, ColumnMetaData{UNCOMPRESSED [sr_return_time_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 417866}, ColumnMetaData{UNCOMPRESSED
[sr_item_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 1096347},
ColumnMetaData{UNCOMPRESSED [sr_customer_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 1708118}, ColumnMetaData{UNCOMPRESSED
[sr_cdemo_sk] INT32 [RLE, PLAIN, BIT_PACKED], 2674001},
ColumnMetaData{UNCOMPRESSED [sr_hdemo_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 3812205}, ColumnMetaData{UNCOMPRESSED
[sr_addr_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 4320246},
ColumnMetaData{UNCOMPRESSED [sr_store_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 5102635}, ColumnMetaData{UNCOMPRESSED
[sr_reason_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 5235151},
ColumnMetaData{UNCOMPRESSED [sr_ticket_number] INT32 [RLE, PLAIN,
BIT_PACKED], 5471579}, ColumnMetaData{UNCOMPRESSED
[sr_return_quantity] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED],
6621731}, ColumnMetaData{UNCOMPRESSED [sr_return_amt] DOUBLE [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 6893357}, ColumnMetaData{UNCOMPRESSED
[sr_return_tax] DOUBLE [RLE, PLAIN_DICTIONARY, BIT_PACKED],
8419465}, ColumnMetaData{UNCOMPRESSED [sr_return_amt_inc_tax] DOUBLE
[RLE, PLAIN, PLAIN_DICTIONARY, BIT_PACKED], 9201856},
ColumnMetaData{UNCOMPRESSED [sr_fee] DOUBLE [RLE, PLAIN_DICTIONARY,
BIT_PACKED], 11366007}, ColumnMetaData{UNCOMPRESSED
[sr_return_ship_cost] DOUBLE [RLE, PLAIN_DICTIONARY, BIT_PACKED],
11959880}, ColumnMetaData{UNCOMPRESSED [sr_refunded_cash] DOUBLE
[RLE, PLAIN_DICTIONARY, BIT_PACKED], 13218730},
ColumnMetaData{UNCOMPRESSED [sr_reversed_charge] DOUBLE [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 14635937},
ColumnMetaData{UNCOMPRESSED [sr_store_credit] DOUBLE [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 15824898},
ColumnMetaData{UNCOMPRESSED [sr_net_loss] DOUBLE [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 17004301}, ColumnMetaData{UNCOMPRESSED [sr_dummycol] BINARY [RLE, PLAIN, BIT_PACKED], 18570072}]}]}
at org.apache.drill.exec.store.parquet.columnreaders.
ParquetRecordReader.handleException(ParquetRecordReader.java:272)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
ParquetRecordReader.next(ParquetRecordReader.java:299)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.ScanBatch.next(
ScanBatch.java:180)
[drill-java-exec-1.11.0.jar:1.11.0]
java.lang.IndexOutOfBoundsException
at org.apache.drill.exec.util.filereader.
BufferedDirectBufInputStream.getNextBlock(BufferedDirectBufInputStream.
java:185)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.util.filereader.
BufferedDirectBufInputStream.readInternal(BufferedDirectBufInputStream.
java:212)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.util.filereader.
277) ~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.util.filereader.
DirectBufInputStream.getNext(DirectBufInputStream.java:111)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
PageReader.readPage(PageReader.java:216)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
PageReader.nextInternal(PageReader.java:283)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
PageReader.next(PageReader.java:307)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
NullableColumnReader.processPages(NullableColumnReader.java:69)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
BatchReader.
Post by PROJJWAL SAHA
readAllFixedFieldsSerial(BatchReader.java:63)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
BatchReader.
Post by PROJJWAL SAHA
readAllFixedFields(BatchReader.java:56)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
BatchReader$FixedWidthReader.readRecords(BatchReader.java:143)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
BatchReader.readBatch(BatchReader.java:42)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
ParquetRecordReader.next(ParquetRecordReader.java:297)
~[drill-java-exec-1.11.0.jar:1.11.0]
java.lang.IndexOutOfBoundsException: null
at java.nio.Buffer.checkBounds(Buffer.java:567) ~[na:1.8.0_121]
at java.nio.ByteBuffer.put(ByteBuffer.java:827) ~[na:1.8.0_121]
at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:379) ~[na:1.8.0_121]
at org.apache.parquet.hadoop.util.CompatibilityUtil.getBuf(
CompatibilityUtil.java:110)
~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
at org.apache.drill.exec.util.filereader.
BufferedDirectBufInputStream.getNextBlock(BufferedDirectBufInputStream.
java:182)
~[drill-java-exec-1.11.0.jar:1.11.0]
... 73 common frames omitted
2017-10-12 04:19:50,506
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] INFO
o.a.d.e.w.fragment.FragmentExecutor -
2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0: State change requested
RUNNING --> FAILED
2017-10-12 04:19:50,507
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] INFO
o.a.d.e.w.fragment.FragmentExecutor -
2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0: State change requested
FAILED --> FINISHED
2017-10-12 04:19:50,533 [BitServer-2] WARN
o.a.drill.exec.work.foreman.Foreman - Dropping request to move to
COMPLETED state as query is already at FAILED state (which is
terminal).
2017-10-12 04:19:50,533 [BitServer-2] WARN
o.a.d.e.w.b.ControlMessageHandler - Dropping request to cancel
fragment. 2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0 does not exist.
Post by PROJJWAL SAHA
sure, I can try disabling sync parquet reader.
Will this however, impact the performance of queries on parquet data ?
Post by Kunal Khatua
If this resolves the issue, could you share some additional details,
such
Post by PROJJWAL SAHA
Post by Kunal Khatua
as the metadata of the Parquet files, the OS, etc.? Details describing
the
Post by PROJJWAL SAHA
Post by Kunal Khatua
setup is also very helpful in identifying what could be the cause of
the
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
error.
We had observed some similar DATA_READ errors in the early iterations
of
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
the Async Parquet reader, but those have been resolved. I'm
presuming you're already on the latest (i.e. Apache Drill 1.11.0)
-----Original Message-----
Sent: Wednesday, October 11, 2017 6:52 PM
Subject: Re: Exception while reading parquet data
Can you try disabling async parquet reader to see if problem gets resolved.
alter session set `store.parquet.reader.pagereader.async`=false;
Thanks,
Arjun
________________________________
Sent: Wednesday, October 11, 2017 2:20 PM
Subject: Exception while reading parquet data
I get below exception when querying parquet data on Oracle Storage
Cloud
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
service.
Any pointers on what does this point to ?
Regards,
Projjwal
ERROR o.a.d.e.u.f.BufferedDirectBufInputStream - Error reading from stream
part-00006-25a9ae4b-fd9e-4770-b17e-9a29b270a4c2.parquet. Error was : null
2017-10-09 09:42:18,516 [scan-2] INFO
o.a.d.e.s.p.c.AsyncPageReader - User Error Occurred: Exception occurred while reading from disk.
(java.lang.IndexOutOfBoundsException)
Exception occurred while reading from disk.
/data25GB/storereturns/part-00006-25a9ae4b-fd9e-4770-b17e-9a
29b270a4c2.parquet
Column: sr_return_time_sk
Row Group Start: 479751
[Error Id: 10680bb8-d1d6-43a1-b5e0-ef15bd8a9406 ] at
org.apache.drill.common.exceptions.UserException$Builder.
build(UserException.java:550)
~[drill-common-1.11.0.jar:1.11.0] at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader.handleAndThrowException(AsyncPageReader.java:185)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader.access$700(AsyncPageReader.java:82)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader$AsyncPageReaderTask.call(AsyncPageReader.java:461)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader$AsyncPageReaderTask.call(AsyncPageReader.java:381)
[drill-java-exec-1.11.0.jar:1.11.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_121] at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
Executor.java:1142)
[na:1.8.0_121]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
lExecutor.java:617)
[na:1.8.0_121]
java.io.IOException: java.lang.IndexOutOfBoundsException
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.getNextBlock(BufferedDirectBufInputStream.java:185)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.readInternal(BufferedDirectBufInputStream.java:212)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.read(BufferedDirectBufInputStream.java:277)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.DirectBufInputStream.
getNext(DirectBufInputStream.java:111)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader$AsyncPageReaderTask.call(AsyncPageReader.java:421)
[drill-java-exec-1.11.0.jar:1.11.0]
... 5 common frames omitted
Caused by: java.lang.IndexOutOfBoundsException: null at
java.nio.Buffer.checkBounds(Buffer.java:567) ~[na:1.8.0_121] at
java.nio.ByteBuffer.put(ByteBuffer.java:827) ~[na:1.8.0_121] at
java.nio.DirectByteBuffer.put(DirectByteBuffer.java:379)
~[na:1.8.0_121]
Post by PROJJWAL SAHA
Post by Kunal Khatua
at
org.apache.parquet.hadoop.util.CompatibilityUtil.getBuf(Comp
atibilityUtil.java:110)
~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.getNextBlock(BufferedDirectBufInputStream.java:182)
~[drill-java-exec-1.11.0.jar:1.11.0]
... 9 common frames omitted
2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-
507f6857e0ea:frag:2:3]
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
INFO o.a.d.e.w.fragment.FragmentExecutor -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested
AWAITING_ALLOCATION --> RUNNING
2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-
507f6857e0ea:frag:2:3]
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
INFO o.a.d.e.w.f.FragmentStatusReporter -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State to report: RUNNING
2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-
507f6857e0ea:frag:2:3]
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
INFO o.a.d.e.w.fragment.FragmentExecutor -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested
RUNNING
Post by PROJJWAL SAHA
Post by Kunal Khatua
--> CANCELLATION_REQUESTED
2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-
507f6857e0ea:frag:2:3]
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
INFO o.a.d.e.w.f.FragmentStatusReporter -
CANCELLATI
PROJJWAL SAHA
2017-10-16 15:19:38 UTC
Permalink
here is the link for the parquet data.
https://drive.google.com/file/d/0BzZhvMHOeao1S2Rud2xDS1NyS00/view?usp=sharing

Setting store.parquet.reader.pagereader.bufferedread=false did not solve
the issue.

I am using Drill 1.11. The parquet data is fetched from Oracle Storage
Cloud Service using swift driver.

Here is the error on the drill command prompt -
Error: DATA_READ ERROR: Exception occurred while reading from disk.

File:
/data1GBparquet/storereturns/part-00000-7ce26fde-f342-4aae-a727-71b8b7a60e63.parquet
Column: sr_return_time_sk
Row Group Start: 417866
File:
/data1GBparquet/storereturns/part-00000-7ce26fde-f342-4aae-a727-71b8b7a60e63.parquet
Column: sr_return_time_sk
Row Group Start: 417866
Fragment 0:0
Post by Kunal Khatua
You could try uploading to Google Drive (since you have a Gmail account)
and share the link .
Did Parth's suggestion of
store.parquet.reader.pagereader.bufferedread=false
resolve the issue?
Also share the details of the hardware setup... #nodes, Hadoop version, etc.
-----Original Message-----
Sent: Sunday, October 15, 2017 8:07 AM
Subject: Re: Exception while reading parquet data
Is there any place where I can upload the 12MB parquet data. I am not able
to send the file through mail to the user group.
Post by Parth Chandra
Seems like a bug in BufferedDirectBufInputStream. Is it possible to
share a minimal data file that triggers this?
You can also try turning off the buffering reader.
store.parquet.reader.pagereader.bufferedread=false
With async reader on and buffering off, you might not see any
degradation in performance in most cases.
Post by PROJJWAL SAHA
hi,
disabling sync parquet reader doesnt solve the problem. I am getting
similar exception I dont see any issue with the parquet file since
the same file works on loading the same on alluxio.
2017-10-12 04:19:50,502
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] ERROR
o.a.d.e.u.f.BufferedDirectBufInputStream - Error reading from stream
null
2017-10-12 04:19:50,506
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] ERROR
IndexOutOfBoundsException
[Error Id: 3b7c4587-c1b8-4e79-bdaa-b2aa1516275b ]
IndexOutOfBoundsException
[Error Id: 3b7c4587-c1b8-4e79-bdaa-b2aa1516275b ]
at org.apache.drill.common.exceptions.UserException$
Builder.build(UserException.java:550)
~[drill-common-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.ScanBatch.next(
ScanBatch.java:249)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.svremover.
RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.project.
ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.aggregate.
HashAggBatch.buildSchema(HashAggBatch.java:111)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:142)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.xsort.
ExternalSortBatch.buildSchema(ExternalSortBatch.java:264)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:142)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.svremover.
RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.project.
ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.limit.
LimitRecordBatch.innerNext(LimitRecordBatch.java:115)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.svremover.
RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.project.
ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.BaseRootExec.
next(BaseRootExec.java:105)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.ScreenCreator$
ScreenRoot.innerNext(ScreenCreator.java:81)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.BaseRootExec.
next(BaseRootExec.java:95)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.work.fragment.FragmentExecutor$1.
run(FragmentExecutor.java:234)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.work.fragment.FragmentExecutor$1.
run(FragmentExecutor.java:227)
[drill-java-exec-1.11.0.jar:1.11.0]
at java.security.AccessController.doPrivileged(Native
Method) [na:1.8.0_121]
at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_121]
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1657)
[hadoop-common-2.7.1.jar:na]
at org.apache.drill.exec.work.fragment.FragmentExecutor.run(
FragmentExecutor.java:227)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.common.SelfCleaningRunnable.run(
SelfCleaningRunnable.java:38)
[drill-common-1.11.0.jar:1.11.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1142)
[na:1.8.0_121]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:617)
[na:1.8.0_121]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
Error in parquet record reader.
/data1GBparquet/storereturns/part-00000-7ce26fde-f342-4aae-
a727-71b8b7a60e63.parquet
Total records read: 0
Row group index: 0
Records in row group: 287514
Parquet Metadata: ParquetMetaData{FileMetaData{schema: message spark_schema {
optional int32 sr_returned_date_sk;
optional int32 sr_return_time_sk;
optional int32 sr_item_sk;
optional int32 sr_customer_sk;
optional int32 sr_cdemo_sk;
optional int32 sr_hdemo_sk;
optional int32 sr_addr_sk;
optional int32 sr_store_sk;
optional int32 sr_reason_sk;
optional int32 sr_ticket_number;
optional int32 sr_return_quantity;
optional double sr_return_amt;
optional double sr_return_tax;
optional double sr_return_amt_inc_tax;
optional double sr_fee;
optional double sr_return_ship_cost;
optional double sr_refunded_cash;
optional double sr_reversed_charge;
optional double sr_store_credit;
optional double sr_net_loss;
{org.apache.spark.sql.parquet.row.metadata={"type":"struct",
"fields":[{"name":"sr_returned_date_sk","type":"
integer","nullable":true,"
Post by PROJJWAL SAHA
metadata":{}},{"name":"sr_return_time_sk","type":"
integer","nullable":true,"metadata":{}},{"name":"sr_
item_sk","type":"integer","nullable":true,"metadata":{}},
true,"metadata":{}},{"name":"sr_cdemo_sk","type":"integer",
"integer","nullable":true,"metadata":{}},{"name":"sr_
addr_sk","type":"integer","nullable":true,"metadata":{}},
{"name":"sr_store_sk","type":"integer","nullable":true,"
metadata":{}},{"name":"sr_reason_sk","type":"integer","
nullable":true,"metadata":{}},{"name":"sr_ticket_number","
type":"integer","nullable":true,"metadata":{}},{"name":"
sr_return_quantity","type":"integer","nullable":true,"
metadata":{}},{"name":"sr_return_amt","type":"double","
nullable":true,"metadata":{}},{"name":"sr_return_tax","type"
:"double","nullable":true,"metadata":{}},{"name":"sr_
return_amt_inc_tax","type":"double","nullable":true,"
true,"metadata":{}},{"name":"sr_return_ship_cost","type":"
double","nullable":true,"metadata":{}},{"name":"sr_
refunded_cash","type":"double","nullable":true,"metadata":{}
true,"metadata":{}},{"name":"sr_store_credit","type":"
double","nullable":true,"metadata":{}},{"name":"sr_net_
loss","type":"double","nullable":true,"metadata":{}},
{"name":"sr_dummycol","type":"string","nullable":true,"
metadata":{}}]}}},
Post by PROJJWAL SAHA
blocks: [BlockMetaData{287514, 18570101 [ColumnMetaData{UNCOMPRESSED
[sr_returned_date_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED],
4}, ColumnMetaData{UNCOMPRESSED [sr_return_time_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 417866}, ColumnMetaData{UNCOMPRESSED
[sr_item_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 1096347},
ColumnMetaData{UNCOMPRESSED [sr_customer_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 1708118}, ColumnMetaData{UNCOMPRESSED
[sr_cdemo_sk] INT32 [RLE, PLAIN, BIT_PACKED], 2674001},
ColumnMetaData{UNCOMPRESSED [sr_hdemo_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 3812205}, ColumnMetaData{UNCOMPRESSED
[sr_addr_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 4320246},
ColumnMetaData{UNCOMPRESSED [sr_store_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 5102635}, ColumnMetaData{UNCOMPRESSED
[sr_reason_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 5235151},
ColumnMetaData{UNCOMPRESSED [sr_ticket_number] INT32 [RLE, PLAIN,
BIT_PACKED], 5471579}, ColumnMetaData{UNCOMPRESSED
[sr_return_quantity] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED],
6621731}, ColumnMetaData{UNCOMPRESSED [sr_return_amt] DOUBLE [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 6893357}, ColumnMetaData{UNCOMPRESSED
[sr_return_tax] DOUBLE [RLE, PLAIN_DICTIONARY, BIT_PACKED],
8419465}, ColumnMetaData{UNCOMPRESSED [sr_return_amt_inc_tax] DOUBLE
[RLE, PLAIN, PLAIN_DICTIONARY, BIT_PACKED], 9201856},
ColumnMetaData{UNCOMPRESSED [sr_fee] DOUBLE [RLE, PLAIN_DICTIONARY,
BIT_PACKED], 11366007}, ColumnMetaData{UNCOMPRESSED
[sr_return_ship_cost] DOUBLE [RLE, PLAIN_DICTIONARY, BIT_PACKED],
11959880}, ColumnMetaData{UNCOMPRESSED [sr_refunded_cash] DOUBLE
[RLE, PLAIN_DICTIONARY, BIT_PACKED], 13218730},
ColumnMetaData{UNCOMPRESSED [sr_reversed_charge] DOUBLE [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 14635937},
ColumnMetaData{UNCOMPRESSED [sr_store_credit] DOUBLE [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 15824898},
ColumnMetaData{UNCOMPRESSED [sr_net_loss] DOUBLE [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 17004301}, ColumnMetaData{UNCOMPRESSED
[sr_dummycol] BINARY [RLE, PLAIN, BIT_PACKED], 18570072}]}]}
Post by Parth Chandra
Post by PROJJWAL SAHA
at org.apache.drill.exec.store.parquet.columnreaders.
ParquetRecordReader.handleException(ParquetRecordReader.java:272)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
ParquetRecordReader.next(ParquetRecordReader.java:299)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.ScanBatch.next(
ScanBatch.java:180)
[drill-java-exec-1.11.0.jar:1.11.0]
java.lang.IndexOutOfBoundsException
at org.apache.drill.exec.util.filereader.
BufferedDirectBufInputStream.getNextBlock(
BufferedDirectBufInputStream.
Post by Parth Chandra
Post by PROJJWAL SAHA
java:185)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.util.filereader.
BufferedDirectBufInputStream.readInternal(
BufferedDirectBufInputStream.
Post by Parth Chandra
Post by PROJJWAL SAHA
java:212)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.util.filereader.
277) ~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.util.filereader.
DirectBufInputStream.getNext(DirectBufInputStream.java:111)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
PageReader.readPage(PageReader.java:216)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
PageReader.nextInternal(PageReader.java:283)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
PageReader.next(PageReader.java:307)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
NullableColumnReader.processPages(NullableColumnReader.java:69)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
BatchReader.
Post by PROJJWAL SAHA
readAllFixedFieldsSerial(BatchReader.java:63)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
BatchReader.
Post by PROJJWAL SAHA
readAllFixedFields(BatchReader.java:56)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
BatchReader$FixedWidthReader.readRecords(BatchReader.java:143)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
BatchReader.readBatch(BatchReader.java:42)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
ParquetRecordReader.next(ParquetRecordReader.java:297)
~[drill-java-exec-1.11.0.jar:1.11.0]
java.lang.IndexOutOfBoundsException: null
at java.nio.Buffer.checkBounds(Buffer.java:567)
~[na:1.8.0_121]
Post by Parth Chandra
Post by PROJJWAL SAHA
at java.nio.ByteBuffer.put(ByteBuffer.java:827)
~[na:1.8.0_121]
Post by Parth Chandra
Post by PROJJWAL SAHA
at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:379) ~[na:1.8.0_121]
at org.apache.parquet.hadoop.util.CompatibilityUtil.getBuf(
CompatibilityUtil.java:110)
~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
at org.apache.drill.exec.util.filereader.
BufferedDirectBufInputStream.getNextBlock(
BufferedDirectBufInputStream.
Post by Parth Chandra
Post by PROJJWAL SAHA
java:182)
~[drill-java-exec-1.11.0.jar:1.11.0]
... 73 common frames omitted
2017-10-12 04:19:50,506
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] INFO
o.a.d.e.w.fragment.FragmentExecutor -
2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0: State change requested
RUNNING --> FAILED
2017-10-12 04:19:50,507
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] INFO
o.a.d.e.w.fragment.FragmentExecutor -
2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0: State change requested
FAILED --> FINISHED
2017-10-12 04:19:50,533 [BitServer-2] WARN
o.a.drill.exec.work.foreman.Foreman - Dropping request to move to
COMPLETED state as query is already at FAILED state (which is
terminal).
2017-10-12 04:19:50,533 [BitServer-2] WARN
o.a.d.e.w.b.ControlMessageHandler - Dropping request to cancel
fragment. 2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0 does not exist.
Post by PROJJWAL SAHA
sure, I can try disabling sync parquet reader.
Will this however, impact the performance of queries on parquet data
?
Post by Parth Chandra
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
If this resolves the issue, could you share some additional details,
such
Post by PROJJWAL SAHA
Post by Kunal Khatua
as the metadata of the Parquet files, the OS, etc.? Details describing
the
Post by PROJJWAL SAHA
Post by Kunal Khatua
setup is also very helpful in identifying what could be the cause of
the
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
error.
We had observed some similar DATA_READ errors in the early iterations
of
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
the Async Parquet reader, but those have been resolved. I'm
presuming you're already on the latest (i.e. Apache Drill 1.11.0)
-----Original Message-----
Sent: Wednesday, October 11, 2017 6:52 PM
Subject: Re: Exception while reading parquet data
Can you try disabling async parquet reader to see if problem gets resolved.
alter session set `store.parquet.reader.pagereader.async`=false;
Thanks,
Arjun
________________________________
Sent: Wednesday, October 11, 2017 2:20 PM
Subject: Exception while reading parquet data
I get below exception when querying parquet data on Oracle Storage
Cloud
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
service.
Any pointers on what does this point to ?
Regards,
Projjwal
ERROR o.a.d.e.u.f.BufferedDirectBufInputStream - Error reading from stream
part-00006-25a9ae4b-fd9e-4770-b17e-9a29b270a4c2.parquet. Error was : null
2017-10-09 09:42:18,516 [scan-2] INFO
o.a.d.e.s.p.c.AsyncPageReader - User Error Occurred: Exception
occurred while reading from disk.
Post by Parth Chandra
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
(java.lang.IndexOutOfBoundsException)
Exception occurred while reading from disk.
/data25GB/storereturns/part-00006-25a9ae4b-fd9e-4770-b17e-9a
29b270a4c2.parquet
Column: sr_return_time_sk
Row Group Start: 479751
[Error Id: 10680bb8-d1d6-43a1-b5e0-ef15bd8a9406 ] at
org.apache.drill.common.exceptions.UserException$Builder.
build(UserException.java:550)
~[drill-common-1.11.0.jar:1.11.0] at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader.handleAndThrowException(AsyncPageReader.java:185)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader.access$700(AsyncPageReader.java:82)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader$AsyncPageReaderTask.call(AsyncPageReader.java:461)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader$AsyncPageReaderTask.call(AsyncPageReader.java:381)
[drill-java-exec-1.11.0.jar:1.11.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_121] at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
Executor.java:1142)
[na:1.8.0_121]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
lExecutor.java:617)
[na:1.8.0_121]
java.io.IOException: java.lang.IndexOutOfBoundsException
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.getNextBlock(BufferedDirectBufInputStream.java:185)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.readInternal(BufferedDirectBufInputStream.java:212)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.read(BufferedDirectBufInputStream.java:277)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.DirectBufInputStream.
getNext(DirectBufInputStream.java:111)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader$AsyncPageReaderTask.call(AsyncPageReader.java:421)
[drill-java-exec-1.11.0.jar:1.11.0]
... 5 common frames omitted
Caused by: java.lang.IndexOutOfBoundsException: null at
java.nio.Buffer.checkBounds(Buffer.java:567) ~[na:1.8.0_121] at
java.nio.ByteBuffer.put(ByteBuffer.java:827) ~[na:1.8.0_121] at
java.nio.DirectByteBuffer.put(DirectByteBuffer.java:379)
~[na:1.8.0_121]
Post by PROJJWAL SAHA
Post by Kunal Khatua
at
org.apache.parquet.hadoop.util.CompatibilityUtil.getBuf(Comp
atibilityUtil.java:110)
~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.getNextBlock(BufferedDirectBufInputStream.java:182)
~[drill-java-exec-1.11.0.jar:1.11.0]
... 9 common frames omitted
2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-
507f6857e0ea:frag:2:3]
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
INFO o.a.d.e.w.fragment.FragmentExecutor -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested
AWAITING_ALLOCATION --> RUNNING
2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-
507f6857e0ea:frag:2:3]
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
INFO o.a.d.e.w.f.FragmentStatusReporter -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State to report: RUNNING
2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-
507f6857e0ea:frag:2:3]
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
INFO o.a.d.e.w.fragment.FragmentExecutor -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested
RUNNING
Post by PROJJWAL SAHA
Post by Kunal Khatua
--> CANCELLATION_REQUESTED
2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-
507f6857e0ea:frag:2:3]
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
INFO o.a.d.e.w.f.FragmentStatusReporter -
CANCELLATION_REQUESTED
Parth Chandra
2017-10-16 18:53:54 UTC
Permalink
Hi Projjwal,

Unfortunately, I did not get a crash when I tried with your sample file.
Also if turning off buffered reader did not help, did you get a different
stack trace?

Any more information you can provide will be useful. Is this part of a
larger query with more parquet files being read? Are you reading all the
columns? Is there some specific column that appears to trigger the issue?

You can mail this info directly to me if you are not comfortable sharing
your data on the public list.

Thanks

Parth
Post by PROJJWAL SAHA
here is the link for the parquet data.
https://drive.google.com/file/d/0BzZhvMHOeao1S2Rud2xDS1NyS00/
view?usp=sharing
Setting store.parquet.reader.pagereader.bufferedread=false did not solve
the issue.
I am using Drill 1.11. The parquet data is fetched from Oracle Storage
Cloud Service using swift driver.
Here is the error on the drill command prompt -
Error: DATA_READ ERROR: Exception occurred while reading from disk.
/data1GBparquet/storereturns/part-00000-7ce26fde-f342-4aae-
a727-71b8b7a60e63.parquet
Column: sr_return_time_sk
Row Group Start: 417866
/data1GBparquet/storereturns/part-00000-7ce26fde-f342-4aae-
a727-71b8b7a60e63.parquet
Column: sr_return_time_sk
Row Group Start: 417866
Fragment 0:0
Post by Kunal Khatua
You could try uploading to Google Drive (since you have a Gmail account)
and share the link .
Did Parth's suggestion of
store.parquet.reader.pagereader.bufferedread=false
resolve the issue?
Also share the details of the hardware setup... #nodes, Hadoop version, etc.
-----Original Message-----
Sent: Sunday, October 15, 2017 8:07 AM
Subject: Re: Exception while reading parquet data
Is there any place where I can upload the 12MB parquet data. I am not
able
Post by Kunal Khatua
to send the file through mail to the user group.
Post by Parth Chandra
Seems like a bug in BufferedDirectBufInputStream. Is it possible to
share a minimal data file that triggers this?
You can also try turning off the buffering reader.
store.parquet.reader.pagereader.bufferedread=false
With async reader on and buffering off, you might not see any
degradation in performance in most cases.
Post by PROJJWAL SAHA
hi,
disabling sync parquet reader doesnt solve the problem. I am getting
similar exception I dont see any issue with the parquet file since
the same file works on loading the same on alluxio.
2017-10-12 04:19:50,502
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] ERROR
o.a.d.e.u.f.BufferedDirectBufInputStream - Error reading from stream
null
2017-10-12 04:19:50,506
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] ERROR
IndexOutOfBoundsException
[Error Id: 3b7c4587-c1b8-4e79-bdaa-b2aa1516275b ]
IndexOutOfBoundsException
[Error Id: 3b7c4587-c1b8-4e79-bdaa-b2aa1516275b ]
at org.apache.drill.common.exceptions.UserException$
Builder.build(UserException.java:550)
~[drill-common-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.ScanBatch.next(
ScanBatch.java:249)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.svremover.
RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.project.
ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.aggregate.
HashAggBatch.buildSchema(HashAggBatch.java:111)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:142)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.xsort.
ExternalSortBatch.buildSchema(ExternalSortBatch.java:264)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:142)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.svremover.
RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.project.
ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.limit.
LimitRecordBatch.innerNext(LimitRecordBatch.java:115)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.svremover.
RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:119)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:109)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractSingleRecordBatch.
innerNext(AbstractSingleRecordBatch.java:51)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.project.
ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.record.AbstractRecordBatch.next(
AbstractRecordBatch.java:162)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.BaseRootExec.
next(BaseRootExec.java:105)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.ScreenCreator$
ScreenRoot.innerNext(ScreenCreator.java:81)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.BaseRootExec.
next(BaseRootExec.java:95)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.work.fragment.FragmentExecutor$1.
run(FragmentExecutor.java:234)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.work.fragment.FragmentExecutor$1.
run(FragmentExecutor.java:227)
[drill-java-exec-1.11.0.jar:1.11.0]
at java.security.AccessController.doPrivileged(Native
Method) [na:1.8.0_121]
at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_121]
at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1657)
[hadoop-common-2.7.1.jar:na]
at org.apache.drill.exec.work.fragment.FragmentExecutor.run(
FragmentExecutor.java:227)
[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.common.SelfCleaningRunnable.run(
SelfCleaningRunnable.java:38)
[drill-common-1.11.0.jar:1.11.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1142)
[na:1.8.0_121]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:617)
[na:1.8.0_121]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
Error in parquet record reader.
/data1GBparquet/storereturns/part-00000-7ce26fde-f342-4aae-
a727-71b8b7a60e63.parquet
Total records read: 0
Row group index: 0
Records in row group: 287514
Parquet Metadata: ParquetMetaData{FileMetaData{schema: message spark_schema {
optional int32 sr_returned_date_sk;
optional int32 sr_return_time_sk;
optional int32 sr_item_sk;
optional int32 sr_customer_sk;
optional int32 sr_cdemo_sk;
optional int32 sr_hdemo_sk;
optional int32 sr_addr_sk;
optional int32 sr_store_sk;
optional int32 sr_reason_sk;
optional int32 sr_ticket_number;
optional int32 sr_return_quantity;
optional double sr_return_amt;
optional double sr_return_tax;
optional double sr_return_amt_inc_tax;
optional double sr_fee;
optional double sr_return_ship_cost;
optional double sr_refunded_cash;
optional double sr_reversed_charge;
optional double sr_store_credit;
optional double sr_net_loss;
{org.apache.spark.sql.parquet.row.metadata={"type":"struct",
"fields":[{"name":"sr_returned_date_sk","type":"
integer","nullable":true,"
Post by PROJJWAL SAHA
metadata":{}},{"name":"sr_return_time_sk","type":"
integer","nullable":true,"metadata":{}},{"name":"sr_
item_sk","type":"integer","nullable":true,"metadata":{}},
true,"metadata":{}},{"name":"sr_cdemo_sk","type":"integer",
"integer","nullable":true,"metadata":{}},{"name":"sr_
addr_sk","type":"integer","nullable":true,"metadata":{}},
{"name":"sr_store_sk","type":"integer","nullable":true,"
metadata":{}},{"name":"sr_reason_sk","type":"integer","
nullable":true,"metadata":{}},{"name":"sr_ticket_number","
type":"integer","nullable":true,"metadata":{}},{"name":"
sr_return_quantity","type":"integer","nullable":true,"
metadata":{}},{"name":"sr_return_amt","type":"double","
nullable":true,"metadata":{}},{"name":"sr_return_tax","type"
:"double","nullable":true,"metadata":{}},{"name":"sr_
return_amt_inc_tax","type":"double","nullable":true,"
true,"metadata":{}},{"name":"sr_return_ship_cost","type":"
double","nullable":true,"metadata":{}},{"name":"sr_
refunded_cash","type":"double","nullable":true,"metadata":{}
true,"metadata":{}},{"name":"sr_store_credit","type":"
double","nullable":true,"metadata":{}},{"name":"sr_net_
loss","type":"double","nullable":true,"metadata":{}},
{"name":"sr_dummycol","type":"string","nullable":true,"
metadata":{}}]}}},
Post by PROJJWAL SAHA
blocks: [BlockMetaData{287514, 18570101 [ColumnMetaData{UNCOMPRESSED
[sr_returned_date_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED],
4}, ColumnMetaData{UNCOMPRESSED [sr_return_time_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 417866}, ColumnMetaData{UNCOMPRESSED
[sr_item_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 1096347},
ColumnMetaData{UNCOMPRESSED [sr_customer_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 1708118}, ColumnMetaData{UNCOMPRESSED
[sr_cdemo_sk] INT32 [RLE, PLAIN, BIT_PACKED], 2674001},
ColumnMetaData{UNCOMPRESSED [sr_hdemo_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 3812205}, ColumnMetaData{UNCOMPRESSED
[sr_addr_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 4320246},
ColumnMetaData{UNCOMPRESSED [sr_store_sk] INT32 [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 5102635}, ColumnMetaData{UNCOMPRESSED
[sr_reason_sk] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 5235151},
ColumnMetaData{UNCOMPRESSED [sr_ticket_number] INT32 [RLE, PLAIN,
BIT_PACKED], 5471579}, ColumnMetaData{UNCOMPRESSED
[sr_return_quantity] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED],
6621731}, ColumnMetaData{UNCOMPRESSED [sr_return_amt] DOUBLE [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 6893357}, ColumnMetaData{UNCOMPRESSED
[sr_return_tax] DOUBLE [RLE, PLAIN_DICTIONARY, BIT_PACKED],
8419465}, ColumnMetaData{UNCOMPRESSED [sr_return_amt_inc_tax] DOUBLE
[RLE, PLAIN, PLAIN_DICTIONARY, BIT_PACKED], 9201856},
ColumnMetaData{UNCOMPRESSED [sr_fee] DOUBLE [RLE, PLAIN_DICTIONARY,
BIT_PACKED], 11366007}, ColumnMetaData{UNCOMPRESSED
[sr_return_ship_cost] DOUBLE [RLE, PLAIN_DICTIONARY, BIT_PACKED],
11959880}, ColumnMetaData{UNCOMPRESSED [sr_refunded_cash] DOUBLE
[RLE, PLAIN_DICTIONARY, BIT_PACKED], 13218730},
ColumnMetaData{UNCOMPRESSED [sr_reversed_charge] DOUBLE [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 14635937},
ColumnMetaData{UNCOMPRESSED [sr_store_credit] DOUBLE [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 15824898},
ColumnMetaData{UNCOMPRESSED [sr_net_loss] DOUBLE [RLE,
PLAIN_DICTIONARY, BIT_PACKED], 17004301}, ColumnMetaData{UNCOMPRESSED
[sr_dummycol] BINARY [RLE, PLAIN, BIT_PACKED], 18570072}]}]}
Post by Parth Chandra
Post by PROJJWAL SAHA
at org.apache.drill.exec.store.parquet.columnreaders.
ParquetRecordReader.handleException(ParquetRecordReader.java:272)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
ParquetRecordReader.next(ParquetRecordReader.java:299)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.physical.impl.ScanBatch.next(
ScanBatch.java:180)
[drill-java-exec-1.11.0.jar:1.11.0]
java.lang.IndexOutOfBoundsException
at org.apache.drill.exec.util.filereader.
BufferedDirectBufInputStream.getNextBlock(
BufferedDirectBufInputStream.
Post by Parth Chandra
Post by PROJJWAL SAHA
java:185)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.util.filereader.
BufferedDirectBufInputStream.readInternal(
BufferedDirectBufInputStream.
Post by Parth Chandra
Post by PROJJWAL SAHA
java:212)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.util.filereader.
277) ~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.util.filereader.
DirectBufInputStream.getNext(DirectBufInputStream.java:111)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
PageReader.readPage(PageReader.java:216)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
PageReader.nextInternal(PageReader.java:283)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
PageReader.next(PageReader.java:307)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
NullableColumnReader.processPages(NullableColumnReader.java:69)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
BatchReader.
Post by PROJJWAL SAHA
readAllFixedFieldsSerial(BatchReader.java:63)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
BatchReader.
Post by PROJJWAL SAHA
readAllFixedFields(BatchReader.java:56)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
BatchReader$FixedWidthReader.readRecords(BatchReader.java:143)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
BatchReader.readBatch(BatchReader.java:42)
~[drill-java-exec-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.parquet.columnreaders.
ParquetRecordReader.next(ParquetRecordReader.java:297)
~[drill-java-exec-1.11.0.jar:1.11.0]
java.lang.IndexOutOfBoundsException: null
at java.nio.Buffer.checkBounds(Buffer.java:567)
~[na:1.8.0_121]
Post by Parth Chandra
Post by PROJJWAL SAHA
at java.nio.ByteBuffer.put(ByteBuffer.java:827)
~[na:1.8.0_121]
Post by Parth Chandra
Post by PROJJWAL SAHA
at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:379)
~[na:1.8.0_121]
at org.apache.parquet.hadoop.util.CompatibilityUtil.getBuf(
CompatibilityUtil.java:110)
~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
at org.apache.drill.exec.util.filereader.
BufferedDirectBufInputStream.getNextBlock(
BufferedDirectBufInputStream.
Post by Parth Chandra
Post by PROJJWAL SAHA
java:182)
~[drill-java-exec-1.11.0.jar:1.11.0]
... 73 common frames omitted
2017-10-12 04:19:50,506
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] INFO
o.a.d.e.w.fragment.FragmentExecutor -
2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0: State change requested
RUNNING --> FAILED
2017-10-12 04:19:50,507
[2620da63-4efb-47e2-5e2c-29b48c0194c0:frag:0:0] INFO
o.a.d.e.w.fragment.FragmentExecutor -
2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0: State change requested
FAILED --> FINISHED
2017-10-12 04:19:50,533 [BitServer-2] WARN
o.a.drill.exec.work.foreman.Foreman - Dropping request to move to
COMPLETED state as query is already at FAILED state (which is
terminal).
2017-10-12 04:19:50,533 [BitServer-2] WARN
o.a.d.e.w.b.ControlMessageHandler - Dropping request to cancel
fragment. 2620da63-4efb-47e2-5e2c-29b48c0194c0:0:0 does not exist.
Post by PROJJWAL SAHA
sure, I can try disabling sync parquet reader.
Will this however, impact the performance of queries on parquet
data
Post by Kunal Khatua
?
Post by Parth Chandra
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
If this resolves the issue, could you share some additional details,
such
Post by PROJJWAL SAHA
Post by Kunal Khatua
as the metadata of the Parquet files, the OS, etc.? Details describing
the
Post by PROJJWAL SAHA
Post by Kunal Khatua
setup is also very helpful in identifying what could be the cause of
the
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
error.
We had observed some similar DATA_READ errors in the early iterations
of
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
the Async Parquet reader, but those have been resolved. I'm
presuming you're already on the latest (i.e. Apache Drill 1.11.0)
-----Original Message-----
Sent: Wednesday, October 11, 2017 6:52 PM
Subject: Re: Exception while reading parquet data
Can you try disabling async parquet reader to see if problem gets
resolved.
alter session set `store.parquet.reader.pagereader.async`=false;
Thanks,
Arjun
________________________________
Sent: Wednesday, October 11, 2017 2:20 PM
Subject: Exception while reading parquet data
I get below exception when querying parquet data on Oracle Storage
Cloud
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
service.
Any pointers on what does this point to ?
Regards,
Projjwal
ERROR o.a.d.e.u.f.BufferedDirectBufInputStream - Error reading
from stream
part-00006-25a9ae4b-fd9e-4770-b17e-9a29b270a4c2.parquet. Error was : null
2017-10-09 09:42:18,516 [scan-2] INFO
o.a.d.e.s.p.c.AsyncPageReader - User Error Occurred: Exception
occurred while reading from disk.
Post by Parth Chandra
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
(java.lang.IndexOutOfBoundsException)
org.apache.drill.common.exceptions.UserException: DATA_READ
Exception occurred while reading from disk.
/data25GB/storereturns/part-00006-25a9ae4b-fd9e-4770-b17e-9a
29b270a4c2.parquet
Column: sr_return_time_sk
Row Group Start: 479751
[Error Id: 10680bb8-d1d6-43a1-b5e0-ef15bd8a9406 ] at
org.apache.drill.common.exceptions.UserException$Builder.
build(UserException.java:550)
~[drill-common-1.11.0.jar:1.11.0] at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader.handleAndThrowException(AsyncPageReader.java:185)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader.access$700(AsyncPageReader.java:82)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader$AsyncPageReaderTask.call(AsyncPageReader.java:461)
[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader$AsyncPageReaderTask.call(AsyncPageReader.java:381)
[drill-java-exec-1.11.0.jar:1.11.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[na:1.8.0_121] at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
Executor.java:1142)
[na:1.8.0_121]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
lExecutor.java:617)
[na:1.8.0_121]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] Caused
java.io.IOException: java.lang.IndexOutOfBoundsException
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.getNextBlock(BufferedDirectBufInputStream.java:185)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.readInternal(BufferedDirectBufInputStream.java:212)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.read(BufferedDirectBufInputStream.java:277)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.util.filereader.DirectBufInputStream.
getNext(DirectBufInputStream.java:111)
~[drill-java-exec-1.11.0.jar:1.11.0]
at
org.apache.drill.exec.store.parquet.columnreaders.AsyncPageR
eader$AsyncPageReaderTask.call(AsyncPageReader.java:421)
[drill-java-exec-1.11.0.jar:1.11.0]
... 5 common frames omitted
Caused by: java.lang.IndexOutOfBoundsException: null at
java.nio.Buffer.checkBounds(Buffer.java:567) ~[na:1.8.0_121] at
java.nio.ByteBuffer.put(ByteBuffer.java:827) ~[na:1.8.0_121] at
java.nio.DirectByteBuffer.put(DirectByteBuffer.java:379)
~[na:1.8.0_121]
Post by PROJJWAL SAHA
Post by Kunal Khatua
at
org.apache.parquet.hadoop.util.CompatibilityUtil.getBuf(Comp
atibilityUtil.java:110)
~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
at
org.apache.drill.exec.util.filereader.BufferedDirectBufInput
Stream.getNextBlock(BufferedDirectBufInputStream.java:182)
~[drill-java-exec-1.11.0.jar:1.11.0]
... 9 common frames omitted
2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-
507f6857e0ea:frag:2:3]
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
INFO o.a.d.e.w.fragment.FragmentExecutor -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested
AWAITING_ALLOCATION --> RUNNING
2017-10-09 09:42:20,533 [26248359-2fc8-d177-c3a6-
507f6857e0ea:frag:2:3]
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
INFO o.a.d.e.w.f.FragmentStatusReporter -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State to report: RUNNING
2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-
507f6857e0ea:frag:2:3]
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
INFO o.a.d.e.w.fragment.FragmentExecutor -
26248359-2fc8-d177-c3a6-507f6857e0ea:2:3: State change requested
RUNNING
Post by PROJJWAL SAHA
Post by Kunal Khatua
--> CANCELLATION_REQUESTED
2017-10-09 09:42:20,534 [26248359-2fc8-d177-c3a6-
507f6857e0ea:frag:2:3]
Post by PROJJWAL SAHA
Post by PROJJWAL SAHA
Post by Kunal Khatua
INFO o.a.d.e.w.f.FragmentStatusReporter -
CANCELLATION_REQUESTED
Loading...