Discussion:
Drill “VALIDATION ERROR: A table or view with given name already exists in schema” for empty directory
Reed Villanueva
2018-12-04 20:50:15 UTC
Permalink
After upgrading drill on our cluster to drill-1.12.0-mapr, testing our
daily ETL scripts (which all use drill for converting parquet files to
tsv), a validation error ("*table or view with given name already exists*")
is always thrown when trying to run a `CREATE TABLE` statement on some
empty directories in a writable workspace.


[Error Id: 6ea46737-8b6a-4887-a671-4bddbea02476 on
mapr002.ucera.local:31010]
at
org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:489)
at
org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:561)
:
:
:
Caused by: org.apache.drill.common.exceptions.UserRemoteException:
VALIDATION ERROR: A table or view with given name
[/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv] already exists
in schema [dfs.etl_internal]


After some brief debugging, I see that the directory in question under the
workspace (ie. /internal_etl/project/version-2/stages/storage/ACCOUNT/tsv)
*is in fact empty*, yet still throwing these errors.

Looking for the error ID in the drillbit.log file in the associated node in
the error message above, we see

2018-12-04 10:13:25,285 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO o.a.drill.exec.work.foreman.Foreman - Query text for query id
23f92019-db56-862f-e7b9-cd51b3e174ae: create table
dfs.etl_internal.`/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv`
as
select <a bunch of fields>
from
dfs.etl_internal.`/internal_etl/project/version-2/stages/storage/ACCOUNT/parquet`
2018-12-04 10:13:25,406 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took
0 ms, numFiles: 1
2018-12-04 10:13:25,408 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took
0 ms, numFiles: 1
2018-12-04 10:13:25,893 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took
0 ms, numFiles: 1
2018-12-04 10:13:25,894 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took
0 ms, numFiles: 1
2018-12-04 10:13:25,898 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took
0 ms, numFiles: 1
2018-12-04 10:13:25,898 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took
0 ms, numFiles: 1
2018-12-04 10:13:25,905 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO o.a.d.e.p.s.h.CreateTableHandler - User Error Occurred: A table or
view with given name
[/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv] already exists
in schema [dfs.etl_internal]
org.apache.drill.common.exceptions.UserException: VALIDATION ERROR: A
table or view with given name
[/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv] already exists
in schema [dfs.etl_internal]


[Error Id: 45177abc-7e9f-4678-959f-f9e0e38bc564 ]
at
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
~[drill-common-1.12.0-mapr.jar:1.12.0-mapr]
at
org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.checkTableCreationPossibility(CreateTableHandler.java:326)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at
org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.getPlan(CreateTableHandler.java:90)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:131)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:79)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:567)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:264)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[na:1.8.0_151]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[na:1.8.0_151]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
2018-12-04 10:13:25,924 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO o.apache.drill.exec.work.WorkManager - Waiting for 0 queries to
complete before shutting down
2018-12-04 10:13:25,924 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO o.apache.drill.exec.work.WorkManager - Waiting for 0 running
fragments to complete before shutting down

This error occurs even when using `DROP TABLE [IF EXISTS]
<workspace>.<table path name>` before the `CREATE TABLE` statement.
Furthermore, the configurations for the dfs workspace itself does not
appear to be changed from before upgrading to drill-1.12, see below:

:
:
"workspaces": {
"root": {
"location": "/",
"writable": false,
"defaultInputFormat": null,
"allowAccessOutsideWorkspace": false
},
"tmp": {
"location": "/tmp",
"writable": true,
"defaultInputFormat": null,
"allowAccessOutsideWorkspace": false
},
"etl_internal": {
"location": "/etl/internal",
"writable": true,
"defaultInputFormat": null,
"allowAccessOutsideWorkspace": false
}
},
:
:

Note that the full process in question is intended to `mv` the directory
contents every day and `CREATE TABLE` with new data from current day (in
case that makes a difference) and this process had been working fine when
we were using drill-1.11.

If anyone with more experience using drill knows what could be happening
here, any opinions or advice would be appreciated.
--
This electronic message is intended only for the named
recipient, and may
contain information that is confidential or
privileged. If you are not the
intended recipient, you are
hereby notified that any disclosure, copying,
distribution or
use of the contents of this message is strictly
prohibited. If
you have received this message in error or are not the
named
recipient, please notify us immediately by contacting the
sender at
the electronic mail address noted above, and delete
and destroy all copies
of this message. Thank you.
Vitalii Diravka
2018-12-06 00:51:03 UTC
Permalink
Hi Reed,

It looks like a bug. Could you please create a jira ticket with an above
description?
https://issues.apache.org/jira/projects/DRILL/issues

Kind regards
Vitalii
Post by Reed Villanueva
After upgrading drill on our cluster to drill-1.12.0-mapr, testing our
daily ETL scripts (which all use drill for converting parquet files to
tsv), a validation error ("*table or view with given name already exists*")
is always thrown when trying to run a `CREATE TABLE` statement on some
empty directories in a writable workspace.
[Error Id: 6ea46737-8b6a-4887-a671-4bddbea02476 on
mapr002.ucera.local:31010]
at
org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:489)
at
org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:561)
VALIDATION ERROR: A table or view with given name
[/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv] already exists
in schema [dfs.etl_internal]
After some brief debugging, I see that the directory in question under the
workspace (ie. /internal_etl/project/version-2/stages/storage/ACCOUNT/tsv)
*is in fact empty*, yet still throwing these errors.
Looking for the error ID in the drillbit.log file in the associated node in
the error message above, we see
2018-12-04 10:13:25,285 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO o.a.drill.exec.work.foreman.Foreman - Query text for query id
23f92019-db56-862f-e7b9-cd51b3e174ae: create table
dfs.etl_internal.`/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv`
as
select <a bunch of fields>
from
dfs.etl_internal.`/internal_etl/project/version-2/stages/storage/ACCOUNT/parquet`
2018-12-04 10:13:25,406 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took
0 ms, numFiles: 1
2018-12-04 10:13:25,408 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took
0 ms, numFiles: 1
2018-12-04 10:13:25,893 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took
0 ms, numFiles: 1
2018-12-04 10:13:25,894 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took
0 ms, numFiles: 1
2018-12-04 10:13:25,898 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took
0 ms, numFiles: 1
2018-12-04 10:13:25,898 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took
0 ms, numFiles: 1
2018-12-04 10:13:25,905 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO o.a.d.e.p.s.h.CreateTableHandler - User Error Occurred: A table or
view with given name
[/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv] already exists
in schema [dfs.etl_internal]
org.apache.drill.common.exceptions.UserException: VALIDATION ERROR: A
table or view with given name
[/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv] already exists
in schema [dfs.etl_internal]
[Error Id: 45177abc-7e9f-4678-959f-f9e0e38bc564 ]
at
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
~[drill-common-1.12.0-mapr.jar:1.12.0-mapr]
at
org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.checkTableCreationPossibility(CreateTableHandler.java:326)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at
org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.getPlan(CreateTableHandler.java:90)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:131)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:79)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:567)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:264)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[na:1.8.0_151]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[na:1.8.0_151]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
2018-12-04 10:13:25,924 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO o.apache.drill.exec.work.WorkManager - Waiting for 0 queries to
complete before shutting down
2018-12-04 10:13:25,924 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO o.apache.drill.exec.work.WorkManager - Waiting for 0 running
fragments to complete before shutting down
This error occurs even when using `DROP TABLE [IF EXISTS]
<workspace>.<table path name>` before the `CREATE TABLE` statement.
Furthermore, the configurations for the dfs workspace itself does not
"workspaces": {
"root": {
"location": "/",
"writable": false,
"defaultInputFormat": null,
"allowAccessOutsideWorkspace": false
},
"tmp": {
"location": "/tmp",
"writable": true,
"defaultInputFormat": null,
"allowAccessOutsideWorkspace": false
},
"etl_internal": {
"location": "/etl/internal",
"writable": true,
"defaultInputFormat": null,
"allowAccessOutsideWorkspace": false
}
},
Note that the full process in question is intended to `mv` the directory
contents every day and `CREATE TABLE` with new data from current day (in
case that makes a difference) and this process had been working fine when
we were using drill-1.11.
If anyone with more experience using drill knows what could be happening
here, any opinions or advice would be appreciated.
--
This electronic message is intended only for the named
recipient, and may
contain information that is confidential or
privileged. If you are not the
intended recipient, you are
hereby notified that any disclosure, copying,
distribution or
use of the contents of this message is strictly
prohibited. If
you have received this message in error or are not the
named
recipient, please notify us immediately by contacting the
sender at
the electronic mail address noted above, and delete
and destroy all copies
of this message. Thank you.
Khurram Faraaz
2018-12-06 01:22:58 UTC
Permalink
Vitalii, this could be related to
https://issues.apache.org/jira/browse/DRILL-2775

Regards,
Khurram
Post by Vitalii Diravka
Hi Reed,
It looks like a bug. Could you please create a jira ticket with an above
description?
https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_projects_DRILL_issues&d=DwIBaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=H5JEl9vb-mBIjic10QAbDD2vkUUKAxjO6wZO322RtdI&m=WD8U6gAg6W1XWQoDvaWyzjdcvuEAdui_jCDb-8JOQhM&s=gNgq2I5icxbflP6fLgeNK5U2fF8N9vGBuokgEb03H6I&e=
Kind regards
Vitalii
Post by Reed Villanueva
After upgrading drill on our cluster to drill-1.12.0-mapr, testing our
daily ETL scripts (which all use drill for converting parquet files to
tsv), a validation error ("*table or view with given name already
exists*")
Post by Reed Villanueva
is always thrown when trying to run a `CREATE TABLE` statement on some
empty directories in a writable workspace.
[Error Id: 6ea46737-8b6a-4887-a671-4bddbea02476 on
mapr002.ucera.local:31010]
at
org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:489)
Post by Reed Villanueva
at
org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:561)
Post by Reed Villanueva
VALIDATION ERROR: A table or view with given name
[/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv] already
exists
Post by Reed Villanueva
in schema [dfs.etl_internal]
After some brief debugging, I see that the directory in question under
the
Post by Reed Villanueva
workspace (ie.
/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv)
Post by Reed Villanueva
*is in fact empty*, yet still throwing these errors.
Looking for the error ID in the drillbit.log file in the associated node
in
Post by Reed Villanueva
the error message above, we see
2018-12-04 10:13:25,285
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.drill.exec.work.foreman.Foreman - Query text for query id
23f92019-db56-862f-e7b9-cd51b3e174ae: create table
dfs.etl_internal.`/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv`
Post by Reed Villanueva
as
select <a bunch of fields>
from
dfs.etl_internal.`/internal_etl/project/version-2/stages/storage/ACCOUNT/parquet`
Post by Reed Villanueva
2018-12-04 10:13:25,406
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses()
took
Post by Reed Villanueva
0 ms, numFiles: 1
2018-12-04 10:13:25,408
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses()
took
Post by Reed Villanueva
0 ms, numFiles: 1
2018-12-04 10:13:25,893
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses()
took
Post by Reed Villanueva
0 ms, numFiles: 1
2018-12-04 10:13:25,894
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses()
took
Post by Reed Villanueva
0 ms, numFiles: 1
2018-12-04 10:13:25,898
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses()
took
Post by Reed Villanueva
0 ms, numFiles: 1
2018-12-04 10:13:25,898
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses()
took
Post by Reed Villanueva
0 ms, numFiles: 1
2018-12-04 10:13:25,905
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.d.e.p.s.h.CreateTableHandler - User Error Occurred: A table or
view with given name
[/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv] already
exists
Post by Reed Villanueva
in schema [dfs.etl_internal]
org.apache.drill.common.exceptions.UserException: VALIDATION ERROR: A
table or view with given name
[/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv] already
exists
Post by Reed Villanueva
in schema [dfs.etl_internal]
[Error Id: 45177abc-7e9f-4678-959f-f9e0e38bc564 ]
at
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
Post by Reed Villanueva
~[drill-common-1.12.0-mapr.jar:1.12.0-mapr]
at
org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.checkTableCreationPossibility(CreateTableHandler.java:326)
Post by Reed Villanueva
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at
org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.getPlan(CreateTableHandler.java:90)
Post by Reed Villanueva
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:131)
Post by Reed Villanueva
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:79)
Post by Reed Villanueva
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at org.apache.drill.exec.work
.foreman.Foreman.runSQL(Foreman.java:567)
Post by Reed Villanueva
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:264)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
Post by Reed Villanueva
[na:1.8.0_151]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
Post by Reed Villanueva
[na:1.8.0_151]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
2018-12-04 10:13:25,924
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.apache.drill.exec.work.WorkManager - Waiting for 0 queries to
complete before shutting down
2018-12-04 10:13:25,924
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.apache.drill.exec.work.WorkManager - Waiting for 0 running
fragments to complete before shutting down
This error occurs even when using `DROP TABLE [IF EXISTS]
<workspace>.<table path name>` before the `CREATE TABLE` statement.
Furthermore, the configurations for the dfs workspace itself does not
"workspaces": {
"root": {
"location": "/",
"writable": false,
"defaultInputFormat": null,
"allowAccessOutsideWorkspace": false
},
"tmp": {
"location": "/tmp",
"writable": true,
"defaultInputFormat": null,
"allowAccessOutsideWorkspace": false
},
"etl_internal": {
"location": "/etl/internal",
"writable": true,
"defaultInputFormat": null,
"allowAccessOutsideWorkspace": false
}
},
Note that the full process in question is intended to `mv` the directory
contents every day and `CREATE TABLE` with new data from current day (in
case that makes a difference) and this process had been working fine when
we were using drill-1.11.
If anyone with more experience using drill knows what could be happening
here, any opinions or advice would be appreciated.
--
This electronic message is intended only for the named
recipient, and may
contain information that is confidential or
privileged. If you are not the
intended recipient, you are
hereby notified that any disclosure, copying,
distribution or
use of the contents of this message is strictly
prohibited. If
you have received this message in error or are not the
named
recipient, please notify us immediately by contacting the
sender at
the electronic mail address noted above, and delete
and destroy all copies
of this message. Thank you.
Vitalii Diravka
2018-12-06 15:17:04 UTC
Permalink
@Khurram Thank you for pointing out the Jira ticket. It differs and it is
no more the issue. I have resolved the ticket.

@Reed I looked to your case and it looks like it is expected
behavior. Empty directory can be regular (but "schemaless") Drill table.
So you can't create the table with the same name under the same workspace.
If you specify the empty directory as a workspace for Drill, then you can
create new tables inside it.
I have also checked that the same behavior was for drill-1.11.0-mapr and
drill-1.10.0-mapr versions.

See more description about queering empty directories here:
https://drill.apache.org/docs/data-sources-and-file-formats-introduction/#schemaless-tables

Kind regards
Vitalii
Post by Khurram Faraaz
Vitalii, this could be related to
https://issues.apache.org/jira/browse/DRILL-2775
Regards,
Khurram
Post by Vitalii Diravka
Hi Reed,
It looks like a bug. Could you please create a jira ticket with an above
description?
https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_projects_DRILL_issues&d=DwIBaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=H5JEl9vb-mBIjic10QAbDD2vkUUKAxjO6wZO322RtdI&m=WD8U6gAg6W1XWQoDvaWyzjdcvuEAdui_jCDb-8JOQhM&s=gNgq2I5icxbflP6fLgeNK5U2fF8N9vGBuokgEb03H6I&e=
Post by Vitalii Diravka
Kind regards
Vitalii
Post by Reed Villanueva
After upgrading drill on our cluster to drill-1.12.0-mapr, testing our
daily ETL scripts (which all use drill for converting parquet files to
tsv), a validation error ("*table or view with given name already
exists*")
Post by Reed Villanueva
is always thrown when trying to run a `CREATE TABLE` statement on some
empty directories in a writable workspace.
[Error Id: 6ea46737-8b6a-4887-a671-4bddbea02476 on
mapr002.ucera.local:31010]
at
org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:489)
Post by Vitalii Diravka
Post by Reed Villanueva
at
org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:561)
Post by Vitalii Diravka
Post by Reed Villanueva
VALIDATION ERROR: A table or view with given name
[/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv] already
exists
Post by Reed Villanueva
in schema [dfs.etl_internal]
After some brief debugging, I see that the directory in question under
the
Post by Reed Villanueva
workspace (ie.
/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv)
Post by Reed Villanueva
*is in fact empty*, yet still throwing these errors.
Looking for the error ID in the drillbit.log file in the associated
node
Post by Vitalii Diravka
in
Post by Reed Villanueva
the error message above, we see
2018-12-04 10:13:25,285
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.drill.exec.work.foreman.Foreman - Query text for query id
23f92019-db56-862f-e7b9-cd51b3e174ae: create table
dfs.etl_internal.`/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv`
Post by Vitalii Diravka
Post by Reed Villanueva
as
select <a bunch of fields>
from
dfs.etl_internal.`/internal_etl/project/version-2/stages/storage/ACCOUNT/parquet`
Post by Vitalii Diravka
Post by Reed Villanueva
2018-12-04 10:13:25,406
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses()
took
Post by Reed Villanueva
0 ms, numFiles: 1
2018-12-04 10:13:25,408
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses()
took
Post by Reed Villanueva
0 ms, numFiles: 1
2018-12-04 10:13:25,893
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses()
took
Post by Reed Villanueva
0 ms, numFiles: 1
2018-12-04 10:13:25,894
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses()
took
Post by Reed Villanueva
0 ms, numFiles: 1
2018-12-04 10:13:25,898
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses()
took
Post by Reed Villanueva
0 ms, numFiles: 1
2018-12-04 10:13:25,898
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses()
took
Post by Reed Villanueva
0 ms, numFiles: 1
2018-12-04 10:13:25,905
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.d.e.p.s.h.CreateTableHandler - User Error Occurred: A table
or
Post by Vitalii Diravka
Post by Reed Villanueva
view with given name
[/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv] already
exists
Post by Reed Villanueva
in schema [dfs.etl_internal]
org.apache.drill.common.exceptions.UserException: VALIDATION
ERROR: A
Post by Vitalii Diravka
Post by Reed Villanueva
table or view with given name
[/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv] already
exists
Post by Reed Villanueva
in schema [dfs.etl_internal]
[Error Id: 45177abc-7e9f-4678-959f-f9e0e38bc564 ]
at
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
Post by Vitalii Diravka
Post by Reed Villanueva
~[drill-common-1.12.0-mapr.jar:1.12.0-mapr]
at
org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.checkTableCreationPossibility(CreateTableHandler.java:326)
Post by Vitalii Diravka
Post by Reed Villanueva
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at
org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.getPlan(CreateTableHandler.java:90)
Post by Vitalii Diravka
Post by Reed Villanueva
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:131)
Post by Vitalii Diravka
Post by Reed Villanueva
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:79)
Post by Vitalii Diravka
Post by Reed Villanueva
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at org.apache.drill.exec.work
.foreman.Foreman.runSQL(Foreman.java:567)
Post by Reed Villanueva
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at org.apache.drill.exec.work
.foreman.Foreman.run(Foreman.java:264)
Post by Vitalii Diravka
Post by Reed Villanueva
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
Post by Vitalii Diravka
Post by Reed Villanueva
[na:1.8.0_151]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
Post by Vitalii Diravka
Post by Reed Villanueva
[na:1.8.0_151]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
2018-12-04 10:13:25,924
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.apache.drill.exec.work.WorkManager - Waiting for 0 queries to
complete before shutting down
2018-12-04 10:13:25,924
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.apache.drill.exec.work.WorkManager - Waiting for 0 running
fragments to complete before shutting down
This error occurs even when using `DROP TABLE [IF EXISTS]
<workspace>.<table path name>` before the `CREATE TABLE` statement.
Furthermore, the configurations for the dfs workspace itself does not
"workspaces": {
"root": {
"location": "/",
"writable": false,
"defaultInputFormat": null,
"allowAccessOutsideWorkspace": false
},
"tmp": {
"location": "/tmp",
"writable": true,
"defaultInputFormat": null,
"allowAccessOutsideWorkspace": false
},
"etl_internal": {
"location": "/etl/internal",
"writable": true,
"defaultInputFormat": null,
"allowAccessOutsideWorkspace": false
}
},
Note that the full process in question is intended to `mv` the
directory
Post by Vitalii Diravka
Post by Reed Villanueva
contents every day and `CREATE TABLE` with new data from current day
(in
Post by Vitalii Diravka
Post by Reed Villanueva
case that makes a difference) and this process had been working fine
when
Post by Vitalii Diravka
Post by Reed Villanueva
we were using drill-1.11.
If anyone with more experience using drill knows what could be
happening
Post by Vitalii Diravka
Post by Reed Villanueva
here, any opinions or advice would be appreciated.
--
This electronic message is intended only for the named
recipient, and may
contain information that is confidential or
privileged. If you are not the
intended recipient, you are
hereby notified that any disclosure, copying,
distribution or
use of the contents of this message is strictly
prohibited. If
you have received this message in error or are not the
named
recipient, please notify us immediately by contacting the
sender at
the electronic mail address noted above, and delete
and destroy all copies
of this message. Thank you.
Vitalii Diravka
2018-12-06 15:36:55 UTC
Permalink
Reed,

I see you have asked the same question on stackoverflow [1] and found the
root cause of the problem.
I have added comment there.

[1]
https://stackoverflow.com/questions/53604950/drill-validation-error-a-table-or-view-with-given-name-already-exists-in-schem/53654748#53654748

Kind regards
Vitalii
Post by Vitalii Diravka
@Khurram Thank you for pointing out the Jira ticket. It differs and it is
no more the issue. I have resolved the ticket.
@Reed I looked to your case and it looks like it is expected
behavior. Empty directory can be regular (but "schemaless") Drill table.
So you can't create the table with the same name under the same workspace.
If you specify the empty directory as a workspace for Drill, then you can
create new tables inside it.
I have also checked that the same behavior was for drill-1.11.0-mapr and
drill-1.10.0-mapr versions.
https://drill.apache.org/docs/data-sources-and-file-formats-introduction/#schemaless-tables
Kind regards
Vitalii
Post by Khurram Faraaz
Vitalii, this could be related to
https://issues.apache.org/jira/browse/DRILL-2775
Regards,
Khurram
Post by Vitalii Diravka
Hi Reed,
It looks like a bug. Could you please create a jira ticket with an above
description?
https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_projects_DRILL_issues&d=DwIBaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=H5JEl9vb-mBIjic10QAbDD2vkUUKAxjO6wZO322RtdI&m=WD8U6gAg6W1XWQoDvaWyzjdcvuEAdui_jCDb-8JOQhM&s=gNgq2I5icxbflP6fLgeNK5U2fF8N9vGBuokgEb03H6I&e=
Post by Vitalii Diravka
Kind regards
Vitalii
Post by Reed Villanueva
After upgrading drill on our cluster to drill-1.12.0-mapr, testing our
daily ETL scripts (which all use drill for converting parquet files to
tsv), a validation error ("*table or view with given name already
exists*")
Post by Reed Villanueva
is always thrown when trying to run a `CREATE TABLE` statement on some
empty directories in a writable workspace.
[Error Id: 6ea46737-8b6a-4887-a671-4bddbea02476 on
mapr002.ucera.local:31010]
at
org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:489)
Post by Vitalii Diravka
Post by Reed Villanueva
at
org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:561)
Post by Vitalii Diravka
Post by Reed Villanueva
VALIDATION ERROR: A table or view with given name
[/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv] already
exists
Post by Reed Villanueva
in schema [dfs.etl_internal]
After some brief debugging, I see that the directory in question under
the
Post by Reed Villanueva
workspace (ie.
/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv)
Post by Reed Villanueva
*is in fact empty*, yet still throwing these errors.
Looking for the error ID in the drillbit.log file in the associated
node
Post by Vitalii Diravka
in
Post by Reed Villanueva
the error message above, we see
2018-12-04 10:13:25,285
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.drill.exec.work.foreman.Foreman - Query text for query id
23f92019-db56-862f-e7b9-cd51b3e174ae: create table
dfs.etl_internal.`/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv`
Post by Vitalii Diravka
Post by Reed Villanueva
as
select <a bunch of fields>
from
dfs.etl_internal.`/internal_etl/project/version-2/stages/storage/ACCOUNT/parquet`
Post by Vitalii Diravka
Post by Reed Villanueva
2018-12-04 10:13:25,406
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses()
took
Post by Reed Villanueva
0 ms, numFiles: 1
2018-12-04 10:13:25,408
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses()
took
Post by Reed Villanueva
0 ms, numFiles: 1
2018-12-04 10:13:25,893
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses()
took
Post by Reed Villanueva
0 ms, numFiles: 1
2018-12-04 10:13:25,894
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses()
took
Post by Reed Villanueva
0 ms, numFiles: 1
2018-12-04 10:13:25,898
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses()
took
Post by Reed Villanueva
0 ms, numFiles: 1
2018-12-04 10:13:25,898
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses()
took
Post by Reed Villanueva
0 ms, numFiles: 1
2018-12-04 10:13:25,905
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.a.d.e.p.s.h.CreateTableHandler - User Error Occurred: A table
or
Post by Vitalii Diravka
Post by Reed Villanueva
view with given name
[/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv] already
exists
Post by Reed Villanueva
in schema [dfs.etl_internal]
org.apache.drill.common.exceptions.UserException: VALIDATION
ERROR: A
Post by Vitalii Diravka
Post by Reed Villanueva
table or view with given name
[/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv] already
exists
Post by Reed Villanueva
in schema [dfs.etl_internal]
[Error Id: 45177abc-7e9f-4678-959f-f9e0e38bc564 ]
at
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
Post by Vitalii Diravka
Post by Reed Villanueva
~[drill-common-1.12.0-mapr.jar:1.12.0-mapr]
at
org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.checkTableCreationPossibility(CreateTableHandler.java:326)
Post by Vitalii Diravka
Post by Reed Villanueva
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at
org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.getPlan(CreateTableHandler.java:90)
Post by Vitalii Diravka
Post by Reed Villanueva
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:131)
Post by Vitalii Diravka
Post by Reed Villanueva
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:79)
Post by Vitalii Diravka
Post by Reed Villanueva
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at org.apache.drill.exec.work
.foreman.Foreman.runSQL(Foreman.java:567)
Post by Reed Villanueva
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at org.apache.drill.exec.work
.foreman.Foreman.run(Foreman.java:264)
Post by Vitalii Diravka
Post by Reed Villanueva
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
Post by Vitalii Diravka
Post by Reed Villanueva
[na:1.8.0_151]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
Post by Vitalii Diravka
Post by Reed Villanueva
[na:1.8.0_151]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
2018-12-04 10:13:25,924
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.apache.drill.exec.work.WorkManager - Waiting for 0 queries to
complete before shutting down
2018-12-04 10:13:25,924
[23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
Post by Reed Villanueva
INFO o.apache.drill.exec.work.WorkManager - Waiting for 0 running
fragments to complete before shutting down
This error occurs even when using `DROP TABLE [IF EXISTS]
<workspace>.<table path name>` before the `CREATE TABLE` statement.
Furthermore, the configurations for the dfs workspace itself does not
"workspaces": {
"root": {
"location": "/",
"writable": false,
"defaultInputFormat": null,
"allowAccessOutsideWorkspace": false
},
"tmp": {
"location": "/tmp",
"writable": true,
"defaultInputFormat": null,
"allowAccessOutsideWorkspace": false
},
"etl_internal": {
"location": "/etl/internal",
"writable": true,
"defaultInputFormat": null,
"allowAccessOutsideWorkspace": false
}
},
Note that the full process in question is intended to `mv` the
directory
Post by Vitalii Diravka
Post by Reed Villanueva
contents every day and `CREATE TABLE` with new data from current day
(in
Post by Vitalii Diravka
Post by Reed Villanueva
case that makes a difference) and this process had been working fine
when
Post by Vitalii Diravka
Post by Reed Villanueva
we were using drill-1.11.
If anyone with more experience using drill knows what could be
happening
Post by Vitalii Diravka
Post by Reed Villanueva
here, any opinions or advice would be appreciated.
--
This electronic message is intended only for the named
recipient, and may
contain information that is confidential or
privileged. If you are not the
intended recipient, you are
hereby notified that any disclosure, copying,
distribution or
use of the contents of this message is strictly
prohibited. If
you have received this message in error or are not the
named
recipient, please notify us immediately by contacting the
sender at
the electronic mail address noted above, and delete
and destroy all copies
of this message. Thank you.
Loading...