Dropping default location database in hive deletes everything
search cancel

Dropping default location database in hive deletes everything

book

Article ID: 294620

calendar_today

Updated On:

Products

Services Suite

Issue/Introduction

Symptoms:
Show databases in Hive using the following command:
hive> show databases;
OK
default
sample1
Time taken: 0.024 seconds, Fetched: 2 row(s)

Create a database explicitly with the default location:

hive> create database sample location '/hive/gphd/warehouse';
OK
Time taken: 0.059 seconds

Check the location of the database in metastore UI (Postgres or MYSQL). Observe that the sample.db directory is not created but it is created under default directory.

metastore=# select * from "DBS";
 DB_ID | DESC | DB_LOCATION_URI | NAME
-------+-----------------------+--------------------------------------------+---------
 1 | Default Hive database | hdfs://hulk/hive/gphd/warehouse | default
 22 | | hdfs://hulk/hive/gphd/warehouse | sample
 23 | | hdfs://hulk/hive/gphd/warehouse/sample1.db | sample1
(3 rows)

 Check the new database created and the corresponding tables:

[root@hdm1 ~]# hadoop fs -ls /hive/gphd/warehouse/
Found 1 items
drwxr-xr-x - hive hadoop 0 2015-02-26 13:46 /hive/gphd/warehouse/sample1.db

Drop the database sample with the following command. Cascade removes all the tables underneath the database.

hive> drop database sample cascade;
OK
Time taken: 0.06 seconds

Check the data in HDFS using the following command. This removes all the databases and corresponding tables from HDFS.

[root@hdm1 ~]# hadoop fs -ls /hive/gphd/warehouse/
ls: `/hive/gphd/warehouse/': No such file or directory

Confirm that metastore still contains the metadata for the deleted databases. The metadata needs to be removed manually:

metastore=# select * from "DBS";
 DB_ID | DESC | DB_LOCATION_URI | NAME
-------+-----------------------+--------------------------------------------+---------
 1 | Default Hive database | hdfs://hulk/hive/gphd/warehouse | default
 23 | | hdfs://hulk/hive/gphd/warehouse/sample1.db | sample1
(2 rows)

Environment


Cause

In Hive, it is not possible to alter the database location, nor update the existing properties.

Resolution

Never explicitly create a database in the default Hive location. A workaround for this problem is to update the value of the database location in the Postgresql metastore database as follows: 

[root@hdm1 ~]# psql -p 10432 -U postgres
psql (8.4.20)
Type "help" for help.
postgres=# \c metastore
psql (8.4.20)
You are now connected to database "metastore".
postgres=# UPDATE "DBS" SET DB_LOCATION_URI = '/hive/gphd/warehouse/<db_name>.db' WHERE NAME = '<db_name>';