-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-29241: Distinguish the default location of the database by catalog #6267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request introduces catalog-aware warehouse paths for Hive metastore, allowing databases and tables under non-default catalogs to have distinct storage locations. The changes ensure that catalog locations are properly isolated while maintaining backward compatibility for the default Hive catalog.
Changes:
- Added new configuration parameters (
WAREHOUSE_CATALOGandWAREHOUSE_CATALOG_EXTERNAL) for catalog-specific warehouse directories - Modified catalog creation to require a 'type' parameter and made location optional with automatic defaults
- Updated database path resolution throughout the codebase to use catalog-aware logic via new
Warehousemethods that accept catalog names
Reviewed changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 21 comments.
Show a summary per file
| File | Description |
|---|---|
| MetastoreConf.java | Added WAREHOUSE_CATALOG and WAREHOUSE_CATALOG_EXTERNAL configuration variables |
| HiveConf.java | Added corresponding Hive configuration variables for catalog warehouses |
| Warehouse.java | Enhanced path resolution methods to accept catalog names; added deprecated markers for old methods |
| CatalogUtil.java | New utility class for catalog type validation (NATIVE, ICEBERG) |
| CreateCatalogAnalyzer.java | Updated to validate catalog type property and make location optional |
| CreateCatalogOperation.java | Modified to generate default location from WAREHOUSE_CATALOG config |
| CreateDatabaseAnalyzer.java | Added validation to check if catalog supports database creation |
| CreateDatabaseOperation.java | Updated path building logic to include catalog names for non-default catalogs |
| DDLUtils.java | Added helper method to check catalog support for database creation |
| HiveParser.g | Modified grammar to make catalog location optional and properties required |
| EximUtil.java | Updated to use catalog-aware warehouse paths for replication |
| Multiple test files | Updated test queries to include required 'type' property in CREATE CATALOG statements |
| HMSHandler.java | Updated default catalog and database creation to use catalog-aware paths |
| MetastoreDefaultTransformer.java | Updated to pass Database object instead of just name |
| Migration/Authorization files | Updated to use new Database-based path methods |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
...one-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java
Outdated
Show resolved
Hide resolved
...one-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java
Show resolved
Hide resolved
...one-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java
Outdated
Show resolved
Hide resolved
ql/src/java/org/apache/hadoop/hive/ql/ddl/catalog/create/CreateCatalogAnalyzer.java
Show resolved
Hide resolved
...one-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java
Show resolved
Hide resolved
...e-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/CatalogUtil.java
Show resolved
Hide resolved
ql/src/java/org/apache/hadoop/hive/ql/queryhistory/repository/AbstractRepository.java
Outdated
Show resolved
Hide resolved
...one-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java
Show resolved
Hide resolved
5940a60 to
49c343a
Compare
|



What changes were proposed in this pull request?
The main objectives of this PR are:
Allow location to be omitted when creating a catalog. At the same time, it is necessary to determine catalogs that do not require a location, such as JDBC catalogs.
For newly created catalogs, the location of their databases will be determined by the two newly added parameters:
metastore.warehouse.catalog.dirandmetastore.warehouse.catalog.external.dir. This helps better ensure that the location of the default Hive catalog's databases and tables remains unaffected.For example, if metastore.warehouse.catalog.dir is hdfs://ns1/testdir, then the location for a newly created catalog named testcat would be hdfs://ns1/testdir/testcat. Consequently, the default path for a database like testdb created under this catalog would be hdfs://ns1/testdir/testcat/testdb.
typeparameter must be specified when creating a catalog to distinguish its type. Based on this type, it will be determined whether the catalog's databases and tables require a location, or whether the catalog type supports creating databases and tables.CREATE CATALOG test_cat COMMENT 'Hive test catalog' PROPERTIES('type'='NATIVE');Why are the changes needed?
The paths of databases and tables created under a non-default catalog may interfere with those under the default Hive catalog. We need to introduce parameters to distinguish the paths for databases and tables under non-default catalogs, ensuring that the paths of the default Hive catalog remain unaffected.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Existing tests.