Skip to content

Commit

Permalink
HIVE-28437: Add documentation for initializing the system schemas for…
Browse files Browse the repository at this point in the history
… HiveServer2 for Docker Image
  • Loading branch information
linghengqian committed Feb 10, 2025
1 parent 9002aba commit 6c77f0c
Show file tree
Hide file tree
Showing 2 changed files with 60 additions and 2 deletions.
58 changes: 58 additions & 0 deletions packaging/src/docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -210,3 +210,61 @@ docker compose down
select count(distinct a) from hive_example;
select sum(b) from hive_example;
```

#### `sys` Schema and `information_schema` Schema

`Hive Schema Tool` is located in the Docker Image at `/opt/hive/bin/schematool`.

By default, system schemas such as `information_schema` for HiveServer2 are not created.
To create system schemas for a HiveServer2 instance,
users need to configure HiveServer2 to use a remote Hive Metastore Server and use a database other than embedded Derby for the Hive Metastore Server.

Assuming `Maven` and `Docker CE` are installed, a possible use case is as follows.
Create a `compose.yaml` file in the current directory,

```yaml
services:
some-postgres:
image: postgres:17.2-bookworm
environment:
POSTGRES_PASSWORD: "example"
metastore-standalone:
image: apache/hive:4.0.1
depends_on:
- some-postgres
environment:
SERVICE_NAME: metastore
DB_DRIVER: postgres
SERVICE_OPTS: >-
-Djavax.jdo.option.ConnectionDriverName=org.postgresql.Driver
-Djavax.jdo.option.ConnectionURL=jdbc:postgresql://some-postgres:5432/postgres
-Djavax.jdo.option.ConnectionUserName=postgres
-Djavax.jdo.option.ConnectionPassword=example
volumes:
- ~/.m2/repository/org/postgresql/postgresql/42.7.5/postgresql-42.7.5.jar:/opt/hive/lib/postgres.jar
hiveserver2-standalone:
image: apache/hive:4.0.1
depends_on:
- metastore-standalone
environment:
SERVICE_NAME: hiveserver2
IS_RESUME: true
SERVICE_OPTS: >-
-Djavax.jdo.option.ConnectionDriverName=org.postgresql.Driver
-Djavax.jdo.option.ConnectionURL=jdbc:postgresql://some-postgres:5432/postgres
-Djavax.jdo.option.ConnectionUserName=postgres
-Djavax.jdo.option.ConnectionPassword=example
-Dhive.metastore.uris=thrift://metastore-standalone:9083
volumes:
- ~/.m2/repository/org/postgresql/postgresql/42.7.5/postgresql-42.7.5.jar:/opt/hive/lib/postgres.jar
```
Then execute the shell command as follows to initialize the system schemas in HiveServer2.
```shell
mvn dependency:get -Dartifact=org.postgresql:postgresql:42.7.5
docker compose up -d
docker compose exec hiveserver2-standalone /bin/bash
/opt/hive/bin/schematool -initSchema -dbType hive -metaDbType postgres -url jdbc:hive2://localhost:10000/default
exit
```
4 changes: 2 additions & 2 deletions packaging/src/docker/entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,9 @@ function initialize_hive {
fi
$HIVE_HOME/bin/schematool -dbType $DB_DRIVER $COMMAND $VERBOSE_MODE
if [ $? -eq 0 ]; then
echo "Initialized schema successfully.."
echo "Initialized Hive Metastore Server schema successfully.."
else
echo "Schema initialization failed!"
echo "Hive Metastore Server schema initialization failed!"
exit 1
fi
}
Expand Down

0 comments on commit 6c77f0c

Please sign in to comment.