Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIVE-28437: Add documentation for initializing the system schemas for HiveServer2 for Docker Image #5629

Merged
merged 1 commit into from
Feb 11, 2025

Conversation

linghengqian
Copy link
Member

@linghengqian linghengqian commented Feb 2, 2025

What changes were proposed in this pull request?

Why are the changes needed?

import org.junit.jupiter.api.Test;
import org.testcontainers.containers.Container.ExecResult;
import org.testcontainers.containers.GenericContainer;
import org.testcontainers.junit.jupiter.Container;
import org.testcontainers.junit.jupiter.Testcontainers;
import org.testcontainers.utility.DockerImageName;
import java.io.IOException;
import java.sql.*;
import java.time.Duration;
import java.time.temporal.ChronoUnit;
import static org.awaitility.Awaitility.await;
import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.Matchers.is;
import static org.junit.jupiter.api.Assertions.assertThrows;
@SuppressWarnings({"SqlNoDataSourceInspection", "resource"})
@Testcontainers
public class InformationSchemaTest {
    @Container
    public static final GenericContainer<?> CONTAINER = new GenericContainer<>(DockerImageName.parse("apache/hive:4.0.1"))
            .withEnv("SERVICE_NAME", "hiveserver2")
            .withExposedPorts(10000);

    @Test
    void test() throws SQLException, IOException, InterruptedException {
        String jdbcUrlPrefix = "jdbc:hive2://" + CONTAINER.getHost() + ":" + CONTAINER.getMappedPort(10000);
        await().atMost(Duration.of(30L, ChronoUnit.SECONDS)).ignoreExceptions().until(() -> {
            DriverManager.getConnection(jdbcUrlPrefix).close();
            return true;
        });
        try (Connection connection = DriverManager.getConnection(jdbcUrlPrefix);
             Statement statement = connection.createStatement()) {
            statement.execute("CREATE DATABASE demo_ds_0");
        }
        try (Connection connection = DriverManager.getConnection(jdbcUrlPrefix + "/demo_ds_0");
             Statement statement = connection.createStatement()) {
            statement.execute("CREATE TABLE IF NOT EXISTS t_order (\n" +
                    "    order_id   BIGINT NOT NULL,\n" +
                    "    order_type INT,\n" +
                    "    user_id    INT    NOT NULL,\n" +
                    "    address_id BIGINT NOT NULL,\n" +
                    "    status     string,\n" +
                    "    PRIMARY KEY (order_id) disable novalidate\n" +
                    ") STORED BY ICEBERG STORED AS ORC TBLPROPERTIES ('format-version' = '2')");
            statement.execute("TRUNCATE TABLE t_order");
            statement.executeUpdate("INSERT INTO t_order (order_id, user_id, order_type, address_id, status) VALUES (1, 1, 1, 1, 'INSERT_TEST')");
            ResultSet resultSet = statement.executeQuery("select * from t_order");
            assertThat(resultSet.next(), is(true));
        }
        assertThrows(SQLException.class, () -> DriverManager.getConnection(jdbcUrlPrefix + "/information_schema").close());
        ExecResult infoResult = CONTAINER.execInContainer(
                "/opt/hive/bin/schematool",
                "-info",
                "-dbType", "hive",
                "-metaDbType", "derby",
                "-url", "jdbc:hive2://localhost:10000/default"
        );
        assertThat(infoResult.getStdout(), is("Metastore connection URL:\t jdbc:hive2://localhost:10000/default\n" +
                "Metastore connection Driver :\t org.apache.hive.jdbc.HiveDriver\n" +
                "Metastore connection User:\t APP\n"));
        ExecResult initResult = CONTAINER.execInContainer(
                "/opt/hive/bin/schematool",
                "-initSchema",
                "-dbType", "hive",
                "-metaDbType", "derby",
                "-url", "jdbc:hive2://localhost:10000/default"
        );
        assertThat(initResult.getStdout(), is("Initializing the schema to: 4.0.0\n" +
                "Metastore connection URL:\t jdbc:hive2://localhost:10000/default\n" +
                "Metastore connection Driver :\t org.apache.hive.jdbc.HiveDriver\n" +
                "Metastore connection User:\t APP\n" +
                "Starting metastore schema initialization to 4.0.0\n" +
                "Initialization script hive-schema-4.0.0.hive.sql\n" +
                "Initialization script completed\n"));
        try (Connection connection = DriverManager.getConnection(jdbcUrlPrefix + "/information_schema");
             Statement statement = connection.createStatement()) {
            ResultSet resultSet = statement.executeQuery("select * from information_schema.COLUMNS limit 100");
            assertThat(resultSet.next(), is(true));
        }
    }
}

Does this PR introduce any user-facing change?

  • This will affect all users of the Hive Docker Image.

Is the change a dependency upgrade?

  • No.

How was this patch tested?

@linghengqian linghengqian changed the title HIVE-28437: Support for initializing system database of Hive-Server 2 for Docker Image HIVE-28437: Add documentation for initializing the system databases for HiveServer2 for Docker Image Feb 7, 2025
@dengzhhu653
Copy link
Member

Hi @linghengqian, Thank you for the contribution! could we add the details to the quickstart as well? the link: https://github.com/apache/hive-site/blob/main/content/Development/quickStart.md

@linghengqian linghengqian changed the title HIVE-28437: Add documentation for initializing the system databases for HiveServer2 for Docker Image HIVE-28437: Add documentation for initializing the system schemas for HiveServer2 for Docker Image Feb 8, 2025
Copy link
Member Author

@linghengqian linghengqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

services:
  some-postgres:
    image: postgres:17.2-bookworm
    environment:
      POSTGRES_PASSWORD: "example"
  hiveserver2-standalone:
    image: apache/hive:4.0.1
    depends_on:
      - some-postgres
    environment:
      SERVICE_NAME: hiveserver2
      DB_DRIVER: postgres
      SERVICE_OPTS: >-
        -Djavax.jdo.option.ConnectionDriverName=org.postgresql.Driver
        -Djavax.jdo.option.ConnectionURL=jdbc:postgresql://some-postgres:5432/postgres
        -Djavax.jdo.option.ConnectionUserName=postgres
        -Djavax.jdo.option.ConnectionPassword=example
    volumes:
      - ~/.m2/repository/org/postgresql/postgresql/42.7.5/postgresql-42.7.5.jar:/opt/hive/lib/postgres.jar
mvn dependency:get -Dartifact=org.postgresql:postgresql:42.7.5
docker compose up -d
docker compose exec hiveserver2-standalone /bin/bash
/opt/hive/bin/schematool -initSchema -dbType hive -metaDbType postgres -url jdbc:hive2://localhost:10000/default
exit

@dengzhhu653
Copy link
Member

services:
  some-postgres:
    image: postgres:17.2-bookworm
    environment:
      POSTGRES_PASSWORD: "example"
  hiveserver2-standalone:
    image: apache/hive:4.0.1
    depends_on:
      - some-postgres
    environment:
      SERVICE_NAME: hiveserver2
      DB_DRIVER: postgres
      SERVICE_OPTS: >-
        -Djavax.jdo.option.ConnectionDriverName=org.postgresql.Driver
        -Djavax.jdo.option.ConnectionURL=jdbc:postgresql://some-postgres:5432/postgres
        -Djavax.jdo.option.ConnectionUserName=postgres
        -Djavax.jdo.option.ConnectionPassword=example
    volumes:
      - ~/.m2/repository/org/postgresql/postgresql/42.7.5/postgresql-42.7.5.jar:/opt/hive/lib/postgres.jar
mvn dependency:get -Dartifact=org.postgresql:postgresql:42.7.5
docker compose up -d
docker compose exec hiveserver2-standalone /bin/bash
/opt/hive/bin/schematool -initSchema -dbType hive -metaDbType postgres -url jdbc:hive2://localhost:10000/default
exit

It also works

@linghengqian
Copy link
Member Author

It also works

  • My understanding is that I should document both ways of writing compose.yaml.

@dengzhhu653 dengzhhu653 merged commit b914b6a into apache:master Feb 11, 2025
3 of 4 checks passed
@dengzhhu653
Copy link
Member

This PR doesn't change any code, so I merge this without a green build to free the resources.

@linghengqian linghengqian deleted the docker-infra branch February 11, 2025 02:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants