Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BQ Error in getTableNames on DatabaseConnector inside CohortDiagnostic package #5

Open
jdposada opened this issue Feb 1, 2021 · 7 comments

Comments

@jdposada
Copy link

jdposada commented Feb 1, 2021

Hi,

When running the package it looks like it is trying to access a BQ public dataset which is very strange

Thank you for all your help. @schuemie @jennifercelane

This is the output from the console

Executing SQL took 1.58 mins
Fetching data from server
Fetching data took 22.2 secs
Cohort characterization took 2.72 mins
Running Temporal Characterization took 2.8 mins
Retrieving concept information
Error in rJava::.jcall(resultSet, "Z", "next") : 
  java.sql.SQLException: [Simba][BigQueryJDBCDriver](100090) Dataset operations error. Message: 403 Forbidden
GET https://bigquery.googleapis.com/bigquery/v2/projects/chc-nih-chest-xray/datasets?maxResults=1000
{
  "code" : 403,
  "errors" : [ {
    "domain" : "global",
    "message" : "VPC Service Controls: Request is prohibited by organization's policy. vpcServiceControlsUniqueIdentifier: 469805ea0e8f8332.",
    "reason" : "policyViolation"
  } ],
  "message" : "VPC Service Controls: Request is prohibited by organization's policy. vpcServiceControlsUniqueIdentifier: 469805ea0e8f8332.",
  "status" : "PERMISSION_DENIED"
}
Calls: <Anonymous> ... exportConceptInformation -> tolower -> <Anonymous> -> <Anonymous> -> .jcheck
An error report has been created at  /opt/workdir/output_folder/errorReportR.txt
Merging 0 zip files.
- Unzipping NA
Error in zip::unzip(zipFiles$zipFile[i], exdir = unzipFolder) : 
  is_string(zipfile) is not TRUE
Calls: <Anonymous> -> <Anonymous> -> stopifnot

This is the full errorReport.txt

Thread: Main
Message:  Error in rJava::.jcall(resultSet, "Z", "next") :    java.sql.SQLException: [Simba][BigQueryJDBCDriver](100090) Dataset operations error. Message: 403 Forbidden GET https://bigquery.googleapis.com/bigquery/v2/projects/chc-nih-chest-xray/datasets?maxResults=1000 {   "code" : 403,   "errors" : [ {     "domain" : "global",     "message" : "VPC Service Controls: Request is prohibited by organization's policy. vpcServiceControlsUniqueIdentifier: 469805ea0e8f8332.",     "reason" : "policyViolation"   } ],   "message" : "VPC Service Controls: Request is prohibited by organization's policy. vpcServiceControlsUniqueIdentifier: 469805ea0e8f8332.",   "status" : "PERMISSION_DENIED" } Calls: <Anonymous> ... exportConceptInformation -> tolower -> <Anonymous> -> <Anonymous> -> .jcheck 
Level:  FATAL
Time:  2021-02-01 20:41:26

Stack trace:
8: stop(list("java.sql.SQLException: [Simba][BigQueryJDBCDriver](100090) Datas
7: .jcheck()
6: rJava::.jcall(resultSet, "Z", "next")
5: DatabaseConnector::getTableNames(connection, cdmDatabaseSchema)
4: tolower(DatabaseConnector::getTableNames(connection, cdmDatabaseSchema))
3: exportConceptInformation(connection = connection, cdmDatabaseSchema = cdmDa
2: CohortDiagnostics::runCohortDiagnostics(packageName = "MSKAI", connectionDe
1: MSKAI::runCohortDiagnostics(connectionDetails = connectionDetails, cdmDatab

R version:
R version 4.0.2 (2020-06-22)

Platform:
x86_64-pc-linux-gnu

Attached base packages:
- parallel
- stats
- graphics
- grDevices
- utils
- datasets
- methods
- base

Other attached packages:
- MSKAI (0.0.1)
- CohortDiagnostics (2.0.0)
- OhdsiSharing (0.2.2)
- FeatureExtraction (3.1.0)
- Andromeda (0.4.0)
- rJava (0.9-13)
- DatabaseConnector (3.0.0)
- SqlRender (1.6.6)
- ggplot2 (3.3.3)
- dplyr (1.0.3)
- devtools (2.3.0)
- usethis (1.6.1)
- MASS (7.3-51.6)

@schuemie
Copy link
Contributor

schuemie commented Feb 2, 2021

DatabaseConnector is simply calling getMetaData() and subsequently getTables() on the JDBC connection. This works fine on the BigQuery testing instance I have, so perhaps your admin needs to adjust some security settings?

@konstjar
Copy link

konstjar commented Feb 2, 2021

@jdposada The message "VPC Service Controls: Request is prohibited by organization's policy. " says that your environment is not able to connect to GCP BQ dataset. You may need VPN connection or do execution inside VPC network/environment.

@jdposada
Copy link
Author

jdposada commented Feb 2, 2021

Hi @schuemie,

Thank you for your prompt answer. Unfortunately, I am not the owner of the project so I cannot adjust my permissions to the cdmDataset. Thank you for that valuable info. Knowing that the functions are working in your BQ environment is a great relief. I did not even know that you had one. What an awesome thing!

Hi @konstjar ,

I have executed everything up to this point, including cohort creation, incidence rates, cohort overlap, etc.... so I do have permission to read and retrieve data from the dataset. Which kind of permissions getMetaData() and getTables() will need? I have been able to create tables on the project where the dataset lives. Right now the cohort table has 169k rows. I have BQ Data Viewer permissions to the cdmDataset. Also, the dataset mentioned here is not even related to the project

//bigquery.googleapis.com/bigquery/v2/projects/chc-nih-chest-xray/datasets?maxResults=1000

In any case. if VPC were a big issue I could not even run the study up to this point unless the functions mentioned above need a special set of permissions like BQ Data Owner, which may be the case. Could you confirm?

Thanks

Jose

@konstjar
Copy link

konstjar commented Feb 4, 2021

The issue was solved by granting more powerful role. I will create a description for DatabaseConnector repo about BigQuery Roles required.

@jdposada jdposada closed this as completed Feb 4, 2021
@jdposada jdposada reopened this Feb 5, 2021
@jdposada
Copy link
Author

jdposada commented Feb 5, 2021

Thought the issue was solved. Even with elevated permission I still see the same issue.

@ablack3
Copy link

ablack3 commented Apr 15, 2022

@jdposada did you find a solution?

@jdposada
Copy link
Author

jdposada commented Apr 16, 2022

I did find a solution. It was having a service account with limited access to the datasets you are going to use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants