Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MongoDbToBigQuery - ISO standard for date and time #2132

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
set paramater default to true
  • Loading branch information
OreOreDa committed Jan 13, 2025
commit fc7c64b0b5ceed91bedd9dfc0255a829349d491c
Original file line number Diff line number Diff line change
@@ -164,9 +164,9 @@ public interface JavascriptDocumentTransformerOptions extends PipelineOptions {
optional = true,
description = "Use legacy time format.",
helpText =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description is "use legacy time format" but the option name is "useIsoTimeFormat". The helpText (default is true) is also not true. Please fix.

"Legacy document conversion does not use ISO format for DateTime (https://github.com/mongodb/mongo-java-driver/blob/main/bson/src/main/org/bson/json/ExtendedJsonDateTimeConverter.java) and Timestamp (https://github.com/mongodb/mongo-java-driver/blob/main/bson/src/main/org/bson/json/ExtendedJsonTimestampConverter.java). Set this parameter to false to use ISO standard for these data types (default is true).")
Boolean getUseLegacyTimeFormat();
"Legacy document conversion does not use ISO format for DateTime (https://github.com/mongodb/mongo-java-driver/blob/main/bson/src/main/org/bson/json/ExtendedJsonDateTimeConverter.java) and Timestamp (https://github.com/mongodb/mongo-java-driver/blob/main/bson/src/main/org/bson/json/ExtendedJsonTimestampConverter.java). Set this parameter to false to use legacy conversion (default is true).")
Boolean getUseIsoTimeFormat();

void setUseLegacyTimeFormat(Boolean useLegacyTimeFormat);
void setUseIsoTimeFormat(Boolean useIsoTimeFormat);
}
}
Original file line number Diff line number Diff line change
@@ -107,7 +107,7 @@ public static boolean run(Options options)
throws ScriptException, IOException, NoSuchMethodException {
Pipeline pipeline = Pipeline.create(options);
String userOption = options.getUserOption();
Boolean useLegacyTimeFormat = options.getUseLegacyTimeFormat();
Boolean useIsoTimeFormat = options.getUseIsoTimeFormat();

TableSchema bigquerySchema;

@@ -169,7 +169,7 @@ public void process(ProcessContext c) {
MongoDbUtils.getTableSchema(
document,
userOption,
Optional.ofNullable(useLegacyTimeFormat).orElse(Boolean.TRUE));
Optional.ofNullable(useIsoTimeFormat).orElse(Boolean.TRUE));
c.output(row);
}
}))
Original file line number Diff line number Diff line change
@@ -79,7 +79,7 @@ public class MongoDbUtils implements Serializable {

private static final Gson GSON = new Gson();

private static final JsonWriterSettings JSON_WRITER_SETTINGS =
private static final JsonWriterSettings JSON_WRITER_SETTINGS_ISO_FORMAT =
JsonWriterSettings.builder()
.dateTimeConverter(new JsonDateTimeConverter())
.timestampConverter(new JsonTimestampConverter())
@@ -133,7 +133,7 @@ public static Document getMongoDbDocument(String uri, String dbName, String coll
}

public static TableRow getTableSchema(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a unit test for legacy / iso time format?

Document document, String userOption, Boolean useLegacyTimeFormat) {
Document document, String userOption, Boolean useIsoTimeFormat) {
TableRow row = new TableRow();
LocalDateTime localDate = LocalDateTime.now(ZoneId.of("UTC"));
if (userOption.equals("FLATTEN")) {
@@ -155,9 +155,9 @@ public static TableRow getTableSchema(
break;
case "org.bson.Document":
String data =
useLegacyTimeFormat
? GSON.toJson(value)
: ((Document) value).toJson(JSON_WRITER_SETTINGS);
useIsoTimeFormat
? ((Document) value).toJson(JSON_WRITER_SETTINGS_ISO_FORMAT)
: GSON.toJson(value);
row.set(key, data);
break;
default:
@@ -167,9 +167,9 @@ public static TableRow getTableSchema(
row.set("timestamp", localDate.format(TIMEFORMAT));
} else if (userOption.equals("JSON")) {
JsonObject sourceDataJsonObject =
useLegacyTimeFormat
? GSON.toJsonTree(document).getAsJsonObject()
: GSON.fromJson(document.toJson(JSON_WRITER_SETTINGS), JsonObject.class);
useIsoTimeFormat
? GSON.fromJson(document.toJson(JSON_WRITER_SETTINGS_ISO_FORMAT), JsonObject.class)
: GSON.toJsonTree(document).getAsJsonObject();

// Convert to a Map
Map<String, Object> sourceDataMap =
@@ -180,7 +180,9 @@ public static TableRow getTableSchema(
.set("timestamp", localDate.format(TIMEFORMAT));
} else {
String sourceData =
useLegacyTimeFormat ? GSON.toJson(document) : document.toJson(JSON_WRITER_SETTINGS);
useIsoTimeFormat
? document.toJson(JSON_WRITER_SETTINGS_ISO_FORMAT)
: GSON.toJson(document);

row.set("id", document.get("_id").toString())
.set("source_data", sourceData)