Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MIN-2432] Enum schema evolution Python fork #1

Draft
wants to merge 18 commits into
base: json_unknown_enum_fork_baseline
Choose a base branch
from

Conversation

antongrbin
Copy link

@antongrbin antongrbin commented Apr 19, 2022

DON'T MERGE THIS PR

This is PR is just a visual representation of our JSON enum fork. Learn more in the original Tech Doc.

Motivation

This PR helps us visualize code changes that we've made to protobuf library wrt JSON schema evolution.

Tech Doc:
https://docs.google.com/document/d/1p5LUSTWrVBcT9F2wXBgHBDT3OJyOm5BpXAyTpwwDVts/edit#

Setup

The baseline branch json_unknown_enum_fork_baseline is pinned to a version of the upstream repository on the version specified by:
https://github.com/noom/noom-contracts/blob/master/PROTOBUF_VERSION

This can be checked by running a command similar to git diff json_unknown_enum_fork_baseline v3.18.1, which should always produce empty results.

The head branch for this PR json_unknown_enum_fork, shows all changes that we've made to the upstream repository.

Changes

If you want to make changes to our fork, create a PR that merges to json_unknown_enum_fork.

Install

If you want to install or update the fork to our noom-contracts repo, run the following script:

  • ./lib/python/fetch_forked_json_format.sh, create a PR, review it and merge it.

Testing in protobuf repo

# Compile the libprotoc library
$ brew install automake libtool
$ ./autogen.sh
$ ./configure
$ make check
$ make
$ ./tests.sh cpp

# Run python tests, make sure you have python3.9 available in your environment
$ virtualenv ~/.protobuf_venv
$ source ~/.protobuf_venv/bin/activate
$ pip install tox
$ ./tests.sh python

When does the fork end?

Once protobuf team decides what to do with JSON enum schema evolution, we will end the fork and implement whatever they decide upstream. They will announce it here:
protocolbuffers#7392

@antongrbin antongrbin marked this pull request as draft April 19, 2022 12:28
@antongrbin antongrbin changed the title [MIN-2432] Enum schema evolution fork [MIN-2432] Enum schema evolution Python fork Apr 22, 2022
antongrbin and others added 6 commits April 22, 2022 16:58
…terface

# Note to reviewer

* The baseline here is a fork, please review documentation on #1 and [tech doc section](https://docs.google.com/document/d/1p5LUSTWrVBcT9F2wXBgHBDT3OJyOm5BpXAyTpwwDVts/edit#heading=h.b8v2uff99l8v).

# Motivation

Reducing the dependencies that `json_format.py` is making is helping us create a more stable fork of that file in noom-contracts.

See concrete problem [here](https://github.com/noom/noom-contracts/runs/6651503347?check_suite_focus=true), where `mypy` fails because it can't find type hints on the `google.protobuf.internal` library.

See code comment above `TruncateToFourByteFloat`

# Testing completed

Ran python unit tests as described in test plan in #1.
antongrbin and others added 6 commits June 20, 2022 17:53
# Motivation

Tech doc:
https://docs.google.com/document/d/1p5LUSTWrVBcT9F2wXBgHBDT3OJyOm5BpXAyTpwwDVts/edit#

# Changes

In this PR we actually fix the Python protobuf library not to break when unknown enum value is encountered if ignoreUnknownFields is true.

The design principles for the change were:
* make the change as small as possible, so it's easy to keep up to date with the upstream
* if one element is unknown in an array of elements, other elements should still parse (see unit tests)

Changes:
* Change `_ConvertScalarFieldValue` so it throws a special exception when unknown enum string value is encountered.
* Wrap every invocation of `_ConvertScalarFieldValue` to optionally ignore the new exception thrown.
* Add unit tests

To write this PR, I reached out for input on #python-club [on slack](https://noom.slack.com/archives/C020854DJDU/p1655740069429799)

# Tested

Executed the test plan from here: #1 to test this within the `protobuf` repo.

Also tested as part of `noom-contracts` repo: https://github.com/noom/noom-contracts/pull/293
antongrbin pushed a commit that referenced this pull request Jan 10, 2024
antongrbin pushed a commit that referenced this pull request Jan 10, 2024
Before, every charAt would emit (on android):
```
    0x00002104    adrp x17, #+0x1000 (addr 0x3000)
    0x00002108    ldr w17, [x17, protocolbuffers#20]
    0x0000210c    ldr x0, [x0, protocolbuffers#128]
    0x00002110    ldr x0, [x0, protocolbuffers#328]
    0x00002114    ldr lr, [x0, protocolbuffers#24]
    0x00002118    blr lr <-- Call into String.charAt(int)
```
Now, it emits the inlined implementation of charAt (branch is for possibly compressed strings):
```
    0x000020b4    ldur w16, [x4, #-8]
    0x000020b8    tbnz w16, #0, #+0xc (addr 0x20c4)
    0x000020bc    ldrb w4, [x4, x0]
    0x000020c0    b #+0x8 (addr 0x20c8)
    0x000020c4    ldrh w4, [x4, x0, lsl #1]
```

PiperOrigin-RevId: 591147406
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant