Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xds: introduce simple grpc transport for generic xds clients #8066

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

purnesh42H
Copy link
Contributor

@purnesh42H purnesh42H commented Feb 5, 2025

This PR is adding a simple grpc-based transport implementation of generic xds clients transport builder which can be provided as transport builder in xDS and LRS client.

The build() method currently creates a new grpc channel every time. However, in future we will incorporate reference count map for existing transports and deduplicate transports based on provided ServerConfig so that transport channel to same server can be shared among xDS and LRS client.

POC
Internal Design

RELEASE NOTES: None

Copy link

codecov bot commented Feb 5, 2025

Codecov Report

Attention: Patch coverage is 84.90566% with 8 lines in your changes missing coverage. Please review.

Project coverage is 82.48%. Comparing base (c7db760) to head (5050360).
Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
...s/internal/clients/grpctransport/grpc_transport.go 84.90% 6 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8066      +/-   ##
==========================================
+ Coverage   82.23%   82.48%   +0.25%     
==========================================
  Files         387      388       +1     
  Lines       38947    38995      +48     
==========================================
+ Hits        32028    32165     +137     
+ Misses       5601     5532      -69     
+ Partials     1318     1298      -20     
Files with missing lines Coverage Δ
...s/internal/clients/grpctransport/grpc_transport.go 84.90% <84.90%> (ø)

... and 30 files with indirect coverage changes

@purnesh42H purnesh42H added Area: xDS Includes everything xDS related, including LB policies used with xDS. Type: Feature New features or improvements in behavior labels Feb 5, 2025
@purnesh42H purnesh42H added this to the 1.71 Release milestone Feb 5, 2025
@purnesh42H purnesh42H force-pushed the generic-xds-client-grpc-transport branch from ad889b4 to 377a648 Compare February 5, 2025 06:06
@purnesh42H purnesh42H requested a review from dfawley February 5, 2025 06:07
@purnesh42H purnesh42H force-pushed the generic-xds-client-grpc-transport branch 4 times, most recently from fcbf470 to efeddd1 Compare February 5, 2025 09:24
@purnesh42H purnesh42H force-pushed the generic-xds-client-grpc-transport branch 6 times, most recently from a6648d7 to c63118f Compare February 6, 2025 19:31
wantErr bool
}{
{
name: "ServerURI_empty",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/_/ / throughout? t.Run will convert them to underscores to make it a legal test name -- just make sure they're human-readable here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made it more like sentence. Was trying to follow go/go-style/decisions#subtest-names

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something that I learnt recently as well, and though it feels a little weird in the beginning to have subtest names that have underscrores, when it comes time to run them individually, it makes a lot of sense as direct copy paste works.

Comment on lines 195 to 274
if err := stream.Send(req); err != nil {
t.Fatalf("failed to send message: %v", err)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also make this do a Recv? Codecov shows that's not covered.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. I wanted to check on this that for Recv, we need a grpc enabled xDS server with byte-based handlers. Should we add an xDS server with byte-based handler or just have a test grpc server to test this? Eventually for e2e tests I think we do need a test xDS server because go-control-plane one proto based.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this test, we could do literally any kind of grpc server. It could be like the stub server, or whatever. Byte-based is not relevant. The server should ideally do proto as is the default, and we can serialize/deserialize proto messages into byte buffers on the client manually for the testing.

True, for e2e tests of xdsclient, we will need to use go-control-plane. If we have testing helpers that make it easy to use, that's fine but, I'd prefer to copy them into the clients subtree and try to make sure we don't import things from the rest of the grpc tree. Otherwise it will be really hard to separate when it's time to move it out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this test. I have created a test server using the same byteCodec to receive and send bytes and written a single test for send and receive. Let me know if that's good enough.

For e2e tests I think we will need to have management server using go-control-plane but that can be separate PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The server should ideally do proto as is the default

I have created a test server using the same byteCodec

Is there a reason you didn't follow the way I suggested? A normal proto server would be less testing code (which is good) and better proof the byte codec is working as intended.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kept the test logic same and change the test server to implement AggregatedDiscoveryServiceServer and handle StreamAggregatedResources

"time"

"google.golang.org/grpc/credentials/insecure"
"google.golang.org/grpc/internal/grpctest"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid using anything from grpc/internal. That includes grpctest and e2e. We won't be able to move this out like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah intend it remove. Regarding grpctest, should we just remove subtests for now or later? Regarding e2e, as i mentioned I think we need to add grpc enabeld xds server.

@dfawley dfawley assigned purnesh42H and unassigned dfawley Feb 6, 2025
@purnesh42H purnesh42H force-pushed the generic-xds-client-grpc-transport branch from e892817 to 5db47df Compare February 7, 2025 11:41
@purnesh42H purnesh42H requested a review from dfawley February 7, 2025 11:48
@purnesh42H purnesh42H force-pushed the generic-xds-client-grpc-transport branch 5 times, most recently from 4d47a17 to 4ca2a7c Compare February 10, 2025 05:48
@purnesh42H purnesh42H assigned dfawley and unassigned purnesh42H Feb 10, 2025
@purnesh42H purnesh42H requested a review from easwars February 18, 2025 04:39
@purnesh42H purnesh42H force-pushed the generic-xds-client-grpc-transport branch from 3bc9087 to f1abdf0 Compare February 18, 2025 07:12
}

func (c *byteCodec) Name() string {
return "byteCodec"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use a more specific name to avoid possible name conflicts with a similar byte codec used by a user? Maybe something like "grpc.xds.internal.clients.grpctransport.byte_codec" that mirrors the package path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense. Changed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon thinking more about this, I'm not sure how I feel about the name that I suggested. This string will be part of every HEADERS frame, sent for every new stream being created. I'm wondering if we should at all be concerned about sending such a long name in the header, for performance reasons.

wantErr bool
}{
{
name: "ServerURI_empty",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something that I learnt recently as well, and though it feels a little weird in the beginning to have subtest names that have underscrores, when it comes time to run them individually, it makes a lot of sense as direct copy paste works.

if err := proto.Unmarshal(res, &gotRes); err != nil {
t.Fatalf("failed to unmarshal response from ts.requestChan to DiscoveryRequest: %v", err)
}
if !cmp.Equal(testDiscoverResponse, &gotRes, protocmp.Transform()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is just way too much coupling between the test logic and the logic in the test server. This results in very brittle tests.

What is the eventual plan? Is it to use the go-control-plane as the management server here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, once we have xds client implemented, we will be able to use go-control-plane in e2e tests. For now, having the code coverage with simple grpc server is enough.

After last commit, response to be returned by StreamAggregatedResources handler can be controlled by test as it is a param in testServer. Regarding having test response as part of testServer struct, we can have implementation of StreamAggregatedResources within test inline so then we don't need response field in testServer.

@easwars easwars assigned purnesh42H and unassigned easwars and dfawley Feb 18, 2025
@arjan-bal arjan-bal modified the milestones: 1.71 Release, 1.72 Release Feb 19, 2025
@purnesh42H purnesh42H assigned easwars and unassigned purnesh42H Feb 19, 2025
@purnesh42H purnesh42H requested a review from easwars February 19, 2025 19:33
@purnesh42H purnesh42H force-pushed the generic-xds-client-grpc-transport branch from 3122cda to 9049d97 Compare February 20, 2025 18:04
// The Extension field of the ServerConfig must be a ServerConfigExtension.
func (b *Builder) Build(sc clients.ServerConfig) (clients.Transport, error) {
if sc.ServerURI == "" {
return nil, fmt.Errorf("ServerConfig's ServerURI field cannot be empty")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go/go-style/decisions#error-strings

The style guide does say "Error strings should not be capitalized unless beginning with an exported name". But this error is happening deep inside the stack, and will definitely be wrapped in some other error. So, this capitalization will not look great when the final error is passed to the user.

Can we do better here?

It is customary for Go packages to prefix their errors with their package name. So, one option is to do that for all errors generated by this package. So, errors here can be fmt.Errorf("grpctransport: missing field ServerConfig.ServerURI"). Or if we take a leaf out of the tls package, they could be fmt.Errorf("grpctransport: ServerURI not set in ServerConfig")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Modified all the error messages in format "grpctransport: ....not set in ....."

if b, ok := v.([]byte); ok {
return b, nil
}
return nil, fmt.Errorf("message must be a byte slice")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we identify the type of v in the error message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

*b = data
return nil
}
return fmt.Errorf("target must be a pointer to a byte slice")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Identify the offending type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

})
cc, err := grpc.NewClient(sc.ServerURI, kpCfg, grpc.WithCredentialsBundle(sce.Credentials), grpc.WithDefaultCallOptions(grpc.ForceCodec(&byteCodec{})))
if err != nil {
return nil, fmt.Errorf("error creating grpc client for server uri %s, %v", sc.ServerURI, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error string could be grpctransport: failed to create transport to server %q: %v", sc.ServerURI, err

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// setupTestServer set up the gRPC server for AggregatedDiscoveryService. It
// creates an instance of testServer that returns the provided response from
// the StreamAggregatedResources() handler and registers it with a gRPC server.
func setupTestServer(t *testing.T, response *v3discoverypb.DiscoveryResponse) *testServer {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please mark as a test helper. See: go/go-style/decisions#mark-test-helpers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if (err != nil) != test.wantErr {
t.Fatalf("Build() error = %v, wantErr %v", err, test.wantErr)
}
if tr != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we can simplify all this complex logic by splitting the success and failure cases into separate tests? See: https://g3doc.corp.google.com/company/teams/go-community/gotip/episodes/50.md?cl=head

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified. For success, also added check to verify if grpc.ClientConn is not nil.

}

// TestBuild verifies that the grpctransport.Builder creates a new
// grpc.ClientConn every time Build() is called.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

every time Build() is called
I don't see this being verified or even exercised here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not applicable now.

}
}

// TestStream_SendAndRecv verifies that grpcTransport.Stream.Send()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nits. I don't think we have to identify function/method names with their package and type prefixes like grpcTransport.Stream.Send. Just Send and Recv should be good enough I feel.

Applies to many such comments across this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// on the stream to and from the gRPC server.
//
// It starts a gRPC test server using setupTestServer(). The test then sends a
// testDiscoverRequest on the stream and verifies that the received discovery
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/received discovery/received discovery request/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

t.Fatalf("<-ts.requestChan = %v, want %v", &gotReq, &testDiscoverRequest)
}
case <-ctx.Done():
t.Fatalf("timeout waiting for request to reach server")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/timeout/Timeout

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

select {
case gotReq := <-ts.requestChan:
if !cmp.Equal(gotReq, testDiscoverRequest, protocmp.Transform()) {
t.Fatalf("<-ts.requestChan = %v, want %v", &gotReq, &testDiscoverRequest)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test error strings should be readable. <-ts.requestChan is not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching. Modified to display diff.

@easwars easwars assigned purnesh42H and unassigned easwars Feb 21, 2025
@purnesh42H purnesh42H force-pushed the generic-xds-client-grpc-transport branch from 2077260 to 5050360 Compare February 21, 2025 08:52
@purnesh42H purnesh42H requested a review from easwars February 21, 2025 08:56
@purnesh42H purnesh42H assigned easwars and unassigned purnesh42H Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: xDS Includes everything xDS related, including LB policies used with xDS. Type: Feature New features or improvements in behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants