Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

document has bytes remaining that were not visited #481

Closed
LinusU opened this issue Jul 3, 2024 · 7 comments
Closed

document has bytes remaining that were not visited #481

LinusU opened this issue Jul 3, 2024 · 7 comments
Assignees

Comments

@LinusU
Copy link

LinusU commented Jul 3, 2024

Versions/Environment

  1. What version of Rust are you using? 1.78.0
  2. What operating system are you using? Amazon Linux 2
  3. What versions of the driver and its dependencies are you using?
    • registry+https://github.com/rust-lang/crates.io-index#mongodb@3.0.0
    • registry+https://github.com/rust-lang/crates.io-index#bson@2.11.0
  4. What version of MongoDB are you using? 5.0.26
  5. What is your MongoDB topology (standalone, replica set, sharded cluster, serverless)? replica set

Describe the bug

For one specific document in our database, we are getting the following error:

document has bytes remaining that were not visited: 2344

I have tried to dump the document as BSON from the database, but I can read that just fine. We are however using a projection, so my working theory is that our database server is giving some invalid bson for that specific projection on that specific query 🤔

We have ~6.8 million documents that works, and one that doesn't 😅

We also tried removing the document, and inserting it again, but the same error is still happening when we query it out with a projection.

I would love to get some more help on how to debug this! e.g. how I can dump the exact BSON that's being parsed.

BE SPECIFIC:

  • What is the expected behavior and what is actually happening?
    • Expected behaviour is for the BSON to parse, but the actual behavior is that it gives the error pasted above
  • Do you have any particular output that demonstrates this problem?
    • Unfortunately not at this time, but would love help on how to get this!
  • Do you have any ideas on why this may be happening that could give us a
    clue in the right direction?
    • My working theory is that there is an incompatibility with the serializer used in MongoDB 5.0.26, and this BSON deserializer
  • Did this issue arise out of nowhere, or after an update (of the driver,
    server, and/or Rust)?
    • The issue arose out of nowhere; that is, no versions have been upgraded in a long time
  • Are there multiple ways of triggering this bug (perhaps more than one
    function produce a crash)?
    • So far, this is the only document out of ~6.8 million that we have observed this error on
  • If you know how to reproduce this bug, please include a code snippet here:
    • Unfortunately, not at this time...

To Reproduce
Unfortunately, I haven't figure out how to reproduce this at this moment.

@isabelatkinson
Copy link
Contributor

Hey @LinusU, thanks for opening this issue! To clarify my understanding, are the following correct?

  • The document in question can be deserialized properly when you read it from your database without a projection
  • The document in question cannot be deserialized when your query uses a projection

To determine whether the issue is coming from the driver specifically, can you run the query with the projection that's giving you problems in mongosh? If that succeeds, then there's likely a bug in our deserialization logic; the error you're receiving comes from this line in the Rust BSON library. If that's the case, whatever information you can provide for us about the projected document would help in diagnosing what's going wrong on our end.

@isabelatkinson
Copy link
Contributor

One more question: what is the generic type of the collection you're using for the query? Does the behavior change when using Collection<Document> or Collection<RawDocumentBuf>?

Copy link

There has not been any recent activity on this ticket, so we are marking it as stale. If we do not hear anything further from you, this issue will be automatically closed in one week.

@github-actions github-actions bot added the Stale label Jul 11, 2024
Copy link

There has not been any recent activity on this ticket, so we are closing it. Thanks for reaching out and please feel free to file a new issue if you have further questions.

@LinusU
Copy link
Author

LinusU commented Jul 30, 2024

@isabelatkinson so sorry for the late reply on this, unfortunately it happened during the start of my vacation.

This has now happened to another document.

It seems like the projection was a red herring, after more investigation I've concluded that this happens with or without the projection. I've managed to trim it down to the code below:

#[tokio::main]
async fn main() -> Result<(), lambda_runtime::Error> {
    let client = Client::with_uri_str("mongodb+srv://......").await?;
    let db = client.default_database().unwrap();

    let mut cursor = db
        .collection::<Order>("Order")
        .find(doc! { "_id": "xyz123" })
        .await?;

    while cursor.advance().await? {
        println!("Error is on the line below");
        cursor.deserialize_current()?;
        println!("Error is on the line above");
    }

    Ok(())
}

This code gives the following output:

Error is on the line below
Error: Error { kind: BsonDeserialization(DeserializationError { message: "document has bytes remaining that were not visited: 2109" }), labels: {}, wire_version: None, source: None }

Order is a custom struct using Serde deserialization traits. Using Document or RawDocumentBuf works without an error.

With this information I was able to further narrow it down to the following line:

  #[serde(borrow)]
  pub refund_receipts: Option<[OrderReceipt<'a>; 1]>,

This code wasn't built to handle anything other than 0 or 1 refund receipts, but the document in question had two. So in a sense this was user error, just a bit hard to understand and track down 😅


I think that it would be awesome if the error message could be updated, but otherwise I'm able to work around this myself now. Thanks for your help!

@isabelatkinson
Copy link
Contributor

Thanks for the additional information! Agreed that this is not a useful error message; I filed RUST-2007 to investigate how to improve it.

@LinusU
Copy link
Author

LinusU commented Jul 31, 2024

Thanks! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants