Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Program deletes items from array incorrectly, fixes if we print the array (RC issue?) #7192

Open
sirasistant opened this issue Jan 27, 2025 · 8 comments
Labels
bug Something isn't working

Comments

@sirasistant
Copy link
Contributor

sirasistant commented Jan 27, 2025

Aim

Doing unrelated changes to a kernel circuit, triggered this bug.

Expected Behavior

No corruption

Bug

  • Clone this branch https://github.com/AztecProtocol/aztec-packages/compare/arv/bug_repro
  • Build the nargo under noir/noir-repo (cargo build --workspace --release)
  • With the nargo present in the subrepo, do in noir-projects/noir-protocol-circuits ../../noir/noir-repo/target/release/nargo execute --package private_kernel_inner_simulated
  • It should pass (the Prover.toml is commited)
  • Now search for the print under "BUG: If we delete this print, the resoluting note_hashes bounded vec is missing the original items"
  • Delete it, run the inner simulated again, should fail.
  • Now run the inner not simulated (acir) ../../noir/noir-repo/target/release/nargo execute --package private_kernel_inner_simulated
  • It should pass

Possibly related to:
AztecProtocol/aztec-packages#10558

To Reproduce

Workaround

None

Workaround Description

No response

Additional Context

No response

Project Impact

None

Blocker Context

No response

Nargo Version

No response

NoirJS Version

No response

Proving Backend Tooling & Version

No response

Would you like to submit a PR for this Issue?

None

Support Needs

No response

@sirasistant sirasistant added the bug Something isn't working label Jan 27, 2025
@github-project-automation github-project-automation bot moved this to 📋 Backlog in Noir Jan 27, 2025
@TomAFrench
Copy link
Member

Hey Alvaro, could you try and minimize this reproduction so that we can pull it out of aztec-packages to track it here?

Also have you confirmed that this issue exists in the builds of nargo from this repository?

@sirasistant
Copy link
Contributor Author

sirasistant commented Jan 27, 2025

The behavior with current noir repo master nargo is "worse": always fails in unconstrained nargo execute --package private_kernel_inner_simulated and still passes in constrained nargo execute --package private_kernel_inner. So the oracle addition doesn't work anymore.
I'll try to minimize it but is probably nontrivial and depends on random parts of the code, since I didn't modify this part

@aakoshh
Copy link
Contributor

aakoshh commented Jan 27, 2025

Sounds very similar to #7163

@aakoshh
Copy link
Contributor

aakoshh commented Jan 28, 2025

NOTE: I merged sync-noir @ 39f42b898fafb66a0d4d4ffd79a39cd2938e97ed into this branch locally to test it.

After some debugging, the problem now isn't that the item is deleted; it's that it's not added to a copy, but rather the original array. It's a bit difficult to follow the route of the data through the code, but roughly:

  1. The circuit gets the previous kernel data and a new private call
  2. Builds a composer by copying the note hashes from previous kernel data from arrays into a BoundedVec
  3. Appends the note hashes from the private call to the BoundedVec
  4. Creates the new output by turning the BoundedVec back into an array
  5. Validates that it has done the right thing by taking the count of how many notes there were in the previous kernel data, then
  6. Check that the notes in the output past the previous count are equal to the notes in the private call

Where things go wrong is that in step 3 instead of appending to a new array, we append to the previous data array as storage. Then, in step 5, it looks like the previous data storage already had those items, and then in step 6 we try to compare notes beyond where the data was appended, skipping the added items, comparing empty ones to the private call.

As a concrete example the Prover.toml has 0 previous notes, 1 new note, which gets correctly added to the output, but then instead of comparing output[0] == new[0] we compare output[1] == new[0].

As a reminder in #7163 the problem was that the array was similarly used as storage for the BoundedVec, but in that case the call_stack item was pop()'ed from the original, instead of a acting on a copy.

I'll now try to see what this looks like in the SSA, and then whether I can add it to the test case.

@aakoshh
Copy link
Contributor

aakoshh commented Jan 28, 2025

For reference what propagate_note_hashes (where the BUG: is) is trying to do:

    fn propagate_note_hashes(&mut self, private_call_public_inputs: PrivateCircuitPublicInputs) {
        // BUG: If we delete this print, the circuit will fail or pass, depending on nargo version
        // println(self.public_inputs.end.note_hashes);
        let note_hashes = private_call_public_inputs.note_hashes;
        for i in 0..note_hashes.len() {
            let note_hash = note_hashes[i];
            if note_hash.value != 0 {
                self.public_inputs.end.note_hashes.push(note_hash.scope(
                    private_call_public_inputs.call_context.contract_address,
                ));
            }
        }
    }

Here's a skeleton of its initial SSA:

brillig(inline) fn propagate_note_hashes f177 {
  b0(..., v49: &mut [(Field, u32, Field); 64], ..., v84: [(Field, u32); 16], ...):
    ...
    v135 = load v49 -> [(Field, u32, Field); 64]  // self.public_inputs.end.note_hashes
    inc_rc v135
    ...
    jmp b1(u32 0)
  b1(v133: u32):
    v209 = lt v133, u32 16
    jmpif v209 then: b3, else: b2
  b2():
    v210 = load v49 -> [(Field, u32, Field); 64]
    dec_rc v210
    ...
    return                                                                 // Nothing returned
  b3():
    v213 = unchecked_mul v133, u32 2
    v214 = array_get v84, index v213 -> Field     // private_call_public_inputs.note_hashes[i]
    v216 = unchecked_add v213, u32 1
    v217 = array_get v84, index v216 -> u32
    v219 = eq v214, Field 0
    v220 = not v219
    jmpif v220 then: b4, else: b5
  b4():
    v222, v223, v224 = call f180(v214, v217, v72) -> (Field, u32, Field)
    call f179(v49, v50, v222, v223, v224)        // self.public_inputs.end.note_hashes.push(...) 
    jmp b5()
  b5():
    v226 = unchecked_add v133, u32 1
    jmp b1(v226)
}

What's different from the previous case is that the array in question does not appear in an output position in this function, which means the fix applied in that issue, which is to preserve inc_rc for arrays in the terminator, does not apply to the inc_rc v135 in b0.

@aakoshh
Copy link
Contributor

aakoshh commented Jan 28, 2025

After preprocessing function, the array does seem to appear in an output position of generate_output:

brillig(inline) fn generate_output f3 {
  b0(..., v46: [(Field, u32, Field); 64], ...):
  ...
  b66():
    ...
    inc_rc v46
    ...
    v613 = allocate -> &mut [(Field, u32, Field); 64]      // self.public_inputs.end.note_hashes
    store v46 at v613
    ...
    jmp b67(u32 0)
  b67(v167: u32):
    v636 = lt v167, u32 16
    jmpif v636 then: b79, else: b68
  b68():
    v637 = load v613 -> [(Field, u32, Field); 64]
    dec_rc v637
    ...
    jmp b69(u32 0)
  b69(v168: u32):                                 // for i in 0..note_hashes.len() 
    v639 = lt v168, u32 16
    jmpif v639 then: b76, else: b70
  b70():
    v640 = load v613 -> [(Field, u32, Field); 64] // self.public_inputs.end.note_hashes
    dec_rc v640
    ...
    inc_rc v640                                   // inc_rc preserved, but it follows dec_rc and IIRC it's a signed counter
    v642 = load v614 -> u32                       // len of the BoundedVec
    v683 = make_array b"{\"kind\":\"struct\",\"name\":\"BoundedVec\",\"fields\":[[\"storage\",{\"kind\":\"array\",\"length\":64,\"type\":{\"kind\":\"struct\",\"name\":\"ScopedNoteHash\",\"fields\":[[\"note_hash\",{\"kind\":\"struct\",\"name\":\"NoteHash\",\"fields\":[[\"value\",{\"kind\":\"field\"}],[\"counter\",{\"kind\":\"unsignedinteger\",\"width\":32}]]}],[\"contract_address\",{\"kind\":\"struct\",\"name\":\"AztecAddress\",\"fields\":[[\"inner\",{\"kind\":\"field\"}]]}]]}}],[\"len\",{\"kind\":\"unsignedinteger\",\"width\":32}]]}"
    call print(u1 1, v640, v642, v683, u1 0)
    jmp b71(u32 0)
  b71(v169: u32):
    v685 = lt v169, u32 16
    jmpif v685 then: b73, else: b72
  b72():                                          // return block
    v686 = load v613 -> [(Field, u32, Field); 64] // self.public_inputs.end.note_hashes
    dec_rc v686
    ...
    dec_rc v686
    ...
    dec_rc v686
    ...
    inc_rc v686
    ...
    inc_rc v686
    ...
    return ..., v686, ...                         // array in return position
  b73():                                          // let note_hash = note_hashes[i];
    v746 = unchecked_mul v169, u32 2
    v747 = array_get v78, index v746 -> Field
    v748 = unchecked_add v746, u32 1
    v749 = array_get v78, index v748 -> u32
    v750 = eq v747, Field 0
    jmpif v750 then: b75, else: b74
  b74():                                           // if note_hash.value != 0
    v751 = load v613 -> [(Field, u32, Field); 64]  // self.public_inputs.end.note_hashes
    v752 = load v614 -> u32                        // len of the BoundedVec
    v753 = lt v752, u32 64
    constrain v753 == u1 1, "push out of bounds"
    v754 = unchecked_mul v752, u32 3
    v755 = array_set v751, index v754, value v747  // self.public_inputs.end.note_hashes.push(...)
    v756 = unchecked_add v754, u32 1
    v757 = array_set v755, index v756, value v749
    v758 = unchecked_add v756, u32 1
    v759 = array_set v757, index v758, value v66
    v760 = add v752, u32 1
    store v759 at v613
    store v760 at v614
    jmp b75()
  b75():                                           // else of `if note_hash.value != 0``
    v761 = unchecked_add v169, u32 1
    jmp b71(v761)

inc_rcs are here and there, alternating with dec_rc. IIRC the RC can be negative and go back to 0; not sure, but could be 0 when the array_set is called, perhaps. Actually if I go through it top to bottom, it seems to be that the RC would be 1 when it mutates.

I'll see what's different when the print is removed, and the circuit passes.

@aakoshh
Copy link
Contributor

aakoshh commented Jan 28, 2025

Here's the difference when the print is removed, as far as I can see it:

brillig(inline) fn generate_output f3 {
  b0(..., v46: [(Field, u32, Field); 64], ...):
  ...
  b66():
    ...
    inc_rc v46
    ...
    v613 = allocate -> &mut [(Field, u32, Field); 64]
    store v46 at v613
    ...
    jmp b67(u32 0)
  b67(v167: u32):
    v636 = lt v167, u32 16
    jmpif v636 then: b79, else: b68
  b68():
    v637 = load v613 -> [(Field, u32, Field); 64]
    dec_rc v637
    ...
    jmp b69(u32 0)
  b69(v168: u32):
    v639 = lt v168, u32 16
    jmpif v639 then: b76, else: b70
  b70():
    v640 = load v613 -> [(Field, u32, Field); 64]
    dec_rc v640
    ...                                            // inc_rc removed because `print` is removed
    jmp b71(u32 0)
  b71(v169: u32):
  ...
  // array_store in b74 is the same as above

It's literally just once inc_rc missing, which I think results in the RC being 0 or negative when the array_store happens. IIRC it only mutates the array if the RC is exactly 1, in which case it won't be, as there are inc_rc, dec_rc, dec_rc by the time we get there.

@aakoshh
Copy link
Contributor

aakoshh commented Jan 28, 2025

I fixed this with AztecProtocol/aztec-packages@4af0186 on the af/fix-7192 branch of aztec-packages, but it contains all other fixes from sync-noir, and does not work when cherry picked into this branch on its own.

I'm now checking if it fixes the test failure on sync-noir as well. (EDIT: No, it doesn't 😩)

BTW the refcount being equal to 1 was confirmed by using

println(std::mem::array_refcount(self.public_inputs.end.note_hashes.storage());

EDIT: Ignore this, it looks like it broke other tests in sync-noir.
EDIT2: sync-noir errors were fixed by AztecProtocol/aztec-packages@4608c49 but I haven't tried it with this yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: 📋 Backlog
Development

No branches or pull requests

3 participants