Storage nodes are not being marked as removed when the contract is destroyed. #141

i-norden · 2022-03-10T03:27:44Z

Discovered by Ashwin and his team.

This causes a bug in ipld-eth-server:

If a contract deployed at some address addr that has a non-empty storage trie s is destroyed at some block n and then a contract (same or different) is redeployed at addr (using CREATE2 opcode) at some block m if we perform the query currently used in RetrieveStorageAtByAddressAndStorageSlotAndBlockHash at some block m+x; where x>=0 it will behave as if the new contract inherited s as its initial state.

In particular, if the specific storage leaf key being queried doesn't exist in the redeployed contract's storage trie at m+x but it did exist in s, the query will incorrectly return the value held at that key in s instead of an empty value.

Fixes

During statediffing

This could be fixed on the indexing end, by adjusting the statediffing code to iterate s and insert a "removed" node-type entry to eth.storage_cids at block n for every node in the trie when generating the statediff between n-1 and n. This would be done here. Having to iterate the entire storage trie is potentially very expensive, and this wouldn't fix the data that has already been processed into Postgres.

Adapt the query

Alternatively, the query used here could be adjusted to account for this issue.

Create a stored function (e.g. was_state_leaf_ever_removed) for checking if a node has ever been removed for a specific state_leaf_key, returning the blockheight(s) it was removed at if so.
Create another stored function (e.g. retrieve_storage_by_address_and_storage_slot_and_blockHash) that first calls was_state_leaf_ever_removed to check if the storage key was ever removed. If it wasn't, it proceeds with the current access pattern. If it was, it finds the height at which it was (most recently) removed and limits its query to eth.storage_cids records above this block height.

This would add additional overhead to the query. This also only fixes this symptom, not the underlying issue. This issue will produce a similar symptom when, for example, trying to create a new storage snapshot from a previous snapshot + all the interim diffs and we would require an adapted query for that as well.

Post-processing of data in Postgres

This issue could also be fixed in post-processing, the algorithm for this would look like:

Query the database to find "Removed" node-type eth.state_cids records that correspond to a contract destruction

SELECT state_cids.*, block_number
FROM eth.state_cids
INNER JOIN eth.header_cids
ON (state_cids.header_id = header_cids.block_hash)
WHERE node_type = 3
AND state_leaf_key != '';

That gets you all removed leaf nodes and the block heights they were removed at, but that includes EOAs and contracts.

To figure out if the removed leafs contained contracts, you need to find the latest non-removed-type
entry for those state_leaf_keys and check if they contain a non-"zero" storage root ("zero" value of root is keccak256(rlp([]byte{}))).

Find the latest eth.storage_cids record at every storage_path in the trie for those contracts

SELECT DISTINCT ON (storage_path) storage_cids.*
FROM eth.storage_cids 
INNER JOIN eth.state_cids ON (storage_cids.state_path = state_cids.state_path) AND (storage_cids.header_id = state_cids.header_id)
INNER JOIN eth.header_cids ON (state_cids.header_id = header_cids.block_hash)
WHERE state_cids.state_leaf_key = $1
WHERE storage_cids.node_type != 3 # can ignore nodes whose latest state is already "removed"
AND block_number < $2
ORDER BY storage_path, block_number DESC;

$1 = state_leaf_key for removed contract, found in step 1
$2 = block_number where the contract was destroyed, found step 1

At the blockheight the corresponding contract was destroyed, create new eth.storage_cids records with node_type = 3 for all of the records found in the above query.

Proposal

Make the updates described in 1) to fix newly indexed data. Use the approach described in 3) to fix the existing data in Postgres.

These contract destructions aren't very frequent, so the performance impact on the head-tracking process should be minimal in aggregate but it could introduce significant latency in the processing of the specific statediffs where large contract destructions do occur. Due to the infrequency, the historical processing required would be sparse and with the index on node_type it should be fast to find the (relatively few) contracts we need to fix the issue for.

The text was updated successfully, but these errors were encountered:

i-norden · 2022-03-23T15:51:26Z

Some notes from discussion today:

Ashwin and his team have identified the problem described here extends to the storage trie. To alleviate this, we need to introduce a new "moved" node type for representing, in our PostgresDB, leaf nodes that have been moved to a new path vs leaf nodes that have been removed altogether.

prathamesh0 · 2022-05-24T11:14:00Z

Tasks:

Fixes in the statediffing code
- Implement changes
- Manual testing
- Unit tests
- Integration tests for scenario described in comment
Post-processing of existing data in Postgres

i-norden · 2023-02-20T21:23:56Z

Closing this as we no longer need to post-process v4 as we are moving to v5, and the missing integration test scenario is no longer relevant in v5 since we don't index intermediate nodes so there would be no need to distinguish between "moved" vs "removed", although we do still need better testing in place for when accounts or slots are removed e.g. cerc-io/tx-spammer#18

i-norden assigned ashwinphatak and i-norden Mar 10, 2022

i-norden changed the title ~~Storage nodes are not being marked as removed when the contract, as a whole, is destroyed.~~ Storage nodes are not being marked as removed when the contract is destroyed. Mar 10, 2022

i-norden assigned abdulrabbani00 Mar 10, 2022

prathamesh0 mentioned this issue Mar 17, 2022

Add diffs for storage nodes of destroyed contracts cerc-io/go-ethereum#210

Merged

i-norden closed this as completed Feb 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Storage nodes are not being marked as removed when the contract is destroyed. #141

Storage nodes are not being marked as removed when the contract is destroyed. #141

i-norden commented Mar 10, 2022 •

edited

Loading

i-norden commented Mar 23, 2022

prathamesh0 commented May 24, 2022 •

edited

Loading

i-norden commented Feb 20, 2023

Storage nodes are not being marked as removed when the contract is destroyed. #141

Storage nodes are not being marked as removed when the contract is destroyed. #141

Comments

i-norden commented Mar 10, 2022 • edited Loading

Fixes

During statediffing

Adapt the query

Post-processing of data in Postgres

Proposal

i-norden commented Mar 23, 2022

prathamesh0 commented May 24, 2022 • edited Loading

i-norden commented Feb 20, 2023

i-norden commented Mar 10, 2022 •

edited

Loading

prathamesh0 commented May 24, 2022 •

edited

Loading