Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(DRAFT PR - NOT FOR REVIEW) mt stress fail fix (WFP transaction issue) #3685

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

dv-msft
Copy link
Collaborator

@dv-msft dv-msft commented Jul 2, 2024

Description

Describe the purpose of and changes within this Pull Request.

Testing

Do any existing tests cover this change? Are new tests needed?

Documentation

Is there any documentation impact for this change?

Installation

Is there any installer impact for this change?

Fixes #3602

NET_EBPF_EXT_TRACELOG_KEYWORD_EXTENSION, "FwpmCalloutDeleteByKey", status);
}

status = FwpmSubLayerDeleteByKey(_wfp_engine_handle, _net_ebpf_ext_wfp_callout_states[index].layer_guid);
Copy link
Collaborator

@shankarseal shankarseal Jul 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bug: the sublayer key is ebpf_hook_sub_layer.subLayerKey.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second parameter for FwpmSubLayerDeleteByKey is a pointer to a GUID (https://learn.microsoft.com/en-us/windows/win32/api/fwpmu/nf-fwpmu-fwpmsublayerdeletebykey0). How'd we be able to use an instance of ebpf_hook_sub_layer which is a FWPM_SUBLAYER struct?

Per that same link, this GUID needs to be the same one that was used to create the sub-layer, as I've done here. Not sure what I'm missing here?

@@ -638,27 +700,47 @@ net_ebpf_extension_initialize_wfp_components(_Inout_ void* device_object)

NET_EBPF_EXT_LOG_ENTRY();

if (_fwp_engine_handle != NULL) {
if (_wfp_engine_handle != NULL) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please do not change the identifier.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took it to be an accidental typo ('fwp' instead of 'wfp'), so fixed it. I can certainly revert it back but was curious why you'd prefer the original.

NET_EBPF_EXT_BAIL_ON_API_FAILURE_STATUS(NET_EBPF_EXT_TRACELOG_KEYWORD_EXTENSION, "FwpmEngineOpen", status);
is_engine_opened = TRUE;

status = FwpmTransactionBegin(_fwp_engine_handle, 0);
// Clean up stale WFP persisted state, if any.
_net_ebpf_extension_cleanup_state(_wfp_engine_handle);
Copy link
Collaborator

@shankarseal shankarseal Jul 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wrong about sessions being persistent by default. It is not. Please read this. So, I think we do not need to call cleanup_state in driver entry. Sorry about the back and forth.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries. Thanks for that update. So would I be right in assuming that NetEbpfExt would hang forever at unload if it did not perform any of these clean-up actions? If not, we'd still need to clean-up stale state in DriverEntry, right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify I want you to try reverting back the global session to be dynamic and let the states be cleaned up automatically at driver unload. But keep the per-filter-context session as static. I think that session can still be associated with the states created in the global dynamic session.

If that does not work, whatever you have put here will work.

@dv-msft dv-msft force-pushed the 3602-mt-stress-fail-fix branch from c5d1633 to 6b08d60 Compare July 3, 2024 07:44
// already registered
goto Exit;
}

// Details @ https://learn.microsoft.com/en-us/windows/win32/fwp/object-management#object-associations.
RtlZeroMemory(&session, sizeof(FWPM_SESSION));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the dynamic flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Workflow failed - km_mt_stress_tests_restart_extension
2 participants