Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wgpu-hal] Blas compaction #7101

Merged
merged 3 commits into from
Feb 12, 2025
Merged

[wgpu-hal] Blas compaction #7101

merged 3 commits into from
Feb 12, 2025

Conversation

Vecvec
Copy link
Contributor

@Vecvec Vecvec commented Feb 10, 2025

Connections
Hal part of #6609

Description
Allows for BLASes to be compacted in wgpu-hal

Testing
No testing, but the successful tests in #6609 tested this API.

Checklist

  • Run cargo fmt.
  • Run cargo clippy.
  • Run cargo xtask test to run tests.
  • Add change to CHANGELOG.md. See simple instructions inside file.

@Vecvec Vecvec requested a review from a team as a code owner February 10, 2025 19:56
@Vecvec Vecvec changed the title Initial. [wgpu-hal] Blas compaction Feb 10, 2025
Copy link
Contributor

@nical nical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I suspect that it would be better to have fewer/larger query pools in the vulkan backend, but we can cross that bridge if it shows up in profiles.

@nical nical enabled auto-merge (squash) February 12, 2025 13:09
@nical nical merged commit 3a4a40a into gfx-rs:trunk Feb 12, 2025
33 checks passed
@cwfitzgerald cwfitzgerald deleted the compaction-hal branch February 12, 2025 18:29
@Vecvec
Copy link
Contributor Author

Vecvec commented Feb 12, 2025

LGTM. I suspect that it would be better to have fewer/larger query pools in the vulkan backend, but we can cross that bridge if it shows up in profiles.

@JMS55 mentioned this on matrix too, but I'd need to look at the specifics (and how expensive they were). For mesa at least this is 8 bytes read back from the acceleration structure so I would be surprised if this is the main bottleneck (especially since my current plan is to combine this in a build command which is not very fast itself),

@nical
Copy link
Contributor

nical commented Feb 12, 2025

It's not the cost of the copy as much as the driver overhead of managing a lot of tiny pools (and maybe the per-pool memory overhead). The recommendation is generally to group things into large-ish pools but to be honest I don't know how impactful it is.

@Vecvec
Copy link
Contributor Author

Vecvec commented Feb 12, 2025

I think I'll add it to

// Ray tracing
// Major missing optimizations (no api surface changes needed):
// - use custom tracker to track build state
// - no forced rebuilt (build mode deduction)
// - lazy instance buffer allocation
// - maybe share scratch and instance staging buffer allocation
// - partial instance buffer uploads (api surface already designed with this in mind)
// - ([non performance] extract function in build (rust function extraction with guards is a pain))
when doing wgpu-core compaction.

@JMS55
Copy link
Collaborator

JMS55 commented Feb 12, 2025

Yeah this is something we can stress test once I start the actual RT work in Bevy and have something to measure.

marcpabst pushed a commit to marcpabst/wgpu that referenced this pull request Feb 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants