Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly implement QUIC sniffer when handling multiple initial packets #3310

Conversation

Vigilans
Copy link
Contributor

@Vigilans Vigilans commented Feb 12, 2025

In recent chromium browser, a QUIC connection may contain multiple initial packets. This PR fixes this issue, to:

  1. Cache and parse multiple UDP packets until a ClientHello can be detected from CRYPTO stream.
  2. Increase max sniffing packets for UDP from 2 to 5, to avoid missing any initial packet after 2 packets (+3 simply because TCP handshake contains 3 packets, so the 2th sniffed packet for TCP is actually 5th packet)
  3. io.EOF will not cause sniffing to fail. Though it maybe irrelavent for QUIC here because UDP packets have determined boundary. This fix may be helpful for TCP stream.

This fix has been proved to work without error for around 9 months in my local setup.

Addresses #3303

@Vigilans Vigilans force-pushed the vigilans/quic-sniffer-multiple-initial-packet branch from ad0be0a to 9afb021 Compare February 12, 2025 15:17
Comment on lines +57 to +62
for len(b) > 0 {
buffer := buf.FromBytes(b)
typeByte, err := buffer.ReadByte()
if err != nil {
return nil, errNotQuic
}
Copy link
Contributor Author

@Vigilans Vigilans Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of changes in this file are just indented under for len(b) > 0 { block. For a better review experience check Hide whitespace settings in File Changes tab.

@Vigilans Vigilans force-pushed the vigilans/quic-sniffer-multiple-initial-packet branch from 9afb021 to 45aaf35 Compare February 12, 2025 15:22
@dyhkwong
Copy link
Contributor

It seems that sniffer.Sniff never returns io.EOF?

Copy link
Contributor

@xiaokangwang xiaokangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed the code and think it is in general nice, although I did find it failed to process one corner case when I was trying to add the test(pacp attached).

I am happy to merge it once the EOF thing is resolved.

Anther thing I think we don't need to worry too much but should keep an eye for is that because we are waiting for 100 ms for receiving the packet for sniffing, the worst case time out is now increased to 500ms. It is still too early to say it would break anything, but I am ready to fix this will a separate timeout timer as needed.

quic-chromebook-2.zip

return nil, errSniffingTimeout
}

cReader.Cache(payload)
if !payload.IsEmpty() {
result, err := sniffer.Sniff(ctx, payload.Bytes(), network)
if err != common.ErrNoClue {
if err != common.ErrNoClue && err != io.EOF {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as #3310 (comment) , I didn't find the reason why io.EOF need to be checked here. Maybe it is from another change?

Copy link
Contributor Author

@Vigilans Vigilans Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After navigating through the code, I can confirm that io.EOF is added because quic sniffer's Buffer.Read/ReadBytes will return io.EOF. However io.EOF is not handled by sniffer, instead errQuic is returned. Therefore I removed check of io.EOF in my update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides, I found increased attempts and io.EOF check are controdictary, because for non-valid payload, returning io.EOF and not finish sniffing is just waste of resource, especially regarding the 500ms timeout.

Vigilans and others added 3 commits February 14, 2025 22:26
Signed-off-by: Vigilans <vigilans@foxmail.com>
* Third packet in `Handshake[2]; packet 1-3` mistakenly copied UDP header into payload, making the payload length 1278 instead of 1250
@Vigilans Vigilans force-pushed the vigilans/quic-sniffer-multiple-initial-packet branch from 32d6403 to d5c5f6d Compare February 14, 2025 15:05
@@ -126,7 +126,7 @@ func TestSniffQUICComplex(t *testing.T) {
},
{
name: "QUIC Chromebook Handshake[2]; packet 1-3",
hexData: "cf0000000108452af27900a1723300404600ca8530029a6bc59ff3e85beb5fc838ac3147ba5c2f6421ddcffdd85167d8de70eff2b1fa016dad4918337dec0feb660edb98e078dea51fa914055984b7957bd732fa4c831e448967af34752fd95835e3caba1e022d6d164f3f53f1bd7f60d560a8684079e90626aa1a4d3fe728158f7e1055ff76d1566072113982b193fb932265381e4de7afb35caa4ec56f31595a33fa2eb0bc84feb9f273224050938825fd21aa7317042ad00785ffd36151aee566a5dfe17d72591af1235059171568e5af0d13fc56e7897c3d632be753d8dea184c3d96d92bc56978cc669d94dd4c5e8dc3dcba7f0a39368fb1e87981e54bba7b86fbd8e8023a94d84f0290f402a5244cb4b0eeaaa57610ea59711a43932c521f10edb4560375693cbea60240389b8cebfd94035cabe4fc96ce8a726b979775e06c3bb0e3c4c866fe82e89fb725499e711e39310b93c785b313459f22d4ba37f90b19447165c2584269d98bf47d1f7ca89585797e4d6f1a4a1db7d2b0ae91a93fb15c3bb0ab953c3656b3b2ca20833d15e95329baff6d2ade1b0921b5ed3ae96648bf123b5265e27b049e9a8674455ff5f763f039568026e4fbe9882fef761c573d8f12e342c274a8dd3ad9854a688ce57cdddb52c758161ae3a59f67fc0d5b85f12e27617e7f4366e97a61fcda084e620dde35686f01dac49ce4bd76b986e3223c215919a1b228beeb74b7fcf32827d55be8f1b3b5fed24df2db023faecbb313b18a151cc4af8199d4bb08f8127b8207a0286d52758eaca87fd476ece0e3b17bcd8afb0289e8fd33c4455d4db6f058826c301ea303bfe2c0a6651a8fb6a2e1897852d758076adb04ad907077c5d5f94089da78d8923a34f1022ed672f378fe0dd81a709b372c0a2042a42e683c051c653e42b43c4a0ea8e961074d2901d4157ac9878b13a207b05ec471cff10d922b74d05623513cd6a4ea192ad21d4089de269633d4d2d1388d98d7c8a9e29848d5558b8aa2b73b437446a640230e6adb7f4b317ee5d66681c4aae11f69b1e5f96cb32ca6331405426cb706167d86f6f8fd588a72d7b2a6906798b81f174d808e1e3fc461e598e797c41bced26b87d09282d7b6d95076c285462e0c420a6f0e171ffe2791b5d221c03520409fe36622ff77796d9b7ef82babb25313acda9c621b22bf45ed909f9365b508860645af4c3aca78e6abca2d3a65c9159fbcd577438505d3f65a57c9412c12c069ad4d6db450beb08603abef621a9e029593fb5881dbd524ea2953b4acaaf59269b584c754e88c033247bb7c032e548d34fd9b2678e62fdf953dabf2be21c3e2d7b18ec7e3aedaf2cd082e19a369c1bcd4ca67e3d464e2200ecc3df98b0aa7f349415d68bcab0441ac3366607eff024bb786aec031a4619f8a24f554fe93c8520a03affcf11e40b6d5002f98c1708cac6c56e77eccba85ea6600d1391cfd202cc7914bfbaa3303266d1a820bf2dc84d2dfcdc4cdb79e6de3fbe3c02b288dcf955652f674f3f59b50849ea7dbf755bdafa27fba3db1267fb1354d8bf25a60cacb900b4d7ba913f9ba5f6b00559ad58b2f34a658ff7ef7f7d1ceeffd9c8325f271e6b5ba44d89685b744306963aa5e05ac0e8b00ada772dd5ae5ffb7043109afea86593743564c7acb4c8e7ef0e57d081eb1b9c0916078b113ece8a6036264a9b9781183c035342d50c7b069f3a01a40230e37ed8efde073c07d0e68066541d78c2f3cbe1e603cfcaaac40000000108452af27900a1723300404600ca8530029a6bc59ff3e85beb5fc838ac3147ba5c2f6421ddcffdd85167d8de70eff2b1fa016dad4918337dec0feb660edb98e078dea51fa914055984b7957bd732fa4c831e4489522d29bb5c84749f83c8e1edfd9da8d1738164a8a9c59e37a5c9994d90bb982dcfa69b20f868960dc139618f1adc2546d34340ae13d826260c54a456bbf7469ee37b1be1d7177004468d7e92cac62a0b165d6a114ad479861dd58959e094b5a6250359301d4a614d529660760e3d1cdec9bf444a3761309bab40e4a977bc749e0dae431952f5f7e6b1ebc1383d343359a387da4301f7fa4b400475e9b82367e56278376dd1c80349f083988945a13649008109cc12a3acf569ffcc5481fbcd86b544e7dc8434e9dd42bd8e5716a844d37879568db046857389d36cc7550c75f94e314db6749aa987f0fc730fae0fcf465d01c2fb745269dfc10132ddb5404dda2f9455780f5818730834aa9db4740793359884b9927b0bd1a5ca96052b4f17397d8b78aa891401bb8bed6726ea2229d919798c50e24d5f40576ac204847be9244aadbe5c773684c37475036541d209c177d4e9c22a1253292ce4ffb886b925b6cf83cc251976a68887eda2777590f51804b790b51eee77e717b7ef0eea71634594df36e6ae9e7574d65c51ac3196f0b2a3b0f023c81f05f7807f958dda03418ed49e14b645e814b9aa55b37c809be3172ca21fe4c7a78e17e9ece8def2dd2949310ecaa41b1b477f4e85db5288aa144e333f47ef291d0e822941181c13859d9fd6d640904ee764c9276125228c932dff3fb12f564f039b52f5ba1ab4d119641df8fe13f784802b99347f0046da63f471e34b1d12d3111cffe7b5d90cf5999879f6f23e7785f09cb10df32821bb68dd8fdcfcdbedd63f2428b2292b9f0e76ff36403c9644fd43e01112ee6218d0ec1c86f6d147e4b802293e906750c7046f53bf05a144e321d3b45e08e4064fd3828fdd1b5d1ceed74081f61319dc0ad9a6e8a3b9cc802e952d24e2271712e2c2cda7daca2f835e6c804feeef8d918404cc82a1aa9534bddff68a472b208a0d0a7fd68a08fbc411132af47a6b67a32617b7b9991524c21599e8e3cb9395cdab87a3f5bf5d1833a9c7ea021b29cf428c877c6b21d62f99340ac7f85ae721acc10968e7d79f111ca40c75e14060d07cffa046d71151a0b00eab657300344b04bd1a8871650c34ceda8610d7c1ba8d37673da6aaa580400e0230c69fba8ba21927de2f5897656144694550d1df3d268804adc707e7b236501734aeabb2e61cb08012bd96eca5a486d7a55f996992c36233815abd71c30e263ba0c5d9456fe0828df16f6af7929390bb143c426d9dfeaf4bb373554479ebe609b36b4bc3dd08ce216b9cdc5726edb458c5e4036d0edc688d3e39d20f8254b5d1f174518f15b344efc27fc56572c0159aa593d5b46bc33818f986e3df8caebd4c7b702ef50dd582714a2b94ecd1c4e90af37d388445c478a32ff6e8f5852ee115966b708eed04da322b98813a69423e95f90b89ce85518e39bcef36fdd5bd312b2c6c5ee85962675274c18f39ee35155517f70fd74b31bb2de6b5108d369252e6fb289e453833132ef7960da1cc0934790c039b9a1b0c74f23eb3b61fe9b4d0ea67de757b93af451eef303b1373199af446a0fa98d5991bbd4771ee63317e6da86efbe213dfff595c41b98e0e89f4f2df110104e760feebf4cb3361171c9fceb1e1c809a268450004fe559040003f11e3c62a2a2d09d83ace2aad1d01bb04ea83d1c60000000108452af27900a1723300404600ca8530029a6bc59ff3e85beb5fc838ac3147ba5c2f6421ddcffdd85167d8de70eff2b1fa016dad4918337dec0feb660edb98e078dea51fa914055984b7957bd732fa4c831e44892ff5e6b16d8a259a9128c2c0c3c525462781a344c3df7f19a747e0e79ca8714995c867fc697a3cb87b35e769465a8e966bcb35b7e897ad036aa23a6c021e2445a0eb79962151cd20dbb43ae1231847de01caf4e5589dfebf026e95f7d1d742e140d9dda849396a70cc0798f1eef06fd5f4cfbc9a190ddf04cc332c5b7b15e53af311190ced92a1291c12b8799f2b50e076539a8370ee667e1791a78f38e565a48acbaa1c78ba941dba8b0d040f8fb8bbcc9f6bf5705efa613a24b12d6ac9cebb4f3fac1b09a07b49d8a3a62808eb0a324629f13a012e6ad0feb11ad97c1572983c713b62f27584809ba43e64e4af9845af807c0783104838f4e2ac33fa848866f3cc64a7b6203a5c09e8ad231f0f06ae2fb7b39a64cedd823b0ff297ad9be1ccac436777ccb3e22ef6b9c12e6d5e34926f50e8ca8c8c0532c810b074d001c11791a01bf25786b57a5da54065dcee4962822e929f47ee44d3b8c83d45a8b7a936dc2a6fa396e4194fa032d1627eca59f69857fc40dab5835d3613dade1c74b09c345bd32c509e9545d2330b157a7acb76409f3ac8eaa22802414f38c5422fe4c5189caaf5c1b93ce7c0892f0cfc477490d335aa78961d632a973cf106bd974c2714176fb0f98cf12f2887a0d7bd491756dd374331eb3e6adb9f2bd0d6b273403fd14b314eb27ebbb6f6e78ce310437004b757c048149cf04429ae4a6d6e65c9b3e0b9c9c4d4ef52007eaaad9670320f10cd5317b3d3edc374d45c98b217dd28fb3c2c2fb6e74a3aced143e3242084b192ba6df24e69fdb883e850714fe27a45f43883486a986574fd1fc10f259fe90786441554514c8dade1f3b86fdaf5f54ab655e2d803c98aa56073b00c32148a1ed367dff3a2bd934ecba55141389990b661bbd9ce1ef1def13747d45500daf92cec9e60908274703e761cd46affd46622f2a2192a79425ebf51c875fc7ba3598e15e0ba2465fc3e87c8a5da1915d3b8abe4b16d21259f311183eee1e7d2b808a91a7c89b284df0eb6a2a79c610bbe47722b3e04d5a6c0e574816a94d97349b6976010eb8c7debf42210982f78de482b7cd068051bf57908dbf46b5ceaf64f5fb33ede4412c1ce81eb1dfb4e99e10dd9b57ebc6e62ecbf4ee2db04d9e48c62bd45f8fa51704d414296a2d51d25ced6a192034a44c67e09d8985b573f98e03fa36dd8dcce2c04b4d5b1f276b6a642aadbdcafcf09de1d234bf8bbbf64aeadf01519ddafb419b3e62d204e04c3d7ebaf54b09e387ac3e9c4781c11625a2f44fddb7a1886f21929bd01c283f64903b6ccbb463984dfadc00f6af2a421517da023fd319f528195cac5fe907624b70c0172479d07d78e266dbf20ab8fc302228f279ffdba7395a839c4a9d7a4e001a260e1702393968f1e9722f023b204cf09cfee9a7bba045e4a2a449ee9fbb5c36e93028cfc87a2e34914b1b4f01beeded175ac0fa73fea9292f2bc3b1247164d8e05cddc3981bdc24e5f596571c418f6fa00fd9d4d0898cbf0d2f5413bed5f100f1854903017b6bc88bd7e303b5e0e2417bbcc984731128eda550d31f9af0e6e743eb6916466bbd435617d56fa60b05cd7dca66a9f6f4be23d3c5ff5d900822c6d1d8d71b0bab24f57d9682381a87c",
hexData: "cf0000000108452af27900a1723300404600ca8530029a6bc59ff3e85beb5fc838ac3147ba5c2f6421ddcffdd85167d8de70eff2b1fa016dad4918337dec0feb660edb98e078dea51fa914055984b7957bd732fa4c831e448967af34752fd95835e3caba1e022d6d164f3f53f1bd7f60d560a8684079e90626aa1a4d3fe728158f7e1055ff76d1566072113982b193fb932265381e4de7afb35caa4ec56f31595a33fa2eb0bc84feb9f273224050938825fd21aa7317042ad00785ffd36151aee566a5dfe17d72591af1235059171568e5af0d13fc56e7897c3d632be753d8dea184c3d96d92bc56978cc669d94dd4c5e8dc3dcba7f0a39368fb1e87981e54bba7b86fbd8e8023a94d84f0290f402a5244cb4b0eeaaa57610ea59711a43932c521f10edb4560375693cbea60240389b8cebfd94035cabe4fc96ce8a726b979775e06c3bb0e3c4c866fe82e89fb725499e711e39310b93c785b313459f22d4ba37f90b19447165c2584269d98bf47d1f7ca89585797e4d6f1a4a1db7d2b0ae91a93fb15c3bb0ab953c3656b3b2ca20833d15e95329baff6d2ade1b0921b5ed3ae96648bf123b5265e27b049e9a8674455ff5f763f039568026e4fbe9882fef761c573d8f12e342c274a8dd3ad9854a688ce57cdddb52c758161ae3a59f67fc0d5b85f12e27617e7f4366e97a61fcda084e620dde35686f01dac49ce4bd76b986e3223c215919a1b228beeb74b7fcf32827d55be8f1b3b5fed24df2db023faecbb313b18a151cc4af8199d4bb08f8127b8207a0286d52758eaca87fd476ece0e3b17bcd8afb0289e8fd33c4455d4db6f058826c301ea303bfe2c0a6651a8fb6a2e1897852d758076adb04ad907077c5d5f94089da78d8923a34f1022ed672f378fe0dd81a709b372c0a2042a42e683c051c653e42b43c4a0ea8e961074d2901d4157ac9878b13a207b05ec471cff10d922b74d05623513cd6a4ea192ad21d4089de269633d4d2d1388d98d7c8a9e29848d5558b8aa2b73b437446a640230e6adb7f4b317ee5d66681c4aae11f69b1e5f96cb32ca6331405426cb706167d86f6f8fd588a72d7b2a6906798b81f174d808e1e3fc461e598e797c41bced26b87d09282d7b6d95076c285462e0c420a6f0e171ffe2791b5d221c03520409fe36622ff77796d9b7ef82babb25313acda9c621b22bf45ed909f9365b508860645af4c3aca78e6abca2d3a65c9159fbcd577438505d3f65a57c9412c12c069ad4d6db450beb08603abef621a9e029593fb5881dbd524ea2953b4acaaf59269b584c754e88c033247bb7c032e548d34fd9b2678e62fdf953dabf2be21c3e2d7b18ec7e3aedaf2cd082e19a369c1bcd4ca67e3d464e2200ecc3df98b0aa7f349415d68bcab0441ac3366607eff024bb786aec031a4619f8a24f554fe93c8520a03affcf11e40b6d5002f98c1708cac6c56e77eccba85ea6600d1391cfd202cc7914bfbaa3303266d1a820bf2dc84d2dfcdc4cdb79e6de3fbe3c02b288dcf955652f674f3f59b50849ea7dbf755bdafa27fba3db1267fb1354d8bf25a60cacb900b4d7ba913f9ba5f6b00559ad58b2f34a658ff7ef7f7d1ceeffd9c8325f271e6b5ba44d89685b744306963aa5e05ac0e8b00ada772dd5ae5ffb7043109afea86593743564c7acb4c8e7ef0e57d081eb1b9c0916078b113ece8a6036264a9b9781183c035342d50c7b069f3a01a40230e37ed8efde073c07d0e68066541d78c2f3cbe1e603cfcaaac40000000108452af27900a1723300404600ca8530029a6bc59ff3e85beb5fc838ac3147ba5c2f6421ddcffdd85167d8de70eff2b1fa016dad4918337dec0feb660edb98e078dea51fa914055984b7957bd732fa4c831e4489522d29bb5c84749f83c8e1edfd9da8d1738164a8a9c59e37a5c9994d90bb982dcfa69b20f868960dc139618f1adc2546d34340ae13d826260c54a456bbf7469ee37b1be1d7177004468d7e92cac62a0b165d6a114ad479861dd58959e094b5a6250359301d4a614d529660760e3d1cdec9bf444a3761309bab40e4a977bc749e0dae431952f5f7e6b1ebc1383d343359a387da4301f7fa4b400475e9b82367e56278376dd1c80349f083988945a13649008109cc12a3acf569ffcc5481fbcd86b544e7dc8434e9dd42bd8e5716a844d37879568db046857389d36cc7550c75f94e314db6749aa987f0fc730fae0fcf465d01c2fb745269dfc10132ddb5404dda2f9455780f5818730834aa9db4740793359884b9927b0bd1a5ca96052b4f17397d8b78aa891401bb8bed6726ea2229d919798c50e24d5f40576ac204847be9244aadbe5c773684c37475036541d209c177d4e9c22a1253292ce4ffb886b925b6cf83cc251976a68887eda2777590f51804b790b51eee77e717b7ef0eea71634594df36e6ae9e7574d65c51ac3196f0b2a3b0f023c81f05f7807f958dda03418ed49e14b645e814b9aa55b37c809be3172ca21fe4c7a78e17e9ece8def2dd2949310ecaa41b1b477f4e85db5288aa144e333f47ef291d0e822941181c13859d9fd6d640904ee764c9276125228c932dff3fb12f564f039b52f5ba1ab4d119641df8fe13f784802b99347f0046da63f471e34b1d12d3111cffe7b5d90cf5999879f6f23e7785f09cb10df32821bb68dd8fdcfcdbedd63f2428b2292b9f0e76ff36403c9644fd43e01112ee6218d0ec1c86f6d147e4b802293e906750c7046f53bf05a144e321d3b45e08e4064fd3828fdd1b5d1ceed74081f61319dc0ad9a6e8a3b9cc802e952d24e2271712e2c2cda7daca2f835e6c804feeef8d918404cc82a1aa9534bddff68a472b208a0d0a7fd68a08fbc411132af47a6b67a32617b7b9991524c21599e8e3cb9395cdab87a3f5bf5d1833a9c7ea021b29cf428c877c6b21d62f99340ac7f85ae721acc10968e7d79f111ca40c75e14060d07cffa046d71151a0b00eab657300344b04bd1a8871650c34ceda8610d7c1ba8d37673da6aaa580400e0230c69fba8ba21927de2f5897656144694550d1df3d268804adc707e7b236501734aeabb2e61cb08012bd96eca5a486d7a55f996992c36233815abd71c30e263ba0c5d9456fe0828df16f6af7929390bb143c426d9dfeaf4bb373554479ebe609b36b4bc3dd08ce216b9cdc5726edb458c5e4036d0edc688d3e39d20f8254b5d1f174518f15b344efc27fc56572c0159aa593d5b46bc33818f986e3df8caebd4c7b702ef50dd582714a2b94ecd1c4e90af37d388445c478a32ff6e8f5852ee115966b708eed04da322b98813a69423e95f90b89ce85518e39bcef36fdd5bd312b2c6c5ee85962675274c18f39ee35155517f70fd74b31bb2de6b5108d369252e6fb289e453833132ef7960da1cc0934790c039b9a1b0c74f23eb3b61fe9b4d0ea67de757b93af451eef303b1373199af446a0fa98d5991bbd4771ee63317e6da86efbe213dfff595c41b98e0e89f4f2df110104e760feebf4cb3361171c9fceb1e1c809a268c60000000108452af27900a1723300404600ca8530029a6bc59ff3e85beb5fc838ac3147ba5c2f6421ddcffdd85167d8de70eff2b1fa016dad4918337dec0feb660edb98e078dea51fa914055984b7957bd732fa4c831e44892ff5e6b16d8a259a9128c2c0c3c525462781a344c3df7f19a747e0e79ca8714995c867fc697a3cb87b35e769465a8e966bcb35b7e897ad036aa23a6c021e2445a0eb79962151cd20dbb43ae1231847de01caf4e5589dfebf026e95f7d1d742e140d9dda849396a70cc0798f1eef06fd5f4cfbc9a190ddf04cc332c5b7b15e53af311190ced92a1291c12b8799f2b50e076539a8370ee667e1791a78f38e565a48acbaa1c78ba941dba8b0d040f8fb8bbcc9f6bf5705efa613a24b12d6ac9cebb4f3fac1b09a07b49d8a3a62808eb0a324629f13a012e6ad0feb11ad97c1572983c713b62f27584809ba43e64e4af9845af807c0783104838f4e2ac33fa848866f3cc64a7b6203a5c09e8ad231f0f06ae2fb7b39a64cedd823b0ff297ad9be1ccac436777ccb3e22ef6b9c12e6d5e34926f50e8ca8c8c0532c810b074d001c11791a01bf25786b57a5da54065dcee4962822e929f47ee44d3b8c83d45a8b7a936dc2a6fa396e4194fa032d1627eca59f69857fc40dab5835d3613dade1c74b09c345bd32c509e9545d2330b157a7acb76409f3ac8eaa22802414f38c5422fe4c5189caaf5c1b93ce7c0892f0cfc477490d335aa78961d632a973cf106bd974c2714176fb0f98cf12f2887a0d7bd491756dd374331eb3e6adb9f2bd0d6b273403fd14b314eb27ebbb6f6e78ce310437004b757c048149cf04429ae4a6d6e65c9b3e0b9c9c4d4ef52007eaaad9670320f10cd5317b3d3edc374d45c98b217dd28fb3c2c2fb6e74a3aced143e3242084b192ba6df24e69fdb883e850714fe27a45f43883486a986574fd1fc10f259fe90786441554514c8dade1f3b86fdaf5f54ab655e2d803c98aa56073b00c32148a1ed367dff3a2bd934ecba55141389990b661bbd9ce1ef1def13747d45500daf92cec9e60908274703e761cd46affd46622f2a2192a79425ebf51c875fc7ba3598e15e0ba2465fc3e87c8a5da1915d3b8abe4b16d21259f311183eee1e7d2b808a91a7c89b284df0eb6a2a79c610bbe47722b3e04d5a6c0e574816a94d97349b6976010eb8c7debf42210982f78de482b7cd068051bf57908dbf46b5ceaf64f5fb33ede4412c1ce81eb1dfb4e99e10dd9b57ebc6e62ecbf4ee2db04d9e48c62bd45f8fa51704d414296a2d51d25ced6a192034a44c67e09d8985b573f98e03fa36dd8dcce2c04b4d5b1f276b6a642aadbdcafcf09de1d234bf8bbbf64aeadf01519ddafb419b3e62d204e04c3d7ebaf54b09e387ac3e9c4781c11625a2f44fddb7a1886f21929bd01c283f64903b6ccbb463984dfadc00f6af2a421517da023fd319f528195cac5fe907624b70c0172479d07d78e266dbf20ab8fc302228f279ffdba7395a839c4a9d7a4e001a260e1702393968f1e9722f023b204cf09cfee9a7bba045e4a2a449ee9fbb5c36e93028cfc87a2e34914b1b4f01beeded175ac0fa73fea9292f2bc3b1247164d8e05cddc3981bdc24e5f596571c418f6fa00fd9d4d0898cbf0d2f5413bed5f100f1854903017b6bc88bd7e303b5e0e2417bbcc984731128eda550d31f9af0e6e743eb6916466bbd435617d56fa60b05cd7dca66a9f6f4be23d3c5ff5d900822c6d1d8d71b0bab24f57d9682381a87c",
Copy link
Contributor Author

@Vigilans Vigilans Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed the code and think it is in general nice, although I did find it failed to process one corner case when I was trying to add the test(pacp attached).

In this testcase (QUIC Chromebook Handshake[2]; packet 1-3), the third packet's hex included the UDP header (28 bytes) by mistake, making the payload length to be 1278 instead of 1250 (all three initial packets' payload are 1250 bytes long). After fixing it the test passed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for finding the cause... I did try to copy and paste multiple times but maybe I made the same mistake each time...

@Vigilans
Copy link
Contributor Author

Vigilans commented Feb 14, 2025

Anther thing I think we don't need to worry too much but should keep an eye for is that because we are waiting for 100 ms for receiving the packet for sniffing, the worst case time out is now increased to 500ms. It is still too early to say it would break anything, but I am ready to fix this will a separate timeout timer as needed.

I have following observations for your pcap:

  • Domain sniffed from first 3 initial packets sent by client
  • 3 initial packets sent within 1ms (6 initial packets in 3ms in another handshake)

The 100ms is controlled by mb, _ := r.reader.ReadMultiBufferTimeout(time.Millisecond * 100). ReadMultiBufferTimeout return at least one packet in each read. In worst case, it may return only one packet in each call, making the 3rd initial packet not sniffed.

So the decision is now based on whether ReadMultiBufferTimeout is guaranteed to return all subsequent initial packets in second call. Since initial packets are sent within 1ms, it seems very likely to happen.

Btw, I removed the increased attempt in my updated commit.

@dyhkwong
Copy link
Contributor

dyhkwong commented Feb 14, 2025

The 100ms is controlled by mb, _ := r.reader.ReadMultiBufferTimeout(time.Millisecond * 100). ReadMultiBufferTimeout return at least one packet in each read. In worst case, it may return only one packet in each call, making the 3rd initial packet not sniffed.

So the decision is now based on whether ReadMultiBufferTimeout is guaranteed to return all subsequent initial packets in second call. Since initial packets are sent within 1ms, it seems very likely to happen.

Maybe let the QUIC sniffer (and other sniffers) returns an error like errRemainingData for incomplete packets, and disable the retry limit only if errRemainingData is encountered, and do not ignore the error of cReader.Cache. For UDP, ReadMultiBufferTimeout should be okay to return all subsequent packets, but for TCP this is not the case. (I'm trying to fix sniffing fragmented TLS records for the TLS sniffer, and found ReadMultiBufferTimeout only returns exactly one packet for TCP. See dyhkwong@3b7adc1)

@Vigilans
Copy link
Contributor Author

Vigilans commented Feb 15, 2025

(I'm trying to fix sniffing fragmented TLS records for the TLS sniffer, and found ReadMultiBufferTimeout only returns exactly one packet for TCP. See dyhkwong@3b7adc1

I am organizing your commit into this PR. A very interesting observation occured: your update to ReadClientHello makes much of new tests in this PR failed:

--- FAIL: TestSniffQUICComplex (0.00s)
    --- FAIL: TestSniffQUICComplex/QUIC_Chromebook_Handshake[1];_packet_1 (0.00s)
        .../v2ray-core/common/protocol/quic/sniff_test.go:228: SniffQUIC() error = <nil>, wantErr true
    --- FAIL: TestSniffQUICComplex/QUIC_Chromebook_Handshake[2];_packet_1 (0.00s)
        .../v2ray-core/common/protocol/quic/sniff_test.go:228: SniffQUIC() error = <nil>, wantErr true
    --- FAIL: TestSniffQUICComplex/QUIC_Chromebook_Handshake[2];_packet_1-2 (0.00s)
        .../v2ray-core/common/protocol/quic/sniff_test.go:228: SniffQUIC() error = <nil>, wantErr true
    --- FAIL: TestSniffQUICComplex/QUIC_Chromebook_Handshake[3];_packet_1 (0.00s)
        .../v2ray-core/common/protocol/quic/sniff_test.go:228: SniffQUIC() error = <nil>, wantErr true
    --- FAIL: TestSniffQUICComplex/QUIC_Chromebook_Handshake[3];_packet_1-2 (0.00s)
        .../v2ray-core/common/protocol/quic/sniff_test.go:228: SniffQUIC() error = <nil>, wantErr true
    --- FAIL: TestSniffQUICComplex/QUIC_Chromebook_Handshake[4];_packet_1 (0.00s)
        .../v2ray-core/common/protocol/quic/sniff_test.go:228: SniffQUIC() error = <nil>, wantErr true
    --- FAIL: TestSniffQUICComplex/QUIC_Chromebook_Handshake[5];_packet_1 (0.00s)
        .../v2ray-core/common/protocol/quic/sniff_test.go:228: SniffQUIC() error = <nil>, wantErr true
    --- FAIL: TestSniffQUICComplex/QUIC_Chromebook_Handshake[7];_packet_1 (0.00s)
        .../v2ray-core/common/protocol/quic/sniff_test.go:228: SniffQUIC() error = <nil>, wantErr true
    --- FAIL: TestSniffQUICComplex/QUIC_Chromebook_Handshake[7];_packet_1-2 (0.00s)
        .../v2ray-core/common/protocol/quic/sniff_test.go:228: SniffQUIC() error = <nil>, wantErr true

This is because your update makes ReadClientHello read partial of QUIC's crypto data and discovered domain in it! Not all initial packets need to be ready before domain can be sniffed out.

I think your update to TLS can be maintained in a separate PR, and let this PR focus on update of QUIC and your new errRemainingData?

@dyhkwong
Copy link
Contributor

dyhkwong commented Feb 15, 2025

I think your update to TLS can be maintained in a separate PR, and let this PR focus on update of QUIC and your new errRemainingData?

Never mind, it is only an unfinished fix and we should not be off-topic here. Just my thoughts about the discussion of maximum retry times.

@Vigilans Vigilans force-pushed the vigilans/quic-sniffer-multiple-initial-packet branch from 04d482f to 7f50592 Compare February 15, 2025 18:11
@Vigilans
Copy link
Contributor Author

I've updated with your commit. It looks generally solid, with some modifications and fixes:

  • errRemainingData renamed to ErrProtoNeedsMoreData. It means "Protocol matches, but need more data to complete sniffing". In contrast, ErrNoClue means "Protocol not matches, and sniffer cannot determine whether there will be a match or not".
  • ErrProtoNeedsMoreData also timeout with 2 failed attempts, but only error of cReader.Cache is counted as one failed attempt. This is just a more permissive version of your check. In contrast, each ErrNoClude counts as failed attempt.
  • Fixed Sniffer.Sniff logic in app/dispatcher/sniffer.go. Your commit did not handle pendingSniffer, which controls sniffers used in next attempt. Before this commit it only includes sniffers returning ErrNoClue. I've fixed it to exclusively use the sniffer if returning ErrProtoNeedMoreData, because this sniffer's protocol matched.

@Vigilans
Copy link
Contributor Author

Btw, I noticed your comment in XTLS/Xray-core#3363 (comment), I've checked code, and it seems sniffer.Sniff is blocking routedDispatch call, so routing is happened only after all packets sniffing attempts are done. In this case the worse case 500ms timeout mentioned by @xiaokangwang seems more important now.

In the meantime, it proves your update to TLS ReadClientHello worths another PR, because it effectively reduces required initial packets to sniff out domain (in optimal case only first packet is enough).

@xiaokangwang
Copy link
Contributor

@Vigilans I think this pull request is ready to be merged. Do you think we should proceed with merge or there is any known issue you would like to fix before it get merged?

@Vigilans
Copy link
Contributor Author

@xiaokangwang I've been running v2ray binary with new commits in this PR and did not witness issue after running for a day. I think we can proceed with the merge.

@xiaokangwang
Copy link
Contributor

Thanks for your contribution. I will merge this now.

@xiaokangwang xiaokangwang merged commit 8ceba34 into v2fly:master Feb 16, 2025
39 of 41 checks passed
@dyhkwong
Copy link
Contributor

dyhkwong commented Feb 18, 2025

7f50592 caused high CPU comsumption and high battery usage. Reverting it can solve this. pprof

@Vigilans
Copy link
Contributor Author

Vigilans commented Feb 18, 2025

7f50592 caused high CPU comsumption and high battery usage. Reverting it can solve this. pprof

Confirmed my instance also got high CPU consumption. Looking into it.

An assumption is that payload indefinitely grows and memory allocation & copy consumes cpu continuously.

@xiaokangwang
Copy link
Contributor

See also: #3316
There might be a buffer corruption issue.

@xiaokangwang
Copy link
Contributor

See also: #3316 There might be a buffer corruption issue.
@Vigilans
I saw your comment on the referenced issue. Maybe a full buffer analysis was necessary to find out the root cause of issue. Sadly my hope for quick victory didn’t last long.

@Vigilans
Copy link
Contributor Author

See also: #3316 There might be a buffer corruption issue.

Fix PR for it: #3320

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants