Replies: 9 comments 3 replies
-
p.s. the network itself is capable of 2-3 Gb/s |
Beta Was this translation helpful? Give feedback.
-
Have you tried the priority boost option that increases the process priority? That can help it consistent real time scheduling from the OS which can help.
|
Beta Was this translation helpful? Give feedback.
-
chkseq boost flush active |
Beta Was this translation helpful? Give feedback.
-
increasing the rx/tx/in/out/stream buffer sizes only delays the inevitable udp input packet drop. i am listening for udp packets on the local interface and transmititng out multicast to 225.x.x.x as advised by my network administrator. we will never need the streaming feature (our data will always be packets) but the normDataSend seems to do a "new" object every packet if you go that route. not ideal for high speed stuff. |
Beta Was this translation helpful? Give feedback.
-
that was with my rate set to 8100000.0 which is faster than 8Mb/s. i increased my rate to 810,000,000.0 and am trying it again |
Beta Was this translation helpful? Give feedback.
-
better results - no dropped packets in a few minutes here. maybe i'll try my 800Mb/s test with the rate set at 1Gb instead of 810000000.0 |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
thanks for your response, brian.
we have a udp transmitter ("Sender" in your diagram) that sends test data in which is a sequence counter that happens to be exactly where CheckSequenceNumber64 checks for it.
and our "Receiver" also does its own sequence check.
for weeks i ran the Sender at 800Mb/s with the 'rate' normStreamer command line parameter set to 810Mb/s (810000000.0) with the same results.
i'm losing the packets on the normStreamer insockbuffer, as the diagnostic is showing flurries like this, after which there will be periods of MINUTES long with no errors -
normStreamer: SendData dropped 77 packets seq:1915035441 seq_prev:1915035364 totalDropped:18446744058720315928
normStreamer: SendData dropped 3 packets seq:1915035444 seq_prev:1915035441 totalDropped:18446744058720315931
normStreamer: SendData dropped 62 packets seq:1915035604 seq_prev:1915035542 totalDropped:18446744058720315993
normStreamer: SendData dropped 2 packets seq:1915035606 seq_prev:1915035604 totalDropped:18446744058720315995
normStreamer: SendData dropped 1461 packets seq:1915059565 seq_prev:1915058104 totalDropped:18446744058720317456
normStreamer: SendData dropped 3 packets seq:1915059568 seq_prev:1915059565 totalDropped:18446744058720317459
normStreamer: SendData dropped 64 packets seq:1915059730 seq_prev:1915059666 totalDropped:18446744058720317523
normStreamer: SendData dropped 3 packets seq:1915059733 seq_prev:1915059730 totalDropped:18446744058720317526
normStreamer: SendData dropped 266 packets seq:1915060397 seq_prev:1915060131 totalDropped:18446744058720317792
normStreamer: inputNeeded:1 inputReady:1 txPending:0 txReady:1 inputCount:16943558210466 txCount:16943558210466 fdMask:0
normStreamer: inputNeeded:1 inputReady:1 txPending:0 txReady:1 inputCount:16951339978938 txCount:16951339978938 fdMask:0
normStreamer: inputNeeded:1 inputReady:1 txPending:0 txReady:1 inputCount:16959047827866 txCount:16959047827866 fdMask:0
normStreamer: inputNeeded:1 inputReady:1 txPending:0 txReady:1 inputCount:16966755454350 txCount:16966755454350 fdMask:0
normStreamer: inputNeeded:1 inputReady:1 txPending:0 txReady:1 inputCount:16974463019712 txCount:16974463019712 fdMask:1
normStreamer: inputNeeded:1 inputReady:1 txPending:0 txReady:1 inputCount:16982170457820 txCount:16982170457820 fdMask:1
normStreamer: inputNeeded:1 inputReady:1 txPending:0 txReady:1 inputCount:16989878445024 txCount:16989878445024 fdMask:1
:
:
that was a test i left running all weekend. the Sender is sending to NORM Sender on the lo interface.
i am the only one on the boxes i'm using (one for "Sender" & "NORM Sender" and the other for "NORM Receiver" and "Receiver") and only my Sender and normStreamer are running on the first box and only the normStreamer and Reciever are running on the 2nd box.
these boxes were specifically set up to do norm testing/development and i don't know of any cron job that runs every few minutes and chews up excessive CPU or floods the network these are on.
i am aware of the un-guaranteed paridigm of UDP delivery, but the sender and NORM Sender are on the same box. and there is such a flurry of data drops that it doesn't look random.
the periods of good transmissions end-to-end last far longer than 'a full buffer's worth of packets' at this data rate.
i repeated my test with the rate set to 2000000000.0 and again at 1600000000.0 (2Gb and 1.6Gbps) with the Sender txing at 600Mb/s and still got the occasional flurry of dropped packets on the input. i'm focusing on the input side for now. it is rare that the NORM Receiver reports dropped packets when they weren't dropped on the input so nearly all (if not all) the drops occur getting the data from the lo interface.
i have tried reducing the data rates and */buffers and r/wmem_max down to 8Mb/s with 100,000 byte buffers and r/wmem max to reproduce the situation at a slower speed and more consistently to hopefully make debugging easier. a little more than 7000 packets into the failure is when i get my flurry. i put monotonic timer debugs in for every packet and found that when WriteToStream returns 0 is when all the "HandleNormEvents" happen. like no norm events happen until WriteToStream returns 0. the time to handle all the norm events grows from 0.001497 seconds at the beginning (no drops) to 0.065230 seconds when it drops the inputs. that's at 8Mb/s. i can't put a timestamp-diagnostic-for-every-packet on an 800Mb/s stream but i've seen normStreamer cruise for MINUTES without dropping packets at 800Mb/s so clearly the
while (NormGetNextEvent(normInstance, &event, false))
normStreamer.HandleNormEvent(event);
loop is not taking 0.065230 seconds at 800Mb/s. puzzling.
i'm trying to get the fastest result possible without having to dive into the guts of the protocol implementation. so i'm only playing with normStreamer.
BTW - one side note on this setup is having the UDP protocol as an unreliable protocol in the middle of the mix if your end application needs full reliability. Since there is no flow control, it's tricky to achieve full end-to-end reliability at high transfer rates. If there were some way to use the normStreamer approach more directly, and you could take advantage of the option for ack-based flow control, etc, you could achieve reliability with higher degree of assurance and even provide different reliability/flow control as needed for different flow types (e.g., full reliability for content that needs it and quasi-reliability for things like video streaming flows that prefer unimpeded forward progress over perfectly lossless delivery)
i understand that, but for safety reasons we are needing to keep the normStreamer separate from our process (if one goes down it doesn't bring down the other).
i'm sitting here watching a flurry of chunks of dropped packets happen as above after minutes and minutes of no problems (at 600Mb/s, 128M buffers and r/wmem_max).
these are 8-core boxes. the normStreamer i'm currently running has no added pthread_setaffinity calls to lock onto any particular core. i have tried adding threading to the input (where one thread does the udp receive into a packet array and the other thread empties the packet array delivering each packet to WriteToStream), with and without core affinity, and had no joy there either, oddly enough with a significant increase in CPU. i've been doing multithreaded fill-the-buffer-empty-the-buffer mutexed stuff since 1998 and don't know why threading doesn't help. normstreamer itself seems to be a single thread (with an additional thread running the norm processing).
anyway.... i'll keep hammering away. apparently there are some higher-ups who think that
<hyperbole>
if you just say the magic word 'norm' you can get guaranteed one-to-twelve multicast throughput at any speed.
</hyperbole>
i'm working to see what the limits are and be able to demonstrate them when i hit them. just wanna make sure i'm not doing something stupid.
thanks for your time. apparently it's working in lots of places.
kevan
From: Brian Adamson ***@***.***>
Sent: Saturday, April 22, 2023 11:30 AM
To: USNavalResearchLaboratory/norm ***@***.***>
Cc: Moore, Kevan L. (MSFC-HP27)[MOSSI2] ***@***.***>; Author ***@***.***>
Subject: [EXTERNAL] [BULK] Re: [USNavalResearchLaboratory/norm] high speed send-only normStreamer (Discussion #78)
CAUTION: This email originated from outside of NASA. Please take care when clicking links or opening attachments. Use the "Report Message" button to report suspicious messages to the NASA SOC.
Hi Kevan - I put together a block diagram to illustrate the message/packet flow for normStreamer when its UDP input/output option is used to gateway/proxy streams of UDP packets end-to-end:
[Image removed by sender. image]<https://user-images.githubusercontent.com/6934297/233795741-c67c170c-6788-452b-a949-b89ee53bfc5d.png>
Where is your packet loss occurring? If it is at the UDP "Receiver" (at the right of this diagram), there is a feature to rate shape the NORM Receiver UDP stream to smooth the bursty traffic that can occur when NORM reliability recovers missing packets and outputs a burst of received messages as UDP packets. The command-line option for that is the "limit" command with units of bits/sec (e.g. "limit 1000000" for 1 Mbps limit on UDP output by the normStreamer receiver).
Another question that comes to mind after thinking about this more is to double check what your UDP messaging load rate is and what you have your NORM tx rate set. Your NORM transmit rate needs to be somewhat higher than your UDP messaging load rate to allow margin for NORM protocol overhead and also for any retransmission in response to packet loss between the NORM sender and receiver. What are you using there?
-
Reply to this email directly, view it on GitHub<#78 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/A7LFRTRIM544JDUG2OXDNYTXCQBRLANCNFSM6AAAAAAXHHWHXQ>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
sorry - the third thread was the original norm main thread that did all the norm event processing.
thread 1 - rx udp packets into a pkt array
thread 2 - empty pkt array delivering each packet to WriteToStream
thread 3 - handle all the norm events
with appropriate norm instance thread suspension
kevan
From: Moore, Kevan L. (MSFC-HP27)[MOSSI2]
Sent: Monday, April 24, 2023 11:08 AM
To: USNavalResearchLaboratory/norm ***@***.***>; USNavalResearchLaboratory/norm ***@***.***>
Cc: Author ***@***.***>
Subject: RE: [EXTERNAL] [BULK] Re: [USNavalResearchLaboratory/norm] high speed send-only normStreamer (Discussion #78)
thanks for your response, brian.
we have a udp transmitter ("Sender" in your diagram) that sends test data in which is a sequence counter that happens to be exactly where CheckSequenceNumber64 checks for it.
and our "Receiver" also does its own sequence check.
for weeks i ran the Sender at 800Mb/s with the 'rate' normStreamer command line parameter set to 810Mb/s (810000000.0) with the same results.
i'm losing the packets on the normStreamer insockbuffer, as the diagnostic is showing flurries like this, after which there will be periods of MINUTES long with no errors -
normStreamer: SendData dropped 77 packets seq:1915035441 seq_prev:1915035364 totalDropped:18446744058720315928
normStreamer: SendData dropped 3 packets seq:1915035444 seq_prev:1915035441 totalDropped:18446744058720315931
normStreamer: SendData dropped 62 packets seq:1915035604 seq_prev:1915035542 totalDropped:18446744058720315993
normStreamer: SendData dropped 2 packets seq:1915035606 seq_prev:1915035604 totalDropped:18446744058720315995
normStreamer: SendData dropped 1461 packets seq:1915059565 seq_prev:1915058104 totalDropped:18446744058720317456
normStreamer: SendData dropped 3 packets seq:1915059568 seq_prev:1915059565 totalDropped:18446744058720317459
normStreamer: SendData dropped 64 packets seq:1915059730 seq_prev:1915059666 totalDropped:18446744058720317523
normStreamer: SendData dropped 3 packets seq:1915059733 seq_prev:1915059730 totalDropped:18446744058720317526
normStreamer: SendData dropped 266 packets seq:1915060397 seq_prev:1915060131 totalDropped:18446744058720317792
normStreamer: inputNeeded:1 inputReady:1 txPending:0 txReady:1 inputCount:16943558210466 txCount:16943558210466 fdMask:0
normStreamer: inputNeeded:1 inputReady:1 txPending:0 txReady:1 inputCount:16951339978938 txCount:16951339978938 fdMask:0
normStreamer: inputNeeded:1 inputReady:1 txPending:0 txReady:1 inputCount:16959047827866 txCount:16959047827866 fdMask:0
normStreamer: inputNeeded:1 inputReady:1 txPending:0 txReady:1 inputCount:16966755454350 txCount:16966755454350 fdMask:0
normStreamer: inputNeeded:1 inputReady:1 txPending:0 txReady:1 inputCount:16974463019712 txCount:16974463019712 fdMask:1
normStreamer: inputNeeded:1 inputReady:1 txPending:0 txReady:1 inputCount:16982170457820 txCount:16982170457820 fdMask:1
normStreamer: inputNeeded:1 inputReady:1 txPending:0 txReady:1 inputCount:16989878445024 txCount:16989878445024 fdMask:1
:
:
that was a test i left running all weekend. the Sender is sending to NORM Sender on the lo interface.
i am the only one on the boxes i'm using (one for "Sender" & "NORM Sender" and the other for "NORM Receiver" and "Receiver") and only my Sender and normStreamer are running on the first box and only the normStreamer and Reciever are running on the 2nd box.
these boxes were specifically set up to do norm testing/development and i don't know of any cron job that runs every few minutes and chews up excessive CPU or floods the network these are on.
i am aware of the un-guaranteed paridigm of UDP delivery, but the sender and NORM Sender are on the same box. and there is such a flurry of data drops that it doesn't look random.
the periods of good transmissions end-to-end last far longer than 'a full buffer's worth of packets' at this data rate.
i repeated my test with the rate set to 2000000000.0 and again at 1600000000.0 (2Gb and 1.6Gbps) with the Sender txing at 600Mb/s and still got the occasional flurry of dropped packets on the input. i'm focusing on the input side for now. it is rare that the NORM Receiver reports dropped packets when they weren't dropped on the input so nearly all (if not all) the drops occur getting the data from the lo interface.
i have tried reducing the data rates and */buffers and r/wmem_max down to 8Mb/s with 100,000 byte buffers and r/wmem max to reproduce the situation at a slower speed and more consistently to hopefully make debugging easier. a little more than 7000 packets into the failure is when i get my flurry. i put monotonic timer debugs in for every packet and found that when WriteToStream returns 0 is when all the "HandleNormEvents" happen. like no norm events happen until WriteToStream returns 0. the time to handle all the norm events grows from 0.001497 seconds at the beginning (no drops) to 0.065230 seconds when it drops the inputs. that's at 8Mb/s. i can't put a timestamp-diagnostic-for-every-packet on an 800Mb/s stream but i've seen normStreamer cruise for MINUTES without dropping packets at 800Mb/s so clearly the
while (NormGetNextEvent(normInstance, &event, false))
normStreamer.HandleNormEvent(event);
loop is not taking 0.065230 seconds at 800Mb/s. puzzling.
i'm trying to get the fastest result possible without having to dive into the guts of the protocol implementation. so i'm only playing with normStreamer.
BTW - one side note on this setup is having the UDP protocol as an unreliable protocol in the middle of the mix if your end application needs full reliability. Since there is no flow control, it's tricky to achieve full end-to-end reliability at high transfer rates. If there were some way to use the normStreamer approach more directly, and you could take advantage of the option for ack-based flow control, etc, you could achieve reliability with higher degree of assurance and even provide different reliability/flow control as needed for different flow types (e.g., full reliability for content that needs it and quasi-reliability for things like video streaming flows that prefer unimpeded forward progress over perfectly lossless delivery)
i understand that, but for safety reasons we are needing to keep the normStreamer separate from our process (if one goes down it doesn't bring down the other).
i'm sitting here watching a flurry of chunks of dropped packets happen as above after minutes and minutes of no problems (at 600Mb/s, 128M buffers and r/wmem_max).
these are 8-core boxes. the normStreamer i'm currently running has no added pthread_setaffinity calls to lock onto any particular core. i have tried adding threading to the input (where one thread does the udp receive into a packet array and the other thread empties the packet array delivering each packet to WriteToStream), with and without core affinity, and had no joy there either, oddly enough with a significant increase in CPU. i've been doing multithreaded fill-the-buffer-empty-the-buffer mutexed stuff since 1998 and don't know why threading doesn't help. normstreamer itself seems to be a single thread (with an additional thread running the norm processing).
anyway.... i'll keep hammering away. apparently there are some higher-ups who think that
<hyperbole>
if you just say the magic word 'norm' you can get guaranteed one-to-twelve multicast throughput at any speed.
</hyperbole>
i'm working to see what the limits are and be able to demonstrate them when i hit them. just wanna make sure i'm not doing something stupid.
thanks for your time. apparently it's working in lots of places.
kevan
From: Brian Adamson ***@***.***>
Sent: Saturday, April 22, 2023 11:30 AM
To: USNavalResearchLaboratory/norm ***@***.***>
Cc: Moore, Kevan L. (MSFC-HP27)[MOSSI2] ***@***.***>; Author ***@***.***>
Subject: [EXTERNAL] [BULK] Re: [USNavalResearchLaboratory/norm] high speed send-only normStreamer (Discussion #78)
CAUTION: This email originated from outside of NASA. Please take care when clicking links or opening attachments. Use the "Report Message" button to report suspicious messages to the NASA SOC.
Hi Kevan - I put together a block diagram to illustrate the message/packet flow for normStreamer when its UDP input/output option is used to gateway/proxy streams of UDP packets end-to-end:
[Image removed by sender. image]<https://user-images.githubusercontent.com/6934297/233795741-c67c170c-6788-452b-a949-b89ee53bfc5d.png>
Where is your packet loss occurring? If it is at the UDP "Receiver" (at the right of this diagram), there is a feature to rate shape the NORM Receiver UDP stream to smooth the bursty traffic that can occur when NORM reliability recovers missing packets and outputs a burst of received messages as UDP packets. The command-line option for that is the "limit" command with units of bits/sec (e.g. "limit 1000000" for 1 Mbps limit on UDP output by the normStreamer receiver).
Another question that comes to mind after thinking about this more is to double check what your UDP messaging load rate is and what you have your NORM tx rate set. Your NORM transmit rate needs to be somewhat higher than your UDP messaging load rate to allow margin for NORM protocol overhead and also for any retransmission in response to packet loss between the NORM sender and receiver. What are you using there?
-
Reply to this email directly, view it on GitHub<#78 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/A7LFRTRIM544JDUG2OXDNYTXCQBRLANCNFSM6AAAAAAXHHWHXQ>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
i'm using normStreamer to try to get high speed data transfer. i've ripped out everything recv-related on my sender and everything send-related on my recv'r. while higher speeds have been achieved by setting txbuffer,rxbuffer,inbuffer,outbuffer,streambuffer all to 128000000 and adjusting rmem_max and wmem_max on both send and recv boxes i still am dropping packets on the udp input side. it's as if it can't get back to the spigot fast enough after delivering the udp packets to norm. it will drop packets on startup, settle down and run run for five minutes with no errors then start dropping input packets (senddata reports sequence errors), then settle down and run with no errors for a few more minutes, lather, repeat. that is running at 800Mb/s at 1000-byte pkts from one sender to one receiver. note that it runs with no errors FAR LONGER than the 128,000,000 byte buffer size.
i have managed to reproduce the drop-the-udp-input-packets error at a lower datarate with lower buffers. at 8Mb/s and 100k rx/tx/in/out/stream buffers it takes 8 seconds before it drops its input packets.
one would think that if normStreamer could go minutes at a time handling all its norm events without dropping the input packets at 800Mb/s the handle-the-norm-events wouldn't be an issue at 8Mb/s.
i'm out of ideas as to how to make sure the 'give-the-packet-to-norm-and-handle-norm-events' keeps up with the input packets at 8Mb/s. i've been working on this for weeks. i've only been able to test on a one-to-two setup and will ultimately need to run on a one-to-twelve (all separate boxes) setup.
do you have any advice on what to look at? i'm trying not to have to invade the norm implementation and only tweak normStreamer and its parameters.
Beta Was this translation helpful? Give feedback.
All reactions