Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STM32 Ethernet stops receiving under heavy load #79066

Closed
biglben opened this issue Sep 26, 2024 · 8 comments
Closed

STM32 Ethernet stops receiving under heavy load #79066

biglben opened this issue Sep 26, 2024 · 8 comments
Assignees
Labels
area: Ethernet bug The issue is a bug, or the PR is fixing a bug platform: STM32 ST Micro STM32 priority: low Low impact/importance bug

Comments

@biglben
Copy link
Contributor

biglben commented Sep 26, 2024

Describe the bug
On the STM32H7, high incoming traffic combined with a busy application can cause the Ethernet peripheral to enter a state where it fails to receive data but continues transmitting. The Ethernet receive DMA channel becomes stuck in the suspended state (visible in RPS0 field in the ETH_DMADSR register).
I identified a fix in the stm32h7xx_hal_driver repository that addresses this issue by correctly setting the tail pointer (commit ceda3ce). With this fix applied, I tested various burst patterns, and the Ethernet functionality remained stable.
I am raising this issue to highlight that the STM32HAL needs to be updated and to ensure that other STM32 series (likely H7 and H5) receive the same fix. Sharing this information may save others considerable debugging time (it took me about 2 days).

To Reproduce
I reproduced the bug by applying the following patch to simulate the application performing other tasks or being blocked:

diff --git a/samples/net/sockets/echo_server/src/udp.c b/samples/net/sockets/echo_server/src/udp.c
index 6847ebd3eb6..222db5ea8d8 100644
--- a/samples/net/sockets/echo_server/src/udp.c
+++ b/samples/net/sockets/echo_server/src/udp.c
@@ -119,6 +119,7 @@ static int process_udp(struct data *data)
                received = recvfrom(data->udp.sock, data->udp.recv_buffer,
                                    sizeof(data->udp.recv_buffer), 0,
                                    &client_addr, &client_addr_len);
+               k_sleep(K_MSEC(110));
 
                if (received < 0) {
                        /* Socket error */

building using west build -p -b nucleo_h743zi/stm32h743xx zephyr/samples/net/sockets/echo_server
After target was ready to receive data, I ran this script:

import socket
import time
import random
import argparse

def send_udp_flood(target_ip, target_port, packet_size, bursts, delay):
    # Create a UDP socket
    sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
    
    # Generate random payload
    payload = bytes(random.getrandbits(8) for _ in range(packet_size))
    
    try:
        while True:
            for _ in range(bursts):
                # Send packet
                sock.sendto(payload, (target_ip, target_port))
            
            # Delay between bursts
            time.sleep(delay)
    except KeyboardInterrupt:
        print("Flooding stopped.")
    finally:
        sock.close()

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="UDP flood script.")
    parser.add_argument("ip", type=str, help="Target IP address.")
    parser.add_argument("port", type=int, help="Target port number.")
    parser.add_argument("packet_size", type=int, help="Size of each UDP packet in bytes.")
    parser.add_argument("bursts", type=int, help="Number of packets to send in each burst.")
    parser.add_argument("delay", type=float, help="Delay time between bursts in seconds.")
    
    args = parser.parse_args()
    
    send_udp_flood(args.ip, args.port, args.packet_size, args.bursts, args.delay)

using python udp_flood.py 192.0.2.1 4242 143 1000 0.1

Expected behavior
The STM32 Ethernet, under heavy incoming traffic, should simply lose some packets but continue operating without interruption.

Impact
None, as I have forked the STM32 HAL module and applied commit ceda3ce.

Logs and console output

[00:00:03.653,000] <inf> net_echo_server_sample: Network disconnected
[00:00:03.915,000] <inf> net_echo_server_sample: Network connected
[00:00:04.015,000] <inf> net_config: IPv6 address: 2001:db8::1
[00:00:04.015,000] <inf> net_config: IPv6 address: 2001:db8::1
[00:00:07.035,000] <err> eth_stm32_hal: Failed to obtain RX buffer

After this point, nothing is received anymore.

Environment:

  • OS: Linux
  • Toolchain: Zephyr SDK 0.16.8
  • Commit SHA: 36940db
@biglben biglben added the bug The issue is a bug, or the PR is fixing a bug label Sep 26, 2024
@erwango erwango added priority: low Low impact/importance bug platform: STM32 ST Micro STM32 area: Ethernet labels Sep 26, 2024
@kevinior
Copy link
Contributor

kevinior commented Oct 1, 2024

We're also seeing this. For example when there's incoming network traffic while mcumgr is erasing internal flash pages (causing program execution to stop).

After the "Failed to obtain RX buffer" message we can't communicate on the network any more.

Thanks @biglben , you've saved me a lot of debugging time.

@kevinior
Copy link
Contributor

kevinior commented Oct 1, 2024

I can confirm that the fix suggested by @biglben works for us too.

@jukkar jukkar changed the title STM32 Ethernet stops receiving und heavy load STM32 Ethernet stops receiving under heavy load Oct 1, 2024
@marwaiehm-st
Copy link
Collaborator

marwaiehm-st commented Oct 20, 2024

I have tested the modification described in the commit and can confirm that it works as expected.
Thank you @biglben

@marwaiehm-st
Copy link
Collaborator

Hi @biglben
Don't hesitate to open a PR containing the fix.

@biglben
Copy link
Contributor Author

biglben commented Oct 29, 2024

Hi @marwaiehm-st
I can open a PR with the fix for STM32H7, but i am not sure which other series have the same issue (i assume h5 too, but can not test).
I am not sure if this fix should be included in a HAL Update. There are more fixes in the stm32h7xx-hal-driver repo which are not included in the zephyr fork

@erwango
Copy link
Member

erwango commented Nov 6, 2024

@biglben Sure, you can open a PR which with a commit cherrypicked from STM32H7 HAL.
See zephyrproject-rtos/hal_stm32#226 as example

@marwaiehm-st
Copy link
Collaborator

The Fix is integrated on the last STM32H7 HAL update zephyrproject-rtos/hal_stm32@5a5a61d#diff-4c8a8466c6ed93130ca63c7b91b20aafc46249458faebeb3d6247c5aabf3765c

@erwango
Copy link
Member

erwango commented Dec 11, 2024

@biglben I'm closing, since fix has been merged. Please open a new issue if the problem is still there.

@erwango erwango closed this as completed Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Ethernet bug The issue is a bug, or the PR is fixing a bug platform: STM32 ST Micro STM32 priority: low Low impact/importance bug
Projects
None yet
Development

No branches or pull requests

4 participants