Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ble-scan switch -t not working #30

Closed
mzdaniel opened this issue May 26, 2021 · 7 comments
Closed

ble-scan switch -t not working #30

mzdaniel opened this issue May 26, 2021 · 7 comments
Labels
bug Something isn't working

Comments

@mzdaniel
Copy link

Thank you @Jakeler for this great project. For years I was trying to do something very similar like this following a endless rabbit hole I am hopping to streamline.

Describe the bug
If my understanding is correct the semantics of -t should be a timeout of how much time is allowed for a possible advertisement to be listened and logged. That is not the case on a couple of systems I tried.

Log messages
ble-scan -d 01:02:03:04:05:06 -t 30 --help
usage: ble-scan [-h] [-t SEC] [-d ADDR]

time ble-scan -d 01:02:03:04:05:06 -t 30
Started deep scan of 01:02:03:04:05:06
Traceback (most recent call last):
...
bleak.exc.BleakError: Device with address 01:02:03:04:05:06 was not found.
real 0m10.621s

ble-serial -v -d 01:02:03:04:05:06
08:13:28.674 | DEBUG | main.py: Running: Namespace(verbose=True, device='01:02:03:04:05:06', addr_type='public', adapter='hci0', mtu=20, write_uuid=None, filename=None, binlog=False, port='/tmp/ttyBLE', read_uuid=None)
08:13:28.675 | DEBUG | selector_events.py: Using selector: EpollSelector
08:13:28.675 | INFO | linux_pty.py: Slave created on /tmp/ttyBLE -> /dev/pts/3
08:13:28.676 | INFO | ble_interface.py: Receiver set up
08:13:28.679 | INFO | ble_interface.py: Trying to connect with 01:02:03:04:05:06
08:13:39.020 | ERROR | main.py: Bluetooth connection failed: Device with address 01:02:03:04:05:06 was not found.
08:13:39.020 | WARNING | main.py: Shutdown initiated
08:13:39.020 | INFO | linux_pty.py: Serial reader and symlink removed
08:13:39.020 | INFO | main.py: Shutdown complete.

Setup (please complete the following information):

  • OS: Linux delphy 5.6.5 Debian
  • Bluetooth Hardware: Intel AC 9560
  • BlueZ Version:bluetoothctl: 5.55
  • Python Version: Python 3.9.2
  • ble-serial and dependency versions: ble-serial 2.1.0, dbus-next 0.2.2, bleak 0.11.0
@Jakeler Jakeler added the bug Something isn't working label May 26, 2021
@Jakeler
Copy link
Owner

Jakeler commented May 26, 2021

Hi @mzdaniel,
that is indeed a bug, thanks for the report.
Currently the custom timeout is only applied to the normal scan function (ble-scan without -d), not the deep scan that actually connects to the device. Then it just uses the default timeout from bleak, which is 10 seconds long. Same for ble-serial.

I will fix this soon and make a new release.

Jakeler added a commit that referenced this issue May 26, 2021
Jakeler added a commit that referenced this issue May 26, 2021
use -t, rename address-type to -a
ref #30
@mzdaniel
Copy link
Author

Fantastic. Thank you for looking into this issue.

@Jakeler
Copy link
Owner

Jakeler commented May 27, 2021

Version 2.2.0 with the fix is released now. You can update to it with pip install -U ble-serial.

@Jakeler Jakeler closed this as completed May 27, 2021
@mzdaniel
Copy link
Author

Thank you again @Jakeler for looking into this issue

Unfortunately my systems are still experiencing instability. The timeout is not truly respected by bleak, as it just relays a kryptic dbus message: bleak.exc.BleakDBusError: org.bluez.Error.Failed expecting the higher levels to address lower level issues or even worse, the user doing manual retries before an explicit timeout.
In this sense, bluepy seem to work better though it uses C code plus all the other important considerations in #5

Temporarely removing the try block from the latest ble-serial release gives this traceback

time ble-scan -d 01:02:03:04:05:06 -t 30
Started deep scan of 01:02:03:04:05:06

Traceback (most recent call last):
File "/home/admin/.local/bin/ble-scan", line 8, in
sys.exit(main())
File "/home/admin/.local/lib/python3.9/site-packages/ble_serial/scan/init.py", line 55, in main
asyncio.run(scan(args))
File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
return future.result()
File "/home/admin/.local/lib/python3.9/site-packages/ble_serial/scan/init.py", line 9, in scan
await deep_scan(args.addr, args.sec)
File "/home/admin/.local/lib/python3.9/site-packages/ble_serial/scan/init.py", line 28, in deep_scan
async with BleakClient(dev, timeout=time) as client:
File "/home/admin/.local/lib/python3.9/site-packages/bleak/backends/client.py", line 61, in aenter
await self.connect()
File "/home/admin/.local/lib/python3.9/site-packages/bleak/backends/bluezdbus/client.py", line 296, in connect
assert_reply(reply)
File "/home/admin/.local/lib/python3.9/site-packages/bleak/backends/bluezdbus/utils.py", line 23, in assert_reply
raise BleakDBusError(reply.error_name)
bleak.exc.BleakDBusError: org.bluez.Error.Failed

real 0m5.314s

@Jakeler
Copy link
Owner

Jakeler commented May 27, 2021

Hmm, I can't reproduce this, it works as expected here:

$ ble-scan -d 01:02:03:04:05:06 -t 30
Started deep scan of 01:02:03:04:05:06

ERROR: Device with address 01:02:03:04:05:06 was not found.
                                                                 [ 0 ] [ 30s091 | 21:15:54 ]

$ ble-scan -d 01:02:03:04:05:06 -t 1
Started deep scan of 01:02:03:04:05:06

ERROR: Device with address 01:02:03:04:05:06 was not found.
                                                                 [ 0 ] [ 1s088 | 21:20:27 ]

I have tested it with two adapters (Intel AC 7260, Realtek RTL8761B) on latest Arch Linux, BlueZ 5.58.
Maybe you can experiment with other hardware/software and find out what causes this?
Does it fail with all timeout values (including the default 5s)?

@mzdaniel
Copy link
Author

mzdaniel commented May 28, 2021 via email

Jakeler added a commit that referenced this issue May 28, 2021
based on deep scan, especially for `org.bluez.Error.Failed` #30
@Jakeler
Copy link
Owner

Jakeler commented May 28, 2021

Okay I see, sorry that I misunderstood you. So it has not much to do with the timeout (at least not the scan or advertisement timeout), it is a general connection problem with a valid device address.

I made a another test script now: https://github.com/Jakeler/ble-serial/blob/master/tests/test_conn_reliability.py
Let it run for 100 attempts, but zero errors again... About 4s average runtime.
The test device was on my desk, only a a few cm away from the Bluetooth adapter on the PC, which results in a strong signal (RSSI about -35).
Then I moved the device to another room and tried again with the weak signal (RSSI about -85). Now the avg. runtime increased to 8s and sure enough 5% result in org.bluez.Error.Failed! Close at the range limit I get the error always in 5-10 tries.

Looking at what happens in Wireshark shows that after this request:

Bluetooth HCI Event - Command Status
    Event Code: Command Status (0x0f)
    Parameter Total Length: 4
    Status: Pending (0x00)
    Number of Allowed Command Packets: 2
    Command Opcode: LE Read Remote Features (0x2016)
    [Command in frame: 603]
    [Command-Pending Delta: 1,773ms]

It fails exactly 225 ms later:

Bluetooth HCI Event - Disconnect Complete
    Event Code: Disconnect Complete (0x05)
    Parameter Total Length: 4
    Status: Success (0x00)
    Connection Handle: 0x0010
    Reason: Connection Failed to be Established (0x3e)

So the host receives an advertisment, opens a connection, tries to read remote features and never gets a response. The 225 ms (7.5ms*30) could be the connection interval, but I am not sure about that.
That means the root cause is very low level, probably even the physical layer, which causes packet loss. I agree that the BlueZ errors are pretty cryptic and unhelpful, but it might not be a bug.

Side note, one time I got this unusual error message:

ERROR: Device with address 20:91:48:DF:76:D9 could not be found. Try increasing `timeout` value or moving the device closer.

A valid attempt would instead follow up with some data:

Bluetooth HCI Event - LE Meta
    Event Code: LE Meta (0x3e)
    Parameter Total Length: 12
    Sub Event: LE Read Remote Features Complete (0x04)
    Status: Success (0x00)
    Connection Handle: 0x0010
    Supported LE Features: 0x0000000000000001, LE Encryption
    [Command in frame: 22]
    [Pending in frame: 23]
    [Pending-Response Delta: 300,165ms]
    [Command-Response Delta: 301,935ms]

(+ many other packets)
After all it normally disconnects with Reason: Connection Terminated by Local Host (0x16) and then Reason: Remote User Terminated Connection (0x13).


Can you confirm the correlation between RSSI and connection failures?
If so, how is the success rate? If I understand your last comment correctly it works most of the time? Because the bleak issue mentions <10% working. Wireshark captures could be also interesting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants