-
-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG]: TPI doesn't fully start up after first or second RPI reboot, possibly related to kasa or temp probes #977
Comments
Hmm, to bad that this is happening. I updated the code so it should continue even when a Kasa relays fails. So it should start again in your case. But there are some things to consider.
So I suggest, based on the fact that I get the feeling you have a backup of the database, go for a full reinstall with Bullseye OS and the latest code using manual installation. |
Thanks for the reply. Re/ point 3) - I can try again with my current install but wanted to dig out some other 1-wire sensors I have lying around to see if they cause the same issue or if it could be a corrupted sensor. Since this is a live system that needs to manage my tank I am hesitant to break it again. Might try to get a second pi for redundancy and then experiment on that one to make sure I don't end up doing nightshifts to keep the temps correct. I'll do another reinstall later today and will post logs to check whether the error persists. My current version below: PRETTY_NAME="Raspbian GNU/Linux 11 (bullseye)" |
First concern are the animals. So only do some testing if possible. It is not needed from my part. OS looks good. About the 1-wire sensors, just do what you want. But they should just work in Bullseye OS. About the relay, it is a bit strange, as there are 6 relays on 1 ip? For Kasa I cannot test, so I cannot optimize it. But normally, this would be optimized by loading all 6 relays at once. And than only 1 connection is needed. And now it is requesting the same device 6 times. This will take some more time. And maybe some side effects like this. For now, I have only updated the hard loading code so that an Kasa error should not stop loading the system. |
So the 6 relays under 1 IP are a Kasa side thing - it is a single power strip with 6 plugs that can be individually addressed. Since it only creates one connection to the router not sure that there is a different way to address it. The Power Strip for your reference. I'll try again with the 1 wires but probably tomorrow. If it fails I want to have time to roll back 😄 |
Some additional points:
|
I just did a fresh install and am getting mixed results. Did a startup in debug mode and everything including the web interface loaded in about a minute. I also noticed that when I follow the logs after startup for longer, I see that eventually another TPI startup is started again and the loop begins once over? Logs from the run below. It's lengthy but shows when the web interface becomes available.
|
On another restart (after loading my backup db to this sd card), eventually, TPI was just in a loop of the below. Web interface not loading. What's unfortunate in this state is that it doesn't run any of the automations. E.g. the heat should be running on a 6 min on, 12 min off timer and it doesn't seem to turn on or off at all. Did a few more runs - even after 2h I am not getting the web interface running properly.
Also did a run in debugging mode, unsuccessfully:
|
sorry for the late response. But it looks like that somehow the 6th relay of the kasa device is having issues. I cannot see why, as I do not have a Kasa device. And I think this is strange, as it is the same device as the other 5 relays... so I am also a bit confused... The problem is that scanning for new relays is taking to much time / is hanging. I am looking into the code to see if I can fix that. Because when you start up TP, it has 3 minutes to fully startup. If that does not happen, TP will be restarted, to try again to startup. As you can see here:
Between the logline about scanning for new relays and the logline where TP startup again, is about 2,5 minute. There it hangs. Probably on the same Kasa device. But there is a way to start it up. First install the program 'screen': https://www.howtogeek.com/662422/how-to-use-linuxs-screen-command/ |
I have updated the code to scan maximal for 30 seconds for new hardware. That should make sure that TP will start at least within the 3 minutes startup window. But I still do not know why the last kasa relay is causing issues. |
Any updates on this? |
Hi, sorry for the delayed response. Also for anyone encountering this in the future - faulty 1-wire sensors will mess with the weirdest parts of the system. They corrupted my SD card several times, resulted in the system not loading etc. If you're facing a similar issue to me try disconnecting all 1-wires and then slowly reconnect. Thank you! |
More info
Please look first at the FAQ or Current Issues for more information.
Or search in the closed tickets to see if it should been fixed in the past
Setup:
Describe the bug
Recently, my setup of TerrariumPI (4.7) suddenly stopped working. My 2 1wire probes reported 0F. Once I attempted to restart the system, it wouldn't load the web interface anymore and SSH wouldn't work either.
I have spent the last 3 days trying to get back up and running - unsuccessfully.
I seem to face a recurring error where something related to my kasa relays fails (according to the logs) and fully bricks the RPI.
I have tried the following changes:
With most attempts I can get to a running TPI, log in once or twice, remove / ignore relays (most of my relays are kasa relays that are auto-added on startup, many of which have nothing to do with my terrarium). Then when I perform a reboot via sudo reboot, TPI doesn't reach healthy state and I get the below logs.
I should add that while my Kasa relays have the address structure 192.168.1.251,1 (through 6), there is no 9999 relay.
Even if I physically disconnect all probes and accessories (e.g. webcam), the error persists and my docker container can't reach healthy state.
And both temp probes provide accurate readings when I manage to connect them. But the system crashes irrespective of that.
I am completely lost at this point. Any help is much appreciated!
Logfiles
2025-01-06 07:09:04,096 - INFO - terrariumEngine - Starting up TerrariumPI 4.12.2 on a Raspberry Pi 4 Model B Rev 1.4 ...
2025-01-06 07:09:04,131 - INFO - terrariumEngine - Loaded 32 settings in 0.03 seconds.
2025-01-06 07:09:04,132 - INFO - terrariumEngine - Loaded language 'en_US'.
2025-01-06 07:09:04,858 - INFO - terrariumEngine - Start loading total power and water usage
2025-01-06 07:09:05,881 - INFO - terrariumEngine - Loaded total power and water usage in 1.02 seconds.
2025-01-06 07:09:05,882 - INFO - terrariumEngine - Loading existing sensors from database.
2025-01-06 07:09:05,945 - WARNING - hardware.sensor - Unable to load sensor 1-Wire sensor temperature named '1-Wire sensor measuring temperature' at address '28-01204fba80d8': Unable to load sensor 1wire 1-Wire sensor measuring temperature at address 28-01204fba80d8: Invalid path., retrying in 0.5 seconds...
2025-01-06 07:09:06,454 - WARNING - hardware.sensor - Unable to load sensor 1-Wire sensor temperature named '1-Wire sensor measuring temperature' at address '28-01204fba80d8': Unable to load sensor 1wire 1-Wire sensor measuring temperature at address 28-01204fba80d8: Invalid path., retrying in 0.5 seconds...
2025-01-06 07:09:06,957 - ERROR - terrariumSensor - Error loading 1wire temperature named '1-Wire sensor measuring temperature' at address '28-01204fba80d8' with error: Unable to load sensor 1-Wire sensor temperature named '1-Wire sensor measuring temperature' at address '28-01204fba80d8': Unable to load sensor 1wire 1-Wire sensor measuring temperature at address 28-01204fba80d8: Invalid path..
2025-01-06 07:09:06,982 - WARNING - hardware.sensor - Unable to load sensor 1-Wire sensor temperature named 'Yuzu Warm Temp' at address '28-01204fc0d569': Unable to load sensor 1wire Yuzu Warm Temp at address 28-01204fc0d569: Invalid path., retrying in 0.5 seconds...
2025-01-06 07:09:07,485 - WARNING - hardware.sensor - Unable to load sensor 1-Wire sensor temperature named 'Yuzu Warm Temp' at address '28-01204fc0d569': Unable to load sensor 1wire Yuzu Warm Temp at address 28-01204fc0d569: Invalid path., retrying in 0.5 seconds...
2025-01-06 07:09:07,988 - ERROR - terrariumSensor - Error loading 1wire temperature named 'Yuzu Warm Temp' at address '28-01204fc0d569' with error: Unable to load sensor 1-Wire sensor temperature named 'Yuzu Warm Temp' at address '28-01204fc0d569': Unable to load sensor 1wire Yuzu Warm Temp at address 28-01204fc0d569: Invalid path..
2025-01-06 07:09:07,989 - INFO - terrariumEngine - Scanning for new sensors ...
2025-01-06 07:09:17,193 - INFO - terrariumEngine - Loaded 0 sensors in 11.31 seconds.
2025-01-06 07:09:17,194 - INFO - terrariumEngine - Loading existing relays from database.
2025-01-06 07:09:21,635 - INFO - terrariumRelay - Loaded relay Kasa Smart relay named '1 Yuzu Heat Panel' at address '192.168.1.251,1' value 0.00 in 4.42 seconds.
2025-01-06 07:09:24,412 - INFO - terrariumRelay - Loaded relay Kasa Smart relay named '2 String Lights Tank' at address '192.168.1.251,2' value 100.00 in 2.78 seconds.
2025-01-06 07:09:27,269 - INFO - terrariumRelay - Loaded relay Kasa Smart relay named '3 Waterfall' at address '192.168.1.251,3' value 100.00 in 2.86 seconds.
2025-01-06 07:09:29,904 - INFO - terrariumRelay - Loaded relay Kasa Smart relay named '4 Growlights Bottom Tank' at address '192.168.1.251,4' value 0.00 in 2.63 seconds.
2025-01-06 07:09:32,713 - INFO - terrariumRelay - Loaded relay Kasa Smart relay named '5 Snake Sun' at address '192.168.1.251,5' value 100.00 in 2.81 seconds.
2025-01-06 07:09:32,770 - WARNING - hardware.relay - Error changing relay Kasa Smart relay named '6 Cooling Fan' at address '192.168.1.251,6' to state 0.0. Error: Unable to query the device 192.168.1.251:9999: [Errno 104] Connection reset by peer, retrying in 0.5 seconds...
2025-01-06 07:09:35,923 - INFO - terrariumRelay - Loaded relay Kasa Smart relay named '6 Cooling Fan' at address '192.168.1.251,6' value 0.00 in 3.21 seconds.
2025-01-06 07:09:35,925 - INFO - terrariumEngine - Scanning for new relays ...
No GEMBIRD SiS-PM found. Check USB connections, please!
Traceback (most recent call last):
File "/opt/venv/lib/python3.11/site-packages/kasa/xortransport.py", line 101, in close
await writer.wait_closed()
File "/usr/local/lib/python3.11/asyncio/streams.py", line 364, in wait_closed
await self._protocol._get_close_waiter(self)
File "/usr/local/lib/python3.11/asyncio/selector_events.py", line 999, in _read_ready__data_received
data = self._sock.recv(self.max_size)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.11/site-packages/gevent/_socketcommon.py", line 660, in recv
return self._sock.recv(*args)
^^^^^^^^^^^^^^^^^^^^^^
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/TerrariumPI/terrariumPI.py", line 23, in
terrariumEngine = terrariumEngine(version)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/TerrariumPI/terrariumEngine.py", line 159, in init
self.scan_new_relays()
File "/TerrariumPI/terrariumEngine.py", line 870, in scan_new_relays
for relay in terrariumRelay.scan_relays(callback=self.callback_relay):
File "/TerrariumPI/hardware/relay/init.py", line 284, in scan_relays
for relay in relay_device._scan_relays(callback, **kwargs):
File "/TerrariumPI/hardware/relay/kasa_relay.py", line 123, in _scan_relays
found_devices = __asyncio.run(__scan())
^^^^^^^^^^^^^^^^^^^^^^^
File "/TerrariumPI/terrariumUtils.py", line 65, in run
return data.result()
^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 456, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/TerrariumPI/hardware/relay/kasa_relay.py", line 93, in __scan
await device.update()
File "/opt/venv/lib/python3.11/site-packages/kasa/smartstrip.py", line 113, in update
await super().update(update_children)
File "/opt/venv/lib/python3.11/site-packages/kasa/smartdevice.py", line 357, in update
await self._modular_update(req)
File "/opt/venv/lib/python3.11/site-packages/kasa/smartdevice.py", line 386, in _modular_update
responses = [
^
File "/opt/venv/lib/python3.11/site-packages/kasa/smartdevice.py", line 387, in
await self.protocol.query(request) for request in request_list if request
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.11/site-packages/kasa/iotprotocol.py", line 43, in query
return await self._query(request, retry_count)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.11/site-packages/kasa/iotprotocol.py", line 64, in _query
raise ex
File "/opt/venv/lib/python3.11/site-packages/kasa/iotprotocol.py", line 48, in _query
return await self._execute_query(request, retry)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.11/site-packages/kasa/iotprotocol.py", line 86, in _execute_query
return await self._transport.send(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.11/site-packages/kasa/xortransport.py", line 165, in send
raise RetryableException(
kasa.exceptions.RetryableException: Unable to query the device 192.168.1.251:9999: [Errno 104] Connection reset by peer
2025-01-06T06:38:00Z
Exception ignored in: <module 'threading' from '/usr/local/lib/python3.11/threading.py'>
Traceback (most recent call last):
File "/opt/venv/lib/python3.11/site-packages/gevent/monkey.py", line 868, in _shutdown
orig_shutdown()
File "/usr/local/lib/python3.11/threading.py", line 1590, in _shutdown
lock.acquire()
File "/opt/venv/lib/python3.11/site-packages/gevent/thread.py", line 112, in acquire
acquired = BoundedSemaphore.acquire(self, blocking, timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "src/gevent/_semaphore.py", line 184, in gevent._gevent_c_semaphore.Semaphore.acquire
File "src/gevent/_semaphore.py", line 253, in gevent._gevent_c_semaphore.Semaphore.acquire
File "src/gevent/_abstract_linkable.py", line 521, in gevent._gevent_c_abstract_linkable.AbstractLinkable._wait
File "src/gevent/_abstract_linkable.py", line 487, in gevent._gevent_c_abstract_linkable.AbstractLinkable._wait_core
File "src/gevent/_abstract_linkable.py", line 490, in gevent._gevent_c_abstract_linkable.AbstractLinkable._wait_core
File "src/gevent/_abstract_linkable.py", line 442, in gevent._gevent_c_abstract_linkable.AbstractLinkable._AbstractLinkable__wait_to_be_notified
File "src/gevent/_abstract_linkable.py", line 451, in gevent._gevent_c_abstract_linkable.AbstractLinkable._switch_to_hub
File "src/gevent/_greenlet_primitives.py", line 61, in gevent._gevent_c_greenlet_primitives.SwitchOutGreenletWithLoop.switch
File "src/gevent/_greenlet_primitives.py", line 65, in gevent._gevent_c_greenlet_primitives.SwitchOutGreenletWithLoop.switch
File "src/gevent/_gevent_c_greenlet_primitives.pxd", line 35, in gevent._gevent_c_greenlet_primitives._greenlet_switch
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: