-
Notifications
You must be signed in to change notification settings - Fork 7.6k
Task watchdog getting triggered when opening OTA partition #3775
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Since you know exactly the method that causes the issue, why not just disable the watchdog (disableCore0WDT) while OTA begin runs? You really should be suspending all normal activity during an OTA in any case, so turning off WDT for 4-5 seconds should be reasonably safe. |
I'm not a huge fan of disabling watchdogs ever; same reason I don't take my seatbelt off when I get on the road my house is on because nothing usually happens there. We are using BLE to send the update data to the device, so there is still some async activity. I have had the BLE stack give me guru meditations during development of this firmware. |
Firmware update is not driving the car, it is replacing the oil filter. The engine should be off when you do that 😄. You should only need to WDT off when you apply the update (the end method). That is where it will be moving a big chunk of data to the disk, and waiting for an ack. |
Well, semantics of WDT aside, is there a documented way to rebuild exactly an arduino libs release, ie. 1.0.4? The arduino-lib-builder script seems to be unversioned, and also pulls in master branches of its dependencies. |
If you look at the releases page, there is a tag and a commit point for each release. Checkout that commit. |
I understand, but I mean in this repository https://github.com/espressif/esp32-arduino-lib-builder/ There does not seem to be releases. And the scripts seem to checkout unversioned subdeps. |
If you look in tools/config.sh there is a line for the branch (read https://github.com/espressif/arduino-esp32/blob/master/docs/lib_builder.md). You can also use a specific commit by feeding it with an IDF_COMMIT variable: |
[STALE_SET] This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions. |
[STALE_DEL] This stale issue has been automatically closed. Thank you for your contributions. |
Was there any solutions or easier workarounds to this than to rebuild the whole SDK? I'm experiencing the same WDT issue with OTA library when calling |
|
Oh it was so obvious. :) Thanks, this works out well. |
Just as a note to anybody else that may come across this issue. This problem does not occur in all devices. For some there is some unknown difference between some WROOM32 devices, that needs a longer watchdog timeout, and plenty of other devices that manage just fine with the defaults. |
Hardware:
Board: ESP32 Dev Module
Core Installation/update date: 1.0.4
IDE name: Arduino IDE
Flash Frequency: 80Mhz
PSRAM enabled: no
Upload Speed: 921600
Computer OS: Linux
Description:
We are noticing an issue on some modules (so far 2/4) whereby calling esp_ota_begin() can stall the ipc0 task for too long, and cause a watchdog reset.
In case it matters, we are compiling with CPU freq 80MHz, and using the minimal SPIFFS w OTA partition scheme.
Initially, we thought that the arduino LoopTask was causing the watchdog timing out, so we moved the calls to esp_ota_get_next_update_partition() and esp_ota_begin() into their own thread to let the LoopTask run freely. This does not solve it, as the task that watchdogs is actually ipc0 as shown:
E (13210) task_wdt: Task watchdog got triggered. The following tasks did not re:
E (13241) task_wdt: - IDLE0 (CPU 0)
E (13374) task_wdt: Tasks currently running:
E (13374) task_wdt: CPU 0: ipc0
E (13374) task_wdt: CPU 1: IDLE1
E (13374) task_wdt: Aborting.
abort() was called at PC 0x400dbbb7 on core 0
Putting a JTAG on and inspecting the ipc0 task, its callstack is as follows:
[Switching to thread 9 (Thread 1073446484)]
#0 0x40083df2 in spi_flash_op_block_func (arg=0x0) at /home/runner/work/esp32-a
rduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/spi_flash/cache_
utils.c:82
82 in /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder
/esp-idf/components/spi_flash/cache_utils.c
(gdb) where
#0 0x40083df2 in spi_flash_op_block_func (arg=0x0) at /home/runner/work/esp32-a
rduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/spi_flash/cache_
utils.c:82
#1 0x40082c6a in ipc_task (arg=0x0) at /home/runner/work/esp32-arduino-lib-buil
der/esp32-arduino-lib-builder/esp-idf/components/esp32/ipc.c:62
#2 0x40091270 in vPortTaskWrapper (pxCode=0x40082c08 <ipc_task>, pvParameters=0
x0) at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp
-idf/components/freertos/port.c:143
Working boards seem to still take over 4 seconds to return from esp_ota_begin(). To test the theory that we have some boards that may have slower flash than the others, we built our own SDK and changed the task watchdog timer from 5s to 10s, and that works around the issue. The misbehaving boards actually take about 5.2 seconds to call esp_ota_begin(). The problem also seems to be affected negatively when BLE is initialized (as in the minimal example below).
I'd really prefer not to have to build our own SDK from a bunch of master branches. Is there any way to get the arduino released SDK to raise the watchdog timeout limit? If not, is there anyway to check out EXACTLY the sources used to re-create a numbered release? I only know about the tools in https://github.com/espressif/esp32-arduino-lib-builder which don't seem to be versioned.
Sketch: (leave the backquotes for code formatting)
Debug Messages:
The text was updated successfully, but these errors were encountered: