Skip to content

OSError: [Errno 116] ETIMEDOUT from socket.recv_into #209

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
istvanzk opened this issue Apr 12, 2025 · 32 comments
Closed

OSError: [Errno 116] ETIMEDOUT from socket.recv_into #209

istvanzk opened this issue Apr 12, 2025 · 32 comments

Comments

@istvanzk
Copy link

istvanzk commented Apr 12, 2025

Hi,

When testing POST requests from a XIAO ESP32S3 (Sense) board, I get this exception error:

Traceback (most recent call last):
  File "adafruit_requests.py", line 658, in request
OSError: [Errno 116] ETIMEDOUT

i.e. the exception is thrown by

socket.recv_into(result)

I managed to further trace the problem back to the when the request is sent in line 643:

self._send_request(socket, host, method, path, headers, data, json, files)

where the actual sending of the bytes seems to fail, possibly due to some HW/SW buffering reasons. Hence the response received is wrong.

My solution was to add a small delay before line 449:

sent = socket.send(data[total_sent:])

so it becomes:

time.sleep(0.02)
sent = socket.send(data[total_sent:])

With this change I do not have anymore the ETIMEDOUT error.
Alternatively, I have also tried to have a delay inserted before/after the self._send_request(...) but that did not solve the timeout problem.

I did test also the same code on XIAO ESP32C3, with the same result. I don't know if it is a specific problem/solution for XIAO, ESP32, ... board, or a more general one.

@justmobilize
Copy link
Collaborator

A few questions:

  1. What version of CircuitPython are you using?
  2. Does this happen no matter where you POST to?
  3. Can you share some example code that causes your issue?

I have many boards (not that one in particular) and have not seen that issue come up.

@istvanzk
Copy link
Author

istvanzk commented Apr 12, 2025

Thank you Justin.

I use CircuitPython 9.2.6 (2025-03-23) and Adafruit Requests version 4.1.10.

I'm having the error when sending a POST request to Dropbox API host and route:
https://api.dropboxapi.com/2/users/get_current_account
and with the headers set according to the https://www.dropbox.com/developers/documentation/http/documentation#users-get_current_account info (authentication is correct because it works after the above described 'hack')

It might easily be a server thing, because when I send POST to:
https://api.dropboxapi.com/oauth2/token
the response is correct even without the extra delay in the code.

@justmobilize
Copy link
Collaborator

That really doesn't make sense. For any one post there are a bunch of calls to send, I've never seen a case of them sending too fast.

What I would recommend is share code that others can test and adding some debug statements to the _send method so you can see how many sends happen before it errors.

@istvanzk
Copy link
Author

Hi Justin.

I know, I had the same reaction :)

I have been trying to debug where before the actual send() I print out the data bytes:

hex_data = ' '.join(f'{b:02x}' for b in data[total_sent:])
print(f'Sending data: {hex_data}')
sent = socket.send(data[total_sent:])

Then things worked fine, and that is how I got to the 'solution' of replacing the print() with a delay...

I'll have my code in GitHub repo within a few days, and I'll post here the link.

@istvanzk
Copy link
Author

Hi,

My code is available here: https://github.com/istvanzk/xiaocam_cpy
The dbx_test is the one which generated the OSError: [Errno 116] ETIMEDOUT on my XIAO board.
I'll continue testing, also on the other XIAO boards I have.

In order to use the Dropbox API implemented in this repo, you need to have a Dropbox account and set up an App in your Dropbox App Console. Then follow the OAuth flow, which can be run on a development machine (not the XIAO board), to obtain the access_token and refresh_token. These tokens need to be inserted in the settings.toml file, as the values for the corresponding DBX_* variables. Don't forget to set also all the other DBX_* variables.

@istvanzk
Copy link
Author

istvanzk commented Apr 13, 2025

I have now tested the same code on a XIAO ESP32C3 board + CircuitPython 9.2.7 and, I got the same error, with the same 'solution' of adding a small delay just before the socket.send().

@istvanzk
Copy link
Author

istvanzk commented Apr 18, 2025

I have been testing this problem further, to try to understand it better. For now, I get this error only when interacting with the Dropbox API server from CircuitPython (their official Python SDK works fine). However ....

  • From the MicroPython issue #5759 it looks like a similar error was known and fixed already back in 2020 for UDP sockets, and is due to no data available (from the server), in combination with the wrong error returned by the extmod/modlwip.c implementation.
  • In the current CircuitPython implementation, I was not able to find the extmod/modlwip.c implementation, so I can't check if the old MicroPyhon fix is applied or not.

@dhalbert
Copy link
Contributor

  • In the current CircuitPython implementation, I was not able to find the extmod/modlwip.c implementation, so I can't check if the old MicroPyhon fix is applied or not.

Our versions of lwip are in ports/raspberrypi/lib/lwip and the one inside ESP-IDF: ports/espressif/esp-idf/modules/components/lwip.

@istvanzk
Copy link
Author

Thank you Dan for the pointers.
I was not so far able to find the place in the code where the above MP fix could appear. Anyways, is for UDP sockets, while the adafruit_requests defaults to TCP.

I was now reading the psf/requests#3353 (comment)
I'm not sure this would solve my problem, but in the current socketpool.Socket implementation there is no support to set the SO_KEEPALIVE, TCP_KEEPIDLE, TCP_KEEPINTVL, TCP_KEEPCNT with setsockopt(). Or, perhaps, there is, but the int values for these are not exposed!?

@dhalbert
Copy link
Contributor

Or, perhaps, there is, but the int values for these are not exposed!?

Yes, you could look those up in circuitpython/ports/espressif/esp-idf/components/lwip/lwip/src/include/lwip/sockets.h and just use the integer values. We have been adding them slowly, but there are many and they take up some (small) amount of space to add.

@istvanzk
Copy link
Author

Thank you Dan.

I have now tested setting the options for TCP keep-alive as recommended in the other psf/requests#3353 (comment). It did not solve my ETIMEDOUT issue, so I need to dig further. Perhaps I'll rise the discussions also on the Dropbox forum.

Status: Unfortunately, so far, the only solution which works is the delay inserted in the _send() method ....

@istvanzk
Copy link
Author

The issue is now also reported on the Dropbox forum. Let's see ...

@justmobilize
Copy link
Collaborator

I'll test your code on some other boards on Monday (was traveling, so couldn't before)

@dhalbert
Copy link
Contributor

@istvanzk Do you have a Pico W or similar that could also be used to test this, to see if it's specific to Espressif? And if you have any other Espressif boards that are not ESP32-S3, that would also be interesting.

@istvanzk
Copy link
Author

I don't have a Pico W. I have a Challenger+ RP2350 WiFi6/BLE5 i.e., with a ESP32-C6 co-processor. I'll try to test this next week.
So far, I only tested another XIAO board XIAO ESP32C3 with the same result and 'fix'.

@dhalbert
Copy link
Contributor

We don't have support for WiFi on the Challenger board in CircuitPython. I suggested a Pico W because it uses a very different WiFi co-processor and underlying code, but would still use the adafruit_requests.

You could even try adafruit_requests on regular Python on your host computer. You can install it with pip3 install adafruit-circuitpython-requests, assuming you already have Python installed. There is an example for CPython ("regular Python") here: https://github.com/adafruit/Adafruit_CircuitPython_Requests/blob/main/examples/cpython/requests_cpython_simpletest.py

@istvanzk
Copy link
Author

Thank you @dhalbert . OK. I'm going to modify my test code to run on my dev machine (Python 3.12).
Anyway, I think is a good idea to have such support in my code as well, as I can see support in most of the CircuitPython modules.

Btw, for using WiFi on my Challenger board in another project, I was planning to make use of the circuitpython-esp32at.

@istvanzk
Copy link
Author

I have now done the test on host/dev machine (Python 3.12) using the original adafruit_requests and my custom Dropbox API implementation. It runs as expected, no errors, no timeouts.
This means the delay 'fix' I came to use when running on the XIAO board is needed due to some Dropbox API server behaviour. Perhaps something to do with the SSL. It's more and more strange, to me at least.

Thank you for all the help @dhalbert and @justmobilize !

When/if I find a solution to my problem I'll post it here, just in case others experience similar issues.

@istvanzk
Copy link
Author

I'm closing this as it does not seem to be an issue with adafruit_requests.

@dhalbert
Copy link
Contributor

It could be an issue with WiFi on the Espressif chips (maybe something is not-quite-done when something returns0. That's why testing on Pico W would be interesting.

Do you have a simple test case you could include here? (I did not read all the posts, sorry if it's somewhere above.) I assume we need a Dropbox API key?

@istvanzk
Copy link
Author

Hi @dhalbert

Yes, to test this, one needs a Dropbox account (even a free one would do) to be able to create an App on their servers. I have my code here: https://github.com/istvanzk/xiaocam_cpy , see also earlier post: #209 (comment)

In the meantime, on the Drobox forum I was suggested to try one of these:

  1. TCP Window Size Sensitivity
    Dropbox API servers may delay ACK packets on purpose (common cloud optimization). Your time.sleep(0.02) fix inadvertently gives the server time to respond. Try modifying socket.settimeout() in your requests library instead of hardcoded sleeps.

  2. MTU Fragmentation
    The ESP32S3's default 1500-byte MTU may conflict with Dropbox's TLS overhead. Test with:

import wifi
wifi.radio.mtu = 1400  # Add this before API calls

My results:

  1. Does not have any impact, and the timeout on the socket was already set (in the connection_manager). I tried several values, no difference w.r.t. this error. Also, similar POST, or GET, requests work fine on other cloud servers, but not the Dropbox one.
  2. The wifi.radio.mtu is not directly accessible in the current CircuitPython, for XIAO boards at least.

@justmobilize
Copy link
Collaborator

If you set the timeout on the very first request, the socket will have it.

You can also create a separate request session:

requests = adafruit_requests.Session(pool, ssl_context)
requests.get(...)
requests_2 = adafruit_requests.Session(pool, ssl_context, session_id="other")
requests_2.get(..., timeout=60)

@istvanzk
Copy link
Author

Hi @justmobilize
This was a spot-on tip, thank you!
I have used a separate session only for the Dropbox API, and things work as expected. have to set the timeout value to 60sec for Dropbox, while for any other server a 10sec timeout is fine.

Now, I think my problem is finally solved :)

@istvanzk istvanzk reopened this Apr 22, 2025
@istvanzk
Copy link
Author

The solution to such problems is to use a separate Session, as explained by @justmobilize in #209 (comment).
In my case I use a timeout value of 60sec for Dropbox API, while for any other server a 10sec timeout is fine.

@justmobilize
Copy link
Collaborator

@istvanzk that's awesome. If you update your repo and let me know, I'll test a few other boards to see if the same thing is needed or if it's board specific

@istvanzk
Copy link
Author

@justmobilize My repo https://github.com/istvanzk/xiaocam_cpy/tree/main is now updated, and I removed all references to the 'fix'. I'll test me too, later, on the other XIAO board I have (C3).

@justmobilize
Copy link
Collaborator

So on a ESP32S2, setting the timeout to something small like 0.1 failed and 0.3 passed. with the test code. I could also use the same session without issues. Each host will get their own socket, so you don't really need a different session unless you need different timeouts for different calls.

@dhalbert
Copy link
Contributor

The timeout increase for ACK seems very specific to Dropbox, but they said "Dropbox API servers may delay ACK packets on purpose (common cloud optimization)". Since one doesn't see the Dropbox on, say, regular Python on a host computer, should we change the default? I'm confused about what the default should be. See for example https://docs.python.org/3/library/socket.html#notes-on-socket-timeouts, where the default timeout is None, which means the socket is blocking and can apparently wait a long time.

@justmobilize
Copy link
Collaborator

Yeah, in CPython you need to set it if you don't want it to block forever. But since things are usually threaded, or you are doing one thing at a time, it's not a big deal.

I personally don't think blocking in circuitpython makes as much sense. But maybe?

@istvanzk
Copy link
Author

I have not tried using 0.1 or 0.3 values, but the case which kept failing for me was with 10sec timeout and common session with other requests (to Adafruit IO). I had to, both separate the Dropbox calls to a separate session and also set the timeout to 60sec (as the default in the Dropbox Python API), while for the other calls the timeout is 10sec (probably can be set lower)

@justmobilize
Copy link
Collaborator

I'm just saying that since the domain is different, you don't need a different session.

Connection Manager grabs a new socket for each domain, so as long as all the calls to dropbox send the same timeout value, you just need the one request session

@istvanzk
Copy link
Author

Unfortunately, my problem is back. I tried today and again I get the same error, even with the last setup which was working 2 days ago :(. It very much looks like a Dropbox server problem to me, either on purpose or a genuine error.... I don't know. Perhaps I'll give up using Dropbox for this purpose and look for another solution....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants