Skip to content

Crash/Panic after SSL Handshake fails #211

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bkgoodman opened this issue Feb 16, 2017 · 24 comments
Closed

Crash/Panic after SSL Handshake fails #211

bkgoodman opened this issue Feb 16, 2017 · 24 comments

Comments

@bkgoodman
Copy link

This only happens under a kind of odd edge-case - I specified my own Root CA, Client Cert and Key, and under "normal" conditions (when I access my server) this works fine.

In the case of the crash - this occurred when I inadvertently connected to the www.howsmyssl.com server when using my own certs. When I tried this only specifying my Root CA, I got the appropriate error that the certificate verify mismatched/failed, and everything aborted normally. But when I had my client cert and key in as well, I got this guru meditation.

Crash, I assume at the line:

DEBUG_PRINT("Protocol is %s \nCiphersuite is %s\n", mbedtls_ssl_get_version(&ssl_client->ssl_ctx), mbedtls_ssl_get_ciphersuite(&ssl_client->ssl_ctx));

So, in other words - I believe once the handshake failed (as it should have) - maybe it should just have aborted, rather than to continue - which might have (as a result of the handshake failure) gotten back bad pointers from the mbedtls_ssl_get_ciphersuite and/or mbedtls_ssl_get_version calls?

(I could try a fix - if I new how to build it? First time with Arduino here...)

Starting connection to server...
Free heap before TLS 106160
Seeding the random number generator
Loading CA cert
Loading CRT cert
Loading private key
Setting up the SSL/TLS structure...
Performing the SSL/TLS handshake...
mbedtls_ssl_handshake returned -0x2700
Guru Meditation Error of type LoadProhibited occurred on core  1. Exception was unhandled.
Register dump:
PC      : 0x400014fd  PS      : 0x00060430  A0      : 0x800d861c  A1      : 0x3ffde0f0  
A2      : 0x00000000  A3      : 0xfffffffc  A4      : 0x000000ff  A5      : 0x0000ff00  
A6      : 0x00ff0000  A7      : 0xff000000  A8      : 0x00000000  A9      : 0x3ffde0a0  
A10     : 0x00000000  A11     : 0x3f4042df  A12     : 0x3ffde334  A13     : 0x3ffc8ff0  
A14     : 0x00000000  A15     : 0x00000001  SAR     : 0x00000016  EXCCAUSE: 0x0000001c  
EXCVADDR: 0x00000000  LBEG    : 0x400014fd  LEND    : 0x4000150d  LCOUNT  : 0xffffffff  

Backtrace: 0x400014fd:0x3ffde0f0 0x400d861c:0x3ffde100 0x400d3008:0x3ffde410 0x400d1972:0x3ffde460 0x400d15aa:0x3ffde6b0 0x400d1608:0x3ffde6d0 0x400d13f3:0x3ffde700 0x4012934e:0x3ffde720

CPU halted.
0x400014fd: ?? ??:0
0x400d861c: _vfprintf_r at vfprintf.c:1529
0x400d3008: printf at printf.c:61
0x400d1972: start_ssl_client(sslclient_context*, unsigned int, unsigned int, unsigned char*, unsigned char*, unsigned char*) at ssl_client.cpp:230
0x400d15aa: WiFiClientSecure::connect(IPAddress, unsigned short, unsigned char*, unsigned char*, unsigned char*) at WiFiClientSecure.cpp:188
0x400d1608: WiFiClientSecure::connect(char const*, unsigned short, unsigned char*, unsigned char*, unsigned char*) at WiFiClientSecure.cpp:188
0x400d13f3: setup at BKGSecureWifi.ino:124
0x4012934e: loopTask(void*) at main.cpp:11 (discriminator 1)
@bkgoodman
Copy link
Author

I did verify that mbedtls_ssl_get_ciphersuite was returning NULL.

I believe the best way to handle this might be to (close the socket and) abort if mbedtls_ssl_handshake returns NULL, returning 0. (Higher level code should regognize this as a failure)

That said - I am not sure if a lot of the other break; error cases should do the same. (Don't want to submit a patch without knowing - and would defer judgement to someone with more knowledge)

@copercini
Copy link
Contributor

copercini commented Feb 16, 2017

oh, it's a really odd edge-case hahaha
but it should not crash in any case. Thanks for report =)

I believe the best way to handle this might be to (close the socket and) abort if mbedtls_ssl_handshake returns NULL, returning 0. (Higher level code should regognize this as a failure)

It's a good solution. But returning -1 or the MbedTLS error, because higher level code just interprets as error if (ret < 0).

I want to release some fixes and heap optimizations for this lib soon and include your fix to this too.

@bkgoodman
Copy link
Author

But the Adruino example does:

if (client.connect(server, 443))

So this would interpret "0" as an error, and nonzero as valid. This works if zero is not a valid "socket".

I was going to come up with a patch, but there seem to be some inconsistencies in how errors are returned. Some places return 0 for error, others return -1. The published client example codes (above) assumes zero is an error.

I don't want to break any "existing" applications - but I'm inclined to think a "clean break" at this point might be good.

If you think >=0 should be "valid" and <0 should mean error - and you're okay with "changing" the existing demo and API - I can submit a fix that cleans up some of this...

@copercini
Copy link
Contributor

copercini commented Feb 16, 2017

But it's called first here: https://github.com/espressif/arduino-esp32/blob/master/libraries/WiFiClientSecure/src/WiFiClientSecure.cpp#L96 just after that it returns to arduino.

@bkgoodman
Copy link
Author

It's only in one of the 4 cases of "connect". In the first two, no explicit "return" is given (will c++ return the last value on the stack by default)? Even in the one you site, it always returns "1" - even on error. In the last case (below that) it returns "0" on name lookup failure, otherwise the value from the "third" case (one you sited) - which always returns "1".

Then there's start_ssl_client which returns the socket number (initialized to -1) - but does not clean up the connection or set the socket back to -1 in the case of error.

Seems like all the connect functions should really return the socket on success, or <0 on error.

@copercini
Copy link
Contributor

@bkgoodman
Copy link
Author

Getting warmer :-) ... It runs a few iterations and then died - but this time, before it did - it got:

Setting up the SSL/TLS structure...
SSL - Memory allocation failed
MbedTLS error: -32512

(The following panic was in the same .queue.c location, trying to read after the connection failed to come up, etc. I still seem to not be getting the return code that the connect failed).

So clearly a case of something not getting freed correctly, even in the happy case.

I am just sitting in a loop doing (comments, printfs and cruft removed for legibility):

  WiFiClientSecure *client = new WiFiClientSecure;
  if (client->connect(server, 443, test_ca_cert, test_client_cert, test_client_key)>=0)
  {
    client->printf("GET https://host.domain.com/cgi-bin/time.cgi?i1=%d&i2=%di3=%d&i4=%d&reason=0x%x HTTP/1.0",
          digitalRead(36),digitalRead(37),0,
          2,REG(RTC_CNTL_RESET_STATE_REG)
          );
    client->println("Host: host.domain.com");
    client->println("Connection: close");
    client->println();
  }

  while (!client->available()){ }  

  while (client->available()) {
    char c = client->read();
    Serial.write(c);
  }

  client->stop();
  delete client;

@copercini
Copy link
Contributor

copercini commented Feb 24, 2017

just for check, please put
WiFiClientSecure *client = new WiFiClientSecure;
just as global and remove it from loop().
remove too: delete client;

If it works, I know the problem hahahaha

@me-no-dev
Copy link
Member

where do you see setup and loop?
Why do you new WiFiClientSecure if you gonna use the client only locally?
Try putting new line (\r\n) in the first printf. there isn't new line added so the server does not even get the host header. instead it gets invalid string.

@me-no-dev
Copy link
Member

me-no-dev commented Feb 24, 2017

void testClient(){
  WiFiClientSecure client;
  if(!client.connect(server, 443, test_ca_cert, test_client_cert, test_client_key)){
    return;//connect fail
  }
  client.printf("GET https://host.domain.com/cgi-bin/time.cgi?i1=%d&i2=%di3=%d&i4=%d&reason=0x%x HTTP/1.0\r\n",
          digitalRead(36),digitalRead(37),0,
          2,REG(RTC_CNTL_RESET_STATE_REG)
          );
  client.print("Host: host.domain.com\r\n");
  client.print("Connection: close\r\n\r\n");

  while (!client.available()){ }  

  while (client.available()) {
    char c = client.read();
    Serial.write(c);
  }
}

@bkgoodman
Copy link
Author

That worked for several iterations. - then I got the same error in queue.c:

Verifying peer X.509 certificate...
Certificate verified.
Free heap after TLS 21348
Connected to server!
UNKNOWN ERROR CODE (004E)
MbedTLS error: -78

Which appear to come from one of the client->printfs which come after the client->connect. (HTTP request, GET, etc).

Another time (after a reset and may loops) I did see:

Performing the SSL/TLS handshake...
BIGNUM - Memory allocation failed
MbedTLS error: -16

...and one other time I saw...

/Users/ficeto/Desktop/ESP32/ESP32/esp-idf-public/components/freertos/./queue.c:1445 (xQueueGenericReceive)- assert failed!
abort() was called at PC 0x400837e1

@me-no-dev
Copy link
Member

you are running the latest esp32-arduino right?

@bkgoodman
Copy link
Author

Not sure what you mean about the newlines. I have it in the first (and subsequent) printfs. Like I said - it works the first several times through the loop. My Apache log sees the request properly - no issues.

As for why I new/delete all the time - the idea is that this is going to sit in a 7/24 lights-out facility. It needs to survive WiFi outages - so the idea being that I should be completely tearing down and restarting the network connection every time through. What I'm REALLY trying to do is to get a good watchdog to come in and do a hard reset intermittently, but that is a whole other issue...

I am running the very latest:

[remote "origin"]
  url = https://github.com/espressif/arduino-esp32.git
  fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
  remote = origin
  merge = refs/heads/master

(With the exception of the WifiSecureClient/src files you wanted me to try in the comment above)

I just did a pull and while I haven't seen the queue.c error yet - I still see the UNKNOWN ERROR CODE 04E MbedTLS error: -78 (in the printf)?

I will keep trying, though...

@me-no-dev
Copy link
Member

in the first printf in your code there isn't a new line at the end! so you are sending actually GET [url] HTTP/1.1Host: [hostname]\r\n and I also wonder why you have https://hostname in your request? you actually need only the stuff after hostname (/cgi-bin/time.cgi?i1=%d&i2=%di3=%d&i4=%d&reason=0x%x HTTP/1.0\r\n)

@me-no-dev
Copy link
Member

there could be one more reason. Withe every client.print you are sending a new packet. Which might be an issue with TLS. Try sending this instead of all prints:

client.printf("GET /cgi-bin/time.cgi?i1=%d&i2=%di3=%d&i4=%d&reason=0x%x HTTP/1.0\r\nHost: %s\r\nConnection: close\r\n\r\n",
  digitalRead(36),digitalRead(37),0,
  2,REG(RTC_CNTL_RESET_STATE_REG),
  "host.domain.com"
  );

@copercini
Copy link
Contributor

copercini commented Feb 24, 2017

You are running out of heap, so these strange errors appears...

try call
client.stop();
in the end of the function

Every time calling

  WiFiClientSecure client;
  if(!client.connect(server, 443, test_ca_cert, test_client_cert, test_client_key))

it start a new ssl connection and you are not ending the previous one, running out of heap.

@me-no-dev
Copy link
Member

@copercini what are you talking about? at the end of the method the client destruct is called. If you are not closing the connection, the server will (Connection: close) and it's a bug. No need for the large fonts please. I have very good vision ;) If you are leaking memory, that is your implementation at fault. the code above is valid.

@copercini
Copy link
Contributor

@me-no-dev sorry about big font, It was not intentional just my poor github formatting code skills

@bkgoodman are you getting a message like: "SSL - The peer notified us that the connection is going to be closed" after the request?

@bkgoodman
Copy link
Author

@copercini - no message like that. In fact, the URL goes to a simple CGI script that dumps the environs - which the ESP then prints - all that is coming back fine.

This code (which prints the received page to serial) (and the odd GET line with the host name and newline issues) were just all taken verbatim from the WiFiClientSecure Arduino example code.

@bkgoodman
Copy link
Author

bkgoodman commented Feb 24, 2017

I did take my entire request - and send it in as a single printf - adding in the newlines, etc. After a few iterations, I got:

Performing the SSL/TLS handshake...
BIGNUM - Memory allocation failed
MbedTLS error: -16

(To answer your prior question) I"m using WiFiClientSecure because this device is going into an off-site place (a pump-house - monitoring equipment) - and will be using third-party/public wifi to communicate back to the monitoring system. (i.e. it's not "local", or a local network).

me-no-dev pushed a commit that referenced this issue Mar 10, 2017
* Add CA certificate in example

SHA1 fingerprint is broken now: more info: https://shattered.io

* Best error handling

When occur an error in WiFiClientSecure library just return the error message
and clean the context avoiding crash - fix for #211

Translate MbedTLS error codes in messages for best understanding

* Declarate certificates as const

mbedtls_pk_parse_key needs a const unsigned char * certificate. In old implementation the certificate was declarated as char * so first it converts to unsigned and after to const.

When we convert signed to unsigned it may result in a +1 larger output.

Fix issue #223
@copercini
Copy link
Contributor

When WiFiClientSecure receive a malformed or incomplete response it's don't close the connection, using about 45k of heap each connection.

After timeout, just the sockets is closed, but not the mbedtls stuff, causing a big memory leak. So after some bad iterations ESP will out of heap.

@copercini
Copy link
Contributor

Solved!

turmary pushed a commit to Seeed-Studio/Seeed_Arduino_atWiFiClientSecure that referenced this issue Jan 22, 2020
* Add CA certificate in example

SHA1 fingerprint is broken now: more info: https://shattered.io

* Best error handling

When occur an error in WiFiClientSecure library just return the error message
and clean the context avoiding crash - fix for espressif/arduino-esp32#211

Translate MbedTLS error codes in messages for best understanding

* Declarate certificates as const

mbedtls_pk_parse_key needs a const unsigned char * certificate. In old implementation the certificate was declarated as char * so first it converts to unsigned and after to const.

When we convert signed to unsigned it may result in a +1 larger output.

Fix issue espressif/arduino-esp32#223
@bjdshp
Copy link

bjdshp commented Mar 3, 2022

When WiFiClientSecure receive a malformed or incomplete response it's don't close the connection, using about 45k of heap each connection.

After timeout, just the sockets is closed, but not the mbedtls stuff, causing a big memory leak. So after some bad iterations ESP will out of heap.

I am running with a continuous connection sending data to Google Firestore every five minutes. I am getting this same error. You said this was solved (5 years ago). Mine appears to be running out of heap which looks like same description.

[E][ssl_client.cpp:33] handle_error(): SSL - The connection indicated an EOF
16:45:44.955 -> [E][ssl_client.cpp:35] handle_error(): MbedTLS message code: -29312

What is the specific solution?

@trullock
Copy link

Slightly off topic, but would be helpful to me;

REG(RTC_CNTL_RESET_STATE_REG),

Where can I find this REG macro? Doesn't seem to be imported by default. What include do I need?

Cheers

blue-2357 pushed a commit to blue-2357/arduino-esp32 that referenced this issue Jul 17, 2024
* Add CA certificate in example

SHA1 fingerprint is broken now: more info: https://shattered.io

* Best error handling

When occur an error in WiFiClientSecure library just return the error message
and clean the context avoiding crash - fix for espressif/arduino-esp32#211

Translate MbedTLS error codes in messages for best understanding

* Declarate certificates as const

mbedtls_pk_parse_key needs a const unsigned char * certificate. In old implementation the certificate was declarated as char * so first it converts to unsigned and after to const.

When we convert signed to unsigned it may result in a +1 larger output.

Fix issue espressif/arduino-esp32#223
dash0820 added a commit to dash0820/arduino-esp32-stripped that referenced this issue Mar 10, 2025
* Add CA certificate in example

SHA1 fingerprint is broken now: more info: https://shattered.io

* Best error handling

When occur an error in WiFiClientSecure library just return the error message
and clean the context avoiding crash - fix for espressif/arduino-esp32#211

Translate MbedTLS error codes in messages for best understanding

* Declarate certificates as const

mbedtls_pk_parse_key needs a const unsigned char * certificate. In old implementation the certificate was declarated as char * so first it converts to unsigned and after to const.

When we convert signed to unsigned it may result in a +1 larger output.

Fix issue espressif/arduino-esp32#223
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants