Skip to content

Possible performance improvement to HardwareSerial::Read #7474

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 task done
wolfbert opened this issue Nov 15, 2022 · 20 comments · Fixed by #7525
Closed
1 task done

Possible performance improvement to HardwareSerial::Read #7474

wolfbert opened this issue Nov 15, 2022 · 20 comments · Fixed by #7525
Assignees
Labels
Area: Peripherals API Relates to peripheral's APIs. Peripheral: UART Status: Solved Type: Feature request Feature request for Arduino ESP32

Comments

@wolfbert
Copy link

Related area

UART

Hardware specification

ESP32

Is your feature request related to a problem?

In trying to understand the inner workings of HardwareSerial and UART (i.e. code review), I noticed the following:

uart_read_bytes could in fact read the whole buffer in one go, saving 2 calls per byte.

Describe the solution you'd like

A possible solution would be to replace uartRead with a new function uartReadBytes with reversed logic. I.e. make reading a single byte the special case, instead of reading the buffer. Care would have to be taken to conditionally insert the "peek byte" at the start of the buffer. The HardwareSerial interface would remain unchanged, whether uartRead needs to be preserved is to be decided.

I could draft the code, but have no means to perform extensive quality checks and tests (other than running examples on my ESP32).

Separately, and a minor thing, it seems that the checks following uart_read_bytes in uartRead and uartPeek are unnecessary. From what I can see, uart_read_bytes will not clobber the buffer if no bytes are returned, and c has already been initialized.

If we absolutely, positively don't trust the IDF, the check would have to be if (len <= 0). uart_read_bytes can return a (negative) error value. In any case, the caller of uartRead has no means of knowing whether the returned value is valid.

Describe alternatives you've considered

No response

Additional context

No response

I have checked existing list of Feature requests and the Contribution Guide

  • I confirm I have checked existing list of Feature requests and Contribution Guide.
@wolfbert wolfbert added the Type: Feature request Feature request for Arduino ESP32 label Nov 15, 2022
@mrengineer7777
Copy link
Collaborator

Tagging @SuGlider. He has done great work on HardwareSerial and can likely help you.

@VojtechBartoska VojtechBartoska added the Status: Awaiting triage Issue is waiting for triage label Nov 16, 2022
@SuGlider
Copy link
Collaborator

@wolfbert - Thanks for the suggestion.

Indeed Arduino has only one HardwareSerial native function, which is HardwareSerial::read() and it will read a single byte from UART.
HardwareSerial::read(buffer) is an extension, but as you said, it could be improved as you suggested.

But ESP32 Arduino is based on IDF, and IDF is a FreeRTOS environment, therefore an issue may happen when we use Tasks to read the same UART.

There will be concurrency and a race condition may create a situation where a task thinks that there are X available bytes to read, then it is preempted, and then another task consumes the buffer and the scheduler returns to the previous task.

All sorts of issues of this nature have to be taken into consideration when writing the code for Arduino HardwareSerial and UART HAL level code.

@SuGlider
Copy link
Collaborator

By other hand, there is HardwareSerial::peek() that also reads a byte from IDF Ringbuffer.
Thus, it may have already consumed a byte from the buffer and it has to be included in a uart_read_bytes(buffer) version...

There are many little details.
At this time I'm sure that the current code works and passes a set of Use Cases and issues reported in the past.

@SuGlider
Copy link
Collaborator

SuGlider commented Nov 16, 2022

For most Arduino users, which is the main target of this project, improving the loop for an out of standard API function that reads a whole buffer won't really make a big difference.
As said, the Arduino standard API is just read() of a single byte. 99% of all sketches and examples just do that.

If the Arduino standard API needs to read more than a Byte, it will use Stream::readBytes(buffer) instead, which in turn, calls HardwareSerial::read() for each byte.

@SuGlider
Copy link
Collaborator

Above, I just wrote some out loud thinking about the suggestion made in this issue.

I could draft the code, but have no means to perform extensive quality checks and tests (other than running examples on my ESP32).

Please feel free to contribute with a PR that improves the performance of the project!
We will appreciate it a lot!

@wolfbert
Copy link
Author

Thanks for your thoughts. I agree that Serial is a core functionality of Arduino and should not be changed without very good reason. Also the benefits may not be worth it, as likely nobody would notice anyway.

I'm not so worried about the concurrency issues (though maybe I should be ...). That must be handled mostly within IDF (i.e. it must protect access to the hardware and ring buffer). The peek function should IMHO be implemented on IDF level, where code could actually look into the buffer without reading. But it can be handled, as it is now.

Actually, I don't see how this can be thread safe:

int HardwareSerial::read(void)
{
    if(available()) {
        return uartRead(_uart);
    }
    return -1;
}

size_t HardwareSerial::read(uint8_t *buffer, size_t size)
{
    size_t avail = available();
    if (size < avail) {
        avail = size;
    }
    size_t count = 0;
    while(count < avail) {
        *buffer++ = uartRead(_uart);
        count++;
    }
    return count;
}

If there occurs a task switch and the buffer is emptied in between, uartRead will return 0. It is also unnecessary, as uart_read_bytes does this check. Then again, having multiple tasks arbitrarily reading from the same serial stream is rather unlikely.

Right now, I'm undecided if it's worth it to pursue this.

@SuGlider
Copy link
Collaborator

Then again, having multiple tasks arbitrarily reading from the same serial stream is rather unlikely.

Yes, that is true. I also think it is unlikely, but every HAL interface in the core assumes it may be used this way.

Actually, I don't see how this can be thread safe:

The Arduino Layer isn't thread safe.
We made only the HAL layer thread safe.
I think that the idea was to make the Arduino code closer to the mainstream code, not sure.

@SuGlider
Copy link
Collaborator

SuGlider commented Nov 16, 2022

Right now, I'm undecided if it's worth it to pursue this.

Let me think a bit more about how to add the uart_read_bytes(buffer) to the HAL layer.
That one may be easy... I'll add a PR and then you can review it!

@wolfbert
Copy link
Author

Another thing I noticed is that the timed read (Stream::readBytes()) also reads character by character. It could be fixed of course, but, as noted before, is it worth it? In any case, I don't want to cause you overtime ;-)

As for HardwareSerial::read(buffer...)something along these lines should do it:

size_t HardwareSerial::read(buffer...)
    return uartReadBytes(buffer...)

size_t uartReadBytes(buffer...)
    if buffer length == 0 || uart == NULL: 
        return 0

    peek_length = 0
    mutex_lock
    if has_peek:
        has_peek = false
        add peek byte to start of buffer
        peek_length = 1
    bytesRead = uart_read_bytes(remaining buffer...)
    mutex_unlock

    return (bytesRead > 0) ? bytesRead + peek_length : peek_length

@wolfbert
Copy link
Author

wolfbert commented Nov 17, 2022

Went down the rabbit hole ... and did a simple performance test, comparing readBytes(), read(buffer), uart_read_bytes(single byte) and uart_read_bytes(buffer). Unless I'm much mistaken, and I'm still trying to fully understand the result, the ratio is roughly 246 : 150 : 85 : 2 µs for reading the 120 bytes hardware buffer. For whatever it's worth.

EDIT: timings are in µs, not ms

@mrengineer7777
Copy link
Collaborator

mrengineer7777 commented Nov 17, 2022

Replies to your comments

Separately, and a minor thing, it seems that the checks following uart_read_bytes in uartRead and uartPeek are unnecessary.

uart_read_bytes() can timeout and return 0 bytes.

From what I can see, uart_read_bytes will not clobber the buffer if no bytes are returned, and c has already been initialized.

True

If we absolutely, positively don't trust the IDF, the check would have to be if (len <= 0). uart_read_bytes can return a (negative) error value.

True. uartPeek() and uartRead() should handle the negative value. @SuGlider

In any case, the caller of uartRead has no means of knowing whether the returned value is valid.

True. Caller should only call uartRead() if there are bytes available.

Analysis
read(buf) //[ARD] HardwareSerial.cpp
uint8_t b = uartRead(); //[ARD] esp32-hal-uart.c
uart_read_bytes(port, buf, 1, timeout); //[IDF] uart.c

The middle layer "esp-hal-uart" appears to add two features for a read:
Mutex locking //Arduino config defaults to OFF, so no effect here
1 byte peek //Allows checking the next byte available.

Discussion
IDF currently does not implement a peek mechanism. uart_read_bytes() calls xRingbufferReceive() which does not have a peek mechanism, but uart_read_bytes() does save the resulting bytes (possibly more than requested) to an internal buffer. A peek function could be added for this internal buffer.

Currently to implement UART byte peek, esp-hal-uart.c has to read a byte and then save it until the next time uartRead() is called.
If IDF did implement a peek mechanism and locking mutex wasn't needed, then uart_read_bytes() could be called instead of uartRead().
@wolfbert If you don't care about any of that, you may get away with calling uart_read_bytes() yourself.

IMHO dropping byte peek would simplify esp32-hal-uart.c, but I don't know who depends on that.

@wolfbert
Copy link
Author

wolfbert commented Nov 18, 2022

True. Caller should only call uartRead() if there are bytes available.

Not necessarily. Serial/HardwareSerial::peek and ::read return an int (valid byte or negative error code). So do the buffered read functions. uartRead and uartPeek in between return uint8_t and therefore cannot communicate errors. If they returned int as well, error handling could be streamlined. @SuGlider

@wolfbert If you don't care about any of that, you may get away with calling uart_read_bytes() yourself.

I tried to mix Arduino and IDF functions for uart, and as long as peek isn't used, it seems to work. The peek function can't be dropped, because it's in Arduinos Serial as well. It could just be moved to IDF, which would greatly simplify things.

To sum up, we have a few potential areas for improvement:

  1. streamline error handling in uartRead/Peek
  2. use buffered reads instead of single byte reads
  3. implement this also in Stream::readBytes (has to be looked into though, as Stream is used beyond uart)
  4. move peek to IDF and get rid of the peek byte logic altogether

Now, these are more quality topics, so if the decision is to not go forward with it, I respect that (never change a running system ;-).

@wolfbert
Copy link
Author

Some more thinking done. I'd suggest that

  • item (1) above should be done
  • items (2) and (3) are nice to have; the performance gain, while significant, won't be noticeable on an ESP32
  • item (4) is not worth it.

@awawa-dev
Copy link

Maybe it's a niche problem and I'm not sure what it's caused by, but I just want to note that with the release of Arduino-esp32 2.0 and SerialPort refactoring, serial communication performance has dropped drastically. Currently, at 2,000,000 baud or faster (2Mb cp2104, 2Mb/4Mb CH9102x) there are problems with frames of approx. 900 bytes which ESP32 is not able to receive undamaged / complete (confirmed on various boards). In versions 1.x there was no problem even with 1800 byte frames.

@SuGlider
Copy link
Collaborator

SuGlider commented Nov 21, 2022

Maybe it's a niche problem and I'm not sure what it's caused by, but I just want to note that with the release of Arduino-esp32 2.0 and SerialPort refactoring, serial communication performance has dropped drastically. Currently, at 2,000,000 baud or faster (2Mb cp2104, 2Mb/4Mb CH9102x) there are problems with frames of approx. 900 bytes which ESP32 is not able to receive undamaged / complete (confirmed on various boards). In versions 1.x there was no problem even with 1800 byte frames.

@awawa-dev

Please open a new issue with a good sketch that demonstrates this issue.
I will personally analyse it and get a feedback to you.
You can tag me in the new issue.

@SuGlider SuGlider added Area: Peripherals API Relates to peripheral's APIs. Status: Needs investigation We need to do some research before taking next steps on this issue Peripheral: UART and removed Status: Awaiting triage Issue is waiting for triage labels Nov 21, 2022
@wolfbert
Copy link
Author

Please open a new issue with a good sketch that demonstrates this issue.

@SuGlider Not sure whether your previous comment was addressed to me. Anyway, it's not a bug, so providing code would be tricky. If you take a look at uartRead/Peek/Available, you'll see it quickly. I'm happy to talk you through.

@SuGlider
Copy link
Collaborator

Please open a new issue with a good sketch that demonstrates this issue.

@SuGlider Not sure whether your previous comment was addressed to me. Anyway, it's not a bug, so providing code would be tricky. If you take a look at uartRead/Peek/Available, you'll see it quickly. I'm happy to talk you through.

@wolfbert - The previous comment was about the issue reported above. I've edited it to make it clear.
I'll add a PR with a few improvements based on your observations in this issue. Thanks!

@awawa-dev
Copy link

@SuGlider if you can take a look #7505

@SuGlider
Copy link
Collaborator

SuGlider commented Nov 27, 2022

@awawa-dev @wolfbert

I have created the PR #7525 that addresses the suggested improvement of this issue.

To sum up, we have a few potential areas for improvement:

  1. streamline error handling in uartRead/Peek
  2. use buffered reads instead of single byte reads
  3. implement this also in Stream::readBytes (has to be looked into though, as Stream is used beyond uart)
  4. move peek to IDF and get rid of the peek byte logic altogether

I think that (1), (2) and (4) are addressed with the PR.

(3) can't be fixed, exactly because it is just an API, that is independent of HardwareSerial.

@SuGlider
Copy link
Collaborator

SuGlider commented Dec 4, 2022

implement this also in Stream::readBytes (has to be looked into though, as Stream is used beyond uart)

This has been also done in implement this also in e072302
Thus all points are now addressed.

@VojtechBartoska VojtechBartoska added Status: Solved and removed Status: Needs investigation We need to do some research before taking next steps on this issue labels Dec 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Peripherals API Relates to peripheral's APIs. Peripheral: UART Status: Solved Type: Feature request Feature request for Arduino ESP32
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants