Skip to content

Fix stall on usb write: try to write only if usb is actually connected #7721

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

imwhocodes
Copy link


Description of Change

While developing on local esp32-s3 the program stalled if no pc was connected to the USB
USB was written by both normal Stream::write and the native log infrastructure

This fix the issue by checking if USB is enabled before trying to actually write

Related links

I'm aware of 1d595fd by @SuGlider but it is not clear to me as this could completely address the 'unconnected USB problem': changing tx_timeout_ms impact only if there are 2 threads writing to the USB and one of them is already stuck trying to write

Maybe I'm missing something,
Luca

@CLAassistant
Copy link

CLAassistant commented Jan 16, 2023

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ lucasssvaz
❌ Luca


Luca seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@VojtechBartoska VojtechBartoska added the Status: Review needed Issue or PR is awaiting review label Jan 25, 2023
Copy link
Collaborator

@SuGlider SuGlider left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if the PR is necessary. Plese check my comments below.

@SuGlider
Copy link
Collaborator

I'm aware of 1d595fd by @SuGlider but it is not clear to me as this could completely address the 'unconnected USB problem': changing tx_timeout_ms impact only if there are 2 threads writing to the USB and one of them is already stuck trying to write

Thanks @imwhocodes for the PR.
PR #7583 fixes the issue you want to address with this PR.
Please let me know if you have an specific use case to demonstrate the issue.

I used this Testing Sketch:

I have an ESP32-S3 with 2 USB ports here. One for UART and another for USB CDC.
I have plugged both USB ports to may Win11 PC.
I Selected ESP32-S3 board with these options in the IDE Board Menu:

  • USB Mode: "Hardware CDC and JTAG" ==> this is the option associated to HWCDC.cpp code
  • USB CDC On Boot: "Enabled" ==> it will work with USB CDC (Serial = CDC)
    In this mode, Serial will be the HW USB CDC port and Serial0 will be the UART port.

The idea of this testing sketch is to write a message in UART and USB CDC every 1 second.

I used PuTTY as terminal to connect to the Win11 COM port from the S3 UART and used Arduino Serial Monitor to connect to the USB CDC Win11 COM port. In this way, Putty will display the loop UART messages while I can experiment with the USB CDC port, plugging, unplugging, closing the Serial Terminal etc.

void setup() {
  Serial.begin(115200);
  Serial0.begin(115200);

  while(!Serial0) delay(1);
  Serial0.print("\n=================\nPrinting on UART\n=================\n");
}

void loop() {
  static uint32_t now = millis();

  if (millis() > now + 1000) {
    Serial0.printf("[%d] Loop running...\n", millis());
    Serial.printf("[%d] Loop running USB CDC...\n", millis());
    now += 1000;
  }
}

Conclusion of the experiment:

1- If I start the S3 only with the Serial Port plugged, I get the loop working fine, every 1 second, even when it is trying to write to the CDC port as well. Therefore, it never stalls.

2- If I start the S3 with both ports plugged and then open the Arduino Serial Monitor (which will only work when Windows enumerates the USB CDC), I see both ports with messages. Because the messages are buffered into the Ring Buffer, I can see all of them from the beginning.

Now, if I unplug the USB from the S3, the serial monitor stops displaying new messages, which is OK.
But the UART continues in PuTTY, which means that the S3 has not stalled.

Now close the Serial Monitor Window. Plug the USB back again.
Open the Serial Monitor again and see the messages flowing in both the Serial Monitor and PuTTY.
No stall...
In the Serial Monitor you may see the Time Stamp a bit jumping, it is due to, again, buffering written data.

@SuGlider SuGlider self-assigned this Jan 28, 2023
@imwhocodes
Copy link
Author

Ok I did some testing and found the problem:
If you try to call void Stream::flush() on the USB-CDC Serial it get stuck

#include <Arduino.h>

void setup() {
  Serial.begin(115200);
  Serial0.begin(115200);

  while(!Serial0) delay(1);

  Serial0.print("\n=================\nPrinting on UART\n=================\n");
  Serial0.flush();

  Serial.print("\n=================\nPrinting on USB CDC\n=================\n");
  Serial.flush(); // <-- HERE PROGRAM STALL


}

void loop() {
  static uint32_t now = millis();

  if (millis() > now + 1000) {
    Serial0.printf("[%d] Loop running UART...\n", millis());
    Serial0.flush();

    Serial.printf("[%d] Loop running USB CDC...\n", millis());
    Serial.flush();

    now += 1000;
  }
}

I'm working on platformio, here you have the complete test project
On platformio.ini file you can change the platform_packages = parameter between release 2.0.6 and my branch-fix

Honestly I still think that the timeout is a stop-gap measure, proper way should be to check if usb is connected (indicated by the value of initial_empty) and bail-out of all operation if it is disconnected, as propose by my PR

Serial-USB-Fixes-Test.zip

Thanks,
Luca

@SuGlider
Copy link
Collaborator

SuGlider commented Feb 1, 2023

Maybe related to #7554

@SuGlider
Copy link
Collaborator

SuGlider commented Feb 1, 2023

@imwhocodes - What is the OS of the computer used for the flush() testing?

@imwhocodes
Copy link
Author

imwhocodes commented Feb 1, 2023

@imwhocodes - What is the OS of the computer used for the flush() testing?

MacOS and Ubuntu
But it is not correlated, it stall only if the CDC-USB is not connected to the PC and/or the console has not been opened

@imwhocodes imwhocodes mentioned this pull request Feb 1, 2023
1 task
@SuGlider
Copy link
Collaborator

SuGlider commented Feb 1, 2023

CLA assistant check Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

Luca seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@imwhocodes - please review and sign the CLA in order to continue with the process. Thanks!

@SuGlider
Copy link
Collaborator

SuGlider commented Feb 5, 2023

@imwhocodes

I'm sorry, but it seems that CI won't let it pass because of commit e392b55
It was done by another user (the wrongly configured one) and because it has no CLA/valid Github user, it won't be able to pass CI and then we can't merge it after reviewing it.

Please close this PR and open a new one with a single commit from your valid GH user ID.

@SuGlider SuGlider added this to the 2.0.7 milestone Feb 5, 2023
@VojtechBartoska VojtechBartoska modified the milestones: 2.0.7, 3.0.0 Feb 13, 2023
@VojtechBartoska
Copy link
Contributor

This PR needs more investigation and testing, milestone changed to 3.0.0.

@me-no-dev
Copy link
Member

@SuGlider PTAL. Maybe good to mix in with the SOA monitor code I gave you a while back.

@VojtechBartoska
Copy link
Contributor

Hello @imwhocodes, can you please sign CLA?

@imwhocodes
Copy link
Author

Hello @imwhocodes, can you please sign CLA?

Commit is signed with "Luca" but my user-name is "imwhocodes" so it refuse my signing

@lucasssvaz
Copy link
Collaborator

Hello @imwhocodes, can you please sign CLA?

Commit is signed with "Luca" but my user-name is "imwhocodes" so it refuse my signing

Please recommit with the proper author configuration.

@VojtechBartoska VojtechBartoska modified the milestones: 3.0.0, 3.0.0-RC1 Jan 30, 2024
@VojtechBartoska VojtechBartoska removed the Status: Awaiting triage Issue is waiting for triage label Jan 30, 2024
Copy link
Contributor

Warnings
⚠️

Some issues found for the commit messages in this PR:

  • the commit message "fix stall on usb write: try to write only if usb is actually connected to a host":
    • summary looks empty
    • type/action looks empty

Please fix these commit messages - here are some basic tips:

  • follow Conventional Commits style
  • correct format of commit message should be: <type/action>(<scope/component>): <summary>, for example fix(esp32): Fixed startup timeout issue
  • allowed types are: change,ci,docs,feat,fix,refactor,remove,revert,test
  • sufficiently descriptive message summary should be between 20 to 72 characters and start with upper case letter
  • avoid Jira references in commit messages (unavailable/irrelevant for our customers)

TIP: Install pre-commit hooks and run this check when committing (uses the Conventional Precommit Linter).

👋 Hello imwhocodes, we appreciate your contribution to this project!


Click to see more instructions ...


This automated output is generated by the PR linter DangerJS, which checks if your Pull Request meets the project's requirements and helps you fix potential issues.

DangerJS is triggered with each push event to a Pull Request and modify the contents of this comment.

Please consider the following:
- Danger mainly focuses on the PR structure and formatting and can't understand the meaning behind your code or changes.
- Danger is not a substitute for human code reviews; it's still important to request a code review from your colleagues.
- Resolve all warnings (⚠️ ) before requesting a review from human reviewers - they will appreciate it.
- To manually retry these Danger checks, please navigate to the Actions tab and re-run last Danger workflow.

Review and merge process you can expect ...


We do welcome contributions in the form of bug reports, feature requests and pull requests.

1. An internal issue has been created for the PR, we assign it to the relevant engineer.
2. They review the PR and either approve it or ask you for changes or clarifications.
3. Once the GitHub PR is approved we do the final review, collect approvals from core owners and make sure all the automated tests are passing.
- At this point we may do some adjustments to the proposed change, or extend it by adding tests or documentation.
4. If the change is approved and passes the tests it is merged into the default branch.

Generated by 🚫 dangerJS against fee3e4b

@lucasssvaz lucasssvaz added the Resolution: Awaiting response Waiting for response of author label Jan 31, 2024
@Jason2866
Copy link
Collaborator

We (Tasmota) use this to detect if JTAG USB is connected:

#if SOC_USB_SERIAL_JTAG_SUPPORTED  // Not S2
  rtc_clk_bbpll_add_consumer();    // Maybe unneeded
  usb_serial_jtag_ll_ena_intr_mask(USB_SERIAL_JTAG_INTR_SOF);
  usb_serial_jtag_ll_clr_intsts_mask(USB_SERIAL_JTAG_INTR_SOF);
  // First check if USB cable is connected
  for (uint32_t i = 0; i < 1000; i++) {  // Allow the host to send at least one SOF packet, 1ms should be enough but let's be very conservative here - maybe unneeded
      is_connected_to_USB = ((usb_serial_jtag_ll_get_intraw_mask() & USB_SERIAL_JTAG_INTR_SOF) != 0);
      if (is_connected_to_USB) { break; }
      delay(1);
  }
  rtc_clk_bbpll_remove_consumer();
#else
  is_connected_to_USB = true;      // S2
#endif  // SOC_USB_SERIAL_JTAG_SUPPORTED

@SuGlider
Copy link
Collaborator

SuGlider commented Feb 7, 2024

This PR will be reviewed and maybe it will use a different approcha based on USB SOF event.

@SuGlider SuGlider modified the milestones: 3.0.0-RC1, 3.0.0 Feb 7, 2024
@me-no-dev me-no-dev added the Status: Pending CLA ⚠️ Contributor is required to sign the CLA label Feb 7, 2024
@VojtechBartoska VojtechBartoska modified the milestones: 3.0.0, 3.0.0-RC1 Feb 21, 2024
@SuGlider
Copy link
Collaborator

This PR will be closed in favor of #9275

@SuGlider SuGlider closed this Feb 22, 2024
Copy link
Collaborator

@SuGlider SuGlider left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR demonstrates an issue, but the solution was changed to use USB SOF signal from IDF 5.1 implementation. It will be solved within Arduino Core 3.0.0.

Thanks @imwhocodes !!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Awaiting response Waiting for response of author Status: Pending CLA ⚠️ Contributor is required to sign the CLA Status: Review needed Issue or PR is awaiting review
Projects
Development

Successfully merging this pull request may close these issues.

7 participants