Skip to content

the offset in the processQueue were not removed correctly #927

@0daypwn

Description

@0daypwn

The issue tracker is ONLY used for the go client (feature request of RocketMQ need to follow RIP process). Keep in mind, please check whether there is an existing same report before your raise a new one.

Alternately (especially if your communication is not a bug report), you can send mail to our mailing lists. We welcome any friendly suggestions, bug fixes, collaboration, and other improvements.

Please ensure that your bug report is clear and that it is complete. Otherwise, we may be unable to understand it or to reproduce it, either of which would prevent us from fixing the bug. We strongly recommend the report(bug report or feature request) could include some hints as to the following:

BUG REPORT

  1. Please describe the issue you observed:

    • What did you do (The steps to reproduce)?
      producer send message very fast.
      consumer consume message very fast.

    • What did you expect to see?

    • What did you see instead?
      some process queue's cache offset may not remove correctly.
      then the consumer offset can't update to broker.
      When this happens many times, it may block queue consume.

e32a9613-7552-4bb4-9f6d-99131828e6e8

  1. Please tell us about your environment:

    • What is your OS?

    • What is your client version?
      v2.1.1

    • What is your RocketMQ version?

  2. Other information (e.g. detailed explanation, logs, related issues, suggestions on how to fix, etc):

processQueue put message order:

  1. (pq.putMessage)put messages to channel pq.msgCh
  2. (pq.putMessage)lock pq.mutex
  3. (pq.putMessage)put messages to map pq.msgCache
  4. (pq.putMessage)unlock pq.mutex

3273d497-84b7-41fd-a0fa-58ba20965071

conumse message order:
1. (pq.getMessages)get messages from channel pq.msgCh
2. consumerInner
3. if consume success, do remove message
4. (pq.removeMessage)lock pq.mutex
6. (pq.removeMessage)remove messages from map pq.msgCache
7. (pq.removeMessage)unlock pq.mutex
58b7595e-ff78-4a6e-abda-569ad6033738

In high concurrency scenarios, the order may be out of order.
1. (pq.putMessage)put messages to channel pq.msgCh
2. (pq.getMessages)get messages from channel pq.msgCh
3. consumerInner
4. if consume success, do remove message
5. (removeMessage)lock pq.mutex
6. (removeMessage)remove messages from map pq.msgCache. At this time, the offset is not in the map.
9. (removeMessage)unlock pq.mutex
10. (pq.putMessage)lock pq.mutex
11. (pq.putMessage)put messages to map pq.msgCache No one will delete it again.
12. (pq.putMessage)unlock pq.mutex

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions