fix inference mode error for qwen2-vl #2668

irexyc · 2024-10-28T06:40:43Z

Motivation

When using qwen2-vl, it will encounter the following error. It is not clear since the outermost scope used torch.inference_mode().

related pr
#2632

2024-10-25 23:09:36,288 - lmdeploy - WARNING - async_engine.py:505 - Since v0.6.0, lmdeploy add `do_sample` in GenerationConfig. It defaults to False, meaning greedy decoding. Please set `do_sample=True` if sampling  decoding is needed
1402024-10-25 23:09:44,250 - lmdeploy - ERROR - request.py:21 - Engine loop failed with error: Inplace update to inference tensor outside InferenceMode is not allowed.You can make a clone to get a normal tensor before doing inplace update.See https://github.com/pytorch/rfcs/pull/17 for more details.
141Traceback (most recent call last):
142  File "/opt/py3/lib/python3.10/site-packages/lmdeploy/pytorch/engine/request.py", line 17, in _raise_exception_on_finish
143    task.result()
144  File "/opt/py3/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 963, in async_loop
145    await self._async_loop()
146  File "/opt/py3/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 957, in _async_loop
147    await __step()
148  File "/opt/py3/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 945, in __step
149    raise e
150  File "/opt/py3/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 939, in __step
151    raise out
152  File "/opt/py3/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 873, in _async_loop_background
153    await self._async_step_background(
154  File "/opt/py3/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 764, in _async_step_background
155    next_token_ids = self.async_sampling_logits(
156  File "/opt/py3/lib/python3.10/site-packages/lmdeploy/utils.py", line 231, in __func_warpper
157    return func(*args, **kwargs)
158  File "/opt/py3/lib/python3.10/site-packages/lmdeploy/pytorch/engine/engine.py", line 548, in async_sampling_logits
159    logits = logits_processor(all_ids, guided_input_ids, split_logits)
160  File "/opt/py3/lib/python3.10/site-packages/lmdeploy/pytorch/engine/logits_process.py", line 324, in __call__
161    scores = _process_temperature_(scores, temperature)
162  File "/opt/py3/lib/python3.10/site-packages/lmdeploy/pytorch/engine/logits_process.py", line 18, in _process_temperature_
163    scores.div_(temperature[:, None])
164RuntimeError: Inplace update to inference tensor outside InferenceMode is not allowed.You can make a clone to get a normal tensor before doing inplace update.See https://github.com/pytorch/rfcs/pull/17 for more details

fix inference mode error

7334cd2

irexyc requested a review from lvhan028 October 28, 2024 06:40

lvhan028 added the Bug:P1 label Oct 28, 2024

lvhan028 approved these changes Oct 28, 2024

View reviewed changes

lvhan028 changed the title ~~fix inference mode error for qwen2~~ fix inference mode error for qwen2-vl Oct 28, 2024

lvhan028 merged commit a41a2a2 into InternLM:main Oct 28, 2024
4 of 5 checks passed

AllentDan pushed a commit to AllentDan/lmdeploy that referenced this pull request Nov 13, 2024

fix inference mode error for qwen2-vl (InternLM#2668)

836f9dc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix inference mode error for qwen2-vl #2668

fix inference mode error for qwen2-vl #2668

irexyc commented Oct 28, 2024

fix inference mode error for qwen2-vl #2668

fix inference mode error for qwen2-vl #2668

Conversation

irexyc commented Oct 28, 2024

Motivation