rpc : update documentation #16441

rgerganov · 2025-10-06T08:50:48Z

Update the README file to match the newly added functionality of exposing multiple devices from a single server.

tools/rpc/README.md

abc-nix · 2025-10-06T11:37:59Z

Could you add an "Advanced" example mentioning the use of llama-server with --tensor-split and/or --override-tensor options connecting to different RPC servers? I think a lot of people expect --gpu-layers to magically distribute layers automatically to each device based on its available memory, and don't understand why it fails when "it should fit".

Sorry if this is not in the scope of the RPC documentation.

tommarques56 · 2025-10-06T12:44:37Z

Hi @rgerganov, I can run an automated high-severity-only LLM review on this PR and post a single focused inline comment. Reply with "approve" or add a comment saying "@tommarques56 approve" to proceed.

abc-nix · 2025-10-06T13:12:19Z

OT: tommarques56, dude, this is a documentation PR. What the heck are you doing with your bot?

No matter your PhD, you should talk with the repo owner first before doing this shit. The ggml organization has an email you can contact. Or open a discussion.

rgerganov · 2025-10-06T13:20:04Z

Could you add an "Advanced" example mentioning the use of llama-server with --tensor-split and/or --override-tensor options connecting to different RPC servers? I think a lot of people expect --gpu-layers to magically distribute layers automatically to each device based on its available memory, and don't understand why it fails when "it should fit".

Good point, I have added a note about how weights are distributed by default and how to change this with --tensor-split

tools/rpc/README.md

Co-authored-by: Diego Devesa <slarengh@gmail.com>

rpc : update documentation

339453e

Update the README file to match the newly added functionality of exposing multiple devices from a single server.

github-actions bot added the examples label Oct 6, 2025

rgerganov requested review from slaren and ggerganov October 6, 2025 08:51

ggerganov approved these changes Oct 6, 2025

View reviewed changes

slaren reviewed Oct 6, 2025

View reviewed changes

tools/rpc/README.md Outdated Show resolved Hide resolved

change device color in diagram

6f90443

add not about --tensor-split

0ccf0ba

slaren approved these changes Oct 6, 2025

View reviewed changes

tools/rpc/README.md Outdated Show resolved Hide resolved

Update tools/rpc/README.md

55231c6

Co-authored-by: Diego Devesa <slarengh@gmail.com>

rgerganov merged commit c61ae20 into ggml-org:master Oct 7, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

rpc : update documentation #16441

rpc : update documentation #16441

Uh oh!

rgerganov commented Oct 6, 2025

Uh oh!

Uh oh!

abc-nix commented Oct 6, 2025

Uh oh!

tommarques56 commented Oct 6, 2025

Uh oh!

abc-nix commented Oct 6, 2025

Uh oh!

rgerganov commented Oct 6, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rpc : update documentation #16441

rpc : update documentation #16441

Uh oh!

Conversation

rgerganov commented Oct 6, 2025

Uh oh!

Uh oh!

abc-nix commented Oct 6, 2025

Uh oh!

tommarques56 commented Oct 6, 2025

Uh oh!

abc-nix commented Oct 6, 2025

Uh oh!

rgerganov commented Oct 6, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!