Skip to content

[core] Support CLI commands to kill Ray resources like tasks/actors/nodes/placement groups #57152

@kevin85421

Description

@kevin85421

Description

Some issues occur only at large scale. In such cases, most computations have already been completed, and deadlines are sometimes stringent, leaving users with no opportunity to rerun the entire job. Therefore, we should provide CLIs to enable users to terminate specific Ray tasks or actor processes, triggering retries of failed tasks so that the job will not hang forever due to some corner cases.

This is useful when users need to babysit some workloads.

Use case

No response

Metadata

Metadata

Assignees

Labels

coreIssues that should be addressed in Ray CoreenhancementRequest for new feature and/or capabilitytriageNeeds triage (eg: priority, bug/not-bug, and owning component)

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions