-
Notifications
You must be signed in to change notification settings - Fork 405
Locate and read cgroup files for cgroup v2 #6432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In this PR, I see changes from #6422. All changes are in one commit. First, it won't rebase automatically; there will be conflicts. Second, the new changes cannot be distinguished from older ones. To avoid these issues, you need to add a new commit (with only new changes) on top of the commit in #6422. |
Some feedback
|
1332c4e
to
3e27238
Compare
With these changes the heap size for cgroups v2 matches that of v1 for a 4GB docker
|
@pshipton This should fix ibmruntimes/ci.docker#124 which is linked to eclipse-openj9/openj9#14190. Also, can you confirm if the below trace point rules still apply?
|
Forgot about the rules for adding tracepoints. But according to https://github.com/eclipse-openj9/openj9/blob/master/doc/diagnostics/AddingTracepoints.md my modifications of old trace points is fine since they don't modify the signature of the format specifiers. |
The older tracepoint names can't be modified. The original names need to be preserved so the latest build can still process older tracepoint files. It's ok to change some text (without changing signatures), as long as the the new text still makes sense in the context of processing older tracepoint files. If a tracepoint is no longer used, you can add the |
I thought that the formatter looks at the ordering of the tracepoints and not the name so the position of a tracepoint in the file determines which old and new tracepoints are matched. If this isn't the case, then maybe we should mention that names can't be changed in the documentation. |
Maybe @keithc-ca knows for sure? |
The doc at https://github.com/eclipse-openj9/openj9/blob/master/doc/diagnostics/AddingTracepoints.md |
@EricYangIBM is correct: the name of the tracepoint is not relevant, just it's position within the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first review pass ... please notify once the all the comments have been addressed ... then, will do a second review pass.
jenkins build all |
Downstream (OpenJ9) testing should be good based upon the builds listed in #6432 (comment). @EricYangIBM Can you locally verify and confirm if ibmruntimes/ci.docker#124 is fixed with the latest changes? |
These changes seem to fix the docker issue (-XX:+OriginalJDK8HeapSizeCompatibilityMode is default, but heap size is the same with or without it):
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. @keithc-ca @0xdaryl, for final review and merge.
Is this ready to merge? |
Yes, lgtm. Waiting for @keithc-ca and @0xdaryl to approve. |
requiredSize = portLibrary->str_printf(portLibrary, NULL, 0, "/proc/%d/cgroup", pid); | ||
Assert_PRT_true(requiredSize <= PATH_MAX); | ||
portLibrary->str_printf(portLibrary, cgroupFilePath, sizeof(cgroupFilePath), "/proc/%d/cgroup", pid); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be improved (in a future pull request): there's no reason to call str_printf()
twice if we are going to insist the result fit the the buffer we already have.
The assertion (that would go away) should use sizeof(cgroupFilePath)
instead of PATH_MAX
.
@0xdaryl Please review and merge these changes |
In eclipse-omr#6432, OMR port library started throwing an error if isRunningInContainer failed. isRunningInContainer can fail if /proc is mounted with the hidepid=2 setting on Linux (eclipse-omr#7021). This prevents a JVM user to start. Before eclipse-omr#6432, no error was returned if isRunningInContainer failed; a user was completely unaware of this failure; this behaviour can lead to performance issues if the process is running in a container; but no functional issues will be seen. The new behaviour will not thrown an error if isRunningInContainer fails, but will issue a warning message to highlight the potential performance impact. Currently, isRunningInContainer is run from omrsysinfo_startup. Neither the trace engine nor NLS messages are enabled at this point. If there is an error in isRunningInContainer, no tracepoint or NLS message will work inside isRunningInContainer. Invocation of isRunningInContainer is delayed to first-use. In OpenJ9, the first-use still happens before the trace engine is initialized, but it happens after the NLS messages are enabled. A new NLS message has been added in eclipse-openj9/openj9#17560, which will show up as a warning when isRunningInContainer fails and highlight the potential performance impact. The result of isRunningInContainer is cached and updated via an atomic operation to enforce data consistency. The caching helps to improve performance when isRunningInContainer is repeatedly invoked. Four new states are introduced for PPG_isRunningInContainer to support the new changes: - OMRPORT_RUNNING_IN_CONTAINER_UNINITIALIZED: evaluate the result of isRunningInContainer. - OMRPORT_RUNNING_IN_CONTAINER_TRUE: inside a container. - OMRPORT_RUNNING_IN_CONTAINER_FALSE: not in a container. - OMRPORT_RUNNING_IN_CONTAINER_ERROR: an error was encountered while evaluating isRunningInContainer. Related: eclipse-omr#7021 Signed-off-by: Babneet Singh <sbabneet@ca.ibm.com>
In eclipse-omr#6432, OMR port library started throwing an error if isRunningInContainer failed. isRunningInContainer can fail if /proc is mounted with the hidepid=2 setting on Linux (eclipse-omr#7021). This prevents a JVM user to start. Before eclipse-omr#6432, no error was returned if isRunningInContainer failed; a user was completely unaware of this failure; this behaviour can lead to performance issues if the process is running in a container; but no functional issues will be seen. The new behaviour will not throw an error if isRunningInContainer fails, but will issue a warning message to highlight the potential performance impact. Currently, isRunningInContainer is run from omrsysinfo_startup. Neither the trace engine nor NLS messages are enabled at this point. If there is an error in isRunningInContainer, no tracepoint or NLS message will work inside isRunningInContainer. Invocation of isRunningInContainer is delayed to first-use. In OpenJ9, the first-use still happens before the trace engine is initialized, but it happens after the NLS messages are enabled. A new NLS message has been added in eclipse-openj9/openj9#17560, which will show up as a warning when isRunningInContainer fails and highlight the potential performance impact. The result of isRunningInContainer is cached and updated via an atomic operation to enforce data consistency. The caching helps to improve performance when isRunningInContainer is repeatedly invoked. Four new states are introduced for PPG_isRunningInContainer to support the new changes: - OMRPORT_RUNNING_IN_CONTAINER_UNINITIALIZED: evaluate the result of isRunningInContainer. - OMRPORT_RUNNING_IN_CONTAINER_TRUE: inside a container. - OMRPORT_RUNNING_IN_CONTAINER_FALSE: not in a container. - OMRPORT_RUNNING_IN_CONTAINER_ERROR: an error is encountered while evaluating isRunningInContainer. Related: eclipse-omr#7021 Signed-off-by: Babneet Singh <sbabneet@ca.ibm.com>
In eclipse-omr#6432, OMR port library started throwing an error if isRunningInContainer failed. isRunningInContainer can fail if /proc is mounted with the hidepid=2 setting on Linux (eclipse-omr#7021). This prevents a JVM user to start. Before eclipse-omr#6432, no error was returned if isRunningInContainer failed; a user was completely unaware of this failure; this behaviour can lead to performance issues if the process is running in a container; but no functional issues will be seen. The new behaviour will not throw an error if isRunningInContainer fails, but will issue a warning message to highlight the potential performance impact. Currently, isRunningInContainer is run from omrsysinfo_startup. Neither the trace engine nor NLS messages are enabled at this point. If there is an error in isRunningInContainer, no tracepoint or NLS message will work inside isRunningInContainer. Invocation of isRunningInContainer is delayed to first-use. In OpenJ9, the first-use still happens before the trace engine is initialized, but it happens after the NLS messages are enabled. A new NLS message has been added in eclipse-openj9/openj9#17560, which will show up as a warning when isRunningInContainer fails and highlight the potential performance impact. The result of isRunningInContainer is cached and updated via an atomic operation to enforce data consistency. The caching helps to improve performance when isRunningInContainer is repeatedly invoked. Four new states are introduced for PPG_isRunningInContainer to support the new changes: - OMRPORT_RUNNING_IN_CONTAINER_UNINITIALIZED: evaluate the result of isRunningInContainer. - OMRPORT_RUNNING_IN_CONTAINER_TRUE: inside a container. - OMRPORT_RUNNING_IN_CONTAINER_FALSE: not in a container. - OMRPORT_RUNNING_IN_CONTAINER_ERROR: an error is encountered while evaluating isRunningInContainer. Related: eclipse-omr#7021 Signed-off-by: Babneet Singh <sbabneet@ca.ibm.com>
In eclipse-omr#6432, OMR port library started throwing an error if isRunningInContainer failed. isRunningInContainer can fail if /proc is mounted with the hidepid=2 setting on Linux (eclipse-omr#7021). This prevents a JVM user to start. Before eclipse-omr#6432, no error was returned if isRunningInContainer failed; a user was completely unaware of this failure; this behaviour can lead to performance issues if the process is running in a container; but no functional issues will be seen. The new behaviour will not throw an error if isRunningInContainer fails, but will issue a warning message to highlight the potential performance impact. Currently, isRunningInContainer is run from omrsysinfo_startup. Neither the trace engine nor NLS messages are enabled at this point. If there is an error in isRunningInContainer, no tracepoint or NLS message will work inside isRunningInContainer. Invocation of isRunningInContainer is delayed to first-use. In OpenJ9, the first-use still happens before the trace engine is initialized, but it happens after the NLS messages are enabled. A new NLS message has been added in eclipse-openj9/openj9#17560, which will show up as a warning when isRunningInContainer fails and highlight the potential performance impact. The result of isRunningInContainer is cached and updated via an atomic operation to enforce data consistency. The caching helps to improve performance when isRunningInContainer is repeatedly invoked. Four new states are introduced for PPG_isRunningInContainer to support the new changes: - OMRPORT_RUNNING_IN_CONTAINER_UNINITIALIZED: evaluate the result of isRunningInContainer. - OMRPORT_RUNNING_IN_CONTAINER_TRUE: inside a container. - OMRPORT_RUNNING_IN_CONTAINER_FALSE: not in a container. - OMRPORT_RUNNING_IN_CONTAINER_ERROR: an error is encountered while evaluating isRunningInContainer. Related: eclipse-omr#7021 Signed-off-by: Babneet Singh <sbabneet@ca.ibm.com>
In eclipse-omr#6432, OMR port library started throwing an error if isRunningInContainer failed. isRunningInContainer can fail if /proc is mounted with the hidepid=2 setting on Linux (eclipse-omr#7021). This prevents a JVM user to start. Before eclipse-omr#6432, no error was returned if isRunningInContainer failed; a user was completely unaware of this failure; this behaviour can lead to performance issues if the process is running in a container; but no functional issues will be seen. The new behaviour will not throw an error if isRunningInContainer fails, but will issue a warning message to highlight the potential performance impact. Currently, isRunningInContainer is run from omrsysinfo_startup. Neither the trace engine nor NLS messages are enabled at this point. If there is an error in isRunningInContainer, no tracepoint or NLS message will work inside isRunningInContainer. Invocation of isRunningInContainer is delayed to first-use. In OpenJ9, the first-use still happens before the trace engine is initialized, but it happens after the NLS messages are enabled. A new NLS message has been added in eclipse-openj9/openj9#17560, which will show up as a warning when isRunningInContainer fails and highlight the potential performance impact. The result of isRunningInContainer is cached and updated via an atomic operation to enforce data consistency. The caching helps to improve performance when isRunningInContainer is repeatedly invoked. Four new states are introduced for PPG_isRunningInContainer to support the new changes: - OMRPORT_RUNNING_IN_CONTAINER_UNINITIALIZED: evaluate the result of isRunningInContainer. - OMRPORT_RUNNING_IN_CONTAINER_TRUE: inside a container. - OMRPORT_RUNNING_IN_CONTAINER_FALSE: not in a container. - OMRPORT_RUNNING_IN_CONTAINER_ERROR: an error is encountered while evaluating isRunningInContainer. Related: eclipse-omr#7021 Signed-off-by: Babneet Singh <sbabneet@ca.ibm.com>
In eclipse-omr#6432, OMR port library started throwing an error if isRunningInContainer failed. isRunningInContainer can fail if /proc is mounted with the hidepid=2 setting on Linux (eclipse-omr#7021). This prevents a JVM user to start. Before eclipse-omr#6432, no error was returned if isRunningInContainer failed; a user was completely unaware of this failure; this behaviour can lead to performance issues if the process is running in a container; but no functional issues will be seen. The new behaviour will not throw an error if isRunningInContainer fails, but will issue a warning message to highlight the potential performance impact. Currently, isRunningInContainer is run from omrsysinfo_startup. Neither the trace engine nor NLS messages are enabled at this point. If there is an error in isRunningInContainer, no tracepoint or NLS message will work inside isRunningInContainer. Invocation of isRunningInContainer is delayed to first-use. In OpenJ9, the first-use still happens before the trace engine is initialized, but it happens after the NLS messages are enabled. A new NLS message has been added in eclipse-openj9/openj9#17560, which will show up as a warning when isRunningInContainer fails and highlight the potential performance impact. The result of isRunningInContainer is cached and updated via an atomic operation to enforce data consistency. The caching helps to improve performance when isRunningInContainer is repeatedly invoked. Four new states are introduced for PPG_isRunningInContainer to support the new changes: - OMRPORT_RUNNING_IN_CONTAINER_UNINITIALIZED: evaluate the result of isRunningInContainer. - OMRPORT_RUNNING_IN_CONTAINER_TRUE: inside a container. - OMRPORT_RUNNING_IN_CONTAINER_FALSE: not in a container. - OMRPORT_RUNNING_IN_CONTAINER_ERROR: an error is encountered while evaluating isRunningInContainer. Related: eclipse-omr#7021 Signed-off-by: Babneet Singh <sbabneet@ca.ibm.com>
In eclipse-omr#6432, OMR port library started throwing an error if isRunningInContainer failed. isRunningInContainer can fail if /proc is mounted with the hidepid=2 setting on Linux (eclipse-omr#7021). This prevents a JVM user to start. Before eclipse-omr#6432, no error was returned if isRunningInContainer failed; a user was completely unaware of this failure; this behaviour can lead to performance issues if the process is running in a container; but no functional issues will be seen. The new behaviour will not throw an error if isRunningInContainer fails, but will issue a warning message to highlight the potential performance impact. Currently, isRunningInContainer is run from omrsysinfo_startup. Neither the trace engine nor NLS messages are enabled at this point. If there is an error in isRunningInContainer, no tracepoint or NLS message will work inside isRunningInContainer. Invocation of isRunningInContainer is delayed to first-use. In OpenJ9, the first-use still happens before the trace engine is initialized, but it happens after the NLS messages are enabled. A new NLS message has been added in eclipse-openj9/openj9#17560, which will show up as a warning when isRunningInContainer fails and highlight the potential performance impact. The result of isRunningInContainer is cached and updated via an atomic operation to enforce data consistency. The caching helps to improve performance when isRunningInContainer is repeatedly invoked. Four new states are introduced for PPG_isRunningInContainer to support the new changes: - OMRPORT_RUNNING_IN_CONTAINER_UNINITIALIZED: evaluate the result of isRunningInContainer. - OMRPORT_RUNNING_IN_CONTAINER_TRUE: inside a container. - OMRPORT_RUNNING_IN_CONTAINER_FALSE: not in a container. - OMRPORT_RUNNING_IN_CONTAINER_ERROR: an error is encountered while evaluating isRunningInContainer. Related: eclipse-omr#7021 Signed-off-by: Babneet Singh <sbabneet@ca.ibm.com>
In eclipse-omr#6432, OMR port library started throwing an error if isRunningInContainer failed. isRunningInContainer can fail if /proc is mounted with the hidepid=2 setting on Linux (eclipse-omr#7021). This prevents a JVM user to start. Before eclipse-omr#6432, no error was returned if isRunningInContainer failed; a user was completely unaware of this failure; this behaviour can lead to performance issues if the process is running in a container; but no functional issues will be seen. The new behaviour will not throw an error if isRunningInContainer fails, but will issue a warning message to highlight the potential performance impact. Currently, isRunningInContainer is run from omrsysinfo_startup. Neither the trace engine nor NLS messages are enabled at this point. If there is an error in isRunningInContainer, no tracepoint or NLS message will work inside isRunningInContainer. Invocation of isRunningInContainer is delayed to first-use. In OpenJ9, the first-use still happens before the trace engine is initialized, but it happens after the NLS messages are enabled. A new NLS message has been added in eclipse-openj9/openj9#17560, which will show up as a warning when isRunningInContainer fails and highlight the potential performance impact. The result of isRunningInContainer is cached and updated via an atomic operation to enforce data consistency. The caching helps to improve performance when isRunningInContainer is repeatedly invoked. Four new states are introduced for PPG_isRunningInContainer to support the new changes: - OMRPORT_RUNNING_IN_CONTAINER_UNINITIALIZED: evaluate the result of isRunningInContainer. - OMRPORT_RUNNING_IN_CONTAINER_TRUE: inside a container. - OMRPORT_RUNNING_IN_CONTAINER_FALSE: not in a container. - OMRPORT_RUNNING_IN_CONTAINER_ERROR: an error is encountered while evaluating isRunningInContainer. Related: eclipse-omr#7021 Signed-off-by: Babneet Singh <sbabneet@ca.ibm.com>
In eclipse-omr#6432, OMR port library started throwing an error if isRunningInContainer failed. isRunningInContainer can fail if /proc is mounted with the hidepid=2 setting on Linux (eclipse-omr#7021). This prevents a JVM user to start. Before eclipse-omr#6432, no error was returned if isRunningInContainer failed; a user was completely unaware of this failure; this behaviour can lead to performance issues if the process is running in a container; but no functional issues will be seen. The new behaviour will not throw an error if isRunningInContainer fails, but will issue a warning message to highlight the potential performance impact. Currently, isRunningInContainer is run from omrsysinfo_startup. Neither the trace engine nor NLS messages are enabled at this point. If there is an error in isRunningInContainer, no tracepoint or NLS message will work inside isRunningInContainer. Invocation of isRunningInContainer is delayed to first-use. In OpenJ9, the first-use still happens before the trace engine is initialized, but it happens after the NLS messages are enabled. A new NLS message has been added in eclipse-openj9/openj9#17560, which will show up as a warning when isRunningInContainer fails and highlight the potential performance impact. The result of isRunningInContainer is cached and updated via an atomic operation to enforce data consistency. The caching helps to improve performance when isRunningInContainer is repeatedly invoked. Four new states are introduced for PPG_isRunningInContainer to support the new changes: - OMRPORT_RUNNING_IN_CONTAINER_UNINITIALIZED: evaluate the result of isRunningInContainer. - OMRPORT_RUNNING_IN_CONTAINER_TRUE: inside a container. - OMRPORT_RUNNING_IN_CONTAINER_FALSE: not in a container. - OMRPORT_RUNNING_IN_CONTAINER_ERROR: an error is encountered while evaluating isRunningInContainer. Related: eclipse-omr#7021 Signed-off-by: Babneet Singh <sbabneet@ca.ibm.com>
In eclipse-omr#6432, OMR port library started throwing an error if isRunningInContainer failed. isRunningInContainer can fail if /proc is mounted with the hidepid=2 setting on Linux (eclipse-omr#7021). This prevents a JVM user to start. Before eclipse-omr#6432, no error was returned if isRunningInContainer failed; a user was completely unaware of this failure; this behaviour can lead to performance issues if the process is running in a container; but no functional issues will be seen. The new behaviour will not throw an error if isRunningInContainer fails, but will issue a warning message to highlight the potential performance impact. Currently, isRunningInContainer is run from omrsysinfo_startup. Neither the trace engine nor NLS messages are enabled at this point. If there is an error in isRunningInContainer, no tracepoint or NLS message will work inside isRunningInContainer. Invocation of isRunningInContainer is delayed to first-use. In OpenJ9, the first-use still happens before the trace engine is initialized, but it happens after the NLS messages are enabled. A new NLS message has been added in eclipse-openj9/openj9#17560, which will show up as a warning when isRunningInContainer fails and highlight the potential performance impact. The result of isRunningInContainer is cached and updated via an atomic operation to enforce data consistency. The caching helps to improve performance when isRunningInContainer is repeatedly invoked. Four new states are introduced for PPG_isRunningInContainer to support the new changes: - OMRPORT_RUNNING_IN_CONTAINER_UNINITIALIZED: evaluate the result of isRunningInContainer. - OMRPORT_RUNNING_IN_CONTAINER_TRUE: inside a container. - OMRPORT_RUNNING_IN_CONTAINER_FALSE: not in a container. - OMRPORT_RUNNING_IN_CONTAINER_ERROR: an error is encountered while evaluating isRunningInContainer. Related: eclipse-omr#7021 Signed-off-by: Babneet Singh <sbabneet@ca.ibm.com>
isCgroupV2Available
to detect if cgroup v2 is available on the system.readCgroupFile
intopopulateCgroupEntryListV1
andpopulateCgroupEntryListV2
,where the latter is added for cgroup v2 to fetch enabled subsystems from
$MOUNT_POINT/cgroupName/cgroup.controllers
.getCgroupMemoryLimit
(and its helpers) for cgroup v2: read the correctcontroller files at
MOUNT_POINT/cgroupName/
and account for the possible "max"values in these files.
PPG_sysinfoControlFlags
global to cache if cgroup v1 or v2 is available,or if the process is running in a container.
Issue: #1281
Signed-off-by: Eric Yang eric.yang@ibm.com