[ https://issues.apache.org/jira/browse/YARN-10353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17160121#comment-17160121 ]
Jim Brennan commented on YARN-10353: ------------------------------------ Thanks for the comments [~ebadger]! I am definitely open to changing the wording to make things clearer. The reason I did it this way was to match the style of the other CPU fields, and to keep it compact and parsable by scripts. I was thinking spaces delimit fields, and colons delimit label vs value. But it's definitely parsable either way, and memory components use the x of y format, so I can live with either. I agree that 2 of 10 vCores is easier to read. Was just trying for a more compact format. Would {{Vcores: 2 of 10}} be better? I used {{CPU-ms}} to make it clear what the units were (one complaint I had of the 2.8 format). It represents total cpu time for the process tree since it started. Maybe {{Accumulated-CPU-ms}}? Speaking of which, I've always thought the existing CPU fields in the log line are mis-named. {{CPU}} is showing cpu/processor (like what you would see in {{top}}, while {{CPU/proc}} is showing percent of CPU across all processors used by yarn. Naming is hard. > Log vcores used and cumulative cpu in containers monitor > -------------------------------------------------------- > > Key: YARN-10353 > URL: https://issues.apache.org/jira/browse/YARN-10353 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn > Affects Versions: 3.4.0 > Reporter: Jim Brennan > Assignee: Jim Brennan > Priority: Minor > Attachments: YARN-10353.001.patch > > > We currently log the percentage/cpu and percentage/cpus-used-by-yarn in the > Containers Monitor log. It would be useful to also log vcores used vs vcores > assigned, and total accumulated CPU time. > For example, currently we have an audit log that looks like this: > {noformat} > 2020-07-16 20:33:51,550 DEBUG [Container Monitor] ContainersMonitorImpl.audit > (ContainersMonitorImpl.java:recordUsage(651)) - Resource usage of ProcessTree > 809 for container-id container_1594931466123_0002_01_000007: 309.5 MB of 2 GB > physical memory used; 2.8 GB of 4.2 GB virtual memory used CPU:143.0905 > CPU/core:35.772625 > {noformat} > The proposal is to add two more fields to show vCores and Cumulative CPU ms: > {noformat} > 2020-07-16 20:33:51,550 DEBUG [Container Monitor] ContainersMonitorImpl.audit > (ContainersMonitorImpl.java:recordUsage(651)) - Resource usage of ProcessTree > 809 for container-id container_1594931466123_0002_01_000007: 309.5 MB of 2 GB > physical memory used; 2.8 GB of 4.2 GB virtual memory used CPU:143.0905 > CPU/core:35.772625 vCores:2/1 CPU-ms:4180 > {noformat} > This is a snippet of a log from one of our clusters running branch-2.8 with a > similar change. > {noformat} > 2020-07-16 21:00:02,240 [Container Monitor] DEBUG > ContainersMonitorImpl.audit: Memory usage of ProcessTree 5267 for > container-id container_e04_1594079801456_1397450_01_001992: 1.6 GB of 2.5 GB > physical memory used; 3.8 GB of 5.3 GB virtual memory used. CPU usage: 18 of > 10 CPU vCores used. Cumulative CPU time: 157410 > 2020-07-16 21:00:02,269 [Container Monitor] DEBUG > ContainersMonitorImpl.audit: Memory usage of ProcessTree 18801 for > container-id container_e04_1594079801456_1390375_01_000019: 413.2 MB of 2.5 > GB physical memory used; 3.8 GB of 5.3 GB virtual memory used. CPU usage: 0 > of 10 CPU vCores used. Cumulative CPU time: 113830 > 2020-07-16 21:00:02,298 [Container Monitor] DEBUG > ContainersMonitorImpl.audit: Memory usage of ProcessTree 5279 for > container-id container_e04_1594079801456_1397450_01_001991: 2.2 GB of 2.5 GB > physical memory used; 3.8 GB of 5.3 GB virtual memory used. CPU usage: 17 of > 10 CPU vCores used. Cumulative CPU time: 128630 > 2020-07-16 21:00:02,339 [Container Monitor] DEBUG > ContainersMonitorImpl.audit: Memory usage of ProcessTree 24189 for > container-id container_e04_1594079801456_1390430_01_000415: 392.7 MB of 2.5 > GB physical memory used; 3.8 GB of 5.3 GB virtual memory used. CPU usage: 0 > of 10 CPU vCores used. Cumulative CPU time: 96060 > 2020-07-16 21:00:02,367 [Container Monitor] DEBUG > ContainersMonitorImpl.audit: Memory usage of ProcessTree 6751 for > container-id container_e04_1594079801456_1397923_01_003248: 1.3 GB of 3 GB > physical memory used; 4.3 GB of 6.3 GB virtual memory used. CPU usage: 12 of > 10 CPU vCores used. Cumulative CPU time: 116820 > 2020-07-16 21:00:02,396 [Container Monitor] DEBUG > ContainersMonitorImpl.audit: Memory usage of ProcessTree 12138 for > container-id container_e04_1594079801456_1397760_01_000044: 4.4 GB of 6 GB > physical memory used; 6.9 GB of 12.6 GB virtual memory used. CPU usage: 15 of > 10 CPU vCores used. Cumulative CPU time: 45900 > 2020-07-16 21:00:02,424 [Container Monitor] DEBUG > ContainersMonitorImpl.audit: Memory usage of ProcessTree 101918 for > container-id container_e04_1594079801456_1391130_01_002378: 2.4 GB of 4 GB > physical memory used; 5.8 GB of 8.4 GB virtual memory used. CPU usage: 13 of > 10 CPU vCores used. Cumulative CPU time: 2572390 > 2020-07-16 21:00:02,456 [Container Monitor] DEBUG > ContainersMonitorImpl.audit: Memory usage of ProcessTree 26596 for > container-id container_e04_1594079801456_1390446_01_000665: 418.6 MB of 2.5 > GB physical memory used; 3.8 GB of 5.3 GB virtual memory used. CPU usage: 0 > of 10 CPU vCores used. Cumulative CPU time: 101210 > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org