Sangjin Lee commented on YARN-2556:

I'm looking at this as part of a parallel effort for the timeline service v.2 
(YARN-3437). While adopting it for the timeline service v.2, I had some review 
comments for the latest patch here.

(1) defaults
The way defaults work is a little bit counter-intuitive and surprising. It 
turns out you have to provide most of the parameters even if you're happy with 
the defaults. For example, you cannot omit "-s". The only way you take 
advantage of the default is by providing zero or a negative value.

IMO, it would be great the defaults are assumed unless the user overrode a 
value specifically. In other words, parameters can be entirely optional.

(2) printUsage
There is a line break missing in printUsage (line 51). Also, related to the 
above, it would be great if we specify the default for each option.

(3) conf option
Just noting that the conf option is not really consumed by this app, because 
it's consumed by the GenericOptionsParser. So I'm not sure why the code that 
parses the conf option exists here?

(4) constants and strings
Nit: it would be good to lift all the constants and repeated strings into 
static final variables.

(5) System.currentTimeMillis()
System.currentTimeMillis() might not be a good choice for the timer. I would 
suggest using System.nanoTime().

(6) printing out aggregate metrics
It prints out transaction rate and the IO rate per mapper, but it would be good 
to report total transaction rate and IO rate as well.

Do let me know if you have any questions on this...

> Tool to measure the performance of the timeline server
> ------------------------------------------------------
>                 Key: YARN-2556
>                 URL: https://issues.apache.org/jira/browse/YARN-2556
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Jonathan Eagles
>            Assignee: Chang Li
>         Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, 
> YARN-2556.1.patch, YARN-2556.2.patch, YARN-2556.patch, yarn2556.patch, 
> yarn2556.patch, yarn2556_wip.patch
> We need to be able to understand the capacity model for the timeline server 
> to give users the tools they need to deploy a timeline server with the 
> correct capacity.
> I propose we create a mapreduce job that can measure timeline server write 
> and read performance. Transactions per second, I/O for both read and write 
> would be a good start.
> This could be done as an example or test job that could be tied into gridmix.

This message was sent by Atlassian JIRA

Reply via email to