Sangjin Lee commented on YARN-4224:

I'm a little late to this thread, and have just started to catch up on the 
discussion, so I might be off-base. But let me ask some questions and also 
chime in on some points.

I actually thought that [Li's original 
 (2nd approach in that comment) was quite good. Although it can be a little 
verbose, it's strongly resource-based and very consistent. Also, I liked the 
fact that some parts of the path may be omitted (e.g. users) if it can be 
inferred from other information (which we already support). With that *single* 
rule, it can describe just about any URL here. It can't be any clearer than 

As for the proposal for creating a single token by concatenating cluster, user, 
and the flow name, is the goal basically to avoid multiple levels in the URL 
path (to aid the UI implementation)? Then what about the flow run id? For 
example, let's consider querying for all apps in a flow run. With Li's original 
proposal, it would be

(I swapped cluster and user)

With the concatenation proposal, it would become


(As Varun pointed out, there is the sticky issue of escaping the concatenation 
character, but let's set that aside for the moment) We still have "/runs/123" 
appended after that UID. Is that going to be fine with the UI implementation? 
Or does that also need to be concatenated so that we have something like


I think this has a potential of making things a lot more complicated than is 
needed. I don’t think it’s easy or desirable to flatten everything into a 
single token at all times. Also, how about cases where some part of the 
information can be omitted (e.g. user, flow name, flow run, etc.)? Then how 
should we form the UID? Would we require user/UI to always specify all the 
parts? It may not be always feasible.

Also, I'm not sure if I like the idea of having an end point just to return the 
UID given the bits. It would make the workflow a lot more complicated (the 
client needs this handshake before it can start querying), and I'm not sure 
what we gain by hiding that part into the server. If we were to do this, I 
think we might as well make this a public piece of information so any user or 
UI can compose it quickly. That would make things a whole lot easier to 
implement on both sides.

How about the following proposal? Can we adopt Li's original proposal and also 
support the UID-based pattern to aid the UI? The UID can be considered more 
like a short-hand notation, but would *complement (not replace)* the basic 
REST-style pattern. But we should clearly spell out under which condition such 
concatenated UIDs are supported in order to eliminate any ambiguity (e.g. only 
"cluster+user+flowname", or "entity_type+entity_id" too). It shouldn't be too 
difficult for the server to support both modes, and we would retain most of the 
simplicity that Li's first proposal has yet be able to facilitate the UI 

What do you think?

> Change the ATSv2 reader side REST interface to conform to current REST APIs' 
> in YARN
> ------------------------------------------------------------------------------------
>                 Key: YARN-4224
>                 URL: https://issues.apache.org/jira/browse/YARN-4224
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Varun Saxena
>            Assignee: Varun Saxena
>              Labels: yarn-2928-1st-milestone
>         Attachments: YARN-4224-YARN-2928.01.patch

This message was sent by Atlassian JIRA

Reply via email to