[
https://issues.apache.org/jira/browse/YARN-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518163#comment-14518163
]
Zhijie Shen commented on YARN-3539:
-----------------------------------
Steve, thanks for consolidating the patch. Here're some of my comments and
thoughts.
bq. What is essential is that all the existing operations must not change, so
that shipping applications do not break.
Yeah, we can retain v1 APIs (this is actually we're doing now), but problem is
around "do not break". Does it mean ATS v2 should be compatible with v1 APIs?
In other word, do we support that user's old app uses v1 client to talk to v2
server?
bq. is is critical to declare that ATSv1 is stable. Without that guarantee, it
is impossible for any application to commit to using the APIs.
bq. Spark depends on this for the SPARK-1537 feature, some ongoing worth with
Accumulo depends on this, when Slider adds ATS support we'll depend on this
stability guarantee, etc, etc.
I pretty understand the desirability of stable APIs. However, I can see TEZ and
Hive/Pig on TEZ started integrating the service even without our declaring the
APIs stable. Though the APIs is not declared as stable, it didn't mean we're
keeping changing it from release to release. Instead, the reality is that the
timeline API is almost compatible since 2.4. Marking it as unstable before is
more like reserving the right to change it for improving the service. So I'm
not sure if it's good timeline now, as we foresee in the near future, we're
going to be upgraded to ATS v2, which may significantly refurnish the APIs.
bq. One area that is not covered in the ATSv1 API is what constitutes a valid
entity type or domain?.
Do you mean the mandatory fields? For entity, they're type, id and starttime
(which can be optional if the entity containsn at least one event). For event,
they are type and timestamp. For domain, they're id.
bq. There is also the fact that the /domain path was added under
/ws/v1/timeline/, so matches the path of entity types. Can you have an entity
type called "domain"? Was it previously possible?
We cannot. "timeline/domain" blocks the entity type "domain" after domain
feature is added. I think we should state it in the documentation (perhaps we
wan't to reserve more names for future use). Other than this, I think we
shouldn't have any other obligation for naming the identifier.
bq. strictly defining what constitutes a valid entity type via a regular
expression, and declaring whether the types are case sensitive.
This is a good idea. We can define the char set and the pattern to prevent
users to define random names, but I'm not sure if it is easy to put into
practice. The question is whether we're going to break the existing users who
have already defined the names that won't match our future regex.
Some comments about the patch:
1. For the bullet points of "Current Status and Future Plans", can we organize
them a bit better. For example, we partition them into the groups of a)
current status and b) future plans. For bullet 4, not just history, but all
timeline data.
2. Can we move "Timeline Server REST API" section before "Generic Data REST
APIs"?
3. Application elements table seems to be wrongly formatted. I think that's why
site compilation is failed.
4. "Generic Data REST APIs" output examples need to be slightly updated. Some
more fields are added or changed.
5. "Timeline Server REST API" output examples are not genuine. Perhaps, we can
run a simple MR example job, and get the up-to-date timeline entity and
application info to show as the examples.
One additional stuff that is not covered by the documentation is the entity
uniqueness. In v1, an entity is globally identified by <type, id>. It means if
user1 has posted <type1, id1> in his application, user2 cannot pos the entity
with the same identifier in his application even they're completely irrelevant.
Therefore, users are suggested to come up with unique entity type for their
framework to avoid the namespace collision.
> Compatibility doc to state that ATS v1 is a stable REST API
> -----------------------------------------------------------
>
> Key: YARN-3539
> URL: https://issues.apache.org/jira/browse/YARN-3539
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: documentation
> Affects Versions: 2.7.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Attachments: HADOOP-11826-001.patch, HADOOP-11826-002.patch,
> YARN-3539-003.patch, YARN-3539-004.patch
>
>
> The ATS v2 discussion and YARN-2423 have raised the question: "how stable are
> the ATSv1 APIs"?
> The existing compatibility document actually states that the History Server
> is [a stable REST
> API|http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html#REST_APIs],
> which effectively means that ATSv1 has already been declared as a stable API.
> Clarify this by patching the compatibility document appropriately
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)