Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to trunk

Tsuyoshi Ozawa Thu, 23 Jun 2016 20:30:52 -0700

Hi Junping,

Thanks for your good suggestion.


> However, my concern to release it in 3.0.0-alpha (even as an alpha feature) 
> is we haven't provide any security support in ATS v2 yet.
> Enabling this feature without understanding the risk here could be a disaster 
> to end-user (even in a test cluster).

You're right. Can we document and clarify that it's still  "alpha 1",
and it doesn't have security features. I also think ATS 1.5 supports
security features, so it's good for production - we should document it
officially.

Thanks,
- Tsuyoshi

On Thu, Jun 23, 2016 at 5:45 PM, 俊平堵 <[email protected]> wrote:
> Big +1 on merging ATS-v2 to trunk. However, my concern to release it in
> 3.0.0-alpha (even as an alpha feature) is we haven't provide any security
> support in ATS v2 yet. Enabling this feature without understanding the risk
> here could be a disaster to end-user (even in a test cluster).
>
> Kudos to everyone who contributes patches, include: Sangjin, Li, Vrushali,
> Naga, Varun, Joep and Zhijie.
>
> Thanks,
>
> Junping
>
> 2016-06-23 13:32 GMT-07:00 Sangjin Lee <[email protected]>:
>>
>> Thanks folks for the good discussion!
>>
>> I'm going to keep it open for a few more days as I'd love to get feedback
>> from more people. I am thinking of opening a voting thread right after the
>> Hadoop Summit next week if there are no objections. Thanks!
>>
>> Regards,
>> Sangjin
>>
>> On Tue, Jun 21, 2016 at 9:51 PM, Li Lu <[email protected]> wrote:
>>
>> > I agree that having non-Hbase impls may attract more potential users to
>> > ATS. Actually I remember we do have some JIRAs for HDFS implementations.
>> > With regard to aggregation, yes, if there are more options on storage
>> > implementations we really need to find some ways to describe their
>> > implications to different kinds of aggressions.
>> >
>> > +1 for the idea of some group chats! The break after the ATS talk may be
>> > a
>> > good candidate?
>> >
>> > Li Lu
>> >
>> > On Jun 21, 2016, at 21:28, Karthik Kambatla <[email protected]> wrote:
>> >
>> > The reasons for my asking about alternate implementations: (1) ease of
>> > trying it out for Yarn devs and iteration for bug fixes, improvements
>> > and
>> > (2) ease of trying it for app-writers/users to figure out if they should
>> > use the ATS. Again, personally, I don't see this as necessary for the
>> > merge
>> > itself, but more so for adoption.
>> >
>> > A test implementation would be enough for #1, and would partially
>> > address
>> > #2. A more substantial implementation would be nice, but I guess we need
>> > to
>> > look at the ROI to decide whether adding that is a good idea.
>> >
>> > On completeness, I agree. Further, for some backend implementations, it
>> > is
>> > possible that a particular aggregation/query might be possible but too
>> > expensive to turn on. What are your thoughts on provisions for the admin
>> > to
>> > turn off some queries/aggregations?
>> >
>> > Orthogonal: is there interest here to catch up on ATS specifically one
>> > of
>> > the days? May be, during the breaks or after the sessions?
>> >
>> > On Tue, Jun 21, 2016 at 6:15 PM, Li Lu <[email protected]> wrote:
>> >
>> >> HDFS or other non-HBase implementations are very helpful. We didn’t
>> >> focus
>> >> on those implementations in the first milestone because we would like
>> >> to
>> >> have one working version as a starting point. We can certainly add more
>> >> implementations when the feature gets more mature.
>> >>
>> >> This said, one of my concerns when building these storage
>> >> implementations
>> >> is “completeness”. We have added a lot of supports to data aggregation.
>> >> As
>> >> of today, part of the aggregation (flow run aggregation) may be
>> >> performed
>> >> as HBase coprocessors. When implementing comparable storage impls, it
>> >> is
>> >> worth noting that one may want to provide some equivalent things to
>> >> perform
>> >> those aggregations (to really make one implementation “complete
>> >> enough”,
>> >> or, “interchangeable” to the existing HBase impl).
>> >>
>> >> Li Lu
>> >> > On Jun 21, 2016, at 15:51, Sangjin Lee <[email protected]> wrote:
>> >> >
>> >> > Thanks Karthik and Tsuyoshi. Regarding alternate implementations, I'd
>> >> like
>> >> > to get a better sense of what you're thinking of. Are you interested
>> >> > in
>> >> > strictly a test implementation (e.g. perfectly fine in a single node
>> >> setup)
>> >> > or a more substantial implementation (may not scale but needs to work
>> >> in a
>> >> > more realistic setup)?
>> >> >
>> >> > Regards,
>> >> > Sangjin
>> >> >
>> >> > On Tue, Jun 21, 2016 at 2:51 PM, J. Rottinghuis
>> >> > <[email protected]
>> >> >
>> >> > wrote:
>> >> >
>> >> >> Thanks Karthik and Tsuyoshi for bringing up good points.
>> >> >>
>> >> >> I've opened https://issues.apache.org/jira/browse/YARN-5281 to track
>> >> this
>> >> >> discussion and capture all the merits and challenges in one single
>> >> place.
>> >> >>
>> >> >> Thanks,
>> >> >>
>> >> >> Joep
>> >> >>
>> >> >> On Tue, Jun 21, 2016 at 8:21 AM, Tsuyoshi Ozawa <[email protected]>
>> >> wrote:
>> >> >>
>> >> >>> Thanks Sangjin for starting the discussion.
>> >> >>>
>> >> >>>>> *First*, if the merge vote is approved, to which branch should
>> >> >>>>> this
>> >> be
>> >> >>> merged and what would be the release version?
>> >> >>>
>> >> >>> As you mentioned, I think it's reasonable for us to target trunk
>> >> >>> and
>> >> >>> 3.0.0-alpha.
>> >> >>>
>> >> >>>>> Slightly unrelated to the merge, do we plan to support any other
>> >> >> simpler
>> >> >>> backend for users to try out, in addition to HBase? LevelDB?
>> >> >>>> We can however, potentially change the Local File System based
>> >> >>> implementation to a HDFS based implementation and have it as an
>> >> alternate
>> >> >>> for non-production use,
>> >> >>>
>> >> >>> In Apache Big Data 2016 NA, some users also mentioned that they
>> >> >>> need
>> >> HDFS
>> >> >>> implementation. Currently it's pending, but I and Varun tried to
>> >> >>> work
>> >> to
>> >> >>> support HDFS backend(YARN-3874). As Karthik mentioned, it's useful
>> >> >>> for
>> >> >>> early users to try v2.0 APIs though it's doesn't scale. IMHO, it's
>> >> useful
>> >> >>> for small cluster(e.g. smaller than 10 machines). After merging the
>> >> >> current
>> >> >>> implementation into trunk, I'm interested in resuming YARN-3874
>> >> >> work(maybe
>> >> >>> Varun is also interested in).
>> >> >>>
>> >> >>> Regards,
>> >> >>> - Tsuyoshi
>> >> >>>
>> >> >>> On Tue, Jun 21, 2016 at 5:07 PM, Varun saxena <
>> >> [email protected]>
>> >> >>> wrote:
>> >> >>>> Thanks Karthik for sharing your views.
>> >> >>>>
>> >> >>>> With regards to merging, it would help to have clear documentation
>> >> >>>> on
>> >> >> how
>> >> >>> to setup and use ATS.
>> >> >>>> --> We do have documentation on this. You and others who are
>> >> interested
>> >> >>> can check out YARN-5174 which is the latest documentation related
>> >> >>> JIRA
>> >> >> for
>> >> >>> ATSv2.
>> >> >>>>
>> >> >>>> Slightly unrelated to the merge, do we plan to support any other
>> >> >> simpler
>> >> >>> backend for users to try out, in addition to HBase? LevelDB?
>> >> >>>> --> We do have a File System based implementation but it is
>> >> >>>> strictly
>> >> >> for
>> >> >>> test purposes (as we write data into a local file). It does not
>> >> support
>> >> >> all
>> >> >>> the features of Timeline Service v.2 as well.
>> >> >>>> Regarding LevelDB, Timeline Service v.2 has distributed writers
>> >> >>>> and
>> >> >> Level
>> >> >>> DB writes data (log files or SSTable files) to local file system.
>> >> >>> This
>> >> >>> means there will be no easy way to have a LevelDB based
>> >> >>> implementation
>> >> >>> because we would not know where to read the data from, especially
>> >> while
>> >> >>> fetching flow level information.
>> >> >>>> We can however, potentially change the Local File System based
>> >> >>> implementation to a HDFS based implementation and have it as an
>> >> alternate
>> >> >>> for non-production use, if there is a potential need for it, based
>> >> >>> on
>> >> >>> community feedback. This however, would have to be further
>> >> >>> discussed
>> >> with
>> >> >>> the team.
>> >> >>>>
>> >> >>>> Regards,
>> >> >>>> Varun Saxena.
>> >> >>>>
>> >> >>>> -----Original Message-----
>> >> >>>> From: Karthik Kambatla [mailto:[email protected]]
>> >> >>>> Sent: 21 June 2016 10:29
>> >> >>>> To: Sangjin Lee
>> >> >>>> Cc: [email protected]
>> >> >>>> Subject: Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to
>> >> >> trunk
>> >> >>>>
>> >> >>>> Firstly, thanks Sangjin and others for driving this major feature.
>> >> >>>>
>> >> >>>> Merging to trunk and including in 3.0.0-alpha1 seems reasonable,
>> >> >>>> as
>> >> it
>> >> >>> will give early access to downstream users.
>> >> >>>>
>> >> >>>> With regards to merging, it would help to have clear documentation
>> >> >>>> on
>> >> >> how
>> >> >>> to setup and use ATS.
>> >> >>>>
>> >> >>>> Slightly unrelated to the merge, do we plan to support any other
>> >> >> simpler
>> >> >>> backend for users to try out, in addition to HBase? LevelDB? I
>> >> understand
>> >> >>> this wouldn't scale, but would it help with initial adoption and
>> >> feedback
>> >> >>> from early users?
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> On Mon, Jun 20, 2016 at 10:26 AM, Sangjin Lee <[email protected]>
>> >> >> wrote:
>> >> >>>>
>> >> >>>>> Hi all,
>> >> >>>>>
>> >> >>>>> I’d like to open a discussion on merging the Timeline Service v.2
>> >> >>>>> feature to trunk (YARN-2928 and MAPREDUCE-6331) [1][2]. We have
>> >> >>>>> been
>> >> >>>>> developing the feature in a feature branch (YARN-2928 [3]) for a
>> >> >>>>> while, and we are reasonably confident that the state of the
>> >> >>>>> feature
>> >> >>>>> meets the criteria to be merged onto trunk and we'd love folks to
>> >> get
>> >> >>>>> their hands on it and provide valuable feedback so that we can
>> >> >>>>> make
>> >> it
>> >> >>> production-ready.
>> >> >>>>>
>> >> >>>>> In a nutshell, Timeline Service v.2 delivers significant
>> >> >>>>> scalability
>> >> >>>>> and usability improvements based on a new architecture. You can
>> >> browse
>> >> >>>>> the requirements/design doc, the storage schema doc, the new
>> >> >>>>> entity/data model, the YARN documentation, and also discussions
>> >> >>>>> on
>> >> >>>>> subsequent milestones on
>> >> >>>>> YARN-2928 [1].
>> >> >>>>>
>> >> >>>>> What we would like to merge to trunk is termed "alpha 1"
>> >> >>>>> (milestone
>> >> >>>>> 1). The feature has a complete end-to-end read/write flow, and
>> >> >>>>> you
>> >> >>>>> should be able to start setting it up and testing it. At a high
>> >> level,
>> >> >>>>> the following are the key features that have been implemented:
>> >> >>>>>
>> >> >>>>> - distributed writers (collectors) as NM aux services
>> >> >>>>> - HBase storage
>> >> >>>>> - new entity model that includes flows
>> >> >>>>> - setting the flow context via YARN app tags
>> >> >>>>> - real time metrics aggregation to the application level and the
>> >> flow
>> >> >>>>> level
>> >> >>>>> - rich REST API that supports filters, complex conditionals,
>> >> >>>>> limits,
>> >> >>>>> content selection, etc.
>> >> >>>>> - YARN generic events and system metrics
>> >> >>>>> - integration with Distributed Shell and MapReduce
>> >> >>>>>
>> >> >>>>> There are a total of 139 subtasks that were completed as part of
>> >> this
>> >> >>>>> effort.
>> >> >>>>>
>> >> >>>>> We paid close attention to ensure that once disabled Timeline
>> >> Service
>> >> >>>>> v.2 does not impact existing functionality when disabled (by
>> >> default).
>> >> >>>>>
>> >> >>>>> I'd like to call out a couple of things to discuss in particular.
>> >> >>>>>
>> >> >>>>> *First*, if the merge vote is approved, to which branch should
>> >> >>>>> this
>> >> be
>> >> >>>>> merged and what would be the release version? My preference is
>> >> >>>>> that
>> >> >>>>> *it would be merged to branch "trunk" and be part of
>> >> >>>>> 3.0.0-alpha1*
>> >> if
>> >> >>> approved.
>> >> >>>>> Since the 3.0.0-alpha1 is in active progress, I wanted to get
>> >> >>>>> your
>> >> >>>>> thoughts on this.
>> >> >>>>>
>> >> >>>>> *Second*, Timeline Service v.2 introduces a dependency on HBase
>> >> >>>>> from
>> >> >>> YARN.
>> >> >>>>> It is not a cyclical dependency (as HBase does not really depend
>> >> >>>>> on
>> >> >>> YARN).
>> >> >>>>> However, the version of Hadoop that HBase currently supports lags
>> >> >>>>> behind the Hadoop version that Timeline Service is based on, so
>> >> there
>> >> >>>>> is a potential for subtle dependency conflicts. We made some
>> >> >>>>> efforts
>> >> >>>>> to isolate the issue (see [4] and [5]). The HBase folks have also
>> >> been
>> >> >>>>> responsive in keeping up with the trunk as much as they can.
>> >> >>>>> Nonetheless, this is something to keep in mind.
>> >> >>>>>
>> >> >>>>> I would love to get your thoughts on these and more before we
>> >> >>>>> open a
>> >> >>>>> real voting thread. Thanks!
>> >> >>>>>
>> >> >>>>> Regards,
>> >> >>>>> Sangjin
>> >> >>>>>
>> >> >>>>> [1] YARN-2928: https://issues.apache.org/jira/browse/YARN-2928
>> >> >>>>> [2] MAPREDUCE-6331:
>> >> >>>>> https://issues.apache.org/jira/browse/MAPREDUCE-6331
>> >> >>>>> [3] YARN-2928 commits:
>> >> >>>>> https://github.com/apache/hadoop/commits/YARN-2928
>> >> >>>>> [4] YARN-5045: https://issues.apache.org/jira/browse/YARN-5045
>> >> >>>>> [5] YARN-5071: https://issues.apache.org/jira/browse/YARN-5071
>> >> >>>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> ---------------------------------------------------------------------
>> >> >>>> To unsubscribe, e-mail: [email protected]
>> >> >>>> For additional commands, e-mail: [email protected]
>> >> >>>>
>> >> >>>
>> >> >>
>> >>
>> >>
>> >
>> >
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to trunk

Reply via email to