[
https://issues.apache.org/jira/browse/YARN-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16166452#comment-16166452
]
Jason Lowe commented on YARN-7190:
----------------------------------
Yes, many tasks probably do not need YARN jars, but some do. Oozie launcher
tasks, for example, need the YARN client jars to talk to the RM and AHS.
Others may need it as well. So we can't just remove the YARN directory
completely from the classpath or tons of stuff will break.
This isn't a completely new problem. In general the dist layout does nothing
to help separate client classpaths from server classpaths. The fact that
there's only a single {{hadop classpath}} command is very revealing. Ideally
there should be multiple options for obtaining the classpath, e.g.: {{hadoop
classpath client}} or {{hadoop classpath client hdfs}} or {{hadoop classpath
client hdfs yarn}} etc. so the user can better control what they need without
having to get _everything_. Also having the dist layout make it easier for
users to pick up the jars they want for their use-case would be nice, since
it's a bit error-prone for users to cherry-pick jars manually. Sometimes we
split functionality of a jar into multiple jars, and now picking up just one of
the jars isn't enough.
Some of these issues are mitigated when using Maven, since we do a better job
of splitting our Maven artifacts into pieces that can be selected based on the
use-case. Or the user can just pull in the hadoop-client Maven dependency to
just get the client stuff. However that approach doesn't apply to the default
cluster setup where clients are picking up code based on {{hadoop classpath}}
when running as clients and {{yarn.application.classpath}} when running as
tasks. Those cases are pulling in whatever is in the
share/hadoop/\{common,hdfs,yarn\} directories, and there's no distinction for
client vs. server there.
> Ensure only NM classpath in 2.x gets TSv2 related hbase jars, not the user
> classpath
> ------------------------------------------------------------------------------------
>
> Key: YARN-7190
> URL: https://issues.apache.org/jira/browse/YARN-7190
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineclient, timelinereader, timelineserver
> Reporter: Vrushali C
>
> [~jlowe] had a good observation about the user classpath getting extra jars
> in hadoop 2.x brought in with TSv2. If users start picking up Hadoop 2,x's
> version of HBase jars instead of the ones they shipped with their job, it
> could be a problem.
> So when TSv2 is to be used in 2,x, the hbase related jars should come into
> only the NM classpath not the user classpath.
> Here is a list of some jars
> {code}
> commons-csv-1.0.jar
> commons-el-1.0.jar
> commons-httpclient-3.1.jar
> disruptor-3.3.0.jar
> findbugs-annotations-1.3.9-1.jar
> hbase-annotations-1.2.6.jar
> hbase-client-1.2.6.jar
> hbase-common-1.2.6.jar
> hbase-hadoop2-compat-1.2.6.jar
> hbase-hadoop-compat-1.2.6.jar
> hbase-prefix-tree-1.2.6.jar
> hbase-procedure-1.2.6.jar
> hbase-protocol-1.2.6.jar
> hbase-server-1.2.6.jar
> htrace-core-3.1.0-incubating.jar
> jamon-runtime-2.4.1.jar
> jasper-compiler-5.5.23.jar
> jasper-runtime-5.5.23.jar
> jcodings-1.0.8.jar
> joni-2.1.2.jar
> jsp-2.1-6.1.14.jar
> jsp-api-2.1-6.1.14.jar
> jsr311-api-1.1.1.jar
> metrics-core-2.2.0.jar
> servlet-api-2.5-6.1.14.jar
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]