[ 
https://issues.apache.org/jira/browse/YARN-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16166452#comment-16166452
 ] 

Jason Lowe commented on YARN-7190:
----------------------------------

Yes, many tasks probably do not need YARN jars, but some do.  Oozie launcher 
tasks, for example, need the YARN client jars to talk to the RM and AHS.  
Others may need it as well.  So we can't just remove the YARN directory 
completely from the classpath or tons of stuff will break.

This isn't a completely new problem.  In general the dist layout does nothing 
to help separate client classpaths from server classpaths.  The fact that 
there's only a single {{hadop classpath}} command is very revealing.  Ideally 
there should be multiple options for obtaining the classpath, e.g.: {{hadoop 
classpath client}} or {{hadoop classpath client hdfs}} or {{hadoop classpath 
client hdfs yarn}} etc. so the user can better control what they need without 
having to get _everything_.  Also having the dist layout make it easier for 
users to pick up the jars they want for their use-case would be nice, since 
it's a bit error-prone for users to cherry-pick jars manually.  Sometimes we 
split functionality of a jar into multiple jars, and now picking up just one of 
the jars isn't enough.

Some of these issues are mitigated when using Maven, since we do a better job 
of splitting our Maven artifacts into pieces that can be selected based on the 
use-case.  Or the user can just pull in the hadoop-client Maven dependency to 
just get the client stuff.  However that approach doesn't apply to the default 
cluster setup where clients are picking up code based on {{hadoop classpath}} 
when running as clients and {{yarn.application.classpath}} when running as 
tasks.  Those cases are pulling in whatever is in the 
share/hadoop/\{common,hdfs,yarn\} directories, and there's no distinction for 
client vs. server there.



> Ensure only NM classpath in 2.x gets TSv2 related hbase jars, not the user 
> classpath
> ------------------------------------------------------------------------------------
>
>                 Key: YARN-7190
>                 URL: https://issues.apache.org/jira/browse/YARN-7190
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineclient, timelinereader, timelineserver
>            Reporter: Vrushali C
>
> [~jlowe] had a good observation about the user classpath getting extra jars 
> in hadoop 2.x brought in with TSv2.  If users start picking up Hadoop 2,x's 
> version of HBase jars instead of the ones they shipped with their job, it 
> could be a problem.
> So when TSv2 is to be used in 2,x, the hbase related jars should come into 
> only the NM classpath not the user classpath.
> Here is a list of some jars
> {code}
> commons-csv-1.0.jar
> commons-el-1.0.jar
> commons-httpclient-3.1.jar
> disruptor-3.3.0.jar
> findbugs-annotations-1.3.9-1.jar
> hbase-annotations-1.2.6.jar
> hbase-client-1.2.6.jar
> hbase-common-1.2.6.jar
> hbase-hadoop2-compat-1.2.6.jar
> hbase-hadoop-compat-1.2.6.jar
> hbase-prefix-tree-1.2.6.jar
> hbase-procedure-1.2.6.jar
> hbase-protocol-1.2.6.jar
> hbase-server-1.2.6.jar
> htrace-core-3.1.0-incubating.jar
> jamon-runtime-2.4.1.jar
> jasper-compiler-5.5.23.jar
> jasper-runtime-5.5.23.jar
> jcodings-1.0.8.jar
> joni-2.1.2.jar
> jsp-2.1-6.1.14.jar
> jsp-api-2.1-6.1.14.jar
> jsr311-api-1.1.1.jar
> metrics-core-2.2.0.jar
> servlet-api-2.5-6.1.14.jar
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to