Re: How to debug hadoop(or YARN) locally?

Allen Zhang Wed, 23 Dec 2015 22:50:30 -0800

Great!  Absolutely a better approach. Thanks







At 2015-12-23 01:49:08, "Chris Nauroth" <[email protected]> wrote:
>If you want the capability to run live pseudo-distributed and deploy code
>changes without doing a full distro tarball build, then you can control
>the classpath by setting a few more environment variables in
>hadoop-env.sh.  Here is an example of what I'm doing in one of my dev
>environments.
>
>export HADOOP_USER_CLASSPATH_FIRST=1
>HADOOP_REPO=~/git/hadoop
>export 
>HADOOP_CLASSPATH=$HADOOP_REPO/hadoop-common-project/hadoop-common/target/cl
>asses:$HADOOP_REPO/hadoop-hdfs-project/hadoop-hdfs-client/target/classes:$H
>ADOOP_REPO/hadoop-hdfs-project/hadoop-hdfs/target/classes
>
>
>Setting HADOOP_CLASSPATH adds additional paths to the classpath before the
>shell launches the JVM.  In my case, I have the source checked out to
>~/git/hadoop, and I point to the target/classes sub-directories for the
>individual sub-modules that I want to override and test.  Then, I can make
>code changes, run "mvn compile" in the sub-module directory, and restart
>the daemons.
>
>By default, the HADOOP_CLASSPATH entries are added at the end of the
>standard classpath.  Setting HADOOP_USER_CLASSPATH_FIRST=1 changes that
>behavior so that the custom entries are first.  This way, my built code
>changes override the classes that were bundled in the tarball distro.
>
>--Chris Nauroth
>
>
>
>
>On 12/21/15, 7:29 PM, "Allen Zhang" <[email protected]> wrote:
>
>>
>>oh, so cool. awesome. Thanks
>>
>>
>>
>>
>>
>>
>>
>>At 2015-12-22 11:01:55, "Jeff Zhang" <[email protected]> wrote:
>>>If you want to change the yarn internal code, you can use MiniYarnCluster
>>>for testing.
>>>
>>>https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-ya
>>>rn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/h
>>>adoop/yarn/server/MiniYARNCluster.java
>>>
>>>On Tue, Dec 22, 2015 at 10:00 AM, Allen Zhang <[email protected]>
>>>wrote:
>>>
>>>>
>>>>
>>>> so, does it to mean that, if I change or add some code, I have to
>>>> re-tarball the whole project using "mvn clean package -Pdist
>>>>-DskipTests
>>>> -Dtar", and then, deploy it to somewhere to remote debug?  if yes, I
>>>>think
>>>> it is so inconvincence. if no, can you guys explain more in this way?
>>>>
>>>>
>>>> Thanks,
>>>> Allen
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> At 2015-12-22 08:55:01, "Jeff Zhang" <[email protected]> wrote:
>>>> >+1 for Chris, remote debug will help you.
>>>> >
>>>> >On Tue, Dec 22, 2015 at 1:54 AM, Chris Nauroth
>>>><[email protected]>
>>>> >wrote:
>>>> >
>>>> >> If you're running the Hadoop daemons in pseudo-distributed mode
>>>>(all the
>>>> >> daemons running as separate processes, but on a single dev host),
>>>>then
>>>> >> another option is to launch the daemon's JVM with the JDWP
>>>>arguments and
>>>> >> attach a "remote" debugger.  This can be either the jdb CLI debugger
>>>> that
>>>> >> ships with the JDK or a fancier IDE like Eclipse or IntelliJ.
>>>> >>
>>>> >> Each daemon's JVM arguments are controlled with an environment
>>>>variable
>>>> >> suffixed with "_OPTS" defined in files named *-env.sh.  For
>>>>example, in
>>>> >> hadoop-env.sh, you could set something like this to enable remote
>>>> >> debugging for the NameNode process:
>>>> >>
>>>> >> export
>>>> >>
>>>> 
>>>>HADOOP_NAMENODE_OPTS="-agentlib:jdwp=transport=dt_socket,server=y,addres
>>>>s=8
>>>> >> 000,suspend=n $HADOOP_NAMENODE_OPTS"
>>>> >>
>>>> >>
>>>> >> Then, you can run "jdb -attach localhost:8000" to attach the
>>>>debugger,
>>>> or
>>>> >> do the equivalent in your IDE of choice.
>>>> >>
>>>> >> --Chris Nauroth
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> On 12/21/15, 7:25 AM, "Daniel Templeton" <[email protected]>
>>>>wrote:
>>>> >>
>>>> >> >Your best bet is to find a test that includes all the bits you
>>>>want and
>>>> >> >execute that test in debug mode.  (You can also change an existing
>>>>test
>>>> >> >to include what you want, but in most cases it is easier to start
>>>>with
>>>> >> >an existing test than to start from scratch.)
>>>> >> >
>>>> >> >Daniel
>>>> >> >
>>>> >> >On 12/20/15 6:01 PM, Allen Zhang wrote:
>>>> >> >> Hi all,
>>>> >> >>
>>>> >> >> I am reading hadoop-2.6.0 source code, mainly focusing on hadoop
>>>> yarn.
>>>> >> >> However i have some problems in reading or debugging the source
>>>> >> >>code,can I debug it locally(I mean in my laptop locally with this
>>>> source
>>>> >> >>code I've downloaded, not remotely debug),
>>>> >> >> because I need to track it execution flow stey by stey, and then
>>>>I
>>>> want
>>>> >> >>to add a new feature or enhancement.
>>>> >> >>
>>>> >> >>
>>>> >> >> So can anyone give some good suggestions or share your method or
>>>>any
>>>> >> >>wiki page?  Really appreciate!!
>>>> >> >>
>>>> >> >>
>>>> >> >> Thanks,
>>>> >> >> Allen
>>>> >> >
>>>> >> >
>>>> >>
>>>> >>
>>>> >
>>>> >
>>>> >--
>>>> >Best Regards
>>>> >
>>>> >Jeff Zhang
>>>>
>>>
>>>
>>>
>>>-- 
>>>Best Regards
>>>
>>>Jeff Zhang
>

Re: How to debug hadoop(or YARN) locally?

Reply via email to