Author: vinodkv
Date: Wed Mar 26 02:52:58 2014
New Revision: 1581656
URL: http://svn.apache.org/r1581656
Log:
YARN-1452. Added documentation about the configuration and usage of generic
application history and the timeline data service. Contributed by Zhijie Shen.
Added:
hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/TimelineServer.apt.vm
Modified:
hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/index.apt.vm
Modified: hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
URL:
http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt?rev=1581656&r1=1581655&r2=1581656&view=diff
==============================================================================
--- hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt (original)
+++ hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt Wed Mar 26 02:52:58 2014
@@ -326,6 +326,9 @@ Release 2.4.0 - UNRELEASED
YARN-1850. Introduced the ability to optionally disable sending out
timeline-
events in the TimelineClient. (Zhijie Shen via vinodkv)
+ YARN-1452. Added documentation about the configuration and usage of generic
+ application history and the timeline data service. (Zhijie Shen via
vinodkv)
+
OPTIMIZATIONS
YARN-1771. Reduce the number of NameNode operations during localization of
Modified:
hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
URL:
http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml?rev=1581656&r1=1581655&r2=1581656&view=diff
==============================================================================
---
hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
(original)
+++
hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
Wed Mar 26 02:52:58 2014
@@ -1105,7 +1105,7 @@
<description>This is default address for the timeline server to start the
RPC server.</description>
<name>yarn.timeline-service.address</name>
- <value>0.0.0.0:10200</value>
+ <value>${yarn.timeline-service.hostname}:10200</value>
</property>
<property>
Added:
hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/TimelineServer.apt.vm
URL:
http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/TimelineServer.apt.vm?rev=1581656&view=auto
==============================================================================
---
hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/TimelineServer.apt.vm
(added)
+++
hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/TimelineServer.apt.vm
Wed Mar 26 02:52:58 2014
@@ -0,0 +1,225 @@
+~~ Licensed under the Apache License, Version 2.0 (the "License");
+~~ you may not use this file except in compliance with the License.
+~~ You may obtain a copy of the License at
+~~
+~~ http://www.apache.org/licenses/LICENSE-2.0
+~~
+~~ Unless required by applicable law or agreed to in writing, software
+~~ distributed under the License is distributed on an "AS IS" BASIS,
+~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+~~ See the License for the specific language governing permissions and
+~~ limitations under the License. See accompanying LICENSE file.
+
+ ---
+ YARN Timeline Server
+ ---
+ ---
+ ${maven.build.timestamp}
+
+YARN Timeline Server
+
+ \[ {{{./index.html}Go Back}} \]
+
+%{toc|section=1|fromDepth=0|toDepth=3}
+
+* Overview
+
+ Storage and retrieval of applications' current as well as historic
+ information in a generic fashion is solved in YARN through the Timeline
+ Server (previously also called Generic Application History Server). This
+ serves two responsibilities:
+
+ ** Generic information about completed applications
+
+ Generic information includes application level data like queue-name, user
+ information etc in the ApplicationSubmissionContext, list of
+ application-attempts that ran for an application, information about each
+ application-attempt, list of containers run under each application-attempt,
+ and information about each container. Generic data is stored by
+ ResourceManager to a history-store (default implementation on a
file-system)
+ and used by the web-UI to display information about completed applications.
+
+ ** Per-framework information of running and completed applications
+
+ Per-framework information is completely specific to an application or
+ framework. For example, Hadoop MapReduce framework can include pieces of
+ information like number of map tasks, reduce tasks, counters etc.
+ Application developers can publish the specific information to the Timeline
+ server via TimelineClient from within a client, the ApplicationMaster
+ and/or the application's containers. This information is then queryable via
+ REST APIs for rendering by application/framework specific UIs.
+
+* Current Status
+
+ Timeline sever is a work in progress. The basic storage and retrieval of
+ information, both generic and framework specific, are in place. Timeline
+ server doesn't work in secure mode yet. The generic information and the
+ per-framework information are today collected and presented separately and
+ thus are not integrated well together. Finally, the per-framework information
+ is only available via RESTful APIs, using JSON type content - ability to
+ install framework specific UIs in YARN isn't supported yet.
+
+* Basic Configuration
+
+ Users need to configure the Timeline server before starting it. The simplest
+ configuration you should add in <<<yarn-site.xml>>> is to set the hostname of
+ the Timeline server:
+
++---+
+<property>
+ <description>The hostname of the Timeline service web
application.</description>
+ <name>yarn.timeline-service.hostname</name>
+ <value>0.0.0.0</value>
+</property>
++---+
+
+* Advanced Configuration
+
+ In addition to the hostname, admins can also configure whether the service is
+ enabled or not, the ports of the RPC and the web interfaces, and the number
+ of RPC handler threads.
+
++---+
+
+<property>
+ <description>Address for the Timeline server to start the RPC
server.</description>
+ <name>yarn.timeline-service.address</name>
+ <value>${yarn.timeline-service.hostname}:10200</value>
+</property>
+
+<property>
+ <description>The http address of the Timeline service web
application.</description>
+ <name>yarn.timeline-service.webapp.address</name>
+ <value>${yarn.timeline-service.hostname}:8188</value>
+</property>
+
+<property>
+ <description>The https address of the Timeline service web
application.</description>
+ <name>yarn.timeline-service.webapp.https.address</name>
+ <value>${yarn.timeline-service.hostname}:8190</value>
+</property>
+
+<property>
+ <description>Handler thread count to serve the client RPC
requests.</description>
+ <name>yarn.timeline-service.handler-thread-count</name>
+ <value>10</value>
+</property>
++---+
+
+* Generic-data related Configuration
+
+ Users can specify whether the generic data collection is enabled or not, and
+ also choose the storage-implementation class for the generic data. There are
+ more configurations related to generic data collection, and users can refer
+ to <<<yarn-default.xml>>> for all of them.
+
++---+
+<property>
+ <description>Indicate to ResourceManager as well as clients whether
+ history-service is enabled or not. If enabled, ResourceManager starts
+ recording historical data that Timelien service can consume. Similarly,
+ clients can redirect to the history service when applications
+ finish if this is enabled.</description>
+ <name>yarn.timeline-service.generic-application-history.enabled</name>
+ <value>false</value>
+</property>
+
+<property>
+ <description>Store class name for history store, defaulting to file system
+ store</description>
+ <name>yarn.timeline-service.generic-application-history.store-class</name>
+
<value>org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore</value>
+</property>
++---+
+
+* Per-framework-date related Configuration
+
+ Users can specify whether per-framework data service is enabled or not,
+ choose the store implementation for the per-framework data, and tune the
+ retention of the per-framework data. There are more configurations related to
+ per-framework data service, and users can refer to <<<yarn-default.xml>>> for
+ all of them.
+
++---+
+<property>
+ <description>Indicate to clients whether Timeline service is enabled or not.
+ If enabled, the TimelineClient library used by end-users will post entities
+ and events to the Timeline server.</description>
+ <name>yarn.timeline-service.enabled</name>
+ <value>true</value>
+</property>
+
+<property>
+ <description>Store class name for timeline store.</description>
+ <name>yarn.timeline-service.store-class</name>
+
<value>org.apache.hadoop.yarn.server.applicationhistoryservice.timeline.LeveldbTimelineStore</value>
+</property>
+
+<property>
+ <description>Enable age off of timeline store data.</description>
+ <name>yarn.timeline-service.ttl-enable</name>
+ <value>true</value>
+</property>
+
+<property>
+ <description>Time to live for timeline store data in
milliseconds.</description>
+ <name>yarn.timeline-service.ttl-ms</name>
+ <value>604800000</value>
+</property>
++---+
+
+* Running Timeline server
+
+ Assuming all the aforementioned configurations are set properly, admins can
+ start the Timeline server/history service with the following command:
+
++---+
+ $ yarn historyserver
++---+
+
+ Or users can start the Timeline server / history service as a daemon:
+
++---+
+ $ yarn-daemon.sh start historyserver
++---+
+
+* Accessing generic-data via command-line
+
+ Users can access applications' generic historic data via the command line as
+ below. Note that the same commands are usable to obtain the corresponding
+ information about running applications.
+
++---+
+ $ yarn application -status <Application ID>
+ $ yarn applicationattempt -list <Application ID>
+ $ yarn applicationattempt -status <Application Attempt ID>
+ $ yarn container -list <Application Attempt ID>
+ $ yarn container -status <Container ID>
++---+
+
+* Publishing of per-framework data by applications
+
+ Developers can define what information they want to record for their
+ applications by composing <<<TimelineEntity>>> and <<<TimelineEvent>>>
+ objects, and put the entities and events to the Timeline server via
+ <<<TimelineClient>>>. Below is an example:
+
++---+
+ // Create and start the Timeline client
+ TimelineClient client = TimelineClient.createTimelineClient();
+ client.init(conf);
+ client.start();
+
+ TimelineEntity entity = null;
+ // Compose the entity
+ try {
+ TimelinePutResponse response = client.putEntities(entity);
+ } catch (IOException e) {
+ // Handle the exception
+ } catch (YarnException e) {
+ // Handle the exception
+ }
+
+ // Stop the Timeline client
+ client.stop();
++---+
Modified:
hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/index.apt.vm
URL:
http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/index.apt.vm?rev=1581656&r1=1581655&r2=1581656&view=diff
==============================================================================
---
hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/index.apt.vm
(original)
+++
hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/index.apt.vm
Wed Mar 26 02:52:58 2014
@@ -43,7 +43,7 @@ MapReduce NextGen aka YARN aka MRv2
* {{{./YARN.html}NextGen MapReduce}}
- * {{{./WritingYarnApplications.html}Writing Yarn Applications}}
+ * {{{./WritingYarnApplications.html}Writing YARN Applications}}
* {{{./CapacityScheduler.html}Capacity Scheduler}}
@@ -51,6 +51,8 @@ MapReduce NextGen aka YARN aka MRv2
* {{{./WebApplicationProxy.html}Web Application Proxy}}
+ * {{{./TimelineServer.html}YARN Timeline Server}}
+
* {{{../../hadoop-project-dist/hadoop-common/CLIMiniCluster.html}CLI
MiniCluster}}
*
{{{../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html}Backward
Compatibility between Apache Hadoop 1.x and 2.x for MapReduce}}