[
https://issues.apache.org/jira/browse/YARN-9016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16791334#comment-16791334
]
Vrushali C commented on YARN-9016:
----------------------------------
Two of the tests are failing due to bind exceptions
{code}
[ERROR]
testTimelineServiceEventPublishingV1V2Enabled(org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher)
Time elapsed: 0.088 s <<< ERROR!
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.BindException:
Problem binding to [0.0.0.0:10200] java.net.BindException: Address already in
use; For more details see: http://wiki.apache.org/hadoop/BindException
at
org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:139)
at
org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:66)
at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:55)
at
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.serviceStart(ApplicationHistoryClientService.java:94)
at
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
at
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceStart(ApplicationHistoryServer.java:120)
at
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at
org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher.testSetup(TestCombinedSystemMetricsPublisher.java:123)
at
org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher.runTest(TestCombinedSystemMetricsPublisher.java:242)
at
org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher.testTimelineServiceEventPublishingV1V2Enabled(TestCombinedSystemMetricsPublisher.java:252)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.BindException: Problem binding to [0.0.0.0:10200]
java.net.BindException: Address already in use; For more details see:
http://wiki.apache.org/hadoop/BindException
{code}
The
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.TestFederationInterceptor)
is failing due to:
{code}
Caused by: org.apache.hadoop.yarn.exceptions.YarnException:
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: amrmToken from UAM SC-1
should be null here
at
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor.allocate(FederationInterceptor.java:719)
at
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.TestFederationInterceptor.getContainersAndAssert(TestFederationInterceptor.java:219)
at
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.TestFederationInterceptor.access$400(TestFederationInterceptor.java:89)
at
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.TestFederationInterceptor$2.run(TestFederationInterceptor.java:407)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
... 28 more
Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: amrmToken
from UAM SC-1 should be null here
at
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor.mergeAllocateResponse(FederationInterceptor.java:1418)
at
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor.mergeAllocateResponses(FederationInterceptor.java:1348)
at
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor.allocate(FederationInterceptor.java:695)
... 34 more
{code}
All of these seem to be unrelated to the patch.
So +1 for patch v004.
I will give it a day or so in case anyone else wants to review and then commit
it.
> DocumentStore as a backend for ATSv2
> ------------------------------------
>
> Key: YARN-9016
> URL: https://issues.apache.org/jira/browse/YARN-9016
> Project: Hadoop YARN
> Issue Type: New Feature
> Components: ATSv2
> Reporter: Sushil Ks
> Assignee: Sushil Ks
> Priority: Major
> Attachments: YARN-9016.001.patch, YARN-9016.002.patch,
> YARN-9016.003.patch, YARN-9016.004.patch
>
>
> h1. Document Store for ATSv2
> The Document Store for ATSv2 is a framework for plugging in
> any Document Store Vendor as a backend for ATSv2 i.e Azure CosmosDB ,
> MongoDB, ElasticSearch etc.
> * Supports multiple Document Store Vendors like CosmosDB, ElasticSearch,
> MongoDB etc by just adding new configurations properties and writing Document
> Store reader and writer clients.
> * Currently has support for CosmosDB.
> * All writes are Async and buffered, latest document would be flushed to the
> store either if the document buffer gets full or periodically at every flush
> interval in background without adding any additional latency to the running
> jobs..
> * All the REST API's of Timeline Reader Server are supported.
> h4.
> *How to enable?*
> Add the flowing properties under *yarn-site.xml*
> {code:java}
> <!-- config required for ATSv2 to use DocumentStore-->
> <property>
> <name>yarn.timeline-service.writer.class </name>
>
> <value>org.apache.hadoop.yarn.server.timelineservice.storage.documentstore.DocumentStoreTimelineWriterImpl</value>
> </property>
> <property>
> <name>yarn.timeline-service.reader.class </name>
> <value>org.apache.hadoop.yarn.server.timelineservice.storage.documentstore.DocumentStoreTimelineReaderImpl</value>
> </property>
> <property>
> <name>yarn.timeline-service.document-store.db-name</name>
> <value>YOUR_DATABASE_NAME</value> <!-- default is timeline_service -->
> </property>{code}
> h3. *Creating DB and Collections for storing documents*
> The following config needs to be set inside
> *yarn-site.xml* for creating the database and collections for storing
> documents.
> {code:java}
> <!-- Using schema creator class for DocumentStore-->
> <property>
> <name>yarn.timeline-service.schema-creator.class </name>
>
> <value>org.apache.hadoop.yarn.server.timelineservice.documentstore.DocumentStoreCollectionCreator</value>
> </property>{code}
> Running the schema creator tool to create the necessary
> collections.
> {code:java}
> bin/hadoop
> org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator{code}
> h3. *Azure CosmosDB*
> To use Azure CosmosDB as a DocumentStore for ATSv2, the additional
> properties under *yarn-site.xml* is required..
> {code:java}
> <!-- config required for using Azure CosmosDB as a DocumentStore for ATSv2 -->
> <property>
> <name>yarn.timeline-service.document-store-type</name>
> <value>COSMOS_DB</value>
> </property>
> <property>
> <name>yarn.timeline-service.document-store.cosmos-db.endpoint</name>
> <value>http://YOUR_AZURE_COSMOS_DB_URL:443/</value>
> </property>
> <property>
> <name>yarn.timeline-service.document-store.cosmos-db.masterkey</name>
> <value>YOUR_AZURE_COSMOS_DB_MASTER_KEY_CREDENTIAL</value>
> </property>
> {code}
>
> *Testing locally*
> In order to test the Azure CosmosDB as a DocumentStore
> locally, install the emulator from
> [here|https://docs.microsoft.com/en-us/azure/cosmos-db/local-emulator] and
> start it locally. Set the endpoint and master key under *yarn-site.xml* as
> mentioned above and run any example job like DistributedShell etc. Later you
> can check the data explorer UI of Azure CosmosDB locally to query the
> documents or even launch the *TimelineReader* locally to fetch/query the data
> from REST API's.
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]