[
https://issues.apache.org/jira/browse/YARN-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15220475#comment-15220475
]
Vinod Kumar Vavilapalli commented on YARN-4879:
-----------------------------------------------
Tx for the doc, [~subru] and [~asuresh]! +1 overall for a unique identifier.
h4. Comments on your doc
- I'd rather call it "an enhancement to identify requests explicitly" instead
of "simple (delta) allocate protocol". We used to use the phrase "delta
protocol" in a slightly different context - see YARN-110.
- bq. The RM will attempt to allocate containers in decreasing sequence number
order,
Why are we putting priority semantics onto the ID? We should just follow the
existing priority ordering.
- bq. In our proposal, we could potentially have requests for each container
at worst case.
It is both network / memory overhead as well as scheduler's CPU time. Till we
move off to global scheduling completely, we should be cautious about this. Of
course, by inverting the ResourceRequest and still keying by ResourceName in
the API, we are limiting the total entries to be of the order of the
cluster-size.
I already suggested on YARN-1547 that we also have an upper limit on the total
number of requests - see
[here|https://issues.apache.org/jira/browse/YARN-1547?focusedCommentId=15218681&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15218681].
But I strongly suggest that we have additional limits on the total number of
IDs that can be used - this will fit our narrative at YARN-4902 too.
h4. Comments from YARN-4902
Copy-edit-pasting here a few comments that we posted in the document for
YARN-4902, and those I think were not laid out in the doc explicitly. We were
calling it Allocation-ID there, I guess I now like Request-ID better. If some
or all of them make sense, you can add them to your doc
- *Scope*: This ID is a unique identifier for different ResourceRequests from
the *same application* - essentially IDs can conflict across applications.
- *Generation*: The application should simply generate a unique identifier
within the application - if not the client-libraries can do so if desired by
the application.
- *Non-binding nature*: Applications can continue to completely ignore the
returned Allocation-ID in the response and use the allocation for any of their
outstanding requests
- *Responses*: The scheduler may return multiple responses corresponding to
the same Allocation-ID - as and when scheduler returns allocations
- *Deeper details on updates*: Similar to the current API, update of only
selected fields against a previously existing Allocation-ID will only update
the object (as opposed to replacing it). For e.g, say a ResourceRequest first
gets created with Allocation-ID "76589" and with _"host: *"_. A future
ResourceRequest with the same Allocation-ID but with contents _“rack05: 10”_
will only append the rack information to the existing object. This is how one
can replace parts of an object and is similar to how the existing
per-record-deltas based protocol works.
- *Deletes*: Similarly, if one wishes to replace an entire ResourceRequest
corresponding to a specific allocation-ID, they can simply cancel the
corresponding ResourceRequest and submit a new one afresh.
h4. Other responses
bq. If a node local allocation is made for node N1, we can immediately lookup
the entries for rack and ANY by using the ID key and decrement them instead of
linearly scanning the rack/ANY entries.
+1, ID is really the logical grouping key.
bq. While making these changes, would it possible to address YARN-314 too?
I'm okay if we can get two in a shot, but I'd caution against risking this
effort by blowing up the size.
> Proposal for a simple (delta) allocate protocol
> -----------------------------------------------
>
> Key: YARN-4879
> URL: https://issues.apache.org/jira/browse/YARN-4879
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: applications, resourcemanager
> Reporter: Subru Krishnan
> Assignee: Subru Krishnan
> Attachments: SimpleAllocateProtocolProposal-v1.pdf
>
>
> For legacy reasons, the current allocate protocol expects expanded requests
> which represent the cumulative request for any change in resource
> constraints. This is not only very difficult to comprehend but makes it
> impossible for the scheduler to associate container allocations to the
> original requests. This problem is amplified by the fact that the expansion
> is managed by the AMRMClient which makes it cumbersome for non-Java clients
> as they all have to replicate the non-trivial logic. In this JIRA, we are
> proposing a delta allocate protocol where the AM will need to only specify
> changes in resource constraints.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)