[
https://issues.apache.org/jira/browse/YARN-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597623#comment-16597623
]
Jason Lowe commented on YARN-8680:
----------------------------------
Thanks for the patch!
It would be cleaner to have a LocalResourceTrackerState constructor that takes
the values as arguments and then mark the internal LocalResourceTrackerState
fields as final. Then there is no need to have setter methods for
LocalResourceTrackerState.
Nit: "iter .hasNext()" should be "iter.hasNext()"
I'm not thrilled how the code just knows that LOCALIZATION_FILECACHE_SUFFIX is
the last key prefix to be consumed so it can seek to the next user. It's
fragile for future maintenance. What if another per-user localization key is
added that occurs lexicographically after LOCALIZATION_FILECACHE_SUFFIX? I
think it would be cleaner to break this loop up into multiple parts. For
example a first loop can seek to the start of appcache entries (i..e.: using
LOCALIZATION_APPCACHE_SUFFIX) then initialize all of the appTrackerState
entries, seeking between each application after it sets up the state object
iterators. Then there can be a second part that seeks to the filecache area
(i.e.: using LOCALIZATION_FILECACHE_SUFFIX) and sets up the iterators for the
filecache state. Then finally the last thing is to seek past all the entries
for a user. Then it's clear how someone can later add a new chunk of data to
load besides appcache and filecache entries safely.
Speaking of seeking past entries, that should be moved into a private utility
method (e.g.: seekPastPrefix or whatever) for a few reasons. First, it's a bit
tricky how that works, and it would be good to add some generous comments on
why "zzz" matters. Second, we shouldn't always assume "zzz" is sufficient to
seek past the end of everything. Third, to be extra careful the seek method
should seek to that key and then peek at the next entry. If the next entry
starts with the prefix we're trying to seek beyond, it should continue to move
and peek until the next entry does not start with the key. Then even if "zzz"
ceases to be lexicographically after any key underneath the specified keyPrefix
it continues to work (albeit with a performance penalty on the order of the
number of entries we need to peek-iterate past).
"zzz" should be a static final String constant with a useful name indicating
what it represents, like BEYOND_ENTRIES_SUFFIX or whatever. Warning: I'm
terrible at names.
We may want to consider removing the LocalResourceTrackerState#isEmpty method.
It encourages code to ignore the underlying iterators and therefore never close
them. I believe leveldb iterators retain a snapshot of the underlying table
files, so it could be quite expensive to leak these iterators without closing
them. For example, ResourceLocalizationService will currently leak these
iterators if isEmpty returns true. It may be better to force the consumers to
deal with the iterators directly so they remember to close them properly when
done. We need to make sure these leveldb iterators don't leak.
> YARN NM: Implement Iterable Abstraction for LocalResourceTrackerstate
> ---------------------------------------------------------------------
>
> Key: YARN-8680
> URL: https://issues.apache.org/jira/browse/YARN-8680
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: yarn
> Reporter: Pradeep Ambati
> Assignee: Pradeep Ambati
> Priority: Critical
> Attachments: YARN-8680.00.patch, YARN-8680.01.patch
>
>
> Similar to YARN-8242, implement iterable abstraction for
> LocalResourceTrackerState to load completed and in progress resources when
> needed rather than loading them all at a time for a respective state.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]