[ https://issues.apache.org/jira/browse/YARN-5683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16690339#comment-16690339 ]
Hadoop QA commented on YARN-5683: --------------------------------- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s{color} | {color:red} YARN-5683 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-5683 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12832871/YARN-5683-3.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/22588/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Support specifying storage type for per-application local dirs > -------------------------------------------------------------- > > Key: YARN-5683 > URL: https://issues.apache.org/jira/browse/YARN-5683 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager > Affects Versions: 3.0.0-alpha2 > Reporter: Tao Yang > Assignee: Tao Yang > Priority: Major > Labels: oct16-hard > Attachments: YARN-5683-1.patch, YARN-5683-2.patch, YARN-5683-3.patch, > flow_diagram_for_MapReduce-2.png, flow_diagram_for_MapReduce.png > > > h3. Introduction > * Some applications of various frameworks (Flink, Spark and MapReduce etc) > using local storage (checkpoint, shuffle etc) might require high IO > performance. It's useful to allocate local directories to high performance > storage media for these applications on heterogeneous clusters. > * YARN does not distinguish different storage types and hence applications > cannot selectively use storage media with different performance > characteristics. Adding awareness of storage media can allow YARN to make > better decisions about the placement of local directories. > h3. Approach > * NodeManager will distinguish storage types for local directories. > ** yarn.nodemanager.local-dirs and yarn.nodemanager.log-dirs configuration > should allow the cluster administrator to optionally specify the storage type > for each local directories. Example: > [SSD]/disk1/nm-local-dir,/disk2/nm-local-dir,/disk3/nm-local-dir (equals to > [SSD]/disk1/nm-local-dir,[DISK]/disk2/nm-local-dir,[DISK]/disk3/nm-local-dir) > ** StorageType defines DISK/SSD storage types and takes DISK as the default > storage type. > ** StorageLocation separates storage type and directory path, used by > LocalDirAllocator to aware the types of local dirs, the default storage type > is DISK. > ** getLocalPathForWrite method of LocalDirAllcator will prefer to choose the > local directory of the specified storage type, and will fallback to not care > storage type if the requirement can not be satisfied. > ** Support for container related local/log directories by ContainerLaunch. > All application frameworks can set the environment variables > (LOCAL_STORAGE_TYPE and LOG_STORAGE_TYPE) to specified the desired storage > type of local/log directories, and choose to not launch container if fallback > through these environment variables (ENSURE_LOCAL_STORAGE_TYPE and > ENSURE_LOG_STORAGE_TYPE). > * Allow specified storage type for various frameworks (Take MapReduce as an > example) > ** Add new configurations should allow application administrator to > optionally specify the storage type of local/log directories and fallback > strategy (MapReduce configurations: mapreduce.job.local-storage-type, > mapreduce.job.log-storage-type, mapreduce.job.ensure-local-storage-type and > mapreduce.job.ensure-log-storage-type). > ** Support for container work directories. Set the environment variables > includes LOCAL_STORAGE_TYPE and LOG_STORAGE_TYPE according to configurations > above for ContainerLaunchContext and ApplicationSubmissionContext. (MapReduce > should update YARNRunner and TaskAttemptImpl) > ** Add storage type prefix for request path to support for other local > directories of frameworks (such as shuffle directories for MapReduce). > (MapReduce should update YarnOutputFiles, MROutputFiles and YarnChild to > support for output/work directories) > ** Flow diagram for MapReduce framework > !flow_diagram_for_MapReduce-2.png! > h3. Further Discussion > * Scheduling : The requirement of storage type for local/log directories may > not be satisfied for a part of nodes on heterogeneous clusters. To achieve > global optimum, scheduler should aware and manage disk resources. > ** Approach-1: Based on node attributes (YARN-3409), Scheduler can allocate > containers which have SSD requirement on nodes with attribute:ssd=true. > ** Approach-2: Based on extended resource model (YARN-3926), it's easy to > support scheduling through extending resource models like vdisk and vssd > using this feature, but hard to measure for applications and isolate for > non-CFQ based disks. > * Fallback strategy still needs to be concerned. Certain applications might > not work well when the requirement of storage type is not satisfied. When > none of desired storage type disk are available, should container launching > be failed? let AM handle? We have implemented a fallback strategy that fail > to launch container when none of desired storage type disk are available. Is > there some better methods? > This feature has been used for half a year to meet the needs of some > applications on Alibaba search clusters. > Please feel free to give your suggestions and opinions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org