[ 
https://issues.apache.org/jira/browse/YARN-5683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15565030#comment-15565030
 ] 

Tao Yang commented on YARN-5683:
--------------------------------

Add YARN-5683-2.patch and update the description.
Updates:
(1) Fix problem: not choosing specified storage medium when local directories 
turned good.
(2) Support fallback strategy that fail to launch container if none of desired 
storage type disk are available.

> Support specifying storage type for per-application local dirs
> --------------------------------------------------------------
>
>                 Key: YARN-5683
>                 URL: https://issues.apache.org/jira/browse/YARN-5683
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager
>    Affects Versions: 3.0.0-alpha2
>            Reporter: Tao Yang
>            Assignee: Tao Yang
>         Attachments: YARN-5683-1.patch, YARN-5683-2.patch, 
> flow_diagram_for_MapReduce-2.png, flow_diagram_for_MapReduce.png
>
>
> h3.  Introduction
> * Some applications of various frameworks (Flink, Spark and MapReduce etc) 
> using local storage (checkpoint, shuffle etc) might require high IO 
> performance. It's useful to allocate local directories to high performance 
> storage media for these applications on heterogeneous clusters.
> * YARN does not distinguish different storage types and hence applications 
> cannot selectively use storage media with different performance 
> characteristics. Adding awareness of storage media can allow YARN to make 
> better decisions about the placement of local directories.
> h3.  Approach
> * NodeManager will distinguish storage types for local directories.
> ** yarn.nodemanager.local-dirs and yarn.nodemanager.log-dirs configuration 
> should allow the cluster administrator to optionally specify the storage type 
> for each local directories. Example: 
> [SSD]/disk1/nm-local-dir,/disk2/nm-local-dir,/disk3/nm-local-dir (equals to 
> [SSD]/disk1/nm-local-dir,[DISK]/disk2/nm-local-dir,[DISK]/disk3/nm-local-dir)
> ** StorageType defines DISK/SSD storage types and takes DISK as the default 
> storage type. 
> ** StorageLocation separates storage type and directory path, used by 
> LocalDirAllocator to aware the types of local dirs, the default storage type 
> is DISK.
> ** getLocalPathForWrite method of LocalDirAllcator will prefer to choose the 
> local directory of the specified storage type, and will fallback to not care 
> storage type if the requirement can not be satisfied.
> ** Support for container related local/log directories by ContainerLaunch. 
> All application frameworks can set the environment variables 
> (LOCAL_STORAGE_TYPE and LOG_STORAGE_TYPE) to specified the desired storage 
> type of local/log directories, and choose to not launch container if fallback 
> through these environment variables (ENSURE_LOCAL_STORAGE_TYPE and 
> ENSURE_LOG_STORAGE_TYPE).
> * Allow specified storage type for various frameworks (Take MapReduce as an 
> example)
> ** Add new configurations should allow application administrator to 
> optionally specify the storage type of local/log directories and fallback 
> strategy (MapReduce configurations: mapreduce.job.local-storage-type, 
> mapreduce.job.log-storage-type, mapreduce.job.ensure-local-storage-type and 
> mapreduce.job.ensure-log-storage-type).
> ** Support for container work directories. Set the environment variables 
> includes LOCAL_STORAGE_TYPE and LOG_STORAGE_TYPE according to configurations 
> above for ContainerLaunchContext and ApplicationSubmissionContext. (MapReduce 
> should update YARNRunner and TaskAttemptImpl)
> ** Add storage type prefix for request path to support for other local 
> directories of frameworks (such as shuffle directories for MapReduce). 
> (MapReduce should update YarnOutputFiles, MROutputFiles and YarnChild to 
> support for output/work directories)
> ** Flow diagram for MapReduce framework
> !flow_diagram_for_MapReduce-2.png!
> h3.  Further Discussion
> * The requirement of storage type for local/log directories may not be 
> satisfied on heterogeneous clusters. To achieve global optimum, scheduler 
> should aware and manage disk resources. 
> [YARN-2139|https://issues.apache.org/jira/browse/YARN-2139] is close to that 
> but seems not support multiple storage types, maybe we should do even more to 
> aware the storage type of disk resource?
> * Node labels or node constraints 
> ([YARN-3409|https://issues.apache.org/jira/browse/YARN-3409]) can also make a 
> higher chance to satisfy the requirement of specified storage type.
> * Fallback strategy still needs to be concerned. Certain applications might 
> not work well when the requirement of storage type is not satisfied. When 
> none of desired storage type disk are available, should container launching 
> be failed? let AM handle? We have implemented a fallback strategy that fail 
> to launch container when none of desired storage type disk are available. Is 
> there some better methods? 
> This feature has been used for half a year to meet the needs of some 
> applications on Alibaba search clusters.
> Please feel free to give your suggestions and opinions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to