[
https://issues.apache.org/jira/browse/YARN-5396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16078630#comment-16078630
]
Zhiyuan Yang commented on YARN-5396:
------------------------------------
[~elgoiri] Thanks for your interest! Please refer to Spark broadcast variable
implementation and this
[paper|https://pdfs.semanticscholar.org/7b0e/6a3dc18babb19daddb63890e763795943485.pdf].
> YARN large file broadcast service
> ---------------------------------
>
> Key: YARN-5396
> URL: https://issues.apache.org/jira/browse/YARN-5396
> Project: Hadoop YARN
> Issue Type: New Feature
> Reporter: Zhiyuan Yang
> Assignee: Zhiyuan Yang
> Attachments: slides-prototype.pdf, YARN-broadcast-prototype.patch,
> YARNFileTransferService-prototype.pdf
>
>
> In Hadoop and related softwares, there are demands of broadcasting large
> files. For example, YARN application may localize large jar files on each
> node; Hive may distribute large tables in fragment-replicate joins; docker
> integration may broadcast large container image. The current local resource
> based solution is to put the files on HDFS and let each node download from
> HDFS, which is inefficient and not scalable. So we want to build a better
> file transfer service in YARN so that all applications can use it broadcast
> large file efficiently.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]