[ 
https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16756585#comment-16756585
 ] 

Eric Yang commented on YARN-7129:
---------------------------------

[~billie.rinaldi] Thank you for the review.

{quote}* Service catalog might be a better name, since the catalog handles YARN 
services and not arbitrary applications. In that case I’d suggest moving the 
module to 
hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-catalog.{quote}

Hadoop-yarn-services is the default implementation of YARN Application Master.  
I view application catalog as independent application that depends on the 
Hadoop-yarn-services.  Hence, I think it is more modular to keep application 
catalog as a peer application from hadoop-yarn-services to prevent overloading 
hadoop-yarn-services project.

{quote}* NOTICE says “binary distribution of this product bundles binaries of 
___” but it should be “source distribution bundles ___.” Also, this information 
should go in LICENSE instead of NOTICE for non-ASLv2 permissive licenses (see 
http://www.apache.org/dev/licensing-howto.html#permissive-deps). It would be 
helpful to include the path(s) to the dependencies, so it’s easier to tell when 
included source code has proper handling in LICENSE and NOTICE and when it 
doesn’t. I have not checked whether all the external code added in this patch 
has been included properly in L&N yet.{quote}

Most of the nodejs projects are sourced to run unit tests.  Similar to maven 
plugins, no NOTICE or LICENSE are included for the build tools.  jQuery, and 
AngularJS are not part of the source distribution, but integrated into binary 
web application.  This is the reason that the NOTICE file includes binary 
distribution of this product bundles binaries of jQuery and AngularJS.

{quote}* There are files named bootstrap-hwx.js and bootstrap-hwx.min.js.{quote}

Will remove hwx from filename in the next patch.

{quote}* 8080 is a difficult port. If you ran the catalog on an 
Ambari-installed cluster, it would fail if it came up on the Ambari server 
host. Does the app catalog need to have net=host? If not, the port wouldn’t be 
a problem.{quote}

App catalog does not require net=host.  In Hadoop 3.3, 
YARN_CONTAINER_RUNTIME_DOCKER_PORTS_MAPPING flag can be used to expose app 
catalog on ephemeral port.

{quote}* I don’t think we should set net=host for the examples, because that 
setting is not recommended for most containers. Also, for the services using 
8080, they could run into conflicts when net=host.{quote}

Will change to use YARN_CONTAINER_RUNTIME_DOCKER_PORTS flag in the next patch.

{quote}* I think it would be better to have fewer examples in samples.xml and 
make sure that they all work (the pcjs image doesn’t exist on dockerhub, and we 
might not want to seem to be “endorsing” all the example images that are 
currently included). Maybe just include httpd and Jenkins? Registry would also 
be a reasonable candidate, but it didn’t work when I tried it with insecure 
limit users mode (it failed to make a dir when pushing an image to it; maybe 
would work as a privileged container with image name 
library/registry:latest).{quote}

I will keep Jenkins, httpd, and Docker Registry as example applications.  For 
private cloud, it is essential to have private Docker Registry, and this 
appears to be the best way to deploy with ease.

{quote}* entrypoint.sh needs modifications to make insecure mode possible 
(maybe checking for empty KEYTAB variable would be sufficient, since -e returns 
true for an empty variable).{quote}

Will include this change in the next patch.

{quote}* Need better defaults for memory requests. When I tested, the catalog 
and Jenkins containers were killed due to OOM. 2048 for the catalog and 1024 
for Jenkins worked for me.{quote}

Will include this change in the next patch.

{quote}* I had to set JAVA_HOME in the catalog service json because the 
container and host didn’t have the same JAVA_HOME.
* It would be good to include the service json needed to launch the catalog. 
We’d need to make the catalog Docker image version a variable for maven to fill 
in during the build process. Maybe the catalog could be one of the service 
examples, like sleeper and httpd.{quote}

Will include service json in example jar file for easy deployment.

{quote}* Downloading from archive.apache.org is discouraged. Is there anything 
else we can do instead for the solr download?{quote}

Solr download is problematic because Apache mirror only keeps the most recent 
releases.  I am open to suggestion other than getting from archive.apache.org.  
I don't know other good way to get fixed version of Solr tarball at this time.

{quote}* Applications launched by the catalog are run as the root user 
(presumably because the catalog is a privileged container); at least that’s 
what is happening in insecure mode. The catalog should have user auth and 
launch the apps as the end user. I see there are already tickets open to 
address this issue.{quote}

Privileged container is not required.  Sorry, it was a mistake in my service 
json.  This will be corrected in the example.

{quote}* We need to work out persistent storage for some of these services 
(including the catalog), or users will get a bad surprise when services or 
individual containers are restarted.{quote}

Solr data can be stored on HDFS by passing proper environment variables.  I 
will open a JIRA to enhance this.

{quote}* Restart doesn’t seem to work. Looks like it is issuing a start, so 
maybe should be named start instead. Restart implies stopping + starting.{quote}

Will correct this in next patch.

{quote}* Could we use AppAdminClient/ApiServiceClient instead of copying the 
getApiClient methods from ApiServiceClient to make the REST calls?{quote}

Good point, this will be updated in the next patch.

{quote}* After deploying a service, it drops to the bottom of the list on the 
main catalog page. Is that intentional?{quote}

Without a search term, Recommended application is displaying in random order.  
I will add a subtask to improve the ranking order base on number of Deployment.

{quote}* If you click on a UI link too early, it brings up a URL such as 
jenkins.${service_name}.${user}.${domain}:8080. It would be better to disallow 
clicking on the link until the variables are populated, if possible.{quote}

Will include the fix in the next patch.

> Application Catalog for YARN applications
> -----------------------------------------
>
>                 Key: YARN-7129
>                 URL: https://issues.apache.org/jira/browse/YARN-7129
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: applications
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>            Priority: Major
>         Attachments: YARN Appstore.pdf, YARN-7129.001.patch, 
> YARN-7129.002.patch, YARN-7129.003.patch, YARN-7129.004.patch, 
> YARN-7129.005.patch, YARN-7129.006.patch, YARN-7129.007.patch, 
> YARN-7129.008.patch, YARN-7129.009.patch, YARN-7129.010.patch, 
> YARN-7129.011.patch, YARN-7129.012.patch, YARN-7129.013.patch, 
> YARN-7129.014.patch, YARN-7129.015.patch, YARN-7129.016.patch, 
> YARN-7129.017.patch, YARN-7129.018.patch
>
>
> YARN native services provides web services API to improve usability of 
> application deployment on Hadoop using collection of docker images.  It would 
> be nice to have an application catalog system which provides an editorial and 
> search interface for YARN applications.  This improves usability of YARN for 
> manage the life cycle of applications.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to