[ https://issues.apache.org/jira/browse/YARN-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652036#comment-16652036 ]
Wangda Tan commented on YARN-8879: ---------------------------------- +1, thanks [~sunilg], please go ahead and get it committed. > Kerberos principal is needed when submitting a submarine job > ------------------------------------------------------------ > > Key: YARN-8879 > URL: https://issues.apache.org/jira/browse/YARN-8879 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Zac Zhou > Assignee: Zac Zhou > Priority: Critical > Attachments: YARN-8879.001.patch, YARN-8879.002.patch > > > when I submitted a submarine job like this: > {code:java} > ./yarn jar > /home/hadoop/hadoop-current/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar > job run \ > --env DOCKER_JAVA_HOME=/opt/java \ > --env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.1.0 --name distributed-tf-gpu \ > --env YARN_CONTAINER_RUNTIME_DOCKER_CONTAINER_NETWORK=calico-network \ > --worker_docker_image 10.120.196.232:5000/gpu-cuda9.0-tf1.8.0-with-models-7 \ > --input_path hdfs://mldev/tmp/cifar-10-data \ > --checkpoint_path hdfs://mldev/user/hadoop/tf-distributed-checkpoint \ > --num_ps 1 \ > --ps_resources memory=4G,vcores=2,gpu=0 \ > --ps_launch_cmd "python /test/cifar10_estimator/cifar10_main.py > --data-dir=hdfs://mldev/tmp/cifar-10-data > --job-dir=hdfs://mldev/tmp/cifar-10-jobdir --num-gpus=0" \ > --ps_docker_image 10.120.196.232:5000/dockerfile-cpu-tf1.8.0-with-models \ > --worker_resources memory=4G,vcores=2,gpu=1 --verbose \ > --num_workers 2 \ > --worker_launch_cmd "python /test/cifar10_estimator/cifar10_main.py > --data-dir=hdfs://mldev/tmp/cifar-10-data > --job-dir=hdfs://mldev/tmp/cifar-10-jobdir --train-steps=500 > --eval-batch-size=16 --train-batch-size=16 --sync --num-gpus=1" {code} > > The following error as got: > {code:java} > Exception in thread "main" java.lang.IllegalArgumentException: Kerberos > principal or keytab is missing. > at > org.apache.hadoop.yarn.service.utils.ServiceApiUtil.validateKerberosPrincipal(ServiceApiUtil.java:255) > at > org.apache.hadoop.yarn.service.utils.ServiceApiUtil.validateAndResolveService(ServiceApiUtil.java:134) > at > org.apache.hadoop.yarn.service.client.ServiceClient.actionCreate(ServiceClient.java:467) > at > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter.submitJob(YarnServiceJobSubmitter.java:542) > at > org.apache.hadoop.yarn.submarine.client.cli.RunJobCli.run(RunJobCli.java:231) > at org.apache.hadoop.yarn.submarine.client.cli.Cli.main(Cli.java:94) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:323) > at org.apache.hadoop.util.RunJar.main(RunJar.java:236){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org