[ 
https://issues.apache.org/jira/browse/WHIRR-168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12993982#comment-12993982
 ] 

Tibor Kiss commented on WHIRR-168:
----------------------------------

In case of hadoop, instead of having hadoop-site.xml we would need 
core-site.xml, mapred-site.xml and hdfs-site.xml files. As I mentioned in my 
last comment, the hadoop command line console is complaining about the 
deprecation of hadoop-site.xml.

That is why we should reuse the current HadoopConfigurationBuilder, just making 
some small changes, to let return Configuration also

{code}
--- 
a/services/hadoop/src/main/java/org/apache/whirr/service/hadoop/HadoopConfigurationBuilder.java
+++ 
b/services/hadoop/src/main/java/org/apache/whirr/service/hadoop/HadoopConfigurationBuilder.java
@@ -51,23 +51,44 @@ public class HadoopConfigurationBuilder {
 
   public static Statement buildCommon(String path, ClusterSpec clusterSpec,
       Cluster cluster) throws ConfigurationException, IOException {
-    Configuration config = buildCommonConfiguration(clusterSpec, cluster,
-        new PropertiesConfiguration(WHIRR_HADOOP_DEFAULT_PROPERTIES));
-    return HadoopConfigurationConverter.asCreateFileStatement(path, config);
+    return HadoopConfigurationConverter.asCreateFileStatement(path, 
+        buildCommonConfiguration(clusterSpec, cluster));
   }
   
   public static Statement buildHdfs(String path, ClusterSpec clusterSpec,
       Cluster cluster) throws ConfigurationException, IOException {
-    Configuration config = buildHdfsConfiguration(clusterSpec, cluster,
-        new PropertiesConfiguration(WHIRR_HADOOP_DEFAULT_PROPERTIES));
-    return HadoopConfigurationConverter.asCreateFileStatement(path, config);
+    return HadoopConfigurationConverter.asCreateFileStatement(path, 
+        buildHdfsConfiguration(clusterSpec, cluster));
   }
   
   public static Statement buildMapReduce(String path, ClusterSpec clusterSpec,
       Cluster cluster) throws ConfigurationException, IOException {
-    Configuration config = buildMapReduceConfiguration(clusterSpec, cluster,
+    return HadoopConfigurationConverter.asCreateFileStatement(path, 
+        buildMapReduceConfiguration(clusterSpec, cluster));
+  }
+  
+  public static Configuration buildCommonConfiguration(ClusterSpec clusterSpec,
+      Cluster cluster) throws ConfigurationException, IOException {
+    return buildCommonConfiguration(clusterSpec, cluster,
+        new PropertiesConfiguration(WHIRR_HADOOP_DEFAULT_PROPERTIES));
+  }
+
+  public static Configuration buildHdfsConfiguration(ClusterSpec clusterSpec,
+      Cluster cluster) throws ConfigurationException, IOException {
+    return buildHdfsConfiguration(clusterSpec, cluster,
+        new PropertiesConfiguration(WHIRR_HADOOP_DEFAULT_PROPERTIES));
+  }
+  
+  public static Configuration buildMapReduceConfiguration(ClusterSpec 
clusterSpec,
+      Cluster cluster) throws ConfigurationException, IOException {
+    return buildMapReduceConfiguration(clusterSpec, cluster,
+        new PropertiesConfiguration(WHIRR_HADOOP_DEFAULT_PROPERTIES));
+  }
+  
+  public static Configuration buildClientConfiguration(ClusterSpec clusterSpec,
+      Cluster cluster) throws ConfigurationException, IOException {
+    return buildClientConfiguration(clusterSpec, cluster,
         new PropertiesConfiguration(WHIRR_HADOOP_DEFAULT_PROPERTIES));
-    return HadoopConfigurationConverter.asCreateFileStatement(path, config);
   }
   
   @VisibleForTesting
@@ -102,4 +123,9 @@ public class HadoopConfigurationBuilder {
     return config;
   }
 
+  @VisibleForTesting
+  static Configuration buildClientConfiguration(ClusterSpec clusterSpec,
+      Cluster cluster, Configuration defaults) throws ConfigurationException {
+    return build(clusterSpec, cluster, defaults, "hadoop-client");
+  }
 }
{code}

then in HadoopNameNodeClusterActionHandler#afterConfigure we can access it

{code}
Configuration coreSiteConf = buildCommonConfiguration(clusterSpec, cluster);
Configuration hdfsSiteConf = buildHdfsConfiguration(clusterSpec, cluster);
Configuration mapredSiteConf = buildMapReduceConfiguration(clusterSpec, 
cluster);
Configuration clientSiteConf = buildClientConfiguration(clusterSpec, cluster);
{code}

then it clientSiteConf with coreSiteConf has to be combined into a composite 
configuration.
clientSiteConf is similar with HBaseClusterActionHandler#getConfiguration() 
composition, but in our case clientSiteConf has to be composed with 
coreSiteConf too and we will end up on a core-site.xml which contains 
everything what we have on cluster instances plus in addition what we only need 
to have in client side.

The question is that hdfs-site.xml and mapred-site.xml will be the same as on 
the cluster instances, probably with this change on the client side we will 
have every properties as on the cluster instances, plus some more. Is this a 
good approach?
Currently on the client side we have only a few values, with this approach we 
will increase a little bit the amount of properties. Is this a problem?

> Add a new optional c parameter for being able to configure the port of socks 
> connection.
> ----------------------------------------------------------------------------------------
>
>                 Key: WHIRR-168
>                 URL: https://issues.apache.org/jira/browse/WHIRR-168
>             Project: Whirr
>          Issue Type: New Feature
>          Components: core, service/hadoop
>         Environment: ec2
>            Reporter: Tibor Kiss
>            Assignee: Tibor Kiss
>            Priority: Minor
>         Attachments: local-socks-proxy-address.patch
>
>
> We have a generated .whirr/<hadoop-cluster-name>/hadoop-proxy.sh which 
> contains a hard coded port value, the 6666.
> In order to be able to start multiple clusters from the same console I needed 
> a simple mechanism to be able to parametrize this port number.
> Therefore I made a patch which adds the possibility to set this 
> 'whirr.local-socks-proxy-address' to something like
> whirr.local-socks-proxy-address=localhost:6666
> Instead of configuring the port, we are able to configure the address which 
> contains the port.
> (also for the sourcecode, it looks much better to not have such a hardcoded 
> value.)
> In order to run multiple clusters you only need to override this paramter 
> knowing that the default value is localhost:6666

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to