Author: phunt Date: Tue Dec 9 16:11:48 2008 New Revision: 724936 URL: http://svn.apache.org/viewvc?rev=724936&view=rev Log: ZOOKEEPER-161. Content needed: "Designing a ZooKeeper Deployment"
Modified: hadoop/zookeeper/trunk/CHANGES.txt hadoop/zookeeper/trunk/docs/zookeeperAdmin.html hadoop/zookeeper/trunk/docs/zookeeperAdmin.pdf hadoop/zookeeper/trunk/src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml Modified: hadoop/zookeeper/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/hadoop/zookeeper/trunk/CHANGES.txt?rev=724936&r1=724935&r2=724936&view=diff ============================================================================== --- hadoop/zookeeper/trunk/CHANGES.txt (original) +++ hadoop/zookeeper/trunk/CHANGES.txt Tue Dec 9 16:11:48 2008 @@ -5,58 +5,62 @@ Backward compatibile changes: BUGFIXES: - ZOOKEEPER-211 Not all Mock tests are working (ben via phunt) + ZOOKEEPER-211. Not all Mock tests are working (ben via phunt) - ZOOKEEPER-223. change default level in root logger to INFO. - (pat via mahadev) + ZOOKEEPER-223. change default level in root logger to INFO. + (pat via mahadev) - ZOOKEEPER-212. fix the snapshot to be asynchronous. (mahadev and ben) + ZOOKEEPER-212. fix the snapshot to be asynchronous. (mahadev and ben) - ZOOKEEPER-213. fix programmer guide C api docs to be in sync with latest - zookeeper.h (pat via mahadev) + ZOOKEEPER-213. fix programmer guide C api docs to be in sync with latest + zookeeper.h (pat via mahadev) - ZOOKEEPER-219. fix events.poll timeout in watcher test to be longer. - (pat via mahadev) + ZOOKEEPER-219. fix events.poll timeout in watcher test to be longer. + (pat via mahadev) - ZOOKEEPER-217. Fix errors in config to be thrown as Exceptions. (mahadev) + ZOOKEEPER-217. Fix errors in config to be thrown as Exceptions. (mahadev) - ZOOKEEPER-228. fix apache header missing in DBTest. (mahadev) + ZOOKEEPER-228. fix apache header missing in DBTest. (mahadev) - ZOOKEEPER-218. fix the error in the barrier example code. (pat via mahadev) + ZOOKEEPER-218. fix the error in the barrier example code. (pat via mahadev) - ZOOKEEPER-206. documentation tab should contain the version number and - other small site changes. (pat via mahadev) + ZOOKEEPER-206. documentation tab should contain the version number and + other small site changes. (pat via mahadev) - ZOOKEEPER-226. fix exists calls that fail on server if node has null data. - (mahadev) + ZOOKEEPER-226. fix exists calls that fail on server if node has null data. + (mahadev) - ZOOKEEPER-204. SetWatches needs to be the first message after auth messages -to the server (ben via mahadev) + ZOOKEEPER-204. SetWatches needs to be the first message after auth + messages to the server (ben via mahadev) - ZOOKEEPER-208. Zookeeper C client uses API that are not thread safe, -causing crashes when multiple instances are active. (austin shoemaker, chris -daroch and ben reed via mahadev) + ZOOKEEPER-208. Zookeeper C client uses API that are not thread safe, + causing crashes when multiple instances are active. + (austin shoemaker, chris daroch and ben reed via mahadev) - ZOOKEEPER-227. gcc warning from recordio.h (chris darroch via mahadev) + ZOOKEEPER-227. gcc warning from recordio.h (chris darroch via mahadev) - ZOOKEEPER-232. fix apache licence header in TestableZookeeper (mahadev) + ZOOKEEPER-232. fix apache licence header in TestableZookeeper (mahadev) - ZOOKEEPER-249. QuorumPeer.getClientPort() always returns -1. (nitay -joffe via mahadev) + ZOOKEEPER-249. QuorumPeer.getClientPort() always returns -1. + (nitay joffe via mahadev) - ZOOKEEPER-248. QuorumPeer should use Map interface instead of -HashMap implementation. (nitay joffe via mahadev) + ZOOKEEPER-248. QuorumPeer should use Map interface instead of HashMap + implementation. (nitay joffe via mahadev) - ZOOKEEPER-241. Build of a distro fails after clean target is run. (patrick -hunt via mahadev) + ZOOKEEPER-241. Build of a distro fails after clean target is run. + (patrick hunt via mahadev) IMPROVEMENTS: - ZOOKEEPER-64. Log system env information when initializing server and -client (pat via mahadev) + ZOOKEEPER-161. Content needed: "Designing a ZooKeeper Deployment" + (breed via phunt) + + ZOOKEEPER-64. Log system env information when initializing server and + client (pat via mahadev) + + ZOOKEEPER-243. add SEQUENCE flag documentation to the programming guide. + (patrick hunt via mahadev) - ZOOKEEPER-243. add SEQUENCE flag documentation to the programming guide. -(patrick hunt via mahadev) Release 3.0.0 - 2008-10-21 Modified: hadoop/zookeeper/trunk/docs/zookeeperAdmin.html URL: http://svn.apache.org/viewvc/hadoop/zookeeper/trunk/docs/zookeeperAdmin.html?rev=724936&r1=724935&r2=724936&view=diff ============================================================================== --- hadoop/zookeeper/trunk/docs/zookeeperAdmin.html (original) +++ hadoop/zookeeper/trunk/docs/zookeeperAdmin.html Tue Dec 9 16:11:48 2008 @@ -201,6 +201,14 @@ <ul class="minitoc"> <li> <a href="#sc_designing">Designing a ZooKeeper Deployment</a> +<ul class="minitoc"> +<li> +<a href="#sc_CrossMachineRequirements">Cross Machine Requirements</a> +</li> +<li> +<a href="#Single+Machine+Requirements">Single Machine Requirements</a> +</li> +</ul> </li> <li> <a href="#sc_provisioning">Provisioning</a> @@ -621,20 +629,103 @@ </ul> <a name="N10160"></a><a name="sc_designing"></a> <h3 class="h4">Designing a ZooKeeper Deployment</h3> -<p></p> -<a name="N10169"></a><a name="sc_provisioning"></a> +<p>The reliablity of ZooKeeper rests on two basic assumptions.</p> +<ol> + +<li> +<p> Only a minority of servers in a deployment + will fail. <em>Failure</em> in this context + means a machine crash, or some error in the network that + partitions a server off from the majority.</p> + +</li> + +<li> +<p> Deployed machines operate correctly. To + operate correctly means to execute code correctly, to have + clocks that work properly, and to have storage and network + components that perform consistently.</p> + +</li> + +</ol> +<p>The sections below contain considerations for ZooKeeper + administrators to maximize the probability for these assumptions + to hold true. Some of these are cross-machines considerations, + and others are things you should consider for each and every + machine in your deployment.</p> +<a name="N1017C"></a><a name="sc_CrossMachineRequirements"></a> +<h4>Cross Machine Requirements</h4> +<p>For the ZooKeeper service to be active, there must be a + majority of non-failing machines that can communicate with + each other. To create a deployment that can tolerate the + failure of F machines, you should count on deploying 2xF+1 + machines. Thus, a deployment that consists of three machines + can handle one failure, and a deployment of five machines can + handle two failures. Note that a deployment of six machines + can only handle two failures since three machines is not a + majority. For this reason, ZooKeeper deployments are usually + made up of an odd number of machines.</p> +<p>To achieve the highest probability of tolerating a failure + you should try to make machine failures independent. For + example, if most of the machines share the same switch, + failure of that switch could cause a correlated failure and + bring down the service. The same holds true of shared power + circuits, cooling systems, etc.</p> +<a name="N10189"></a><a name="Single+Machine+Requirements"></a> +<h4>Single Machine Requirements</h4> +<p>If ZooKeeper has to contend with other applications for + access to resourses like storage media, CPU, network, or + memory, its performance will suffer markedly. ZooKeeper has + strong durability guarantees, which means it uses storage + media to log changes before the operation responsible for the + change is allowed to complete. You should be aware of this + dependency then, and take great care if you want to ensure + that ZooKeeper operations aren’t held up by your media. Here + are some things you can do to minimize that sort of + degradation: + </p> +<ul> + +<li> + +<p>ZooKeeper's transaction log must be on a dedicated + device. (A dedicated partition is not enough.) ZooKeeper + writes the log sequentially, without seeking Sharing your + log device with other processes can cause seeks and + contention, which in turn can cause multi-second + delays.</p> + +</li> + + +<li> + +<p>Do not put ZooKeeper in a situation that can cause a + swap. In order for ZooKeeper to function with any sort of + timeliness, it simply cannot be allowed to swap. + Therefore, make certain that the maximum heap size given + to ZooKeeper is not bigger than the amount of real memory + available to ZooKeeper. For more on this, see + <a href="#sc_commonProblems">Things to Avoid</a> + below. </p> + +</li> + +</ul> +<a name="N101A7"></a><a name="sc_provisioning"></a> <h3 class="h4">Provisioning</h3> <p></p> -<a name="N10172"></a><a name="sc_strengthsAndLimitations"></a> +<a name="N101B0"></a><a name="sc_strengthsAndLimitations"></a> <h3 class="h4">Things to Consider: ZooKeeper Strengths and Limitations</h3> <p></p> -<a name="N1017B"></a><a name="sc_administering"></a> +<a name="N101B9"></a><a name="sc_administering"></a> <h3 class="h4">Administering</h3> <p></p> -<a name="N10184"></a><a name="sc_monitoring"></a> +<a name="N101C2"></a><a name="sc_monitoring"></a> <h3 class="h4">Monitoring</h3> <p></p> -<a name="N1018D"></a><a name="sc_logging"></a> +<a name="N101CB"></a><a name="sc_logging"></a> <h3 class="h4">Logging</h3> <p>ZooKeeper uses <strong>log4j</strong> version 1.2 as its logging infrastructure. The ZooKeeper default <span class="codefrag filename">log4j.properties</span> @@ -644,10 +735,10 @@ <p>For more information, see <a href="http://logging.apache.org/log4j/1.2/manual.html#defaultInit">Log4j Default Initialization Procedure</a> of the log4j manual.</p> -<a name="N101AD"></a><a name="sc_troubleshooting"></a> +<a name="N101EB"></a><a name="sc_troubleshooting"></a> <h3 class="h4">Troubleshooting</h3> <p></p> -<a name="N101B6"></a><a name="sc_configuration"></a> +<a name="N101F4"></a><a name="sc_configuration"></a> <h3 class="h4">Configuration Parameters</h3> <p>ZooKeeper's behavior is governed by the ZooKeeper configuration file. This file is designed so that the exact same file can be used by @@ -655,7 +746,7 @@ layouts are the same. If servers use different configuration files, care must be taken to ensure that the list of servers in all of the different configuration files match.</p> -<a name="N101BF"></a><a name="sc_minimumConfiguration"></a> +<a name="N101FD"></a><a name="sc_minimumConfiguration"></a> <h4>Minimum Configuration</h4> <p>Here are the minimum configuration keywords that must be defined in the configuration file:</p> @@ -702,7 +793,7 @@ </dd> </dl> -<a name="N101E6"></a><a name="sc_advancedConfiguration"></a> +<a name="N10224"></a><a name="sc_advancedConfiguration"></a> <h4>Advanced Configuration</h4> <p>The configuration settings in the section are optional. You can use them to further fine tune the behaviour of your ZooKeeper servers. @@ -793,7 +884,7 @@ </dd> </dl> -<a name="N10246"></a><a name="sc_clusterOptions"></a> +<a name="N10284"></a><a name="sc_clusterOptions"></a> <h4>Cluster Options</h4> <p>The options in this section are designed for use with an ensemble of servers -- that is, when deploying clusters of servers.</p> @@ -883,7 +974,7 @@ </dl> <p></p> -<a name="N102A3"></a><a name="Unsafe+Options"></a> +<a name="N102E1"></a><a name="Unsafe+Options"></a> <h4>Unsafe Options</h4> <p>The following options can be useful, but be careful when you use them. The risk of each is explained along with the explanation of what @@ -928,7 +1019,7 @@ </dd> </dl> -<a name="N102D5"></a><a name="sc_zkCommands"></a> +<a name="N10313"></a><a name="sc_zkCommands"></a> <h3 class="h4">ZooKeeper Commands: The Four Letter Words</h3> <p>ZooKeeper responds to a small set of commands. Each command is composed of four letters. You issue the commands to ZooKeeper via telnet @@ -993,7 +1084,7 @@ <pre class="code">$ echo ruok | nc 127.0.0.1 5111 imok </pre> -<a name="N10315"></a><a name="sc_dataFileManagement"></a> +<a name="N10353"></a><a name="sc_dataFileManagement"></a> <h3 class="h4">Data File Management</h3> <p>ZooKeeper stores its data in a data directory and its transaction log in a transaction log directory. By default these two directories are @@ -1001,7 +1092,7 @@ transaction log files in a separate directory than the data files. Throughput increases and latency decreases when transaction logs reside on a dedicated log devices.</p> -<a name="N1031E"></a><a name="The+Data+Directory"></a> +<a name="N1035C"></a><a name="The+Data+Directory"></a> <h4>The Data Directory</h4> <p>This directory has two files in it:</p> <ul> @@ -1047,14 +1138,14 @@ idempotent nature of its updates. By replaying the transaction log against fuzzy snapshots ZooKeeper gets the state of the system at the end of the log.</p> -<a name="N1035A"></a><a name="The+Log+Directory"></a> +<a name="N10398"></a><a name="The+Log+Directory"></a> <h4>The Log Directory</h4> <p>The Log Directory contains the ZooKeeper transaction logs. Before any update takes place, ZooKeeper ensures that the transaction that represents the update is written to non-volatile storage. A new log file is started each time a snapshot is begun. The log file's suffix is the first zxid written to that log.</p> -<a name="N10364"></a><a name="File+Management"></a> +<a name="N103A2"></a><a name="File+Management"></a> <h4>File Management</h4> <p>The format of snapshot and log files does not change between standalone ZooKeeper servers and different configurations of @@ -1071,7 +1162,7 @@ needs the latest complete fuzzy snapshot and the log files from the start of that snapshot. The PurgeTxnLog utility implements a simple retention policy that administrators can use.</p> -<a name="N10375"></a><a name="sc_commonProblems"></a> +<a name="N103B3"></a><a name="sc_commonProblems"></a> <h3 class="h4">Things to Avoid</h3> <p>Here are some common problems you can avoid by configuring ZooKeeper correctly:</p> @@ -1125,7 +1216,7 @@ </dd> </dl> -<a name="N10399"></a><a name="sc_bestPractices"></a> +<a name="N103D7"></a><a name="sc_bestPractices"></a> <h3 class="h4">Best Practices</h3> <p>For best results, take note of the following list of good Zookeeper practices. <em>[tbd...]</em> Modified: hadoop/zookeeper/trunk/docs/zookeeperAdmin.pdf URL: http://svn.apache.org/viewvc/hadoop/zookeeper/trunk/docs/zookeeperAdmin.pdf?rev=724936&r1=724935&r2=724936&view=diff ============================================================================== Binary files - no diff available. Modified: hadoop/zookeeper/trunk/src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml URL: http://svn.apache.org/viewvc/hadoop/zookeeper/trunk/src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml?rev=724936&r1=724935&r2=724936&view=diff ============================================================================== --- hadoop/zookeeper/trunk/src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml (original) +++ hadoop/zookeeper/trunk/src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml Tue Dec 9 16:11:48 2008 @@ -282,7 +282,85 @@ <section id="sc_designing"> <title>Designing a ZooKeeper Deployment</title> - <para></para> + <para>The reliablity of ZooKeeper rests on two basic assumptions.</para> + <orderedlist> + <listitem><para> Only a minority of servers in a deployment + will fail. <emphasis>Failure</emphasis> in this context + means a machine crash, or some error in the network that + partitions a server off from the majority.</para> + </listitem> + <listitem><para> Deployed machines operate correctly. To + operate correctly means to execute code correctly, to have + clocks that work properly, and to have storage and network + components that perform consistently.</para> + </listitem> + </orderedlist> + + <para>The sections below contain considerations for ZooKeeper + administrators to maximize the probability for these assumptions + to hold true. Some of these are cross-machines considerations, + and others are things you should consider for each and every + machine in your deployment.</para> + + <section id="sc_CrossMachineRequirements"> + <title>Cross Machine Requirements</title> + + <para>For the ZooKeeper service to be active, there must be a + majority of non-failing machines that can communicate with + each other. To create a deployment that can tolerate the + failure of F machines, you should count on deploying 2xF+1 + machines. Thus, a deployment that consists of three machines + can handle one failure, and a deployment of five machines can + handle two failures. Note that a deployment of six machines + can only handle two failures since three machines is not a + majority. For this reason, ZooKeeper deployments are usually + made up of an odd number of machines.</para> + + <para>To achieve the highest probability of tolerating a failure + you should try to make machine failures independent. For + example, if most of the machines share the same switch, + failure of that switch could cause a correlated failure and + bring down the service. The same holds true of shared power + circuits, cooling systems, etc.</para> + </section> + + <section> + <title>Single Machine Requirements</title> + + <para>If ZooKeeper has to contend with other applications for + access to resourses like storage media, CPU, network, or + memory, its performance will suffer markedly. ZooKeeper has + strong durability guarantees, which means it uses storage + media to log changes before the operation responsible for the + change is allowed to complete. You should be aware of this + dependency then, and take great care if you want to ensure + that ZooKeeper operations arenât held up by your media. Here + are some things you can do to minimize that sort of + degradation: + </para> + + <itemizedlist> + <listitem> + <para>ZooKeeper's transaction log must be on a dedicated + device. (A dedicated partition is not enough.) ZooKeeper + writes the log sequentially, without seeking Sharing your + log device with other processes can cause seeks and + contention, which in turn can cause multi-second + delays.</para> + </listitem> + + <listitem> + <para>Do not put ZooKeeper in a situation that can cause a + swap. In order for ZooKeeper to function with any sort of + timeliness, it simply cannot be allowed to swap. + Therefore, make certain that the maximum heap size given + to ZooKeeper is not bigger than the amount of real memory + available to ZooKeeper. For more on this, see + <xref linkend="sc_commonProblems"/> + below. </para> + </listitem> + </itemizedlist> + </section> </section> <section id="sc_provisioning">