Looks like the handling of 'setting osimage' action hung in the
anaconda.pm->mkinstall. Could you run following actions to simulate the
nexdesinty from node to see how long it takes to finish.

   Select one node and run 'nodeset <node> shell; rpower <node> reset' to
   make it get into genesis shell.
   Run 'chdef <node> currchain=osimage' to make the next state of node to
   be 'osimage'
   Run 'time ssh <node> /bin/nextdestiny <xcatmaster>:3001' to see how long
   it takes to finish a nextdestiny request for a node.


Thanks
Best Regards
----------------------------------------------------------------------
 Wang Xiaopeng (王晓朋)
 IBM China System Technology Laboratory
 Tel: 86-10-82453455
 Email: w...@cn.ibm.com
 Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road,
Haidian District Beijing P.R.China 100193



From:   Russell Jones <russell-l...@jonesmail.me>
To:     xcat-user@lists.sourceforge.net,
Date:   2014/01/22 00:47
Subject:        Re: [xcat-user] Bad performance with 70 nodes requesting
            nextdestiny



For contrast, I just tested currstate=runimage, and chain=shell. The
difference was night and day, the highest load average the management node
ever reached was 4.

There's something about having chain/currchain=osimage that is causing the
management node to be overloaded.


On 1/21/2014 10:24 AM, Russell Jones wrote:
      Sorry, forgot to also mention that it takes a while for them all to
      leave the Genesis DHCP message "Acquired IPv4 address...." before
      they actually run the script. When they do get past that and start
      running the script it seems they all do it fairly simultaneously.



      On 1/20/2014 8:56 PM, Xiao Peng Wang wrote:


            How long it takes to finish a 'nodeset <node> osimage> command?

            From the syslog, did you see all the nodes get hanging in
            nextdestiny for a while before getting into deployment process
            (10 minutes?)? Or you saw the nodes got into deployment process
            one by one with a fixed interval (what was the interval?).


            Thanks
            Best Regards
            
----------------------------------------------------------------------

            Wang Xiaopeng (王晓朋)
            IBM China System Technology Laboratory
            Tel: 86-10-82453455
            Email: w...@cn.ibm.com
            Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West
            Road, Haidian District Beijing P.R.China 100193

            Inactive hide details for Russell Jones ---2014/01/21
            10:36:11---There is just 2 tasks, a runimage and an osimage.
            chain.currchRussell Jones ---2014/01/21 10:36:11---There is
            just 2 tasks, a runimage and an osimage.
            chain.currchain=osimage, and

            From: Russell Jones <russell-l...@jonesmail.me>
            To: xcat-user@lists.sourceforge.net,
            Date: 2014/01/21 10:36
            Subject: Re: [xcat-user] Bad performance with 70 nodes
            requesting nextdestiny





            There is just 2 tasks, a runimage and an osimage.

            chain.currchain=osimage, and chain.currstate=runimage=
            http://master/install/script.tgz.

            The image is available to be downloaded. A nodeset on a single
            node completed immediately.



            On 1/20/2014 7:57 PM, Xiao Peng Wang wrote:

                  How many tasks in your chain list?
                  What's the value of chain.currchain and chain.currstate
                  for nodes when you saw that a lot of processes hang in
                  'nextdestiny' request?

                  What I can think of is the 'nextdestiny' will try to
                  check the http download path of 'runimage', could you
                  check the path is accessible from xCAT MN? And could you
                  run 'nodeset <node> runimage=xxx' to see how much time it
                  takes to finish?

                  Thanks
                  Best Regards
                  
----------------------------------------------------------------------

                  Wang Xiaopeng (王晓朋)
                  IBM China System Technology Laboratory
                  Tel: 86-10-82453455
                  Email: w...@cn.ibm.com
                  Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang
                  West Road, Haidian District Beijing P.R.China 100193

                  Inactive hide details for Russell Jones ---2014/01/18
                  06:46:09---Hi all, My management node is experiencing a
                  spike of over 60 Russell Jones ---2014/01/18
                  06:46:09---Hi all, My management node is experiencing a
                  spike of over 60 load when I boot

                  From: Russell Jones <russell-l...@jonesmail.me>
                  To: xcat-user@lists.sourceforge.net,
                  Date: 2014/01/18 06:46
                  Subject: [xcat-user] Bad performance with 70 nodes
                  requesting nextdestiny


                  Hi all,

                  My management node is experiencing a spike of over 60
                  load when I boot
                  70 nodes at once and request a runimage of them. Doing a
                  "ps aux" shows
                  high iowait and what appears to be a perl process for
                  nextdestiny
                  running for each node. We had this same issue under xCAT
                  2.3, and was
                  under the hopeful impression that performance was
                  increased with xCAT
                  2.8 and nextdestiny behavior.

                  After about 10 minutes it seemed to have chewed through
                  the requests and
                  the nodes continued their respective chained commands. Is
                  there a way of
                  improving the way the management node handles nextdestiny
                  requests from
                  10's of nodes at once?


                  Thanks!

                  
------------------------------------------------------------------------------

                  CenturyLink Cloud: The Leader in Enterprise Cloud
                  Services.
                  Learn Why More Businesses Are Choosing CenturyLink Cloud
                  For
                  Critical Workloads, Development Environments & Everything
                  In Between.
                  Get a Quote or Start a Free Trial Today.
                  
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk

                  _______________________________________________
                  xCAT-user mailing list
                  xCAT-user@lists.sourceforge.net
                  https://lists.sourceforge.net/lists/listinfo/xcat-user




                  
------------------------------------------------------------------------------

                  CenturyLink Cloud: The Leader in Enterprise Cloud
                  Services.
                  Learn Why More Businesses Are Choosing CenturyLink Cloud
                  For
                  Critical Workloads, Development Environments & Everything
                  In Between.
                  Get a Quote or Start a Free Trial Today.
                  
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk



                  _______________________________________________
                  xCAT-user mailing list
                  xCAT-user@lists.sourceforge.net
                  https://lists.sourceforge.net/lists/listinfo/xcat-user
            
------------------------------------------------------------------------------

            CenturyLink Cloud: The Leader in Enterprise Cloud Services.
            Learn Why More Businesses Are Choosing CenturyLink Cloud For
            Critical Workloads, Development Environments & Everything In
            Between.
            Get a Quote or Start a Free Trial Today.
            
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
            _______________________________________________
            xCAT-user mailing list
            xCAT-user@lists.sourceforge.net
            https://lists.sourceforge.net/lists/listinfo/xcat-user



            
------------------------------------------------------------------------------

            CenturyLink Cloud: The Leader in Enterprise Cloud Services.
            Learn Why More Businesses Are Choosing CenturyLink Cloud For
            Critical Workloads, Development Environments & Everything In
            Between.
            Get a Quote or Start a Free Trial Today.
            
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk


            _______________________________________________
            xCAT-user mailing list
            xCAT-user@lists.sourceforge.net
            https://lists.sourceforge.net/lists/listinfo/xcat-user


------------------------------------------------------------------------------

CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today.
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

<<inline: graycol.gif>>

------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to