On 12/21/2012 10:45 AM, Gilad Berman wrote:
> Hello All,
> 
> I would like your advise on how we can increase the boot performance,
> i.e - when we are booting 84 nodes at once (Statelite) some of them (~2)
> usually do not boot (looks like they are not getting dhcp request but
> i'm not completely sure) and we have to reboot them.
> when booting 168 from the same node the number goes up to ~5-7 nodes.
>  While  this is might be acceptable we would still like to improve it.
> 
> I know the first thing comes to mind is the network - we have 4 port
> LACP bonding, however, only two ports are actually active on send for
> some reason - we see two links fully loaded and two links do not send
> anything. on receive  the bond behave as expected.
> 
> I know the LACP issue has nothing to do with xCAT but i'm sure people on
> the list have optimized the boot performance before.

LACP uses hashing to determine which link to send traffic over. I would
guess that the default algorithm is not appropriate for your traffic
pattern and it's not balancing well across the links. It's typical to
only get a 70/30 split on a "well-tuned" aggregated link and a true
50/50 split is very rare.

> 
> Any advise is appreciated and happy Christmas :)
> 
> Regards,
> 
> Gilad Berman
> HPC Architect
> IBM System & Technology Group. Israel
> 
> E-mail: [email protected]
> Tel:    972-3-9188262
> Mobile: 972-52-2554262
> 
> The information contained in this email is being provided by IBM as a
> matter of courtesy and provided "AS-IS" without any direct and implied
> warranty; IBM assumes no liability. It is your responsibility to ensure
> that any resulting customer proposal has been correctly designed to meet
> your clients' requirements and to have an active review process which
> ensures an appropriate level of solution assurance is performed for all
> proposals. IBM does not take responsibility for the solution or solution
> assurance.
> 
> 
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
> 
> 
> 
> _______________________________________________
> xCAT-user mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/xcat-user
> 


-- 
Daniel M. Weeks
Systems Administrator
Computational Center for Nanotechnology Innovations
Rensselaer Polytechnic Institute
Troy, NY 12180
518-276-4458

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to