Re: [xcat-user] Staging a new management node but keeping it inactive?

Nick Evans Fri, 07 Dec 2018 19:37:28 -0800

Why not run the mgr and data networks for the new cluster in different
vlans.


That way you can leave the dhcp on both systems running and they will only
respond to the requests for their respective clusters.

Nick

On Sat., 8 Dec. 2018, 8:22 am Rich Sudlow <[email protected] wrote:

> We do something slightly different by not using a DHCP pool but only allow
> DHCP to answer with MACs that are know.
>
>
> On 12/7/18 1:30 PM, David Johnson wrote:
> > Yes, only one can have a dynamic range.  In my case neither of them do,
> > since I manually paste the MAC addresses into the mac table.
> >
> > My issue with deleting first is when I deleted 16 mac addresses and then
> got
> > sidetracked and went home, those nodes later lost their lease and ended
> up getting
> > evicted from GPFS.  Not an issue if the nodes were otherwise idle, but
> they had
> > been marked to drain, jobs were still running on them.
> >
> >   — ddj
> >
> >> On Dec 7, 2018, at 1:25 PM, Kevin Keane <[email protected]
> >> <mailto:[email protected]>> wrote:
> >>
> >> Thank you, Dave. That is an interesting alternative approach; I might
> actually
> >> consider that.
> >>
> >> So you are saying that the old and new DHCP servers can run in
> parallel? I
> >> assume that they just can't both have dynamic ranges?
> >>
> >> I'm not sure I understand what the problem is with deleting a node
> first, and
> >> then adding it on the new system. Even if the lease expires, wouldn't
> it just
> >> reacquire the new one once you create it?
> >>
> >> _______________________________________________________________________
> >> Kevin Keane | Systems Architect | University of San Diego ITS |
> >> [email protected] <mailto:[email protected]>
> >> Maher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 |
> 619.260.6859
> >>
> >> *REMEMBER! **_No one from IT at USD will ever ask to confirm or supply
> your
> >> password_*.
> >> These messages are an attempt to steal your username and password.
> Please do
> >> not reply to, click the links within, or open the attachments of these
> >> messages. Delete them!
> >>
> >>
> >>
> >>
> >> On Thu, Dec 6, 2018 at 4:43 PM <[email protected]
> >> <mailto:[email protected]>> wrote:
> >>
> >>     We’ve kept parallel clusters on the same network for nearly a year
> now
> >>     while transitioning to RH7 from CentOS 6.
> >>     Initially copied the hosts and nodelist and MAC tables into the new
> xcat
> >>     database.  Carefully controlled use of makedhcp so that nodes
> moving to
> >>     the new cluster were first added to new dhcp server and then
> deleted from
> >>     the old. (  Didn’t want a repeat of what happened when I left some
> deleted
> >>     from the old but not added to the new cluster and they lost their
> lease.
> >>     The postscript hardeths also helped.  ). Make new images and use
> nodeset
> >>     to point to them. Reboot and test.
> >>
> >>     Drawback is having to make parallel changes on both management
> servers all
> >>     the time, but we needed both clusters to access gpfs so it was a
> necessary
> >>     evil.
> >>
> >>       -- ddj
> >>     Dave Johnson
> >>
> >>     On Dec 6, 2018, at 4:23 PM, Kevin Keane <[email protected]
> >>     <mailto:[email protected]>> wrote:
> >>
> >>>     I'm in the middle of upgrading our existing HPC (from RHEL 6 to
> RHEL 7).
> >>>     I'm doing most of my testing on a separate "sandbox" test bed, but
> now
> >>>     I'm close to going live. I'm trying to figure out how to do this
> with
> >>>     minimal disruption.
> >>>
> >>>     My question: how can I install the new management node and keep it
> >>>     *almost* completely operational, without interfering with the
> existing
> >>>     cluster? Is it enough to disable DHCP, or do I need to do anything
> else?
> >>>
> >>>     How do I prevent DHCP from accidentally getting enabled before I'm
> ready?
> >>>     Is makedhcp responsible for that?
> >>>
> >>>     Step-by-step, here is what I plan to do:
> >>>
> >>>     - Set up the new management node, but keep it inactive.
> >>>     - Test
> >>>     - Bring down all compute nodes.
> >>>     - Via IPMI, reset all the compute nodes' BMC controllers to DHCP
> >>>     - Other migration steps (home directories, modifications on the
> storage
> >>>     node, etc.)
> >>>     - De-activate the old management node (but keep it running)
> >>>     - Activate the new management node.
> >>>     - Discover and boot compute nodes
> >>>
> >>>     Is there anything glaringly obvious that I overlooked?
> >>>
> >>>     Thanks!
> >>>
> >>>
>  _______________________________________________________________________
> >>>     Kevin Keane | Systems Architect | University of San Diego ITS |
> >>>     [email protected] <mailto:[email protected]>
> >>>     Maher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 |
> 619.260.6859
> >>>
> >>>     *REMEMBER! **_No one from IT at USD will ever ask to confirm or
> supply
> >>>     your password_*.
> >>>     These messages are an attempt to steal your username and password.
> Please
> >>>     do not reply to, click the links within, or open the attachments
> of these
> >>>     messages. Delete them!
> >>>
> >>>
> >>>     _______________________________________________
> >>>     xCAT-user mailing list
> >>>     [email protected] <mailto:
> [email protected]>
> >>>     https://lists.sourceforge.net/lists/listinfo/xcat-user
> >>     _______________________________________________
> >>     xCAT-user mailing list
> >>     [email protected] <mailto:
> [email protected]>
> >>     https://lists.sourceforge.net/lists/listinfo/xcat-user
> >>
> >> _______________________________________________
> >> xCAT-user mailing list
> >> [email protected] <mailto:[email protected]
> >
> >> https://lists.sourceforge.net/lists/listinfo/xcat-user
> >
> >
> >
> > _______________________________________________
> > xCAT-user mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/xcat-user
> >
>
>
> --
> Rich Sudlow
> University of Notre Dame
> Center for Research Computing - Union Station
> 506 W. South St
> South Bend, In 46601
>
> (574) 631-7258 (office)
> (574) 807-1046 (cell)
>
>
> _______________________________________________
> xCAT-user mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/xcat-user
>

_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user

Re: [xcat-user] Staging a new management node but keeping it inactive?

Reply via email to