On Tue, Nov 24, 2009 at 1:11 PM, Eric Van Hensbergen <[email protected]> wrote: > Those are implementation specifics that the user/admin can be largely > unaware of. It would be quite trivial to assume the same environment > as .ssh (system-level password authentication and/or key-files on a > shared file system). > > The unfortunate side of that is it requires shared distributed file > system or shared auth mechanisms be present which mean you require > something more than the drone systems we currently deploy with xcpu2 > which are much easier to manage.
We don't necessarily use a shared distributed file system for things like system keys. Since they don't change often, we may put them into a RAM root image and perhaps update them with a tree'd remote copy. I want to clarify what I said before, since I combined authentication and account authorization. In addition to something like ssh key authentication, resource managers like torque use PAM to determine which accounts are active on a node at a given time. Now, I'm not particularly fond of any of the existing resource managers, so I would be content if a scheduler (Moab in our case) talked directly to xcpu2. We also need tight integration with MPI implementations. Currently we have a situation where the resource manager has to establish connections to all of the nodes in an allocation, then MPI has to do the same sort of wireup. I understand that it is non-trivial to get Open MPI to utilize xcpu. -- Andrew Shewmaker
