I know jarrod has made several changes to the ipmi code after the 2.7.5
build.  For now you could try grabbing the latest version of these 2 files
in the 2.7 branch and see if it helps your problem:

cd /opt/xcat/lib/perl/xCAT
mv IPMI.pm IPMI.pm.orig
wget
http://svn.code.sf.net/p/xcat/code/xcat-core/branches/2.7/xCAT-server/lib/perl/xCAT/IPMI.pm

cd /opt/xcat/lib/perl/xCAT_plugin
mv ipmi.pm ipmi.pm.orig
wget
http://svn.code.sf.net/p/xcat/code/xcat-core/branches/2.7/xCAT-server/lib/xcat/plugins/ipmi.pm

If you try this, let me know if it fixes your problem.  Hopefully tomorrow
jarrod can look at the error.

Bruce Potter        STSM, Linux & AIX Cluster Development, IBM,
Poughkeepsie, NY
Email: [email protected]    Phone:  external: 845-433-7073, internal: TL
293-7073




From:   Stuart Barkley <[email protected]>
To:     xCAT Users Mailing list <[email protected]>,
Date:   11/03/2012 07:14 PM
Subject:        [xcat-user] Error: 1 code on opening RMCP+ session (was: xCAT
            2.7.5 is released)



On Mon, 29 Oct 2012 at 14:39 -0000, Lissa Valletta wrote:

> xCAT 2.7.5 release is now available on the download page.

I have installed 2.7.5 on two of our IBM clusters and am seeing a new
problem with the IPMI support.

I'm getting a significant number of new RMCP+ errors and eventually
timeouts.  I haven't done a lot of testing yet, but with 120 to 260
nodes the errors occur on nearly each request.  With a smaller number
of nodes (~20) I don't see these errors.

I note that there are significant changes between 2.7.4 and 2.7.5 in
2.7/xCAT-server/lib/perl/xCAT/IPMI.pm regarding timeouts and IPMI
login state transitions.  I didn't study the changes or the file
revision history closely.

One one column of a dx360 M2 iDataPlex cluster:

    # date; rvitals rack1a led | xcoll
    Sat Nov  3 18:28:55 EDT 2012
    mc036: Error: 1 code on opening RMCP+ session
    mc025: Error: 1 code on opening RMCP+ session
    mc033: Error: 1 code on opening RMCP+ session
    mc025: Error: 1 code on opening RMCP+ session
    mc036: Error: 1 code on opening RMCP+ session
    mc033: Error: 1 code on opening RMCP+ session
    mc025: Error: 1 code on opening RMCP+ session
    mc036: Error: 1 code on opening RMCP+ session
    mc033: Error: 1 code on opening RMCP+ session
    mc033: Error: 1 code on opening RMCP+ session
    mc025: Error: 1 code on opening RMCP+ session
    mc036: Error: 1 code on opening RMCP+ session
    mc025: Error: 1 code on opening RMCP+ session
    mc033: Error: 1 code on opening RMCP+ session
    mc036: Error: 1 code on opening RMCP+ session
    mc025: Error: 1 code on opening RMCP+ session
    mc036: Error: 1 code on opening RMCP+ session
    mc033: Error: 1 code on opening RMCP+ session
    mc033: Error: 1 code on opening RMCP+ session
    mc025: Error: 1 code on opening RMCP+ session
    mc025: Error: timeout
    mc033: Error: timeout
    ====================================

mc024,mc003,mc018,mc002,mc001,mc016,mc036,mc026,mc017,mc039,mc006,mc008,mc032,mc038,mc007,mc037,mc029,mc009,mc012,mc042,mc034,mc023,mc030,mc035,mc020,mc013,mc041,mc005,mc015,mc027,mc010,mc014,mc040,mc011,mc028,mc021,mc031,mc019,mc022,mc004

    ====================================
    No active error LEDs detected

    #

    On 110 x3650 M2 servers:

    # date; rvitals bc-compute led | xcoll
    Sat Nov  3 18:31:55 EDT 2012
    bc050: Error: 1 code on opening RMCP+ session
    bc036: Error: 1 code on opening RMCP+ session
    bc050: Error: 1 code on opening RMCP+ session
    bc036: Error: 1 code on opening RMCP+ session
    bc042: Error: 1 code on opening RMCP+ session
    bc042: Error: 1 code on opening RMCP+ session
    bc050: Error: 1 code on opening RMCP+ session
    bc036: Error: 1 code on opening RMCP+ session
    bc042: Error: 1 code on opening RMCP+ session
    bc050: Error: 1 code on opening RMCP+ session
    bc036: Error: 1 code on opening RMCP+ session
    bc042: Error: 1 code on opening RMCP+ session
    bc050: Error: 1 code on opening RMCP+ session
    bc036: Error: 1 code on opening RMCP+ session
    bc042: Error: 1 code on opening RMCP+ session
    bc050: Error: 1 code on opening RMCP+ session
    bc036: Error: 1 code on opening RMCP+ session
    bc042: Error: 1 code on opening RMCP+ session
    bc050: Error: 1 code on opening RMCP+ session
    bc036: Error: 1 code on opening RMCP+ session
    bc042: Error: 1 code on opening RMCP+ session
    bc050: Error: timeout
    bc036: Error: timeout
    bc042: Error: timeout
    ====================================

T06,S11,S10,T10,bc023,bc033,bc021,bc027,bc044,bc022,bc017,bc054,bc057,bc061,bc039,bc063,bc059,bc029,bc052,bc047,bc019,bc048,bc041,bc018,bc026,bc040,bc056,bc053,bc035,bc051,bc032,bc049,bc028,bc058,bc062,bc043,bc030,bc045,bc025,bc024,bc038,bc034,bc046,bc031,bc060,bc055,bc020,bc037

    ====================================
    No active error LEDs detected

    #

Each time problems are reported on different nodes.

Stuart Barkley
--
I've never been lost; I was once bewildered for three days, but never lost!
                                        --  Daniel Boone

------------------------------------------------------------------------------

LogMeIn Central: Instant, anywhere, Remote PC access and management.
Stay in control, update software, and manage PCs from one command center
Diagnose problems and improve visibility into emerging IT issues
Automate, monitor and manage. Do more in less time with Central
http://p.sf.net/sfu/logmein12331_d2d
_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user

<<inline: graycol.gif>>

------------------------------------------------------------------------------
LogMeIn Central: Instant, anywhere, Remote PC access and management.
Stay in control, update software, and manage PCs from one command center
Diagnose problems and improve visibility into emerging IT issues
Automate, monitor and manage. Do more in less time with Central
http://p.sf.net/sfu/logmein12331_d2d
_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to