Ron,

I believe that we have fixed the file not found error with the last
patch.  I will look into the fcall size error.  

Thanks for pointing these out.  I will fix the stderr issue today.

-- 
Hugh Greenberg <[EMAIL PROTECTED]>


On Thu, 2008-11-06 at 12:17 -0800, ron minnich wrote:
> we're having lots of trouble with xget, thought I would open up to the list.
> 
> Here's one example:
> <<<-- (0x805f4a8) Tversion tag 65535 msize 32792 version '9P2000.u'
> -->>> (0x805f4a8) Rversion tag 65535 msize 32792 version '9P2000'
> <<<-- (0x805f4a8) Tattach tag 0 fid 0 afid -1 uname root aname
> -->>> (0x805f4a8) Rattach tag 0 qid  (0000000000000001 0 'd')
> <<<-- (0x805f4a8) Twalk tag 0 fid 0 newfid 1 nwname 1 'log'
> -->>> (0x805f4a8) Rwalk tag 0 nwqid 1 (0000000000000006 d '')
> <<<-- (0x805f4a8) Topen tag 0 fid 1 mode 2
> -->>> (0x805f4a8) Ropen tag 0 (0000000000000006 d '') iounit 0
> listen on 48617
> <<<-- (0x805f4a8) Twrite tag 0 fid 1 offset 0 count 16 data 6c697374 656e206f 
> 6
> 
> Error: invalid fcall size greater than msize: 5
> Fatal error: Could not find file: 
> vnfs/caos-nsa-node-1.0rc1-1.stateless.i386/vng
> ERROR: Could not xget vnfs/caos-nsa-node-1.0rc1-1.stateless.i386/vnfs.img 
> vnfs/1
> \rERROR(430): There was an error downloading required VNFS components
> 
> That fcall size error is ugly.
> 
> we also see lots of segvs.
> 
> we now have a 100% failure case.
> 
> We have booted a node to failure in perceus. xget will now always
> fail. Here you go.
> <<<-- (0x805f4a8) Tversion tag 65535 msize 32792 version '9P2000.u'
> -->>> (0x805f4a8) Rversion tag 65535 msize 32792 version '9P2000'
> <<<-- (0x805f4a8) Tattach tag 0 fid 0 afid -1 uname root aname
> -->>> (0x805f4a8) Rattach tag 0 qid  (0000000000000001 0 'd')
> <<<-- (0x805f4a8) Twalk tag 0 fid 0 newfid 1 nwname 1 'log'
> -->>> (0x805f4a8) Rwalk tag 0 nwqid 1 (0000000000000006 21 '')
> <<<-- (0x805f4a8) Topen tag 0 fid 1 mode 2
> -->>> (0x805f4a8) Ropen tag 0 (0000000000000006 21 '') iounit 0
> listen on 54183
> <<<-- (0x805f4a8) Twrite tag 0 fid 1 offset 0 count 16 data 6c697374 656e206f 
> 6
> 
> Error: invalid fcall size greater than msize: 5
> Fatal error: Could not find file: 
> vnfs/caos-nsa-node-1.0rc1-1.stateless.i386/vng
> 
> This is when we're still trying to open/write the log!
> 
> on client:
> write(3, "\'\0\0\0v\0\0\1\0\0\0\0\0\0\0\0\0\0\0\20\0\0\0listen on 54183\n", 
> 39)
> poll([{fd=3, events=POLLIN|POLLOUT, revents=POLLOUT}, {fd=5, 
> events=POLLIN|POLLO
> poll([{fd=3, events=POLLIN, revents=POLLIN}, {fd=5, events=POLLIN|POLLOUT}], 
> 2,
> read(3, "Client 10.1.1.73!43082: listen on 54183\n", 32792) = 40
> shutdown(3, 2 /* send and receive */)   = 0
> close(3)                                = 0
> write(2, "Error: invalid fcall size greater than msize: 5\n", 48) = 48
> stat64(".", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
> write(2, "Fatal error: Could not find file: 
> vnfs/caos-nsa-node-1.0rc1-1.stateles
> exit_group(-1)                          = ?
> 
> now we're tracing server:
> poll([{fd=4, events=POLLIN|POLLOUT}, {fd=0, events=POLLIN}, {fd=1, 
> events=POLLIN
> }, {fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, 
> {fd=8, e
> vents=POLLIN}, {fd=9, events=POLLIN}, {fd=2, events=POLLIN, revents=POLLIN}], 
> 9,
>  300000) = 1
> read(2, "\'\0\0\0v\0\0\1\0\0\0\0\0\0\0\0\0\0\0\20\0\0\0listen on 50621\n", 
> 32792
> ) = 39
> write(2, "Client 10.1.1.73!41808: listen on 50621\n", 40) = 40
> time(NULL)                              = 1225977306
> write(2, "\v\0\0\0w\0\0\20\0\0\0", 11)  = 11
> poll([{fd=4, events=POLLIN|POLLOUT}, {fd=0, events=POLLIN}, {fd=1, 
> events=POLLIN
> }, {fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, 
> {fd=8, e
> vents=POLLIN}, {fd=9, events=POLLIN}, {fd=2, events=POLLIN|POLLOUT, 
> revents=POLL
> IN|POLLOUT}], 9, 300000) = 1
> read(2, "", 32792)                      = 0
> close(2)                                = 0
> 
> note that we are doing I/O to the client on fd 2. Note also that we
> appear to be dumping log data of some sort on fd 2. This log data is
> clearly not 9p. I assume there is an fprintf(2 in there somewhere.
> 
> This trashes the connection with the client.
> 
> I can fix unless somebody has a better idea.
> 
> ron

Reply via email to