[Bug 17255] Support Upload Resume via Server Side File Concatenation

bugzilla-daemon Tue, 29 Mar 2011 10:17:06 -0700

https://bugzilla.wikimedia.org/show_bug.cgi?id=17255


--- Comment #10 from Michael Dale <[email protected]> 2011-03-29 17:16:58 UTC ---
So... the current consensus based on bug 25676 is a transport level chunk
support. This points to the support being written into core and have it handled
somewhat above the 'upload' api entry points. 

The flow would look like the following: based on
http://code.google.com/apis/documents/docs/3.0/developers_guide_protocol.html#ResumableUploadInitiate
and  http://code.google.com/p/gears/wiki/ResumableHttpRequestsProposal

Your initial post sets all the upload parameters 
filename
comment
text
token
stash 
etc.

In addition to Content-Length ( for the parameters ).  We set the
"X-Upload-Content-Type" and "X-Upload-Content-Length" headers that give the
target file type and upload size but we /DO NOT/ include any portion of the
file in this initial request. These special X-Upload-Content-Length headers
indicate to the server that this is a resumable / chunk upload request. (
Ideally we don't want to explicitly tag with a mediaWiki specific api parameter
) We may need a way to initially communicate to the client that the server
supports resumable uploads. ( ie 

The server then checks the requested size, validates all the initial upload
parameters ( token, valid file name etc ) then responds with a unique url that
only the current session can upload to. 

HTTP/1.1 200 OK
Location: <upload_uri>

NOTE: We are slightly abusing the resume protocol, since normally you send a
request to upload the entire file ( but because small chunks are more friendly
on wikimedia's back-end system we want clients to send things in smaller parts
)

Then the client then starts to send the file in 1 meg chunks. The chunks are
specified via the Content-Range header ie something like: 

Content-Length: 10
Content-Range: bytes 0-10/100

The server revives the content-range POSTs and checks that the chunk is
authenticated via the session and unique url, the chunks byte ranges are
checked and only valid unseen sequential byte ranges are appended to the file. 

If there are no errors the server responds with a header specify the next chunk
HTTP/1.1 308 Resume Incomplete
Content-Length: 0
Range: 11-20

The client then responds to the Resume Incomplete and sends the next chunk to
the server, if the POST breaks or is incomplete the client can query the server
for where it left off with: 

PUT <upload_uri> HTTP/1.1
Host: docs.google.com
Content-Length: 0
Content-Range: bytes */100

The client should only do this every 30 seconds for 5 min and then give up. The
server should also "give up" after 30 min and invalidate any chunks that
attempt to be appended to an old file. Likewise partially uploaded files should
be purged every so often, possibly with the same purge system used for stashed
files?

Finally if all is well, the when the final chunk is sent and the normal api
repose code is run where the file is validated and stashed or added to the
system. 


If this sounds reasonable all that left to do is implementation ;)

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

[Bug 17255] Support Upload Resume via Server Side File Concatenation

Reply via email to