Re: [Wikitech-l] Wikisource books and web 1.0 pages

ThomasV Fri, 13 Aug 2010 03:36:57 -0700

Interesting.

You might want to have a look at Microsoft's Seadragon technology :
http://www.dailymotion.com/video/x2738e_sea-dragon-and-photosynth-demo_tech
(check at 1min20s if you don't want to watch the whole video)


Now, getting back to your proposal : A javascript interface similar to
the ones at IA or Google Books, that downloads only the few scans
that need to be shown to the user, would be fairly easy to write
using the API. We could even do it for text, as long as it is rendered
as well-separated physical pages.

However, it would be more complicated to apply the same principle
if text is to be rendered without page separations, and preserving
its logical structure. We would need to either pre-parse the whole
document and develop an API that lets us download small bits of it,
or to parse the current page together with previous and next pages.
I am not sure if it is really worth the effort ; the bandwidth saving would
be less significant than for scans.

Thomas




Lars Aronsson a écrit :
> On 08/11/2010 09:46 PM, Aryeh Gregor wrote:
>   
>> This seems like a very weird way to do things.  Why is the book being
>> split up by page to begin with?  For optimal reading, you should put a
>> lot more than one book-page's worth of content on each web page.
>>     
>
> ThomasV will give the introduction to ProofreadPage and its
> purpose. I will take a step back. A book is typically 40-400 pages,
> because that is how much you can comfortably bind in one
> volume (one spine) and sell as a commercial product. A web 1.0
> (plain HTML + HTTP) page is typically a smaller chunk of
> information, say 1-100 kbytes. To match (either in Wikisource
> or Wikibooks) the idea of a book with web technology, the book
> needs to split up, either according to physical book pages
> (Wikisource with the ProofreadPage extension) or chapters
> (Wikisource without ProofreadPage or Wikibooks).
>
> In either case, the indiviual pages have a sequential relationship.
> If you print the pages, you can glue them together and the sequence
> makes sense, which is not the case with Wikipedia. Such pages have
> links to the previous and next page in sequence (which Wikipedia
> articles don't have).
>
> Wikipedia, Wikibooks and Wikisource mostly use web 1.0 technology.
> A very different approach to web browsing was taken when Google
> Maps was launched in 2005, the poster project for the "web 2.0".
> You arrive at the map site with a coordinate. From there, you can
> pan in any direction and new parts of the map (called "tiles") are
> downloaded by advanced JavaScript and XML (AJAX) calls as
> you go. Your browser will never hold the entire map. It doesn't
> matter how big the entire map is, just like it doesn't matter how
> big the entire Wikipedia website is. The unit of information to fetch
> is the "tile", just like the web 1.0 unit was the HTML page.
>
> If we applied this web 2.0 principle to Wikibooks and Wikisource,
> we wouldn't need to have pages with previous/next links. We could
> just have smooth, continuous scrolling in one long sequence. Readers
> could still arrive at a given coordinate (chapter or page), but
> continue from there in any direction.
>
> Examples of such user interfaces for books are Google Books and the
> Internet Archive online reader. You can link to page 14 like this:
> http://books.google.com/books?id=Z_ZLAAAAMAAJ&pg=PA14
> and then scroll up (to page 13) or down (to page 15). The whole
> book is never in your browser. New pages are AJAX loaded as they
> are needed. It's like Google maps except that you can only pan in
> two directions (one dimensions), not in the four cardinal directions.
> And the zoom is more primitive here. After you have scrolled to page
> 19, you need to use the "Link" tool to know the new URL to link to.
>
> At the Internet Archive, the user interface is similar, but the URL
> in your browser is updated as you scroll (for better or worse),
> http://www.archive.org/stream/devisesetembleme00lafeu#page/58/mode/1up
>
> If we only have scanned images of book pages, this is simple enough,
> because each scanned image is like a "tile" in Google maps. But in
> Wikisource, we have also run OCR software to extract a text layer for
> each page, and we have proofread that text to make it searchable.
> I still have not learned JavaScript, but I guess you could make AJAX
> calls for a chunk of text and add that to the scrollable web page, just
> like you can add tiled images. Google has not done this, however. If
> you switch to "plain text" viewing mode,
> http://books.google.com/books?pg=PA14&id=Z_ZLAAAAMAAJ&output=text
> you get traditional web 1.0 "pages" with links to the previous and
> next web page. (Each of Google's text pages contains text from 5 book
> pages, e.g. page 11-15, only to make things more confusing.)
>
> But the real challenge comes when you want to wiki edit one such
> chunk of scrollable text. I think it could work similar to our section
> editing of a long Wikipedia article. But to be really elegant, I should
> be able, when editing a section, to scroll up or down beyond the current
> section, in an eternal textarea.
>
> If we can solve this, "section editing 2.0" that goes outside of the box
> (or maybe we should skip directly to WYSIWYG editing), then we can
> have the beginning of a whole new Wikisource interface.
>
>
>   


_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wikisource books and web 1.0 pages

Reply via email to