David Goodman wrote: > Quite apart from the incredible range available from a research > library, the great majority of Wikipedians, even experienced ones, do > not use even those sources which are made available free from local > public libraries to residents. Many seem not to even think about using > anything free on the internet except that reachable through the > Googles. if Google News reports a newspaper or magazine behind a pay > wall, they do not even think of looking for it in other databases or > web sites that they may have available. David's issue here is something he describes as familiar generally to librarians. It does seem to me to be a hybrid of that one (leading the horse to the reference library water is not the same as having the horse drink), with another one. Tim Berners-Lee is apparently interested in the [[Deep Web]], which is to a first approximation what you can't Google for, but is out there. One clear cause is online databases, where if the webcrawler can't think up a good query, the potential web page answer won't get reported.
I was thinking about this more obliquely, because of my current interests: another couple of causes occur to me. There are texts online which are reference material, but need proof-reading (tell me about it) before the text is accurate enough for the search term to be there "in clear". And (as I found out just now) there are texts online that are downloads that are huge files. I've just looked at a PDF that is over 500 Mb. Both these issues are obvious to me as user of archive.org. There is a route for information to migrate onto the Web as book -> scan -> post to archive.org. Which is fruitful and gets it "out there". It happens that for reference information our model is more useful by a factor of at least 1000 (you can check the figures for archive.org downloads). So, the deeper Web needs "dredging" work before such things turn up on most people's first page of search engine hits. I'd quite agree with David that simply using the "shallow Web" and moving information from one part of it to another is not the only thing research for WP should be about. It seems to me that during Wikipedia's second decade we'll need to become more thoughtful about what is involved. (In Wikisource terms, for example, it would be great to see development of that project as the "reference Commons", matching the function the Commons serves for media files. But that's a potentially divisive idea, since it is already a "free library" with its own mission.) Charles _______________________________________________ WikiEN-l mailing list [email protected] To unsubscribe from this mailing list, visit: https://lists.wikimedia.org/mailman/listinfo/wikien-l
