Re: [whatwg] NoDatabase databases

Brett Zamir Sun, 18 Aug 2013 21:45:01 -0700

On 8/17/2013 5:16 AM, Brendan Long wrote:

On 05/01/2013 10:57 PM, Brett Zamir wrote:

I wanted to propose (if work has not already been done in this area)
creating an HTTP extension to allow querying for retrieval and
updating of portions of HTML (or XML) documents where the server is so
capable and enabled, obviating the need for a separate database (or
more accurately, bringing the database to the web server layer).

Can't you use JavaScript to do this already? Just put each part of the
page in a separate HTML or XML files, then have JavaScript request the
parts it needs and put insert them into the DOM as needed.


Yes, one can, but:

1. It won't allow users to have their browser (or privileged add-oncode) make such universal, cross-domain partial-document-obtainingrequests to any webpage they wish (at least to any webpage which is on aserver where a drop-in server module or script aware of this standardprotocol had been employed).

Imagine, for example, if all a government had to do to release theirdata online was to save a Word doc, Excel file, Access database, etc. asHTML and FTP it to a publicly-accessible directory on their server (andadd a server module aware of the HTML Query API which intercepts suchqueries sent to files in their public directory to handle XPath/CSSSelector query processing and send back CORS headers with the modifiedresponse). Bam, there is now a genuine, queryable database on the Webwhich is available to the world for querying.

One could obtain subsets of such data stores without the document owner(in this case, the government) needing to go through hoops to ensuretheir documents/data are converted into JSON/XML/etc., have custom RESTAPIs provided, has a search interface created, etc. (though thisprotocol would let people store their data in a JSON database, etc. ifthey wished, but they could also just upload static HTML files).

Consumers of this data (whether web developers or users of thebrowser/add-on concept mentioned above) would have no need to doinefficient screen scraping which first had to grab entire documents tobe able to extract useful data. There would be no need forserver-side-only solutions (at least if one is coming from a privilegedenvironment such as a browser/add-on, if the document owner enabled CORSon their server, or if the site is one's own).

2. Such JavaScript solutions as you mention are custom and requiredevelopers to learn different client-side (and server-side) librariesand learn different server APIs. With a standard HTML Query API, onewould need know nothing more than the URL of the data store (and thestructure of the contents one was seeking) to get away with bareXMLHttpRequest (or $.ajax) calls that do what one wants against the datastore--no need to know what specific query strings to add to meet therequirements of a custom server-side API. (In some cases, that mayadmittedly be more convenient to have a succinct query syntax optimizedfor the specific document format, but it is nice to always have thegeneric query option.)

3. Custom JavaScript requires sites to include such code in every fileand to write scripts. Of course, SOME data necessitates customizedaccess control such as a website's user database (though even here, onecould use http://en.wikipedia.org/wiki/Basic_access_authentication toavoid scripting).

But even with scripts determining access control, many sites could stillbenefit, by being able to say create, upload, and manage a Word documentsaved as HTML with a table whose (WYSIWYG) columns were "user" and"password" and then, as per #2 above, use a single reusable server-sidelibrary implementing the standard to query this document. The sitecould, if they wished, later switch to importing their document into adatabase while still keeping the HTML Query API library calls. And if aserver-side script wanted to say let authenticated administrators queryor alter the user table, client-side JavaScript could be output to themby the server-side code which conducted queries against the user tablein the same familiar manner.

4. If markup would be added to HTML which coordinated intelligently withthis query scheme, say for example to allow querying of documents withknown paragraph numbering (there are more interesting and frequentlyneeded use cases than this with tables and lists as I'm planning toexplain in my response to Ian, but I'll use a simpler example in thisresponse)...


a. The document creator could create:

<article paragraphRange="">
<p>This is par. 1</p>
<p>This is par. 2</p>
...
<p>This is par. 500</p>
</article>

b. an intermediary server plugin would detect the "paragraphRange"attribute and then auto-strip out all of the inner paragraphs beforedelivering the document to the user (unless say other markup werepresent on <article> such as `showRange="1-20"` in which case it wouldonly strip out paragraphs 21-500, or if `paragraphsPerPage="5"` were set(without showRange), it would strip out pars. 6-500).

c. the browser when it received the document could then recognize the"paragraphRange" attribute to know that it should add its own searchinterface widget at this point in the document which might contain:1. A generic browser-localized label, e.g. "Choose a range ofparagraphs"2. Two numeric text boxes to allow the user to request a paragraphrange, e.g., 23-45 from the server3. A "get all paragraphs" button or link (as an alternative to therange) to obtain all paragraphs beneath the widget (or "get theremaining paragraphs" had the "showRange" attribute been used).4. If the "paragraphsPerPage" attribute were present, the browsercould also add a link to "get the next 5 paragraphs"

Although custom scripts could do this, it requires the markup creator,including those on any public content creation sites such as wikis,blogs, and discussion forums, to include such a custom script as well.

Even if the WhatWG did not wish to engage in adopting such specificmarkup conventions until seeing experience gained and demand for thesewidgets assessed, having an official HTML Query Language by which widgetcreators could pass information back-and-forth between client and serverin a uniform manner would still facilitate the development process asper #2 above.

Re: [whatwg] NoDatabase databases

Reply via email to