On 10/13/2014 10:05 AM, Anne van Kesteren wrote:
Not yet. I'm still seeing a large set of differences between what I am
producing and what is in urltestdata.txt and need to track down whether the
problems are in my implementation, the spec, or in the test results.
Once those three are in sync; I'll try to look at the bigger picture.
Cool. Sounds great.
New test results:
http://intertwingly.net/stories/2014/10/13/urltest-results/
The fourth column ("Notes") indicates which properties differ between
what my software produces and what the testdata indicates should be the
expected results. These fall into three basic categories:
1) rows where the notes merely say "href" are cases where parse errors
are thrown and failure is returned. The expected results are an object
that returns the original href, but empty values for all other
properties. I don't see this behavior in the spec:
https://url.spec.whatwg.org/#url-parsing
2) rows that contain "href hostname" appear to be ones where the
expected results do not appear to be updated to include the host to IDNA
mapping.
3) rows that contain "href protocol hostname pathname" need further
investigation. I suspect that these are based on my using a library to
normalize the IDNA mapping, and it "helpfully" cleans up other problems
like removing U+0000 characters from the input.
My implementation can be found here:
http://intertwingly.net/stories/2014/10/13/url_rb.html
Note the comments linking back to spec sections, and comments that
identify step numbers.
- Sam Ruby
P.S. I didn't update to the latest test data yet; but from what I can
see the changes wouldn't materially affect the results, so I am
publishing now.
P.P.S. Preview of what is yet to come, ruby2js run against my
implementation produces:
http://intertwingly.net/stories/2014/10/13/url_js.html
This will need some additional work to get running, for example lines
54, 65, 82, 85, and 267 call out to libraries that aren't available to
JavaScript. Lines 275 to 277 are debugging lines that will be removed
shortly.