On 10/12/2014 04:18 AM, Anne van Kesteren wrote:
On Sat, Oct 11, 2014 at 7:24 PM, Sam Ruby <ru...@intertwingly.net> wrote:
On 10/10/2014 08:19 PM, Sam Ruby wrote:
2) https://url.spec.whatwg.org/#concept-basic-url-parser
I'm interpreting "terminate this algorithm" and "return failure" to
mean the same thing, and I'm interpreting "parse error" as "set
parse error flag and continue".
I'm inclined to submit a pull request standardizing on "terminate this
algorithm" and "set parse error".
I'm not sure what you mean here. Returning failure is important.
However, in override mode returning failure is not important so the
algorithm is simply terminated.
Can you explain in JavaScript terms what the difference is between
return failure and terminate?
In any case, this difference wasn't clear to me, and mixed in with not
defining what should be done with parse errors, and returning failure
without setting parse errors (as mentioned below); all combined to make
it more difficult (for me at least) to determine what was desired.
And parse error error would be more
like flag a parse error, or append a parse error to a list of parse
errors. It depends a bit on whether the parser decides to halt on the
first one or not.
I don't see anything in the prose that indicates that halting on the
first parse error is an option.
b) Step 1.3.3 seems problematic. I interpret this prose to mean "if
any character in buffer is a "%" and the first two characters
after the pointer position in input aren't hex characters".
Specifically, it appears to be comparing a possibly
non-contiguous set of characters.
Ah yes. It needs to check the two code points after code point in
buffer. That seems like a bug.
I'll look into submitting a pull request after I complete this pass.
4) https://url.spec.whatwg.org/#file-host-state
Step 1.3.2 returns failure without setting parse_error. Is this
correct?
5) https://url.spec.whatwg.org/#host-state
Step 1.2.2 also returns failure without setting parse_error.
This is indeed inconsistent. I must at some point have thought that
returning failure without reporting a parse error was fine (as failure
was indicated) or the other way around. Reporting a parse error before
returning is probably best.
I'm inclined to submit a pull request changing these to to set parse error
before failing.
Thanks.
Will do.
6) https://url.spec.whatwg.org/#relative-path-state
If input contains a path but no query or fragment, the last part of
the path will be accumulated into buffer, but that buffer will never
be added to the path
Looks like I got confused by the prose in the spec. I've submitted a pull
request that makes this point clearer:
https://github.com/whatwg/url/pull/4
There were a number of places where things weren't clear to me; after I
complete my technical review and testing verification, I'll go back and
identify more. Meanwhile, here is an example:
https://github.com/whatwg/url/pull/5
Feel free to only modify url.src.html in PRs.
Ack.
Did you have a look at the open bugs by the way? There's a chance the
parsing algorithm will get rewritten at some point to be a bit more
functional and less state driven.
Not yet. I'm still seeing a large set of differences between what I am
producing and what is in urltestdata.txt and need to track down whether
the problems are in my implementation, the spec, or in the test results.
Once those three are in sync; I'll try to look at the bigger picture.
- Sam Ruby