On 07/13/2017 02:18 AM, Nicolas Vervelle wrote:


I think I've found some discrepancy between Linter reports. On frwiki, the
page "Discussion:Yasser Arafat" is reported in the list for self-closed-tag
[1], but when run the text of the page through the transform API [2], I
only get errors for obsolete-tag and mixed-content and nothing for
self-closed-tag.

When I pasted the wikitext for Discussion:Yasser_Arafat page in the wikitext box AND entered the page title in the title box on https://fr.wikipedia.org/api/rest_v1/#!/Transforms/post_transform_wikitext_to_lint_title_revision, I do see the following among others:
...

|{ "type": "self-closed-tag", "params": { "name": "span" }, "dsr": [ 183063, 183134, null, null ], "templateInfo": { "name": "Modèle:Censuré" } },|

...

However, if I don't add the page title in the title box, I can reproduce your problem ... so, clearly this is something to do with a template depending on the page title.

I can reproduce this on the commandline with the specific wikitext substring that the Linter interface shows you. This output below shows that the linter error is dependent on having the page title there.

---
[subbu@earth parsoid] echo '{{Censuré|Tu remarqueras que je ne te retourne pas la question.<br />}}' | parse.js --page Discussion:Yasser_Arafat --prefix frwiki --lint > /dev/null [info/lint/self-closed-tag][frwiki/Discussion:Yasser_Arafat] {"type":"self-closed-tag","params":{"name":"span"},"dsr":[0,71,null,null],"templateInfo":{"name":"Modèle:Censuré"}} [info/lint/stripped-tag][frwiki/Discussion:Yasser_Arafat] {"type":"stripped-tag","params":{"name":"SPAN"},"dsr":[0,71,null,null],"templateInfo":{"name":"Modèle:Censuré"}} [subbu@earth parsoid] echo '{{Censuré|Tu remarqueras que je ne te retourne pas la question.<br />}}' | parse.js --prefix frwiki --lint > /dev/null
[subbu@earth parsoid]
---

When I add a --dump tplsrc flag to parsoid (which you can also get by using the expandtemplates action api endpoint), I see the following:

---
<span class="censure" style="background-color:#EEF;color:#EEF;" title="Tu remarqueras que je ne te retourne pas la question.<br />"><span style="visibility:hidden">Tu remarqueras que je ne te retourne pas la question.<br /></span></span>
---

So, it looks like Parsoid's tokenizer is tripping on the /> that is present in the span title attribute and false assumes it is a self-closing tag.

In any case, in conclusion:

(1) Please provide page title when you use the API
(2) There is a Parsoid bug in detection of self-closing tags where presence of a "/>" in an HTML attribute triggers a false positive. This has been reported previously ... so I suppose it is not as uncommon as I thought. We'll take a look at that.

Subbu.
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to