https://bugzilla.wikimedia.org/show_bug.cgi?id=32753

DrTrigon <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[email protected]

--- Comment #6 from DrTrigon <[email protected]> 2011-12-18 17:08:36 UTC ---
Can confirm this bug on de:wiki 1.18mwf e.g. on(In reply to comment #3)
> Ok I can confirm your results there.
> 
> The first two sections (first one is 'regular', second is in the templated
> text):
> 
>             {
>                 "toclevel": 1,
>                 "level": "2",
>                 "line": "Gr\u00fc\u00df Gott und Herzlich Willkommen auf 
> meiner
> Benutzer-Diskussionsseite",
>                 "number": "1",
>                 "index": "1",
>                 "fromtitle": "Benutzer_Diskussion:Label5",
>                 "byteoffset": 3417,
>                 "anchor":
> "Gr.C3.BC.C3.9F_Gott_und_Herzlich_Willkommen_auf_meiner_Benutzer-Diskussionsseite"
>             },
>             {
>                 "toclevel": 2,
>                 "level": "3",
>                 "line": "Meine WP-W\u00fcnsche f\u00fcr 2011",
>                 "number": "1.1",
>                 "index": "",
>                 "fromtitle": false,
>                 "byteoffset": 7897,
>                 "anchor": "Meine_WP-W.C3.BCnsche_f.C3.BCr_2011"
>             },
> 
> Since this second one comes from within a template, the current parser can't
> really assign it a byte position within the article text. I'm not too familiar
> with how this output is generated so will have to take a peek to say more.
> Ideally it at least shouldn't mess up the later sections, but I'm not sure how
> a "byteoffset" helps if you don't have a "bytelength"... possibly this is just
> a bad data structure that's not really suitable for how sections are handled.
> :(

The point is in the byteoffset field should be a "" in order to be correct
recognized e.g. by DrTrigonBot. Look at [1] there you have e.g.

 index="T-7" byteoffset=""

for all template entries, except the level 3 headings were you get e.g.

 index="" byteoffset="137405"

which confuses my bot a little bit! My workaround is to catch the empty index
string, but since this is considered to be a bug I cannot rely on the fact that
there will always be an empty index string...

[1]
http://de.wikipedia.org/w/api.php?action=parse&page=Wikipedia:L%C3%B6schkandidaten/12.%20Dezember%202009&prop=sections

Greetings

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to