Hi Adam,

It seems that the issue is the apostrophe after "L", in the wikidata
> query it is "´" and the wikipedia link above uses "'".
>

I see, a lof of the issues were caused by my string mishandling.

One approach you might consider is to download the entire log
> history, then process it locally to filter by page ID.
>

I am still unclear on how to know definitely for sure that an article was
deleted.  It seems like the only way is to tell through the comments. For
example, this call:
https://en.wikipedia.org/w/api.php?action=query&list=logevents&leaction=delete/delete&letitle=Zayn%20Malik
shows the comment "[[Wikipedia:Articles for deletion/Louis Tomlinson]]"
which I have noticed to exist for other articles that were successfully
deleted, but the article "Zayn Malik" exists. The  most recent event has
the comment
"[[WP:CSD#G6|G6]]: Deleted to make way for move" which would imply the
other deletions weren't successful but the article still exists.

Thanks,

Doris

On Thu, Nov 4, 2021 at 3:20 AM Adam Wight <[email protected]> wrote:

> On 11/4/21 8:09 AM, D Z wrote:
>
> > Hi Adam,
> >
> > Thanks for your reply. The qitem api returns missing for this article but
> > the article exists:
> >
> >
> https://www.wikidata.org/w/api.php?action=wbgetentities&format=json&sites=eswiki&titles=Playas%20de%20L%C2%B4Atalaya%20y%20Focar%C3%B3n&normalize=1
> >
> > The Wikipedia page link
> > <https://es.wikipedia.org/wiki/Playas_de_L%27Atalaya_y_Focar%C3%B3n> is
> > here.
>
> It seems that the issue is the apostrophe after "L", in the wikidata
> query it is "´" and the wikipedia link above uses "'".  Maybe something
> in your query script is normalizing the fancy apostrophe to a simple
> one?  I would check for proper UTF-8 handling.
>
> > Would you know if there is a way to input article revision ID or pageid
> > instead of source title for the logevents API? The strings seem to be
> > problematic at times.
>
> This was prescient :-).  But I don't see any record of the article being
> deleted, so perhaps the API is correct in this case?
>
>
> https://pt.wikipedia.org/wiki/Special:Log?type=&user=&page=Rodrigo+Flores+Álvarez&wpdate=&tagfilter=
> <https://pt.wikipedia.org/wiki/Special:Log?type=&user=&page=Rodrigo+Flores+%C3%81lvarez&wpdate=&tagfilter=>
>
> Unfortunately, the API help page doesn't mention filtering the log by
> page ID.  One approach you might consider is to download the entire log
> history, then process it locally to filter by page ID.
>
> Help page:
> https://www.mediawiki.org/w/api.php?action=help&modules=query%2Blogevents
>
> Regards,
> Adam W.
> [[mw:User:Adamw]
>
> > For example, the article 'Rodrigo Flores Álvarez' of
> > 'pt' Wikipedia gives me trouble (I got this article from the
> cxtranslation
> > list). This page seems to be missing
> > <https://pt.wikipedia.org/wiki/Rodrigo_Flores_%C3%81lvarez> and perhaps
> I
> > am not using the logevents API correctly, but it returns empty.
> >
> > {'batchcomplete': '', 'query': {'logevents': []}}
> >
> > ------------------------------
> > endpoint = str('pt') + '.wikipedia.org/w/api.php'
> > query_url =  "https://{0}".format(endpoint)
> > params = {}
> > params['action'] = 'query'
> > params['list'] = 'logevents'
> > params['format'] = 'json'
> > params['leaction'] = 'delete/delete'
> > params['letitle'] = 'Rodrigo Flores Álvarez'
> > json_response = requests.get(url=query_url, params=params).json()
> >
> > Thanks again and cheers,
> >
> > Doris Zhou
> >
> > On Wed, Oct 27, 2021 at 9:51 AM Adam Wight <[email protected]>
> wrote:
> >
> >> The "logevents" API should return the same data as Special:Log. For
> >> example,
> >>
> >>
> >>
> https://en.wikipedia.org/w/api.php?action=query&list=logevents&letitle=Category:Recipients%20of%20the%20Order%20of%20the%20Tower%20and%20Sword
> >>
> >> This can be filtered further to just delete events, and so on.
> >>
> >> But if you only want to know whether an article exists or not, "missing"
> >> should be accurate.  Can you share some example URLs for which the page
> >> exists, but the API returns "missing"?
> >>
> >> Kind regards,
> >> Adam W.
> >>
> >> On 10/27/21 3:40 AM, D Z wrote:
> >>> Hello All,
> >>>
> >>> I am doing research investigating the role of machine translation in
> >>> Wikipedia articles. I am having trouble with how to know if an article
> >> has
> >>> been deleted from Wikipedia. Specifically, I am getting a list of
> >> articles
> >>> from the cxtranslation list and I would like to know which articles are
> >> no
> >>> longer on Wikipedia. I see that there is the deletion log form
> >>> <https://en.wikipedia.org/wiki/Special:Log/delete> but is there an API
> >> or
> >>> some way to access something like this form so I could check if a mass
> >>> amount of articles have been deleted?
> >>>
> >>> I have used the Media Wiki API <https://en.wikipedia.org/w/api.php> to
> >> get
> >>> articles and the API returns missing for some articles, but this does
> not
> >>> seem to be fully accurate for determining if an article has been
> deleted
> >>> because the API has returned 'missing' for articles that do exist.
> >>>
> >>> To summarize, my main question is: given an article language edition
> and
> >>> article title, or an article pageid, is there an API to check if the
> >>> article has been deleted?
> >>>
> >>> Any help would be greatly appreciated!
> >>>
> >>> Thanks,
> >>>
> >>> Doris Zhou
> >>> _______________________________________________
> >>> Wiki-research-l mailing list -- [email protected]
> >>> To unsubscribe send an email to
> >> [email protected]
> >> _______________________________________________
> >> Wiki-research-l mailing list -- [email protected]
> >> To unsubscribe send an email to
> [email protected]
> >>
> > _______________________________________________
> > Wiki-research-l mailing list -- [email protected]
> > To unsubscribe send an email to
> [email protected]
> _______________________________________________
> Wiki-research-l mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
>
_______________________________________________
Wiki-research-l mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to