Hi Adam, It seems that the issue is the apostrophe after "L", in the wikidata > query it is "´" and the wikipedia link above uses "'". >
I see, a lof of the issues were caused by my string mishandling. One approach you might consider is to download the entire log > history, then process it locally to filter by page ID. > I am still unclear on how to know definitely for sure that an article was deleted. It seems like the only way is to tell through the comments. For example, this call: https://en.wikipedia.org/w/api.php?action=query&list=logevents&leaction=delete/delete&letitle=Zayn%20Malik shows the comment "[[Wikipedia:Articles for deletion/Louis Tomlinson]]" which I have noticed to exist for other articles that were successfully deleted, but the article "Zayn Malik" exists. The most recent event has the comment "[[WP:CSD#G6|G6]]: Deleted to make way for move" which would imply the other deletions weren't successful but the article still exists. Thanks, Doris On Thu, Nov 4, 2021 at 3:20 AM Adam Wight <[email protected]> wrote: > On 11/4/21 8:09 AM, D Z wrote: > > > Hi Adam, > > > > Thanks for your reply. The qitem api returns missing for this article but > > the article exists: > > > > > https://www.wikidata.org/w/api.php?action=wbgetentities&format=json&sites=eswiki&titles=Playas%20de%20L%C2%B4Atalaya%20y%20Focar%C3%B3n&normalize=1 > > > > The Wikipedia page link > > <https://es.wikipedia.org/wiki/Playas_de_L%27Atalaya_y_Focar%C3%B3n> is > > here. > > It seems that the issue is the apostrophe after "L", in the wikidata > query it is "´" and the wikipedia link above uses "'". Maybe something > in your query script is normalizing the fancy apostrophe to a simple > one? I would check for proper UTF-8 handling. > > > Would you know if there is a way to input article revision ID or pageid > > instead of source title for the logevents API? The strings seem to be > > problematic at times. > > This was prescient :-). But I don't see any record of the article being > deleted, so perhaps the API is correct in this case? > > > https://pt.wikipedia.org/wiki/Special:Log?type=&user=&page=Rodrigo+Flores+Álvarez&wpdate=&tagfilter= > <https://pt.wikipedia.org/wiki/Special:Log?type=&user=&page=Rodrigo+Flores+%C3%81lvarez&wpdate=&tagfilter=> > > Unfortunately, the API help page doesn't mention filtering the log by > page ID. One approach you might consider is to download the entire log > history, then process it locally to filter by page ID. > > Help page: > https://www.mediawiki.org/w/api.php?action=help&modules=query%2Blogevents > > Regards, > Adam W. > [[mw:User:Adamw] > > > For example, the article 'Rodrigo Flores Álvarez' of > > 'pt' Wikipedia gives me trouble (I got this article from the > cxtranslation > > list). This page seems to be missing > > <https://pt.wikipedia.org/wiki/Rodrigo_Flores_%C3%81lvarez> and perhaps > I > > am not using the logevents API correctly, but it returns empty. > > > > {'batchcomplete': '', 'query': {'logevents': []}} > > > > ------------------------------ > > endpoint = str('pt') + '.wikipedia.org/w/api.php' > > query_url = "https://{0}".format(endpoint) > > params = {} > > params['action'] = 'query' > > params['list'] = 'logevents' > > params['format'] = 'json' > > params['leaction'] = 'delete/delete' > > params['letitle'] = 'Rodrigo Flores Álvarez' > > json_response = requests.get(url=query_url, params=params).json() > > > > Thanks again and cheers, > > > > Doris Zhou > > > > On Wed, Oct 27, 2021 at 9:51 AM Adam Wight <[email protected]> > wrote: > > > >> The "logevents" API should return the same data as Special:Log. For > >> example, > >> > >> > >> > https://en.wikipedia.org/w/api.php?action=query&list=logevents&letitle=Category:Recipients%20of%20the%20Order%20of%20the%20Tower%20and%20Sword > >> > >> This can be filtered further to just delete events, and so on. > >> > >> But if you only want to know whether an article exists or not, "missing" > >> should be accurate. Can you share some example URLs for which the page > >> exists, but the API returns "missing"? > >> > >> Kind regards, > >> Adam W. > >> > >> On 10/27/21 3:40 AM, D Z wrote: > >>> Hello All, > >>> > >>> I am doing research investigating the role of machine translation in > >>> Wikipedia articles. I am having trouble with how to know if an article > >> has > >>> been deleted from Wikipedia. Specifically, I am getting a list of > >> articles > >>> from the cxtranslation list and I would like to know which articles are > >> no > >>> longer on Wikipedia. I see that there is the deletion log form > >>> <https://en.wikipedia.org/wiki/Special:Log/delete> but is there an API > >> or > >>> some way to access something like this form so I could check if a mass > >>> amount of articles have been deleted? > >>> > >>> I have used the Media Wiki API <https://en.wikipedia.org/w/api.php> to > >> get > >>> articles and the API returns missing for some articles, but this does > not > >>> seem to be fully accurate for determining if an article has been > deleted > >>> because the API has returned 'missing' for articles that do exist. > >>> > >>> To summarize, my main question is: given an article language edition > and > >>> article title, or an article pageid, is there an API to check if the > >>> article has been deleted? > >>> > >>> Any help would be greatly appreciated! > >>> > >>> Thanks, > >>> > >>> Doris Zhou > >>> _______________________________________________ > >>> Wiki-research-l mailing list -- [email protected] > >>> To unsubscribe send an email to > >> [email protected] > >> _______________________________________________ > >> Wiki-research-l mailing list -- [email protected] > >> To unsubscribe send an email to > [email protected] > >> > > _______________________________________________ > > Wiki-research-l mailing list -- [email protected] > > To unsubscribe send an email to > [email protected] > _______________________________________________ > Wiki-research-l mailing list -- [email protected] > To unsubscribe send an email to [email protected] > _______________________________________________ Wiki-research-l mailing list -- [email protected] To unsubscribe send an email to [email protected]
