JAllemandou added a comment.

  I continued my analysis today looking at top-100 parsed user-agents from both 
queries-with-referer subset, and queries-without-referer subset, over the month 
of September.
  See https://phabricator.wikimedia.org/P12933
  
  - The queries-with-referer have a defined user-agent. meaning that the 
user-agent-parser we use to extract structured information from the user-agent 
line provides values for a lot of its fields. By looking at the top-100 
user-agents we actually cover more than 90% of requests made with referer
  - The queries-without-referer have either an undefined or `Spider` 
user-agent, meaning that the user-agent line is either not parseable or is 
parsed as a bot. I inspected manually the user-agent lines and confirm that 
most of the user-agent lines looks like bots (particularly the ones making most 
requests).  By looking at the top 100 user-agents we also cover more than 90% 
of requests made without referer.
  
  This confirms that, despite being small, the requests providing a referer 
seems trustworthy. There is therefore nothing more to for this task, data is 
already available.

TASK DETAIL
  https://phabricator.wikimedia.org/T261841

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Zbyszko, JAllemandou
Cc: CBogen, JAllemandou, Aklapper, Gehel, Alter-paule, Beast1978, Un1tY, 
Akuckartz, Hook696, darthmon_wmde, Kent7301, joker88john, CucyNoiD, Nandana, 
Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to