Ahh. What are pmcs? On Wed, Oct 22, 2014 at 5:06 PM, Maximilian Klein <[email protected]> wrote:
> Out of interest, my regex was > > pmc\s*\=\s*(.*?)[\|\}] > > and then also > > pmid\s*\=\s*(.*?)[\|\}] > > > with ignorecase flag set on. > > Make a great day, > Max Klein ‽ http://notconfusing.com/ > > On Wed, Oct 22, 2014 at 12:48 PM, Aaron Halfaker <[email protected] > > wrote: > >> Hey folks, >> >> Somehow I missed this thread, but I've already addressed this request on >> the Village Pump[1]. See: >> >> See. >> http://datasets.wikimedia.org/public-datasets/enwiki/etc/pmids.articles.20141008.tsv >> >> >> I extracted PMIDs with the following regex: /\bpmid *= *[0-9]+\b/i >> >> It includes page_id, page_namespace, page_title, rev_id (most recent), >> pmid in TAB separated values. >> >> Let me know if you have questions or if you think the regex matching >> strategy is insufficient. It's pretty quick to take another pass. >> >> 1. >> https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)#Extracting_PMIDs >> >> On Wed, Oct 22, 2014 at 1:27 PM, Maximilian Klein <[email protected]> >> wrote: >> >>> Jake, >>> I have script that does this already for DOIs, Its was one-line change >>> to make. These files should answer what you were looking for. >>> >>> https://raw.githubusercontent.com/notconfusing/listiness/pmc/pmc_list.txt >>> >>> https://raw.githubusercontent.com/notconfusing/listiness/pmc/pmid_list.txt >>> >>> In the future you can tell them to use halfak's >>> https://pythonhosted.org/mediawiki-utilities/ >>> This is the code I used to get those lists. >>> https://github.com/notconfusing/listiness/commit/e140ce9202b9c1098dec40ca1da3ff135fd8c520 >>> >>> Make a great day, >>> Max Klein ‽ http://notconfusing.com/ >>> >>> On Mon, Oct 20, 2014 at 9:20 PM, Andrew G. West <[email protected] >>> > wrote: >>> >>>> Jake, >>>> >>>> Yes, its a rather straightforward parse based on the citation format >>>> which Jeremy described. Doc James and I already have this coded up for a >>>> soon to be published [[WP:MED]] readership/editorship paper. >>>> >>>> Searching for PMID's in the entirety of the Wikipedia article base >>>> would be a bit time consuming -- but if one needs to pull down only >>>> articles in WikiProject Medicine, for example, I am also able to help on >>>> that front. >>>> >>>> Perhaps we'll take this offline, but if anyone else is interested in >>>> the dirty details, feel free to contact one of us off-list. -AW >>>> >>>> -- >>>> Andrew G. West, PhD >>>> http://www.andrew-g-west.com >>>> >>>> >>>> >>>> On 10/20/2014 11:57 PM, Jake Orlowitz wrote: >>>> >>>>> Hi folks, >>>>> >>>>> Relaying a question from a Stanford medical researcher: >>>>> >>>>> "Do you know if it is possible to extract PubMed ID (PMID) or PMCIDs >>>>> from Wiki references? Furthermore, could you dump those IDs out into a >>>>> list for analysis?" >>>>> >>>>> Best, >>>>> Jake Orlowitz (Ocaasi) >>>>> >>>>> >>>>> _______________________________________________ >>>>> Wiki-research-l mailing list >>>>> [email protected] >>>>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> Wiki-research-l mailing list >>>> [email protected] >>>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l >>>> >>> >>> >>> _______________________________________________ >>> Wiki-research-l mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l >>> >>> >> >> _______________________________________________ >> Wiki-research-l mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l >> >> > > _______________________________________________ > Wiki-research-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l > >
_______________________________________________ Wiki-research-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
