Ranking the whole MEDLINE database according to a large training set using text indexing.

Brian P Suomela and Miguel A Andrade.
BMC Bioinformatics 2005, 6:75.

Supplementary material.

Additional file 1 . Stem Cell references
List in plain text format (stemcellpapers.txt) of 81,416 PubMed Identifiers (PMIDs) linked to abstracts in MEDLINE that have one or more MeSH terms which are members of the set of terms related to stem cell.

Additional file 2 . Nouns, adjectives, and verb scores.
A zip compressed file (file2.zip) containing three lists in plain text format (sortednounscores.txt, sortedadjectivescores.txt, sortedverbscores.txt) of the computed scores for 2,256 nouns, 1,193 adjectives, and 748 verbs.

Additional file 3 . Stem Cell MEDLINE reference scores
A zip compressed file (file3.zip) containing lists in plain text format (stemcellpaperscores-adjectives.txt, stemcellpaperscores-nouns.txt, stemcellpaperscores-nounsadjectives.txt, stemcellpaperscores-verbs.txt) of 81,416 PMIDs of stem cell references and their scores according to nouns, adjectives, verbs, and combined nouns/adjectives.

Additional file 4 . Subset of MEDLINE reference scores
A zip compressed file (file4.zip) containing lists in plain text format (paperscores-adjectives.txt, paperscores-nouns.txt, paperscores-nounsadjectives.txt, paperscores-verbs.txt) of 81,416 PMIDs of references randomly selected from MEDLINE and their scores according to nouns, adjectives, verbs, and combined nouns/adjectives.

Additional file 5 . List of 6,923 scored abstracts
A zip compressed file (file5.zip) containing a table in plain text format with tab separated columns (paperscores-nouns-recent.txt) of 6,923 PMIDs of references not included in the training set with their scores, and a human evaluation of their relevance to the topic of stem cells.