PageRank without hyperlinks: Reranking with PubMed related article networks for biomedical text retrieval
2008

Improving Biomedical Text Retrieval with PageRank

publication Evidence: moderate

Author Information

Author(s): Lin Jimmy

Primary Institution: National Center for Biotechnology Information, National Library of Medicine

Hypothesis

Can related article networks be exploited for text retrieval in the same manner as hyperlink graphs on the Web?

Conclusion

The link structure of content-similarity networks can be exploited to improve the effectiveness of information retrieval systems.

Supporting Evidence

  • Incorporating PageRank scores yields significant improvements in ranked-retrieval metrics.
  • The study confirms that related document networks can enhance retrieval effectiveness.
  • Statistical tests showed significant improvements over baseline retrieval scores.

Takeaway

This study shows that using connections between similar articles can help find the right information better, just like how links on the web work.

Methodology

Experiments were conducted using the TREC 2005 genomics track test collection, combining PageRank and HITS scores with standard retrieval engine scores.

Limitations

The study did not perform second order expansions of related documents, which could limit the network density.

Statistical Information

P-Value

0.01453

Statistical Significance

p<0.05

Digital Object Identifier (DOI)

10.1186/1471-2105-9-270

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication