TY - GEN
T1 - Keyphrase extraction-based query expansion in digital libraries
AU - Song, Min
AU - Song, Il Yeol
AU - Allen, Robert B.
AU - Obradovic, Zoran
PY - 2006
Y1 - 2006
N2 - In pseudo-relevance feedback, the two key factors affecting the retrieval performance most are the source from which expansion terms are generated and the method of ranking those expansion terms. In this paper, we present a novel unsupervised query expansion technique that utilizes keyphrases and POS phrase categorization. The keyphrases are extracted from the retrieved documents and weighted with an algorithm based on information gain and co-occurrence of phrases. The selected keyphrases are translated into Disjunctive Normal Form (DNF) based on the POS phrase categorization technique for better query refomulation. Furthermore, we study whether ontologies such as WordNet and MeSH improve the retrieval performance in conjunction with the keyphrases. We test our techniques on TREC 5, 6, and 7 as well as a MEDLINE collection. The experimental results show that the use of keyphrases with POS phrase categorization produces the best average precision.
AB - In pseudo-relevance feedback, the two key factors affecting the retrieval performance most are the source from which expansion terms are generated and the method of ranking those expansion terms. In this paper, we present a novel unsupervised query expansion technique that utilizes keyphrases and POS phrase categorization. The keyphrases are extracted from the retrieved documents and weighted with an algorithm based on information gain and co-occurrence of phrases. The selected keyphrases are translated into Disjunctive Normal Form (DNF) based on the POS phrase categorization technique for better query refomulation. Furthermore, we study whether ontologies such as WordNet and MeSH improve the retrieval performance in conjunction with the keyphrases. We test our techniques on TREC 5, 6, and 7 as well as a MEDLINE collection. The experimental results show that the use of keyphrases with POS phrase categorization produces the best average precision.
UR - http://www.scopus.com/inward/record.url?scp=34247260471&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34247260471&partnerID=8YFLogxK
U2 - 10.1145/1141753.1141800
DO - 10.1145/1141753.1141800
M3 - Conference contribution
AN - SCOPUS:34247260471
SN - 1595933549
SN - 9781595933546
T3 - Proceedings of the ACM/IEEE Joint Conference on Digital Libraries
SP - 202
EP - 209
BT - 6th ACM/IEEE-CS Joint Conference on Digital Libraries 2006
T2 - 6th ACM/IEEE-CS Joint Conference on Digital Libraries 2006: Opening Information Horizons, JCDL '06
Y2 - 11 June 2006 through 15 June 2006
ER -