Abstract
The present work proposed a semantic retrieval approach to treat the issues of semantic ambiguity of indexed terms, the uncertainty, and imprecision that is inherent in the information retrieval process. The proposed approach constitutes of three different phases. The query meaning was discovered in the first phase by formulating a set of candidate queries from possible contexts. A score for each alternative was calculated based on its semantic tree and inherent dispersion between its concepts. This score assesses the overall meaning of the alternative query. This phase was finished by selecting the candidate query that attains the highest score to be the best representative to the original query. A semantic index was built in the second phase exploiting the classic and semantic characteristics of the document concepts to finally assign a weight for each concept to estimate its relative importance. The third phase proposed a ranking model that utilizes the semantic similarities and relations between concepts to calculate the query-document relevance. This ranking model is based on a query likelihood language model and a conceptual weighting model. The validity of the proposed approach was evaluated through performance comparisons with the related benchmarks measured in terms of the standard IR performance metrics. The proposed approach outperformed the compared baselines and improved the measured metrics. A statistical significance test was conducted to guarantee that the obtained improvements are true enhancements and are not a cause of random variation of the compared systems. The statistical test supported the hypothesis that the obtained improvements were significant.