Title page for ETD etd-02032004-163252
( Browse | Search ) All Available ETDs
Type of Document Dissertation
Author Mohamed, Khaled Abd El-Fatah
Author's Email Address khaledma1@hotmail.com
URN etd-02032004-163252
Title Merging Multiple Search Results Approach for Meta-Search Engines
Degree Doctor of Philosophy
Program Library and Information Science
School School of Information Sciences
Advisory Committee
Advisor Name Title
Christinger Tomer Committee Chair
Amy Knapp Committee Member
Donald King Committee Member
José-Marie Griffiths Committee Member
Keywords
  • Meta-Engines
  • resources combination
  • experimental method
  • rank aggregation
  • merging Algorithm
  • combining retreived documents
  • www
  • world wide web
  • combination
  • Web Retrieving
  • Information Retrieval
  • Data Fusion
  • Meta-Search Engines
  • retieval experiment
Date of Defense 2004-01-29
Availability unrestricted
Abstract
Meta Search Engines are finding tools developed for enhancing the search performance by submitting user queries to multiple search

engines and combining the search results in a unified ranked list. They utilized data fusion technique, which requires three major steps: databases selection, the results combination, and the results merging.

This study tries to build a framework that can be used for merging the search results retrieved from any set of search engines. This framework based on answering three major questions:

1.How meta-search developers could define the optimal rank order for the selected engines.

2. How meta-search developers could choose the best search engines combination.

3. What is the optimal heuristic merging function that could be used for aggregating the rank order of the retrieved documents form incomparable search engines.

The main data collection process depends on

running 40 general queries on three major search engines (Google, AltaVista, and Alltheweb). Real users have involved in the relevance judgment process for a five point relevancy scale. The

performance of the three search engines, their different combinations and different merging algorithm have been compared to rank the database, choose the best combination and define the optimal merging function.

The major findings of this study are (1) Ranking the databases in merging process should depends on their overall performance not their popularity or size; (2)Larger databases tend to perform better than smaller databases; (3)The combination of the search engines should depend on ranking the database and choosing the

appropriate combination function; (4)Search Engines tend to retrieve more overlap relevant document than overlap irrelevant documents; and (5) The merging function which take the

overlapped documents into accounts tend to perform better than the interleave and the rank similarity function.

In addition to these findings the study has developed a set of requirements for the merging process to be successful. This procedure include the databases selection, the combination, and merging upon heuristic solutions.

Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  mohamed_732.pdf 669.88 Kb 00:03:06 00:01:35 00:01:23 00:00:41 00:00:03
If you have questions or comments please send mail to ETD-Feedback or view
the University of Pittsburgh Electronic Theses and Dissertations (ETD) Project page.