Abstract:
The Web has drastically changed the online availability of data and the amount of
electronically exchanged information. However, the volume and heterogeneity of the
information that is available online via Websites or databases make it difficult for a
user to visit each and every Website that is relevant to the information needed. Primary
search tools i.e. search engines, subject directories and social network search engines
are not enough to meet the requirements of the information seeker.
Traditional search engines are based on keyword or phrase search, without taking into
account the semantics of the word or phrase, and hence may not provide the desired
results to the user. Other traditional search tools suffer from low recall and precision.
These tools do not provide comprehensive coverage of the Web. To overcome these
problems, meta-search engines aim to offer topic-specific search using multiple
heterogeneous search engines.
In the human resource domain, traditional methods of job/employee search i.e.
newspapers, magazines, advertising at job fairs, employment recruitment agencies and
registering with search firms, lack the ability for search in the modern employment
market. In this dissertation, we propose a new configurable meta-search engine in the
human resource domain to provide an ideal platform for meta-search provider and a
job seeker. Our aim is to combine the respective benefits of vertical search engines,
meta-search engines and semantic search engines within a domain-specific context, in
which there is a well-understood domain ontology.
We are concerned with techniques to support two key aspects of meta-search engines:
i) meta-search engine creation by meta-search engine providers and ii) meta-search
engine usage for information seekers. One of the important challenges in accessing
heterogeneous and distributed data via a meta-search engine is schema/data matching
and integration. We describe an approach to schema and data integration for meta-
search engines. During the matching and integration process, we need to handle
syntactic, semantic and structural heterogeneity between multiple information sources.
In this dissertation, our main objective is to resolve semantic conflicts. Our approach is
a hybrid one, in that we use multiple matching criteria and multiple matchers. We
employ several element levels, structure levels and ontology based techniques during
the integration process. A domain ontology serves as a global ontology and allows us
to resolve semantic heterogeneity. Our matching process handles different mapping
cardinalities (1:1, 1:n, n:1, m:n). The mappings derived are used to generate an
integrated meta-search query interface, to support query processing in the meta-search
engine, and to resolve semantic conflicts arising during result extraction from the
source search engines. Experiments conducted in the job search domain show that the
cumulative use of element-level, structure- level and ontology-based techniques
increases the correctness of matching during the automatic integration of source search
interfaces.
The system supports meta-search provider in the quick development of meta-search
engines and is able to understand and integrate schemas from different job search
engines semantically. Meta-search provider can easily integrate the new search engines
in the meta-search engine. The system can help job seekers in the job search without
visiting multiple search engines. Job seekers do not need to spend their time to comb
through large numbers of job results in searching for the relevant job. The system can
semantically understand the job results and rank them for the job seekers.
An important aspect of our meta-search in human resource domain is that it has been
designed by applying semantic Web technologies, to solve the problems of meta-
search developers and job seekers. We provide the solutions for automatic integration
of data, structures and processes in human resource domain into a meta-search by the
use of our modelled domain ontology and multiple matchers. We have used HR-XML
and different classification schemes in the construction of domain ontology and
integrated interface for the meta-search engine. Our modelled domain ontology and
HR-XML for the generation of integrated schema and integrated interface are used to
understand the meaning of terms and to improve the quality of search interface and
search results.
Flexible and re-useable design patterns have been introduced for the creation process,
usage process and different components of meta-search engine. Design pattern for the
creation process helps the meta-search provider and design pattern for the usage
process helps the job seeker. Design patterns for different components of meta-search
engine help the new developers to speed up the development process.
Meta-search increases the Web coverage for job seeker by the combination of
specialized search engine, multiple search engines and semantic search into one. We
hope that new meta-search engine can be helpful in reducing the unemployment rate of
a country.