近日,实验室的一篇学术论文“Query Expansion with Enriched User Profiles for Personalized Search Utilizing Folksonomy Data”被国际顶级期刊IEEE Transactions on Knowledge and Data Engineering(IEEE TKDE)期刊收录为regular论文。IEEE Transactions on Knowledge and Data Engineering期刊是知识和数据工程领域中最具影响力的刊物,主要关注知识发现和数据挖掘、数据库和数据建模、并行分布式数据管理系统、数据密集型可扩展计算系统结构、移动系统、搜索引擎以及数据工程应用等领域的最新研究进展和技术,为中国计算机学会(CCF) A类推荐期刊。这是实验室在CCF A类期刊上发表的首篇学术论文。所发表论文简要信息如下:
=================================================================================================
标题: Query Expansion with Enriched User Profiles for Personalized Search Utilizing Folksonomy Data
作者: Dong Zhou, Xuan Wu, Wenyu Zhao , Séamus Lawless and Jianxun Liu
来源出版物: IEEE Transactions on Knowledge and Data Engineering(IEEE TKDE)
摘要:Query expansion has been widely adopted in Web search as a way of tackling the ambiguity of queries. Personalized search utilizing folksonomy data has demonstrated an extreme vocabulary mismatch problem that requires even more effective query expansion methods. Co-occurrence statistics, tag-tag relationships and semantic matching approaches are among those favored by previous research. However, user profiles which only contain a user’s past annotation information may not be enough to support the selection of expansion terms, especially for users with limited previous activity with the system. We propose a novel model to construct enriched user profiles with the help of an external corpus for personalized query expansion. Our model integrates the current state-of-the-art text representation learning framework, known as word embeddings, with topic models in two groups of pseudo-aligned documents. Based on user profiles, we build two novel query expansion techniques. These two techniques are based on topical weights-enhanced word embeddings, and the topical relevance between the query and the terms inside a user profile respectively. The results of an in-depth experimental evaluation, performed on two real-world datasets using different external corpora, show that our approach outperforms traditional techniques, including existing non-personalized and personalized query expansion methods.
(编辑:周栋)