Mining top-k Popular Datasets via a Deep Generative Model

Uchenna Akujuobi; Ke Sun; Xiangliang Zhang

Back

Conference proceeding

Mining top-k Popular Datasets via a Deep Generative Model

Uchenna Akujuobi, Ke Sun and Xiangliang Zhang

The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Conference Proceedings, p.584

01/01/2018

Abstract

Artificial intelligence

Citation analysis

Data management

Data mining

Datasets

Domains

Machine learning

Conference Title: 2018 IEEE International Conference on Big Data (Big Data) Conference Start Date: 2018, Dec. 10 Conference End Date: 2018, Dec. 13 Conference Location: Seattle, WA, USA Finding popular datasets to work on is essential for data-driven research domains. In this paper, we focus on the problem of extracting top-k popular datasets that have been used in data mining, machine learning, and artificial intelligence fields. We solve this problem on an attributed citation network, which includes node content information (text of published papers) and paper citation relations. By formulating the problem as a semi-supervised multi-label classification one, we develop an efficient deep generative model for learning from both the document content and citation relations. The evaluation on a real-world dataset shows that our proposed model outperforms baseline methods. We then apply the model further to reveal the top-k frequently cited datasets in selected areas and report interesting findings.

Metrics

1 Record Views

Details

Title: Mining top-k Popular Datasets via a Deep Generative Model
Creators - without role: Uchenna Akujuobi
Ke Sun
Xiangliang Zhang
Publication Details: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Conference Proceedings, p.584
Publisher: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Identifiers: 9943824908331
Academic Unit: King Abdullah University of Science & Technology
Language: English
Resource Type: Conference proceeding