Min Song
Yonsei University
Data miningInformation retrievalData scienceComputer scienceText mining
What is this?
Publications 234
#1Jian XuH-Index: 6
#2Sunkyu KimH-Index: 7
Last. Vetle I. TorvikH-Index: 22
view all 15 authors...
PubMed is an essential resource for the medical domain, but useful concepts are either difficult to extract from PubMed or are disambiguated, which has significantly hindered knowledge discovery. To address this issue, we developed a solution by constructing a PubMed knowledge graph (PKG) by extracting bio-entities from 29 million PubMed abstracts, disambiguating author names, integrating funding data through the National Institutes of Health (NIH) ExPORTER, affiliation history and educational b...
#1Ali DaudH-Index: 1
#2Min SongH-Index: 18
Last. Anwar GhaniH-Index: 3
view all 7 authors...
#1Gyeong Taek Lee (Yonsei University)
#2Chang Ouk Kim (Yonsei University)H-Index: 18
Last. Min Song (Yonsei University)H-Index: 18
view all 3 authors...
Sentiment analysis plays an important role in understanding individual opinions expressed in websites such as social media and product review sites. The common approaches to sentiment analysis use ...
#1Min Song (Yonsei University)H-Index: 18
#2Keun Young Kang (Yonsei University)H-Index: 4
Last. Xinyuan Zhang (WHU: Wuhan University)
view all 4 authors...
Acknowledgements have been examined as important elements in measuring the contributions to and intellectual debts of a scientific publication. Unlike previous studies that were limited in the scope of analysis and manual examination. The present study aimed to conduct the automatic classification of acknowledgements on a large scale of data. To this end, we first created a training dataset for acknowledgements classification by sampling the acknowledgements sections from the entire PubMed Centr...
#1Yoo Kyung Jeong (Yonsei University)H-Index: 5
#2Qing Xie (Yonsei University)H-Index: 1
Last. Min Song (Yonsei University)H-Index: 18
view all 4 authors...
Abstract The current study has two objectives. First, we explore the characteristics of biological entities, such as drugs, and their side effects using an author–entity pair bipartite network. Second, we use the constructed network to examine whether there are outstanding features of relations between drugs and side effects. We extracted drug and side effect names from 169,766 PubMed abstracts published between 2010 to 2014 and constructed author–entity pair bipartite networks after ambiguous a...
#1Hyeonseo Lee (Yonsei University)
#2Nakyeong Lee (Yonsei University)
Last. Min SongH-Index: 18
view all 4 authors...
The fast-growing digital data generation leads to the emergence of the era of big data, which become particularly more valuable because approximately 70% of the collected data in the world comes from social media. Thus, the investigation of online social network services is of paramount importance. In this paper, we use the sentiment analysis, which detects attitudes and emotions toward issues of society posted in social media, to understand the actual economic situation. To this end, two steps ...
#1Go Eun Heo (Yonsei University)H-Index: 6
#2Qing Xie (Yonsei University)H-Index: 1
Last. Jeong-Hoon Lee (POSTECH: Pohang University of Science and Technology)H-Index: 8
view all 4 authors...
Background Extracting useful information from biomedical literature plays an important role in the development of modern medicine. In natural language processing, there have been rigorous attempts to find meaningful relationships between entities automatically by co-occurrence-based methods. It has been increasingly important to understand whether relationships exist, and if so how strong, between any two entities extracted from a large number of texts. One of the defining methods is to measure ...
1 CitationsSource
Systematic scientometric reviews, empowered by computational and visual analytic approaches, offer opportunities to improve the timeliness, accessibility, and reproducibility of studies of the literature of a field of research. On the other hand, effectively and adequately identifying the most representative body of scholarly publications as the basis of subsequent analyses remains a common bottleneck in the current practice. What can we do to reduce the risk of missing something potentially sig...
3 CitationsSource
Jul 17, 2019 in AAAI (National Conference on Artificial Intelligence)
#1Reinald Kim Amplayo (Yonsei University)H-Index: 4
#2Seung-won Hwang (Yonsei University)H-Index: 24
Last. Min Song (Yonsei University)H-Index: 18
view all 3 authors...
Word sense induction (WSI), or the task of automatically discovering multiple senses or meanings of a word, has three main challenges: domain adaptability, novel sense detection, and sense granularity flexibility. While current latent variable models are known to solve the first two challenges, they are not flexible to different word sense granularities, which differ very much among words, from aardvark with one sense, to play with over 50 senses. Current models either require hyperparameter tun...
#1Xiaomin LiangH-Index: 1
#2Daifeng LiH-Index: 8
Last. Yi Bu (IU: Indiana University Bloomington)H-Index: 1
view all 6 authors...
Advances in machine learning and deep learning methods, together with the increasing availability of large-scale pharmacological, genomic, and chemical datasets, have created opportunities for identifying potentially useful relationships within biochemical networks. Knowledge embedding models have been found to have value in detecting knowledge-based correlations among entities, but little effort has been made to apply them to networks of biochemical entities. This is because such networks tend ...
1 CitationsSource