Abstract
It has been insufficiently explored how to perform density-based clustering by exploiting textual attributes on social media. In this paper, we aim at discovering a social point-of-interest (POI) boundary, formed as a convex polygon. More specifically, we present a new approach and algorithm, built upon our earlier work on social POI boundary estimation (SoBEst). This SoBEst approach takes into account both relevant and irrelevant records within a geographic area, where relevant records contain a POI name or its variations in their text field. Our study is motivated by the following empirical observation: a fixed representative coordinate of each POI that SoBEst basically assumes may be far away from the centroid of the estimated social POI boundary for certain POIs. Thus, using SoBEst in such cases may possibly result in unsatisfactory performance on the boundary estimation quality (BEQ), which is expressed as a function of the F-measure. To solve this problem, we formulate a joint optimization problem of simultaneously finding the radius of a circle and the POI's representative coordinate c by allowing to update c. Subsequently, we design an iterative SoBEst (I-SoBEst) algorithm, which enables us to achieve a higher degree of BEQ for some POIs. The computational complexity of the proposed I-SoBEst algorithm is shown to scale linearly with the number of records. We demonstrate the superiority of our algorithm over competing clustering methods including the original SoBEst.
Original language | English |
---|---|
Article number | 106710 |
Journal | Knowledge-Based Systems |
Volume | 213 |
DOIs | |
Publication status | Published - 2021 Feb 15 |
Bibliographical note
Funding Information:This research was supported by the Republic of Korea’s MSIT (Ministry of Science and ICT) , under the High-Potential Individuals Global Training Program (No. 2020-0-01463 ) supervised by the IITP (Institute of Information and Communications Technology Planning Evaluation), by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI) , funded by the Ministry of Health & Welfare , Republic of Korea ( HI20C0127 ), by the Technology Innovation Program (No. 10039010 , Development of Lightweight Materials with Superb Mechanical Properties Based on AI) funded By the Ministry of Trade, Industry & Energy (MOTIE, Korea) , and by the Yonsei University, Republic of Korea Research Fund of 2020 ( 2020-22-0101 ). This paper was presented in part at the ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatio Data, San Francisco, CA, June 2016 [18] .
Funding Information:
This research was supported by the Republic of Korea's MSIT (Ministry of Science and ICT), under the High-Potential Individuals Global Training Program (No. 2020-0-01463) supervised by the IITP (Institute of Information and Communications Technology Planning Evaluation), by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (HI20C0127), by the Technology Innovation Program (No. 10039010, Development of Lightweight Materials with Superb Mechanical Properties Based on AI) funded By the Ministry of Trade, Industry & Energy (MOTIE, Korea), and by the Yonsei University, Republic of Korea Research Fund of 2020 (2020-22-0101). This paper was presented in part at the ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatio Data, San Francisco, CA, June 2016 [18].
Publisher Copyright:
© 2020 Elsevier B.V.
All Science Journal Classification (ASJC) codes
- Management Information Systems
- Software
- Information Systems and Management
- Artificial Intelligence