Instant code clone search

Mu Woong Lee, Jong Won Roh, Seung Won Hwang, Sunghun Kim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

36 Citations (Scopus)

Abstract

In this paper, we propose a scalable instant code clone search engine for large-scale software repositories. While there are commercial code search engines available, they treat software as text and often fail to find semantically related code. Meanwhile, existing tools for semantic code clone searches take a "post-mortem" approach involving the detection of clones "after" the code development is completed, and hence, fail to return the results instantly. In clear contrast, we combine the strength of these two lines of existing research, by supporting instant code clone detection. To achieve this goal, we propose scalable indexing structures on vector abstractions of code. Our proposed algorithms allow developers to detect clones of a given code segment among the 1.7 million code segments from 492 open source projects in sub-second response times, without compromising the accuracy obtained by a state-of-the-art tool.

Original languageEnglish
Title of host publicationProceedings of the 18th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE-18
Pages167-176
Number of pages10
DOIs
Publication statusPublished - 2010
Event18th ACM SIGSOFT International Symposium on the Foundations of Software Engineering, FSE-18 - Santa Fe, NM, United States
Duration: 2010 Nov 72010 Nov 11

Publication series

NameProceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering

Other

Other18th ACM SIGSOFT International Symposium on the Foundations of Software Engineering, FSE-18
Country/TerritoryUnited States
CitySanta Fe, NM
Period10/11/710/11/11

All Science Journal Classification (ASJC) codes

  • Software

Fingerprint

Dive into the research topics of 'Instant code clone search'. Together they form a unique fingerprint.

Cite this