Abstract
In previous work, we described a technique for detecting design-level similar program structures (structural clones) formed from recurring configurations of similar code fragments (simple clones). In this paper, we analyze in detail how frequently these structural clones occur in software systems and how structural clone analysis extends the benefits of analysis based on simple clones only. Our case study of 11 open source systems revealed that over 50% of simple clones are captured by structural clones that often correspond to meaningful design or application domain concepts. Because of their larger size, it is easier for programmers to perceive the similarity situation in a system from structural clone perspective rather than from simple clone perspective only. We also discuss the contribution of structural clone detection towards program understanding, design recovery, maintenance, and refactoring using examples from the case study systems.