Automated training-set creation for software architecture traceability problem

Waleed Zogaan; Ibrahim Mujhid; Joanna C. S. Santos; Danielle Gonzalez; Mehdi Mirakhorli

doi:10.1007/s10664-016-9476-y

Back

Automated training-set creation for software architecture traceability problem

Journal article

Peer reviewed

Automated training-set creation for software architecture traceability problem

Waleed Zogaan, Ibrahim Mujhid, Joanna C. S. Santos, Danielle Gonzalez and Mehdi Mirakhorli

Empirical software engineering : an international journal, Vol.22(3), pp.1028-1062

01/06/2017

DOI: https://doi.org/10.1007/s10664-016-9476-y

Abstract

Computer Science

Computer Science, Software Engineering

Science & Technology

Technology

Automated trace retrieval methods based on machine-learning algorithms can significantly reduce the cost and effort needed to create and maintain traceability links between requirements, architecture and source code. However, there is always an upfront cost to train such algorithms to detect relevant architectural information for each quality attribute in the code. In practice, training supervised or semi-supervised algorithms requires the expert to collect several files of architectural tactics that implement a quality requirement and train a learning method. Establishing such a training set can take weeks to months to complete. Furthermore, the effectiveness of this approach is largely dependent upon the knowledge of the expert. In this paper, we present three baseline approaches for the creation of training data. These approaches are (i) Manual Expert-Based, (ii) Automated Web-Mining, which generates training sets by automatically mining tactic's APIs from technical programming websites, and lastly (iii) Automated Big-Data Analysis, which mines ultra-large scale code repositories to generate training sets. We compare the trace-link creation accuracy achieved using each of these three baseline approaches and discuss the costs and benefits associated with them. Additionally, in a separate study, we investigate the impact of training set size on the accuracy of recovering trace links. The results indicate that automated techniques can create a reliable training set for the problem of tracing architectural tactics.

Metrics

1 Record Views

See more details

Details

Title: Automated training-set creation for software architecture traceability problem
Creators - without role: Waleed Zogaan - Rochester Institute of Technology
Ibrahim Mujhid - Rochester Institute of Technology
Joanna C. S. Santos - Rochester Institute of Technology
Danielle Gonzalez - Rochester Institute of Technology
Mehdi Mirakhorli - Rochester Institute of Technology
Publication Details: Empirical software engineering : an international journal, Vol.22(3), pp.1028-1062
Publisher: Springer Nature
Number of pages: 35
Grant note: 1543176 / Division of Computing and Communication Foundations; National Science Foundation (NSF); NSF - Directorate for Computer & Information Science & Engineering (CISE)
Identifiers: 9917250108331
Academic Unit: Jazan University
Language: English
Resource Type: Journal article