Predicting Change Propagation in Software Systems
By Ahmed E. Hassan and Richard C. Holt
Summary
The paper addresses how a change in an entity of a software system is propagated to other entities. It explores the importance of better change propagation to maintain consistency among interdependent entities in a system. A heuristics based approach is taken to overcome the challenges associated with change propagation. Source control systems are discussed in this regard, which records changes occurring at a file level but do not track changes necessary in entities depended on that change. The authors tried to fill this gap by working at the source code entity level to capture entity level changes such as addition and modification of a function and to track changes in dependencies between entities. The authors describe change propagation task as an iterative process involving a developer consulting an expert if unable to detect entity to change. The performance of the heuristics is based on two techniques – Recall and Precision. An example scenario is presented to depict the heuristic in action, showing formulation of a change set for a new bug fixing problem a developer is working on, who interacts with the heuristics and a human expert. Each heuristic is characterized by two aspects – heuristic data sources that aim to reduce the number of predictions for entities, and pruning techniques that is used to improve precision of a heuristic. To validate the proposed heuristics the researchers studied 5 systems developed for over 10 years and with 40 years of historical data. Based on analyzing 4 heuristics the authors suggest that historical information and code layout are better than code structure to help propagate changes. The researchers conclude that the historical co-change records can help derive more heuristics and encouraged more research to derive sophisticated heuristics for better change propagation.
Strengths
1. The paper is a good source for researchers interested in the empirical analysis of software change propagation. The question “how change propagates through software entities” has been approached in a simple way which is helpful, even if the question itself is complicated.
2. The example scenario provided to describe how the heuristics for change propagation are measured is a strong point as it successfully conveyed the authors’ ideas.
3. The authors demonstrate the results of the experiment with a comprehensive set of data with tables, graphs and definitions. They use 5 software products developed and maintained over a considerable amount of time, which is a strong support to their findings.
Weaknesses
With due respect to authors, here are several weaknesses that may be considered-
1. The researchers simplify their process based on the assumption that developer will only query heuristics for entities that are already suggested, not any random entity. This defies real world scenario. One can consider any entity that he/she thinks is change propagation candidate.
2. When defining the recall and precision values it is unclear why their values are set to 1 if there is no predicted entity. It would be nice if the mathematical background were provided for this decision.
3. None of the studied systems has UI and the results would be different for systems with UI. This is a limitation of the experiment, given that UI is common in software systems today.
Fine-Grained Analysis of Change Couplings
By Beat Fluri, Harald C. Gall, and Martin Pinzger
Summary
A Change Coupling involves two files that are committed at the same time by the same author and with the commit description. In this paper, the authors focus on structural changes between such commits and answer if the majority of change couplings involve structural change. They use Eclipse IDE to retrieve useful information from release history data and filter those couplings specifically caused by structural changes. The process involves Eclipse integration with version control system, retrieving all modification reports, storing them in a Release History Database (RHDB). A major part involves two concurrently running subprocesses – to cluster change coupling groups and to structurally compare two consecutively committed source files. To cluster changes the authors take the relation analysis approach and store the clusters in matrices. The columns in such a matrix are the revision vectors. Special emphasis is put on detecting frequently committed change couples, which is done by observing the revision vectors. At the same time in the process, to structurally compare two consecutively committed files the authors use Eclipse Compare plugin, which works in two steps: it converts an entity into a synthesized format suitable for a differencing algorithm. The differences are then saved in the RHDB. The researchers merge these two types of information and then extract change couples based on structural change with the help of change coupling cluster browser. They perform the filtering process on Eclipse compare plugin as a case study, and find that more than half of the change couplings are not caused by structural changes. This finding prompts the authors to perform more case studies. The authors also plan to work on detecting pattern in structural change sets.
Leave a Reply