Delta debugging framework
Contents
Project Name
Delta debugging framework
Project Description
Delta debugging is an automated approach to debugging that isolates failures systematically. Given a failing test that can be mechanically verified (including a browser crash), delta debugging is a way of automatically isolating the change that introduced the failure. Having a framework in place to pull builds from CVS, bisect by date and change set (using bonsai data -- remember, CVS doesn't have changesets!), and report results would let computers make developers more productive.
Project Leader(s)
Name(s) of primary people working on the project. If you want to join a project as leader, discuss with other leaders first. Include links to personal pages within wiki.
Project Contributor(s)
Name(s) of people casually working on the project, or who have contributed significant help. Include links to personal pages within wiki. NOTE: only Project Leader(s) should add names here. You can’t add your own name to the Contributor list.
Project Details
Provides more depth than the Project Description. This is the place for technical discussions, project specs, or other details. If this gets very long, you might consider breaking this part into multiple pages and linking to them.
[WORK IN PROGRESS: This post will be updated once I gather all of my thoughts. Stay tuned...]
Based on the papers on delta debugging that I've sifted through (with From Automated Testing to Automated Debugging by Andreas Zeller being the most understandable to non-math wizards, and Yesterday, my program worked. Today, it does not. Why? also by Andreas Zeller probably being the most relevant to our project), here is an outline of my understanding of delta debugging and a summary of the concepts that must be taken into account while working on the delta debugging framework project.
Delta debugging is an algorithm that can automatically and systematically isolate/narrow down the failure-inducing circumstances that are necessary to produce the bug. Delta debugging can be applied to isolating various types of failure-inducing circumstances, including:
- program input
- user interaction (key presses, button presses, mouse clicks, etc.)
- program code modification (adding / updating / deleting variables, functions, classes, etc.)
How? From my understanding, given a known bug and a known set of circumstances that can reproduce the bug, we can continually execute tests that vary the circumstances until a minimal subset of circumstances that can reproduce the bug is left. That is, for each test, a circumstance is removed, and if the bug is still present, than that circumstance can theoretically be eliminated as the cause of the bug from the set of circumstances. In this project's case, the circumstance will be certain change(s) to the program code that caused a regression in the program.
[Note: The rest of this post contains mathematical concepts that I may not fully understand but am trying to put into terms understandable to non-math wizards. A dangerous combination.]
With regards to delta debugging algorithm, we must take into account three possible scenarios that may complicate it:
- Interference. The cause of the regression may be from a combination of changes. That is, each individual change does not cause the program to regress, but applying the changes together causes the regression. Thus, there must be a method to identify the set of changes that cause the regression.
- Inconsistency. That is, changes in source code may be dependent on other changes in the source code and without them the program cannot be compiled and tested successfully. Thus, there must be a method to identify and handle dependencies (such as a dependency graph).
- Granularity. A single change may consist of hundreds of lines of code, yet only a couple lines of the change may be responsible for the regression. Thus, There must be a method to break the change into smaller manageable chunks.
Before we continue, here are some important terms to understand.
- Configuration. A subset of all changes made to source code since the last known good version and the version with the regression.
- Test. A test case or function that can determine whether a configuration contains the failure-inducing change (Fails), doesn't contain the failure-inducing change (Passes), or produces indeterminate results (possibly because of interference or inconsistency).
- Subset. Given two sets (A and B), A is a subset of B if and only if all of the elements of set A are in set B.
- Superset. Given two sets (A and B), B is a superset of A if and only if B contains all of the elements of A.
- Union. Given two sets (A and B), the union of A and B contain all elements of A and B.
- Intersection. Given two sets (A and B), the intersection of A and B are the elements that are in both A and B.
There are a couple of properties that can decrease the number of tests required to find the minimal failure-inducing set of changes. While implementing these properties into the algorithm may be more difficult than testing all possible combinations, it should result in a more efficient algorithm in certain cases. Something I would like to aim for.
- Monotony. If a change causes a failure, any configuration that includes this change fails as well (makes sense to me). If a configuration does not cause a failure, then all subsets of the configuration do not cause a failure, thus they can be eliminated from the change set (the proof by contradiction shown in Yesterday, my program worked. Today, it does not. Why? makes some sense to me. However, if my understanding is correct, while a subset of the configuration does not cause the failure by itself, the concept of interference suggests that the subset combined with another configuration may cause the regression).
- Unambiguity. A configuration is unambiguous if, given two subsets of a configuration, a test on the first subset reports to cause the regression, a test on the second subset reports to cause the regression and the intersection of the first and second sets does not pass (reports to cause the regression or produces indeterminate results).
- Consistency. All subsets of a configuration produces a determinate result (pass or fail).
In the ideal situation, a configuration will be monotone, unambiguous, and consistent. In which case, a binary algorithm may suffice. The set of changes may be recursively split into two subsets and tested. The result may be that the first or second subset has the failure inducing change or both tests pass as a result of interference. In the case of interference, we must recursively test both halves with all changes in one half of the configuration remaining applied.
[WORK IN PROGRESS: This post will be updated once I gather all of my thoughts. Stay tuned...]
Project References
NewsForge: An Introduction to Delta Debugging Delta debugging simplifies debugging process in a program by automating the process and continually splitting the program to smaller chunks called deltas. This technique is useful in three circumstances:
- Error occurs due to user inputs (e.g. keypress, file I/O). Delta debugging is used to eliminate user actions irrelevant to the nature of the error and pinpoint the cause of the error.
- Error occurs due to recent changes to the code. In this situation, deltas are retrieved from the net differences from both codes.
- Multithreading environment. Delta debugging can track down the exact order of operations originating from multiple threads that caused the error.
We need to focus on one of these circumstances. Judging from the project description, we should work on the first case, while perhaps opening the door to future expansion.
Project News
Sept 26, 2006
- I read through your documentation here, and it is looking good. I also spoke to Shaver by phone this morning, and we chatted briefly about this project. He suggests that you start your work by looking for a suitable Crash Case, one that happens reliably. Then you need at what would be necessary in order to bisect the change set (e.g., bonsai data) in order to get closer to the change that introduced the bug. Shaver suggested that robc (Rob Campbell) might be a good person to help you brainstorm on this.