Delta debugging framework

Name(s) of primary people working on the project. If you want to join a project as leader, discuss with other leaders first. Include links to personal pages within wiki.

Richard Chu
Dean Woodside
Aditya Nanda Kuswanto

Project Contributor(s)

Name(s) of people casually working on the project, or who have contributed significant help. Include links to personal pages within wiki.
NOTE: only Project Leader(s) should add names here. You can’t add your own name to the Contributor list.

Project Details

Provides more depth than the Project Description. This is the place for technical discussions, project specs, or other details. If this gets very long, you might consider breaking this part into multiple pages and linking to them.

Based on the papers on delta debugging that I've sifted through (with From Automated Testing to Automated Debugging by Andreas Zeller being the most understandable to non-math wizards, and Yesterday, my program worked. Today, it does not. Why? also by Andreas Zeller probably being the most relevant to our project), here is an outline of my understanding of delta debugging, a summary of the concepts that must be taken into account while working on the delta debugging framework project, and an outline of the conceptual stages of developing the delta debugging framework.

Delta debugging is an algorithm that can automatically and systematically isolate/narrow down the failure-inducing circumstances that are necessary to produce the bug. Delta debugging can be applied to isolating various types of failure-inducing circumstances, including:

program input
user interaction (key presses, button presses, mouse clicks, etc.)
program code modification (adding / updating / deleting variables, functions, classes, etc.)

How? From my understanding, given a known bug and a known set of circumstances that can reproduce the bug, we can continually execute tests that vary the circumstances until a minimal subset of circumstances that can reproduce the bug is left. That is, for each test, a circumstance is removed, and if the bug is still present, than that circumstance can theoretically be eliminated as the cause of the bug from the set of circumstances. In this project's case, the circumstance will be certain change(s) to the program code that caused a regression in the program.

[Note: The rest of this post contains mathematical concepts that I may not fully understand but am trying to put into terms understandable to non-math wizards. A dangerous combination.]

Before we continue, here are some important terms to understand.

Configuration. A subset of all changes made to source code since the last known good version and the version with the regression.
Test. A test case or function that can determine whether a configuration contains the failure-inducing change (Fails), doesn't contain the failure-inducing change (Passes), or produces indeterminate results (possibly because of interference or inconsistency).
Subset. Given two sets (A and B), A is a subset of B if and only if all of the elements of set A are in set B.
Superset. Given two sets (A and B), B is a superset of A if and only if B contains all of the elements of A.
Union. Given two sets (A and B), the union of A and B contain all elements of A and B.
Intersection. Given two sets (A and B), the intersection of A and B are the elements that are in both A and B.

With regards to the delta debugging algorithm, we must take into account these possible concepts that may complicate it:

Interference. Each individual change may not cause the program to regress, but applying a combination of changes together causes the regression. Thus, there must be a method to identify the set of changes that cause the regression.
In the case of interference, we must recursively test both halves with all changes in one half of the configuration remaining applied.

Inconsistency. Changes in the source code may be dependent on other changes in the source code and without them the program cannot be compiled and tested successfully. Thus, there must be a method to identify and handle dependencies (such as a dependency graph).
[Wild, far left field thought: How can we ensure that when applying a configuration, it is consistent? Well, in the compiler process, there is a lexical analysis step which breaks the source code into tokens, then there is a dependence analysis step which produces constraints/dependencies between the tokens. Theoretically, if we can harness parts of the compilation process, we could have a method of knowing the dependencies between the changes in the source code.]

Granularity. A single logical change may consist of hundreds of lines of code, yet only a couple lines of the change may be responsible for the regression. Thus, There must be a method to break the change into smaller manageable chunks.

Monotony. If a change causes a failure, any configuration that includes this change fails as well (makes sense to me). If a configuration does not cause a failure, then all subsets of the configuration do not cause a failure, thus they can be eliminated from the change set (makes some sense to me. However, if my understanding is correct, while a subset of the configuration does not cause the failure by itself, the concept of interference suggests that the subset combined with another configuration may cause the regression).

Unambiguity. A failure is caused by only one configuration (No interference). Thus, for efficiency, we do not have to search the other configuration for more failure-inducing changes. Whether or not a configuration is ambiguous or unambiguous, a failure-inducing change will be produced. However, for completeness with regard to finding all failure-inducing changes, it is preferred to search both configurations.

Now that we are aware of the different concepts that we must take into account with regards to delta debugging, the next section will outline some facts and assumptions that are being made, and attempt to define the vision and process of the delta debugging framework.

Project Facts and Assumptions

Project Facts:

The source tree for the Mozilla project is HUGE. With many different source file types (C++, JS, XUL, etc.) in many different directories.
Failure-inducing change(s) will unlikely be localized to a single directory and file. Failure-inducing change(s) may be spread across many different directories and source files.
The source files could be of the same type (C++), mixed type (C++, JS), same directory, different directory. It shouldn't matter. The framework should be source type and location agnostic.
The failure-inducing change(s) may not be localized to a single developer. The failure-inducing change(s) may have been caused by another developer's change(s) to a source file they were working on. That is, a single developer's source scope may not be encapsulated but interconnected and interdependent on other developers source code.
The developer's current version of the source code contains the regression.
The developer has a test case that can be used indicate whether the test passes/fails/is indeterminate.
The developer will NOT know the date/version of the last known good version.
Bonsai is a tool that can produce a list of differences between versions of a source file. (Bonsai's functionality has not been examined closely yet but will have to as it may be a key component to the framework)

OUTDATED

Possible Vision of the Delta Debugging Framework:

(subject to change based on stakeholder consultation/feedback, feasibility study)

Since the last time a developer executed a test case that passed, the developer modified some source files. The source files may be of the same type or mixed type, same directory or different directory. It shouldn't matter. The framework should be source type and location agnostic. Upon executing the test case again, the result is now a failure. The developer panics. It's only days before the deadline to submit bug patches before the source tree is supposed to be closed for release and the bug is a blocker. The developer doesn't want to be shamed for delaying the release, and the source code is too complex to find the bug in time, so what should they do? Use the delta debugging framework! that's what. How? you may ask. Well keep reading to find out. * scenario may vary.
The delta debugging framework may require the developer to input one piece of information. The test case/function that used to pass but now fails. It will be used to determine whether the source files with progressive changes passes/fails the test.
Once the developer has inputted this piece of information, it will use Bonsai to query the source tree and compile a list of all the changes to the source files since a certain amount of time.
(If there was a method of determining change dependencies so as to eliminate the possibility of inconsistencies, it would be done in this step. One possible way of reducing the possibility of inconsistencies is to logically group changes by location or check in time.)
This step would be where the delta debugging algorithm would come into play. The algorithm should basically:
1. Recursively, incrementally remove changes from the source code with the regression.
2. Recompile the source tree.
3. Execute the test case. There may be 3 outcomes:
  1. The test case passes. We know that the failure-inducing change(s) are in the change(s) that were removed.
  2. The test case fails. We know that the failure-inducing change(s) are not exclusively in the change(s) that were removed. I say not exclusively because of the concept of Interference (described above).
  3. The test case is indeterminate. There were some inconsistencies.

Project Flowchart

The flowchart represents the simplistic version of the delta debugging algorithm. It will theoretically find a failure-inducing change set but not necessarily the minimal set or the full set of failure-inducing change(s). The algorithm is depicted as recursively linear however it could be binarily recursive. In the linear version, the theoretical maximum number of iterations (worst case scenario) is:

where n represents the total number of changes and r is a subset of n.

In other words, the summation of the combinations of changes without repetitions that can be made given that the size of the change set can vary from 1 to n.

Updated Delta Debugging Flowchart.

Here are some thoughts regarding the flowchart:

The whole process revolves around a certain Test, which must be passed to complete the process. It is assumed that the source code passed this test before, but not anymore due to recent changes to the tree. The framework will locate these changes.
The Test is a versatile module and can be adapted to accept any tests the user may use.
When the initial test fails, the framework first attempts to locate which changeset causes this failure. This is done by "going back through time", retrieving the trees from previous revisions, and running each tree through the same test. The idea is to locate the latest revision where the test passes successfully.
Once this revision is identified, the framework will extract the diff, the difference between the two revisions.
The framework will then use this diff to break down the difference possibilities (e.g. directory, file, etc) and isolate the cause of the failure.
Once this is done, the framework will deliver the cause of the failure in a report for the user and the operation is finished.

Project Source Repository

Assuming you have SVN, the project's source can be obtained via SVN using the following command:

svn checkout svn://cdot.senecac.on.ca/deltadbg

Project Task List

Priority Legend
High Priority	Medium Priority	Low Priority

Status Legend
Task completed	Task started but not complete	Task not started

Task

Description

Priority

Assigned to

Status

Change set / Change

Retrieval of Change / Change set

The Granularity concept. A single revision may consist of hundreds or thousands of lines of code that were changed, yet only a couple lines of the change may be responsible for the regression. Thus, There must be a method to break the change into smaller manageable chunks. The different types of chunks we may breaking up a changeset are: Revision, Directories, Files, Code Blocks, and Lines.

High

Currently can retrieve change sets of type Revision, Directory, and File. NOT going to complete retrieval of Code Block, Line of Code change set.

Requires more thorough test suite (ChangesetTest.pl needs more test cases)

Application of Change / Change set

OK. Change sets can be retrieved. Now what? You must be able to apply a change or change set or subset of a change set to the source tree. Your mission is to figure out how to do that.

High

Can apply a changeset (specified by array of indices passed in) for a Revision, Directory, and File Changeset. NOT going to complete application of Code Block or Line changeset.

Requires test cases (ChangesetTest.pl is outdated.)

Unapplication of Change / Change set

Changesets obviously must be able to be applied. But changesets must also be able to be unapplied. Your mission is to figure out how to do that.

High

Can unapply a changeset (specified by array of indices passed in) for a Revision, Directory, and File Changeset. NOT going to complete unapplication of Code Block or Line changeset.

UNTESTED.

GNU Make (http://www.gnu.org/software/make/)

Wrapper around the GNU make utility

Mozilla uses the GNU make utility to build their source tree. your mission is to make a wrapper around the GNU make utility so that the make command can be programmatically called to build the source tree.

High

Wrapper created: makewrapper.pl. Can execute the make command with options specified by the user.

Requires more thorough test case (maketest.pl needs more test cases).

Subversion (SVN) Repository (http://subversion.tigris.org/, http://svnbook.red-bean.com/nightly/en/index.html)

Wrapper around the necessary SVN commands

For the automated debugging to work, we may need to automatically modify the working copy by reverting to a different revision or updating certain directories and files. It may also need to know the differences between revisions and changesets.

High

Wrapper created: svn.pl. Currently has subroutines for commit, update, diff, and checkout commands. May need to wrap other SVN commands.

Requires more thorough test case (svntest.pl needs more test cases).

Query SVN repository for differences between two revisions

Your mission is to find out the relevant commands that can return the differences between two revisions, the meta-data that is kept with each revision, how differences between two revisions are stored and formatted, and how this data can be parsed into a usable form for our project (wrapper?).

High

Done.

CVS/Mozilla Bonsai (http://www.mozilla.org/bonsai.html, CVS Book)
In my mind, Bonsai may be too bloated for our needs.

Wrapper around the necessary CVS commands

For the automated debugging to work, we may need to automatically modify the working copy by reverting to a different revision or updating certain directories and files. It may also need to know the differences between revisions and changesets.

Medium

Just starting out.

Query CVS repository for differences between two revisions

Your mission is to find out the relevant commands that can return the differences between two revisions, the meta-data that is kept with each revision, how differences between two revisions are stored and formatted, and how this data can be parsed into a usable form for our project (wrapper?).

Medium

Just starting out.

Test Case(s) (Tindexbox)

Creation / Extraction of Test Case(s)

We need test cases that can return whether or not the test passes or fails. Tinderbox has a couple of tests that are executed after the source is built. Extract those tests from the Tinderbox source code so that we can use them in this project. We also need a test case that can pass/fail consistently so that we can test the delta debugger.

High

Aditya Nanda Kuswanto

Work in progress. Found the tests! Now need to figure out how to run them and how they work.

Test Framework

We ideally need a way to allow users to specify the test(s) to be run easily without them having to modify the delta debugging module.

High

Work in progress.

Obtaination of Test Repositories

Obtaination of test SVN Repository

We have an SVN repository that holds our delta debugging framework source files. We need another SVN repository that we could use to test our framework.

High

Done. The URL to the test SVN repository is: svn://cdot.senecac.on.ca/deltatest

Obtaination of test CVS Repository

When the CVS version of the framework is completed, it will be useful to have a test CVS repository that we could use to test our framework.

Medium

Work in progress. The CVS repository has been created. The web interface to the repository is: here. Apparently need to just get some forwarding issues resolved.

Implementation of Delta Debugging Algorithm (Yesterday, my program worked. Today, it does not. Why?)

The Algorithm

The delta debugging algorithm. Drives the framework to retrieve change sets, apply changes, build source tree, run test case(s) to find the minimal set of failure inducing changes. The intersection of all other parts of the framework to make them work together. Ideally, should be abstract enough for easy extensibility with little impact.

High

Work in progress.

Points of Confusion

Bonsai issue -- unresolved

When I get confused, I draw diagrams.

The Clear: Seemingly Straight Forward

The RCS tree is straight forward. It will encapsulate the data and operations related to the revision control system. SVN wraps the operations of the SVN revision control system, CVS will wrap the operations of the CVS revision control system, etc.

The Build tree is straight forward. It wraps the build tool used to build the source tree.

The Blurry: Current Points of Confusion

RCS's can remember the changes (deltas) that occurred in previous versions of a file, the history of changes that occur between revisions, etc.

A Changeset and its subclasses will encapsulate the idea of a set of changes. A set of changes could be broken down into various categories such as a specific revision, a list of directories, a list of files, a list of blocks of code, and finally a line of code.

A Change and its subclasses encapsulate the idea of a single change. A change can be a change made within a directory, change made within a file, change made to block of code, or a change to a line.

A ChangesetFactory is supposed to return a change set based on the type of change set requested. To get the requested change set, one needs to know the type of revision control system (SVN, CVS, other, etc.) and/or the data required to connect to it. So there obviously need a link between RCS and ChangesetFactory/Changeset. The question is how? What is the proper/best way to link them together? One way is to pass in an RCS object to the ChangesetFactory which would then pass that object to the appropriate Changeset subclass. I don't like that solution but it's the simplest.

Also, the method to get a change set for SVN may be different from CVS. So there may be a Changeset hierarchy for SVN and another one for CVS. I don't like the idea of that at all. There must be another way.

The Blind: Future Points of Confusions

Applying a change in a changeset. Should the Changeset subclasses be able to do that? Are they the information expert? They know about the changes. Should they know how to apply them? How would we go abouts applying a subset of changes in a changeset? For example, there may have been changes in 10 different directories, how would we apply the changes from say 4 of the 10 directories and not the others?
Connecting all 3 hierarchies together. Need to be able to connect to SVN, need to be able to get and apply changes, need to be able to build the source tree.
The actual delta debugging algorithm.

But that's all for the future.

Project News