Difference between revisions of "Delta debugging framework"

From CDOT Wiki
Jump to: navigation, search
(Updated DD Flowchart)
m (Reverted edits by Saoagent (Talk) to last revision by Reed)
 
(65 intermediate revisions by 6 users not shown)
Line 7: Line 7:
 
([[#top|↑ top]])
 
([[#top|↑ top]])
  
[http://en.wikipedia.org/wiki/Delta_Debugging Delta debugging] is an automated approach to debugging that isolates failures systematically.  Given a failing test that can be mechanically verified (including a browser crash), [http://en.wikipedia.org/wiki/Delta_Debugging delta debugging] is a way of automatically isolating the change that introduced the failure.  Having a framework in place to pull builds from CVS, bisect by date and change set (using [http://en.wikipedia.org/wiki/Bonsai_CVS_code_management_system bonsai ] data -- remember, CVS doesn't have changesets!), and report results would let computers make developers more productive.
+
[http://en.wikipedia.org/wiki/Delta_Debugging Delta debugging] is an automated approach to debugging that isolates failures systematically.  Given a failing test that can be mechanically verified (including a browser crash), [http://en.wikipedia.org/wiki/Delta_Debugging delta debugging] is a way of automatically isolating the change that introduced the failure.   
 +
 
 +
For developers, the occurrence of a scenario similar to this happens all too often: a developer codes a piece of functionality that works. Then, over a period of time, multiple changes to the program source files are made and that piece of functionality stops working. The cause of the regression could be any of the changes that were made since the time the functionality was last known to work.
 +
 
 +
To isolate the cause of the regression, the developer begins the debugging process. Generally, debugging is a manual process where the developer must walk through the code while trying to keep track of variables and function calls. Sure, there are debuggers that can help you keep track of variables, the call stack, watch certain blocks of code, and execute the code step by step, however debugging is still mainly a manual process.
 +
 
 +
Written in perl, given
 +
# that the source code is located in an SVN repository (support for CVS in the future)
 +
# a test case that can automatically verify whether or not a piece of functionality of a program works or not
 +
# a way to automatically build the program from the source code (if needed)
 +
the delta debugging framework aims to automatically isolate the failure-inducing changes to the source code that caused a regression.
 +
 
 +
== Project License ==
 +
 
 +
Written in perl, given a test case that can automatically verify whether or
 +
<br />not a piece of functionality of a program works or not, the delta debugging
 +
<br />framework aims to automatically isolate the failure-inducing changes to the
 +
<br />source code that caused a regression.
 +
<br />
 +
<br />
 +
Copyright (C) 2006 Richard Chu, Aditya Nanda Kuswanto, Dean William Woodside
 +
<br />
 +
<br />
 +
This program is free software; you can redistribute it and/or modify it under
 +
<br />the terms of the GNU General Public License as published by the Free Software
 +
<br />Foundation; either version 2 of the License, or (at your option) any later
 +
<br />version.
 +
<br />
 +
<br />
 +
This program is distributed in the hope that it will be useful, but WITHOUT
 +
<br />ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
 +
<br />FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
 +
<br />
 +
<br />
 +
You should have received a copy of the GNU General Public License along with
 +
<br />this program; if not, write to the Free Software Foundation, Inc., 59 Temple
 +
<br />Place, Suite 330, Boston, MA 02111-1307 USA
 +
<br />
 +
<br />
 +
[http://www.opensource.org/licenses/gpl-license.php The GNU General Public License (GPL) Version 2, June 1991]
 +
<br />
 +
<br />Contact Information about the authors of the delta debugging framework can be found
 +
<br />on our individual profile pages.
  
 
== Project Leader(s) ==
 
== Project Leader(s) ==
Line 24: Line 66:
  
 
''Name(s) of people casually working on the project, or who have contributed significant help. Include links to personal pages within wiki. <br />NOTE: only Project Leader(s) should add names here.  You '''can’t''' add your own name to the Contributor list.''
 
''Name(s) of people casually working on the project, or who have contributed significant help. Include links to personal pages within wiki. <br />NOTE: only Project Leader(s) should add names here.  You '''can’t''' add your own name to the Contributor list.''
 +
 +
[[User:Reed|Reed Loden]] - Setup the CVS repository for us with the web front-end.  Provided direction in querying Bonsai (a means to extract the output in XML).<br>
 +
[[User:Elichak|Liz Chak]] - Documentation of the subroutines in the svn.pl and makewrapper.pl source files.
 +
 +
== Project Source Repository ==
 +
 +
([[#top|&uarr; top]])
 +
 +
Assuming you have [http://subversion.tigris.org/ SVN], the project's source can be obtained via SVN using the following command:
 +
 +
svn checkout svn://cdot.senecac.on.ca/deltadbg
 +
 +
The source can also be obtained at the following links:
 +
* [http://matrix.senecac.on.ca/~rchu2/ddf/ddf.zip Delta Debugging Framework.zip]
 +
* [http://matrix.senecac.on.ca/~rchu2/ddf/ddf.tar.bz2 Delta Debugging Framework.tar.bz2]
 +
 +
The test cases can be obtained via SVN using the following command
 +
 +
svn checkout svn://cdot.senecac.on.ca/deltatest
 +
 +
The source can also be obtained at the following links:
 +
* [http://matrix.senecac.on.ca/~rchu2/ddf/deltatest.zip DeltaTest.zip]
 +
* [http://matrix.senecac.on.ca/~rchu2/ddf/deltatest.tar.bz2 DeltaTest.tar.bz2]
  
 
== Project Details ==
 
== Project Details ==
Line 71: Line 136:
 
Now that we are aware of the different concepts that we must take into account with regards to delta debugging, the next section will outline some facts and assumptions that are being made, and attempt to define the vision and process of the delta debugging framework.
 
Now that we are aware of the different concepts that we must take into account with regards to delta debugging, the next section will outline some facts and assumptions that are being made, and attempt to define the vision and process of the delta debugging framework.
  
== Project Facts and Assumptions ==
+
 
 +
== Project Principles ==
  
 
([[#top|&uarr; top]])
 
([[#top|&uarr; top]])
 
'''Project Facts:'''
 
  
 
# The source tree for the Mozilla project is HUGE. With many different source file types (C++, JS, XUL, etc.) in many different directories.
 
# The source tree for the Mozilla project is HUGE. With many different source file types (C++, JS, XUL, etc.) in many different directories.
Line 84: Line 148:
 
# The developer has a test case that can be used indicate whether the test passes/fails/is indeterminate.
 
# The developer has a test case that can be used indicate whether the test passes/fails/is indeterminate.
 
# The developer will NOT know the date/version of the last known good version.
 
# The developer will NOT know the date/version of the last known good version.
# Bonsai is a tool that can produce a list of differences between versions of a source file. (Bonsai's functionality has not been examined closely yet but will have to as it may be a key component to the framework)
 
 
 
'''Possible Vision of the Delta Debugging Framework''':
 
 
(subject to change based on stakeholder consultation/feedback, feasibility study)
 
 
# Since the last time a developer executed a test case that passed, the developer modified some source files. The source files may be of the same type or mixed type, same directory or different directory. It shouldn't matter. The framework should be source type and location agnostic. Upon executing the test case again, the result is now a failure. The developer panics. It's only days before the deadline to submit bug patches before the source tree is supposed to be closed for release and the bug is a blocker. The developer doesn't want to be shamed for delaying the release, and the source code is too complex to find the bug in time, so what should they do? Use the delta debugging framework! that's what. How? you may ask. Well keep reading to find out. <small>* scenario may vary.</small>
 
# The delta debugging framework may require the developer to input one piece of information. The test case/function that used to pass but now fails. It will be used to determine whether the source files with progressive changes passes/fails the test.
 
# Once the developer has inputted this piece of information, it will use Bonsai to query the source tree and compile a list of all the changes to the source files since a certain amount of time.
 
# (If there was a method of determining change dependencies so as to eliminate the possibility of inconsistencies, it would be done in this step. One possible way of reducing the possibility of inconsistencies is to logically group changes by location or check in time.)
 
# This step would be where the delta debugging algorithm would come into play. The algorithm should basically:
 
## Recursively, incrementally remove changes from the source code with the regression.
 
## Recompile the source tree.
 
## Execute the test case. There may be 3 outcomes:
 
### The test case passes. We know that the failure-inducing change(s) are in the change(s) that were removed.
 
### The test case fails. We know that the failure-inducing change(s) are not exclusively in the change(s) that were removed. I say not exclusively because of the concept of Interference (described above).
 
### The test case is indeterminate. There were some inconsistencies.
 
 
  
 
== Project Flowchart ==
 
== Project Flowchart ==
Line 122: Line 167:
 
[[Image:Dd_flowchart1.PNG]]
 
[[Image:Dd_flowchart1.PNG]]
  
== Project Source Repository ==
+
Here are some thoughts regarding the flowchart:
 +
# The whole process revolves around a certain '''Test''', which must be passed to complete the process. It is assumed that the source code passed this test before, but not anymore due to recent changes to the tree. The framework will locate these changes.
 +
# The '''Test''' is a versatile module and can be adapted to accept any tests the user may use.
 +
# When the initial test fails, the framework first attempts to locate which changeset causes this failure. This is done by "going back through time", retrieving the trees from previous revisions, and running each tree through the same test. The idea is to locate the latest revision where the test passes successfully.
 +
# Once this revision is identified, the framework will extract the '''diff''', the difference between the two revisions.
 +
# The framework will then use this '''diff''' to break down the difference possibilities (e.g. directory, file, etc) and isolate the cause of the failure.
 +
# Once this is done, the framework will deliver the cause of the failure in a report for the user and the operation is finished.
 +
 
 +
 
 +
== Project Test Cases ==
 +
 
 +
The test cases used in this project are located in the [[Delta_debugging_testcases|Delta Debugging Testcases page]].
 +
 
 +
== Project Roadmap ==
 +
 
 +
([[#top|&uarr; top]])
 +
 
 +
[[Delta Debugging Framework Roadmap|Delta Debugging Framework Roadmap]]
 +
 
 +
The page outlines our current vision of the delta debugging framework and a roadmap of the work that needs to be completed to accomplish our vision. This roadmap may be subject to change and/or expanded as our vision expands through feedback and requests from others, our own ideas, etc.
 +
 
 +
== Partial Class Diagram ==
  
 
([[#top|&uarr; top]])
 
([[#top|&uarr; top]])
  
Assuming you have [http://subversion.tigris.org/ SVN], the project's source can be obtained via SVN using the following command:
+
Most of the classes in blue exist in the source repository. The classes in pale yellow are classes that won't be completed in the first release.
 +
 
  
svn checkout svn://cdot.senecac.on.ca/deltadbg
+
[[Image:Dd_partialclassdiagram2.PNG]]
  
== Project Task List ==
+
 
 +
== Project News ==
  
 
([[#top|&uarr; top]])
 
([[#top|&uarr; top]])
  
<table style="width: 100%;" class="standard-table" border="1" cellpadding="1" cellspacing="1">
+
''This is where your regular updates will go.  In these you should discuss the status or your work, your interactions with other members of the community (e.g., Seneca and Mozilla), problems you have encountered, etc. Put detailed technical information into the Project Details page (i.e., update it as you go), and save this section for news about participation in the project.''
    <tr>
+
 
        <th>Task</th>
+
=== Dec. 22, 2006 ===
        <th>Description</th>
+
 
        <th>Assigned to</th>
+
I haven't posted an update in a while. So what's been done?
        <th>Status</th>
+
 
    </tr>
+
I finally had some time to do a second round of testing & debugging of the delta debugging framework. And guess what? It ''seems'' to work now. The problem? Combination of logical errors when applying and unapplying changes in the framework and a bad test case. Go figure.
 +
 
 +
However, before I get ahead of myself and officially tag and release the delta debugging framework as version 0.1, I would like to test it out on another test program. Hopefully, this can be done this weekend.  And if all goes well, version 0.1 will be officially released before the end of the year.
  
  
    <tr>
 
        <td colspan="4"><strong>Change set / Change</strong>
 
        </td>
 
    </tr>
 
  
    <tr>
+
=== Dec. 13, 2006 ===
        <td>Retrieval of Change / Change set</td>
 
        <td>The Granularity concept. A single revision may consist of hundreds or thousands of lines of code that were changed, yet only a couple lines of the change may be responsible for the regression. Thus, There must be a method to break the change into smaller manageable chunks. The different types of chunks we may breaking up a changeset are: Revision, Directories, Files, Code Blocks, and Lines.</td>
 
        <td>[[User:RichardChu|Richard Chu]]</td>
 
        <td>Work in progress. Currently can retrieve change sets of type Revision, Directory, and File. Need to complete retrieval of Code Block, Line of Code change set? <br />Requires through test suite (ChangesetTest.pl needs more test cases)</td>
 
    </tr>
 
  
    <tr>
+
Created [[Delta_debugging_testcases|Delta Debugging Testcases]] page to discuss the nature of the test cases created to test the algorithm. Included in the page are 2 testcases created so far, the '''HelloWorld''' binary test and the '''Sudoku''' test. Both tests can be found in the '''deltatest''' svn repository. The repository can be checked out using this command:
        <td>Application of Change / Change set</td>
+
<pre> svn checkout svn://cdot.senecac.on.ca/deltatest </pre>
        <td>OK. Change sets can be retrieved. Now what? You must be able to apply a change or change set or subset of a change set to the source tree. Your mission is to figure out how to do that.</td>
 
        <td>[[User:RichardChu|Richard Chu]]</td>
 
        <td>Work in progress. Can apply a change (specified by index passed in) from a Revision, Directory, and File Changeset. Do we want to be able to pass in an array of indices and apply the changes associated with those indices? Requires some thought. <br />Requires through test suite (ChangesetTest.pl needs more test cases)</td>
 
    </tr>
 
  
 +
Exactly 12 days before Christmas, the delta debugging framework has been released under the [http://www.opensource.org/licenses/gpl-license.php GPL Version 2] License.
  
    <tr>
+
Unfortunately, we haven't had the time to test the delta debugger much since Dec. 09, 2006 because of exams, other school work; Planning to spend some time this weekend to test the delta debugger and figure out why it currently seems to not be able find the minimal set of failure inducing directories/files (whether its because of unreliable test case or logical error in the program).
        <td colspan="4"><strong>GNU Make</strong> ([http://www.gnu.org/software/make/ http://www.gnu.org/software/make/])
 
        </td>
 
    </tr>
 
  
    <tr>
+
A roadmap of our vision of the direction of the project will be heading in the future will be created and posted soon.
        <td>Wrapper around the GNU make utility</td>
 
        <td>Mozilla uses the GNU make utility to build their source tree. your mission is to make a wrapper around the GNU make utility so that the make command can be programmatically called to build the source tree.</td>
 
        <td>[[User:RichardChu|Richard Chu]]</td>
 
        <td>Work in progress. Initial wrapper created: ''makewrapper.pl''. Requires thorough test case (''maketest.pl'' needs more test cases).</td>
 
    </tr>
 
  
  
 +
=== Dec. 11, 2006 ===
  
    <tr>
+
Uploaded testcase for '''HelloWorld''' binary at '''deltatest svn'''. The test simulates the error that may occur when compilation fails due to syntax error. The exalted HelloWorld program is located on the HelloWorld directory, while the test definition is at HelloTestCase1.pm. The algorithm detects failed test and reverts the affected file to the version where the test passes.
        <td colspan="4"><strong>Subversion (SVN) Repository</strong> ([http://subversion.tigris.org/ http://subversion.tigris.org/], [http://svnbook.red-bean.com/nightly/en/index.html http://svnbook.red-bean.com/nightly/en/index.html])
 
        </td>
 
    </tr>
 
  
    <tr>
+
Note for the future: improve user feedback functions!
        <td>Wrapper around the necessary SVN commands</td>
 
        <td>For the automated debugging to work, we may need to automatically modify the working copy by reverting to a different revision or updating certain directories and files. It may also need to know the differences between revisions and changesets.</td>
 
        <td>[[User:RichardChu|Richard Chu]]</td>
 
        <td>Work in progress. Initial wrapper created: ''svn.pl''. Currently has subroutines for ''commit'', ''update'', ''diff'', and ''checkout'' commands. May need to wrap other SVN commands. Requires thorough test case (''svntest.pl'' needs more test cases).</td>
 
    </tr>
 
  
    <tr>
 
        <td>Query SVN repository for differences between two revisions</td>
 
        <td>Your mission is to find out the relevant commands that can return the differences between two revisions, the meta-data that is kept with each revision, how differences between two revisions are stored and formatted, and how this data can be parsed into a usable form for our project (wrapper?).</td>
 
        <td>[[User:RichardChu|Richard Chu]]</td>
 
        <td>Work in progress.</td>
 
    </tr>
 
  
 +
=== Dec. 10, 2006 ===
  
 +
Where is the CVS/Bonsai work heading?  Here is a breakdown of the past 3-4 weeks:
 +
* Initially was going for a straight wrapper around CVS ala the style Richard used for SVN.
 +
* Tried to find some functionality within Bonsai that could make it easier.
 +
* Talked to Reed Loden, he set up a repository for us to try with.  Thanks Reed!
 +
* Thought that there may be some additional (read: unpublished) tools that could be worked with.  Got in contact with some of the "Project Participants" listed on [http://www.mozilla.org/projects/bonsai/].  Was told the person in particular wasn't a contributor (just submitted a bug report).  They in turn pointed me to [irc://irc.mozilla.org/#mozwebtools #mozwebtools].
 +
* Lurked on [irc://irc.mozilla.org/#mozwebtools #mozwebtools] for a few weeks.  Talked to 'justdave' about Bonsai.  Reed Loden chimed up and informed me that Bonsai can output to XML using ?xml=1 on the query (score!  thanks again).
 +
* Researched some PERL parsing utilities.  Trying out XML::LibXML for DOM-style parsing.
 +
* Hopefully wrap something up by Wednesday.  Failing that, might just go with simple CVS wrapper of some sort.
  
    <tr>
 
        <td colspan="4"><strong>CVS/Mozilla Bonsai</strong> ([http://www.mozilla.org/bonsai.html http://www.mozilla.org/bonsai.html], [http://cvsbook.red-bean.com/OSDevWithCVS_3E.pdf CVS Book])<br />You can do these tasks by trying to interpret the Bonsai source code yourself, or preferably by finding a person who has intimate knowledge of the Bonsai source code and asking them.
 
        </td>
 
    </tr>
 
  
    <tr>
 
        <td>Query CVS via Bonsai for checkins</td>
 
        <td>You can use Bonsai to search for the checkins made within a certain time frame, within a certain directory, made by a certain developer, etc. Your mission is to find the relevant source files, functions, variables, etc. that drive this functionality. </td>
 
        <td>TBD.</td>
 
        <td>Not started.</td>
 
    </tr>
 
  
    <tr>
+
=== Dec. 09, 2006 ===
        <td>Results of querying CVS via Bonsai for checkins</td>
 
        <td>Bonsai obviously returns results from the query. The question is how? Your mission is to find the relevant source files, functions, variables, etc. that is used to return and store results. What you need to find out is what type of data is returned and how are the results formatted?</td>
 
        <td>TBD.</td>
 
        <td>Not started.</td>
 
    </tr>
 
  
    <tr bgcolor="#F0F0F0">
+
What has been done since last week?
        <td>Query CVS via Bonsai for version history</td>
+
* Got a test program and uploaded it to svn://cdot.senecac.on.ca/deltatest. The pristine working version is revision 4. The latest committed copy is revision 8. The regressive code was committed somewhere in between.  
        <td>For each file in the CVS repository, Bonsai has the ability to list a history of versions. From the first to the latest. Your mission is to find out the relevant source files, functions, variables, etc. that are used to obtain the version history for a certain source file.</td>
+
* Started testing the delta debugging framework.  
        <td>TBD.</td>
 
        <td>Not started.</td>
 
    </tr>
 
  
    <tr bgcolor="#F0F0F0">
+
The results of the testing?
        <td>Results of querying CVS via Bonsai for version history</td>
 
        <td>Your mission is to find out the relevant source files, functions, variables, etc. that are used to return and store the results of searching for a file's version history. What data is returned and how is the data stored/formatted?</td>
 
        <td>TBD.</td>
 
        <td>Not started.</td>
 
    </tr>
 
  
    <tr>
+
'''Finding the minimal revision set/last known good revision'''
        <td>Query CVS via Bonsai for differences between versions</td>
 
        <td>Using Bonsai, a user can see the differences between two different versions of a source file. Your mission is to find out the relevant source files, functions, variables, etc. that are used to find the differences between two different versions of a source file.</td>
 
        <td>TBD.</td>
 
        <td>Not started.</td>
 
    </tr>
 
  
    <tr>
+
Works. The delta debugger correctly reverts to a previous revision, builds the source code, and runs the test case. The test case returns the proper results on whether or not it passes or fails. The delta debugger correctly stops at revision 4 - the last known good version.
        <td>Results of querying CVS via Bonsai for differences between versions</td>
 
        <td>Using Bonsai, a user can see the differences between two different versions of a source file. How are the results returned? How is it formatted? Your mission is to find out the relevant source files, functions, variables, etc. that are used to return the results of the query. You are to find out how the data is returned and how it is formatted.</td>
 
        <td>TBD.</td>
 
        <td>Not started.</td>
 
    </tr>
 
  
    <tr>
+
'''Finding the minimal failure-inducing set of directories'''
        <td colspan="4"><strong>Test Case(s)</strong> ([http://www.mozilla.org/tinderbox.html Tindexbox])<br />
 
        </td>
 
    </tr>
 
  
    <tr>
+
Indeterminate. There is only 1 directory in the source repository so that directory should be returned as the minimal failure inducing set of directories. Does it return it? yes and no.
        <td>Creation / Extrapolation of Test Case(s)</td>
 
        <td>We need test cases that can return whether or not the test passes or fails. Tinderbox has a couple of tests that are executed after the source is built. Extrapolate those tests from the Tinderbox source code so that we can use them in this project. We also need a test case that can pass/fail consistently so that we can test the delta debugger.</td>
 
        <td>[[User:Ankuswan|Aditya Nanda Kuswanto]]</td>
 
        <td>Work in progress. Found the tests! Now need to figure out how to run them and how they work.</td>
 
    </tr>
 
  
 +
The delta debugger correctly applies all of the changes within that directory. And I think it correctly builds the source tree and runs the test case. However, the return code of the test case is not as expected. I expect the test case to report that the test fails, however, it reports that it passes. Thus, the delta debugger returns no directories as failure inducing.
  
 +
However, if I force the test case to return the expected result, then the delta debugger correctly returns the directory as the failure-inducing one.
  
    <tr>
+
I suspect (or at least hope) that the indeterminate results of finding the failure inducing set of directories is because of a possibly unreliable or inconsistent test case. However, I can not be sure until I rule out the test case as the problem.
        <td colspan="4"><strong>Implementation of Delta Debugging Algorithm</strong> ([http://www.infosun.fmi.uni-passau.de/st/papers/tr-99-01/ Yesterday, my program worked. Today, it does not. Why?])<br />
 
        </td>
 
    </tr>
 
    <tr>
 
        <td>The Algorithm</td>
 
        <td>The delta debugging algorithm. Drives the framework to retrieve change sets, apply changes, build source tree, run test case(s) to find the minimal set of failure inducing changes. The intersection of all other parts of the framework to make them work together. Ideally, should be abstract enough for easy extensibility with little impact.</td>
 
        <td>[[User:dwwoodsi|Dean Woodside]]</td>
 
        <td>Work in progress. Check the SVN repository from time to time.</td>
 
    </tr>
 
  
</table>
+
'''Finding the minimal failure-inducing set of files'''
  
== Project References ==
+
Indeterminate. There are multiple source files in the repository. Does it return the correct failure-inducing source file? I don't know. I have the same suspicions for this as for the directory changeset.
  
([[#top|&uarr; top]])
+
Based on the testing, it seems to be able to cycle through every combination of changes in the changeset, apply the combination of changes, build the source code, and run the test case. The test case just seems to not report the correct test results.
 +
 
 +
 
 +
 
 +
=== Dec. 03, 2006 ===
 +
 
 +
Committed some updated to the SVN repository.
 +
* The test framework. There are a couple of files to the framework: Test.pl, TestCase.pl, TestSuite.pl, TestResult.pl, TestRunner.pl. It is loosely based off of the design of the JUnit framework. Why such an elaborate design just for the need of users to define the test case that can determine whether or not a piece of functionality works or not to be run? For a few reasons that I may be adamant about:
 +
# To use the delta debugging framework, the user should not have to touch the DeltaDebugger.pl file to define the tests and how to run them. Using the testing framework, this can be done by subclassing the TestCase.pl class and overriding the run() subroutine.
 +
# For the delta debugger to work, it needs to know whether the test case passes or fails. Using the test framework, I hope to control the possible return codes of the tests to either pass or fail only.
 +
* testtest.pl that tests the functionality of the test framework.
 +
* updates to DeltaDebugger.pl to make use of the test framework.
 +
 
 +
 
 +
Crunch time. One week left. The high priority tasks that still need to be done:
 +
# Acquisition of a program we could use to test the delta debugging framework. See [[#How_to_Get_Involved|How To Get Involved]] for more info.
 +
# Test, debug the delta debugging framework.
 +
 
 +
 
 +
=== Nov. 26, 2006 ===
 +
 
 +
Committed some updates to the SVN repository.
 +
* Updated the delta debugging algorithm module. I didn't realize this yesterday but the algorithm to find the minimal set of failure inducing files (and code block and line of code changes if those changeset types ever gets dones) is the same (with minor modifications) as the algorithm that can find the minimal set of failure inducing directories. Thus I generalized that algorithm to remove the directory changeset specific code so that it will work with all other types of changesets.
 +
* Removed the debugging/test related code from the source files.
 +
CVS Repository Setup (thanks to [[user:reed|Reed Loden!]]): '''hera.senecac.on.ca/deltatest'''
 +
*[http://hera.senecac.on.ca:43080/viewvc.cgi/?root=deltatest ViewVC Web Repository Browser]
 +
*If you want commit access for whatever reason, email one of the project members
 +
 
 +
'''Milestone:'''
 +
* Even though the test framework is incomplete, I think we can go ahead and begin the initial testing of the delta debugger on a real regressive program as I think we are ready. Coincidentally, exactly 2 months after the first project news posting on Sept. 26, 2006.
  
[http://programming.newsforge.com/article.pl?sid=05/06/30/1549248&from=rss NewsForge: An Introduction to Delta Debugging]
+
=== Nov. 25, 2006 ===
Delta debugging simplifies debugging process in a program by automating the process and continually splitting the program to smaller chunks called deltas. This technique is useful in three circumstances:
 
*Error occurs due to user inputs (e.g. keypress, file I/O). Delta debugging is used to eliminate user actions irrelevant to the nature of the error and pinpoint the cause of the error.
 
*Error occurs due to recent changes to the code. In this situation, deltas are retrieved from the net differences from both codes.
 
*Multithreading environment. Delta debugging can track down the exact order of operations originating from multiple threads that caused the error.
 
  
== Points of Confusion ==
+
I haven't posted an update in a while so here goes. What's been done since then?
  
([[#top|&uarr; top]])
+
Committed some updates to the SVN repository.
 +
* Modified the Changeset hierarchy of classes. Added a getChange() subroutine that takes an index and retrieves the change from the changeset. Also modified the getChangeset() subroutine to optionally take an array of directories/files to limit the search scope to within the directories/files passed in. These changes are possibly dangerously untested.
 +
* Committed the DeltaDebugger.pl file. This file houses the actual delta debugging algorithm. It requires three user-defined pieces of information: a Build object, an RCS object, and the automated test cases. Currently, it can theoretically find the failure inducing revision, and the minimal failure inducing set of directories.
 +
* Committed a DeltaDebuggerTest.pl file. It just tests the correctivity of the theoretical.
  
[[Bonsai issue]] -- '''unresolved'''
 
  
 +
In the works:
 +
* Continue working on the delta debugging algorithm. Need to be able to find the minimal failure inducing set of files.
 +
* Test framework. Allow users to plug in test cases/suites without touching the DeltaDebugger.pl module.
  
When I get confused, I draw diagrams.
 
  
 +
The deadline for a version 0.01 release is looming. 1-2 weeks left to get this done. What needs to be done to accomplish this?
 +
* Finish everything that is in the works real soon.
 +
* Need a test program that we could use and upload to our test SVN repository to test the delta debugging framework. Ideally, the test program will meet the following requirements:
 +
# Has source files that span multiple directories wide and deep yet be small enough that the delta debugging can be done in a short amount of time so that all aspects of the delta debugger can be tested.
 +
# Has a regression. Or can easily be modified so that some functionality will stop working.
 +
# Has an automated test case that tests the regressive functionality.
 +
* Put theory into practice. So far the delta debugging algorithm has not been tested on a real program. The correctness of the algorithm has only been confirmed in theory. We need to test the algorithm in a production environment real soon.
  
[[Image:Dd_partialclassdiagram.PNG]]
 
  
 +
=== Nov. 19, 2006 ===
  
'''The Clear: Seemingly Straight Forward'''
+
The earlier crash case we had (see the update directly below) was a non-regressive bug--there was no former build that worked with it.
  
The RCS tree is straight forward. It will encapsulate the data and operations related to the revision control system. SVN wraps the operations of the SVN revision control system, CVS will wrap the operations of the CVS revision control system, etc.
+
Going to use [https://bugzilla.mozilla.org/show_bug.cgi?id=325377 Bug #325377] instead.  Having difficulty identifying when it was first introduced--the information in the bug report doesn't seem to be quite accurate.  Using the nightly builds as archived at [http://archive.mozilla.org/pub/mozilla/nightly/ http://archive.mozilla.org/pub/mozilla/nightly/] to narrow it down.
  
The Build tree is straight forward. It wraps the build tool used to build the source tree.
+
Fortunately this crash is easily automated and does not require user interaction.
  
'''The Blurry: Current Points of Confusion'''
 
  
RCS's can remember the changes (deltas) that occurred in previous versions of a file, the history of changes that occur between revisions, etc.
+
=== Nov. 18, 2006 ===
  
A Changeset and its subclasses will encapsulate the idea of a set of changes. A set of changes could be broken down into various categories such as a specific revision, a list of directories, a list of files, a list of blocks of code, and finally a line of code.
+
*<strike>Found a suitable crash case thanks to the people of [irc://irc.mozilla.org#qa #qa] (in particular, asqueella and Aleksej).  For full details on the bug, see [https://bugzilla.mozilla.org/show_bug.cgi?id=354300 Bug #354300].</strike>
  
A Change and its subclasses encapsulate the idea of a single change. A change can be a change made within a directory, change made within a file, change made to block of code, or a change to a line.
+
*Talked to Reed Loden on IRC. He will be setting up a CVS repository for us something this coming week (Tuesday at earliest).
  
A ChangesetFactory is supposed to return a change set based on the type of change set requested.  To get the requested change set, one needs to know the type of revision control system (SVN, CVS, other, etc.) and/or the data required to connect to it.  So there obviously need a link between RCS and ChangesetFactory/Changeset. The question is how? What is the proper/best way to link them together? One way is to pass in an RCS object to the ChangesetFactory which would then pass that object to the appropriate Changeset subclass. I don't like that solution but it's the simplest.
 
  
Also, the method to get a change set for SVN may be different from CVS. So there may be a Changeset hierarchy for SVN and another one for CVS. I don't like the idea of that at all. There must be another way.
+
=== Nov. 17, 2006 ===
  
'''The Blind: Future Points of Confusions'''
+
Committed some updates to the SVN repository.
 +
* Changed applyChanges subroutine to take array of indices instead of scalar of an index.
 +
* Added unapplyChanges subroutine to Changeset classes.
 +
* [http://www.cpan.org/modules/by-module/Math/Math-Combinatorics-0.08.readme Math::Combinatorics], shamelessly stolen from [http://www.cpan.org/modules/by-module/Math/Math-Combinatorics-0.08.tar.gz here]. This module is used in the Delta Debugging Algorithm module to help find the minimal failure-inducing changeset.
  
* Applying a change in a changeset. Should the Changeset subclasses be able to do that? Are they the information expert? They know about the changes. Should they know how to apply them? How would we go abouts applying a subset of changes in a changeset? For example, there may have been changes in 10 different directories, how would we apply the changes from say 4 of the 10 directories and not the others?
+
In the pipeline:
* Connecting all 3 hierarchies together. Need to be able to connect to SVN, need to be able to get and apply changes, need to be able to build the source tree.
+
* Delta Debugging Algorithm partially complete. Unthoroughly tested though can theoretically find the directories that contain the failure inducing changes.
* The actual delta debugging algorithm.  
+
* Test cases and samples we may be able to use to test the algorithm.
  
But that's all for the future.
+
Uploaded files into '''scen1''' directory, containing test module for '''binaryTest'''. The test is ready to be used in the algorithm. The directory contains:
 +
* '''binaryTest.pl''' - test to detect the existence of a file.
 +
* '''helloWorld.pl''' - enough said!
 +
* '''binaryTestCaller.pl''' - runs '''helloWorld.pl''', pipe the result to '''hello.log''', and have '''binaryTest.pl''' attempt to detect it.
 +
This is the working version of the code, labeled '''revision 12'''. Now I have to find a way to wreck it........
  
== Project News ==
 
  
([[#top|&uarr; top]])
+
=== Nov. 14, 2006 ===
  
''This is where your regular updates will go.  In these you should discuss the status or your work, your interactions with other members of the community (e.g., Seneca and Mozilla), problems you have encountered, etc. Put detailed technical information into the Project Details page (i.e., update it as you go), and save this section for news about participation in the project.''
+
The development of testing system for the framework is in the works. The first scenario revolves around a test called '''BinaryExist''', which has been shamelessly ripped from the Tinderbox script. All this test does is check whether a given file exists in the system. While this test can aspire for great things, right now it's doing simple thing, like checking whether its client Hello World program is doing what it's supposed to. Initial testing reveals that this test has potential. Will be uploaded to the SVN soon.
  
  
Line 337: Line 375:
  
 
Attempts to run the tests have so far been unsuccessful. If someone (hint hint) could figure out how to run these tests and how these tests work that would be great.
 
Attempts to run the tests have so far been unsuccessful. If someone (hint hint) could figure out how to run these tests and how these tests work that would be great.
 +
  
 
=== Oct. 31/Nov. 01, 2006 ===
 
=== Oct. 31/Nov. 01, 2006 ===
Line 473: Line 512:
  
 
I read through your documentation here, and it is looking good.  I also spoke to Shaver by phone this morning, and we chatted briefly about this project.  He suggests that you start your work by looking for a suitable '''Crash Case''', one that happens reliably.  Then you need at what would be necessary in order to bisect the change set (e.g., [http://www.mozilla.org/bonsai.html bonsai] data) in order to get closer to the change that introduced the bug.  Shaver suggested that robc (Rob Campbell) might be a good person to help you brainstorm on this.
 
I read through your documentation here, and it is looking good.  I also spoke to Shaver by phone this morning, and we chatted briefly about this project.  He suggests that you start your work by looking for a suitable '''Crash Case''', one that happens reliably.  Then you need at what would be necessary in order to bisect the change set (e.g., [http://www.mozilla.org/bonsai.html bonsai] data) in order to get closer to the change that introduced the bug.  Shaver suggested that robc (Rob Campbell) might be a good person to help you brainstorm on this.
 +
 +
== How to Get Involved ==
 +
 +
We need a test program that we could use and upload to our test SVN repository to test the delta debugging framework. Ideally, the test program will meet the following requirements:
 +
# Has source files that span multiple directories wide and deep yet be small enough that the delta debugging can be done in a short amount of time so that all aspects of the delta debugger can be tested.
 +
# Has a regression. Or can easily be modified so that some functionality will stop working.
 +
# Has an automated test case that tests the regressive functionality.
 +
If you don't have a program that meets the first requirement, we could also use test programs that have multiple source files. The key being that the program has more than one source file. Programs that are contained in only one source files are useless to us.
 +
 +
If you have a program that meets these requirements, and you want to contribute to this project, then holla.
 +
 +
 +
<hr />
 +
 +
 +
If you are looking for an easy way in which to contribute to this project, you can jump in by writing one or more tests for the test suite.  This does not require that you learn about the delta debugging inner-workings or structure.
 +
 +
Basic Advice:
 +
* You '''must''' be able to automate the test--no human intervention is allowed.
 +
* Possible test types include:
 +
*: '''Crashing'''
 +
*:: Can you crash the program with a minimal collection of circumstances (steps) that are easily reproducable? (In other words, can you write a script so that this happens in a controlled manner.)
 +
*: '''Performance-related'''
 +
*:: Is there a threshold for unacceptable consumption of time and/or space that is reason for concern?
 +
*: '''Program hanging'''
 +
*:: Does the program hang?  Will it occur in a certain functionality of the software that is possible to isolate (reproduce) through scripted means?
 +
*: '''Unexpected return codes'''
 +
*:: What is a normal return code for the program?  What is considered unexpected?  Script a series of actions and pass the return code up to the test framework.
 +
* Each test will fit into the test framework (which, at this point, still has to be designed).  The tests must follow a few rules (again, undecided at this point).
 +
 +
Please check back in a few days.  Expect some templates and samples up shortly to help get you going.  <u>The currently listed test types are subject to change.</u>
 +
 +
 +
==Future of the Project==
 +
Here are some of the ideas related to the continuation of this project.  Included are some personal ideas of the team members, tasks to reach the overall objective (a working, robust, Delta Debugging Framework for Mozilla), and additional features/functionality that would enhance the framework.  This is subject to change, and a project roadmap will be written in the near future.
 +
 +
===CVS Support via Bonsai===
 +
For the exploration into Bonsai and to see where it is/was heading, please view the [[delta debugging framework bonsai direction|Bonsai Direction]].  It is likely that a workable solution could be produced utilizing some of the details found in the link.  This functionality would be particularly useful to Mozilla as this [Bonsai] is the technology they currently use.
 +
 +
===Enhancement of the Algorithm===
 +
Richard's great algorithm can be further enhanced using a binary search-like approach that splits the revision from the current, all the way back to when the regression was first noticed (or, alternatively, when the crash case last known to have worked).  Currently it works in a sequential manner, testing all previous revisions in order.
 +
 +
:'''More Granularity'''
 +
:For this course, Richard's algorithm supported down to the file-level of change.  In the future, it could go as far as evaluating changes in lines of code.
 +
 +
===Fleshed Out Test Suite Design===
 +
The test suite test types should be further fleshed out and individual tests gathered (no participation from the class was possible due to time constraints; the test suite design wasn't fully explored and documented).  Test suites could be put together for each major Mozilla.org project (Firefox, Thunderbird, Sunbird, Bugzilla, etc.).
 +
 +
===More Crash Cases===
 +
More crash cases need to be found for the success in testing the project.
 +
 +
===Unit Tests===
 +
A debugging framework, more so than other projects, should have its code quality tested and scrutinized heavily.
 +
 +
===Code Review===
 +
Perhaps some manual audits could be performed by hand from outside contributors in the future.

Latest revision as of 13:16, 17 April 2013

Project Name

Delta debugging framework

Project Description

(↑ top)

Delta debugging is an automated approach to debugging that isolates failures systematically. Given a failing test that can be mechanically verified (including a browser crash), delta debugging is a way of automatically isolating the change that introduced the failure.

For developers, the occurrence of a scenario similar to this happens all too often: a developer codes a piece of functionality that works. Then, over a period of time, multiple changes to the program source files are made and that piece of functionality stops working. The cause of the regression could be any of the changes that were made since the time the functionality was last known to work.

To isolate the cause of the regression, the developer begins the debugging process. Generally, debugging is a manual process where the developer must walk through the code while trying to keep track of variables and function calls. Sure, there are debuggers that can help you keep track of variables, the call stack, watch certain blocks of code, and execute the code step by step, however debugging is still mainly a manual process.

Written in perl, given

  1. that the source code is located in an SVN repository (support for CVS in the future)
  2. a test case that can automatically verify whether or not a piece of functionality of a program works or not
  3. a way to automatically build the program from the source code (if needed)

the delta debugging framework aims to automatically isolate the failure-inducing changes to the source code that caused a regression.

Project License

Written in perl, given a test case that can automatically verify whether or
not a piece of functionality of a program works or not, the delta debugging
framework aims to automatically isolate the failure-inducing changes to the
source code that caused a regression.

Copyright (C) 2006 Richard Chu, Aditya Nanda Kuswanto, Dean William Woodside

This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; either version 2 of the License, or (at your option) any later
version.

This program is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with
this program; if not, write to the Free Software Foundation, Inc., 59 Temple
Place, Suite 330, Boston, MA 02111-1307 USA

The GNU General Public License (GPL) Version 2, June 1991

Contact Information about the authors of the delta debugging framework can be found
on our individual profile pages.

Project Leader(s)

(↑ top)

Name(s) of primary people working on the project. If you want to join a project as leader, discuss with other leaders first. Include links to personal pages within wiki.

Richard Chu
Dean Woodside
Aditya Nanda Kuswanto

Project Contributor(s)

(↑ top)

Name(s) of people casually working on the project, or who have contributed significant help. Include links to personal pages within wiki.
NOTE: only Project Leader(s) should add names here. You can’t add your own name to the Contributor list.

Reed Loden - Setup the CVS repository for us with the web front-end. Provided direction in querying Bonsai (a means to extract the output in XML).
Liz Chak - Documentation of the subroutines in the svn.pl and makewrapper.pl source files.

Project Source Repository

(↑ top)

Assuming you have SVN, the project's source can be obtained via SVN using the following command:

svn checkout svn://cdot.senecac.on.ca/deltadbg

The source can also be obtained at the following links:

The test cases can be obtained via SVN using the following command

svn checkout svn://cdot.senecac.on.ca/deltatest

The source can also be obtained at the following links:

Project Details

(↑ top)

Provides more depth than the Project Description. This is the place for technical discussions, project specs, or other details. If this gets very long, you might consider breaking this part into multiple pages and linking to them.


Based on the papers on delta debugging that I've sifted through (with From Automated Testing to Automated Debugging by Andreas Zeller being the most understandable to non-math wizards, and Yesterday, my program worked. Today, it does not. Why? also by Andreas Zeller probably being the most relevant to our project), here is an outline of my understanding of delta debugging, a summary of the concepts that must be taken into account while working on the delta debugging framework project, and an outline of the conceptual stages of developing the delta debugging framework.


Delta debugging is an algorithm that can automatically and systematically isolate/narrow down the failure-inducing circumstances that are necessary to produce the bug. Delta debugging can be applied to isolating various types of failure-inducing circumstances, including:

  • program input
  • user interaction (key presses, button presses, mouse clicks, etc.)
  • program code modification (adding / updating / deleting variables, functions, classes, etc.)


How? From my understanding, given a known bug and a known set of circumstances that can reproduce the bug, we can continually execute tests that vary the circumstances until a minimal subset of circumstances that can reproduce the bug is left. That is, for each test, a circumstance is removed, and if the bug is still present, than that circumstance can theoretically be eliminated as the cause of the bug from the set of circumstances. In this project's case, the circumstance will be certain change(s) to the program code that caused a regression in the program.


[Note: The rest of this post contains mathematical concepts that I may not fully understand but am trying to put into terms understandable to non-math wizards. A dangerous combination.]


Before we continue, here are some important terms to understand.

  • Configuration. A subset of all changes made to source code since the last known good version and the version with the regression.
  • Test. A test case or function that can determine whether a configuration contains the failure-inducing change (Fails), doesn't contain the failure-inducing change (Passes), or produces indeterminate results (possibly because of interference or inconsistency).
  • Subset. Given two sets (A and B), A is a subset of B if and only if all of the elements of set A are in set B.
  • Superset. Given two sets (A and B), B is a superset of A if and only if B contains all of the elements of A.
  • Union. Given two sets (A and B), the union of A and B contain all elements of A and B.
  • Intersection. Given two sets (A and B), the intersection of A and B are the elements that are in both A and B.


With regards to the delta debugging algorithm, we must take into account these possible concepts that may complicate it:

  • Interference. Each individual change may not cause the program to regress, but applying a combination of changes together causes the regression. Thus, there must be a method to identify the set of changes that cause the regression.
    In the case of interference, we must recursively test both halves with all changes in one half of the configuration remaining applied.
  • Inconsistency. Changes in the source code may be dependent on other changes in the source code and without them the program cannot be compiled and tested successfully. Thus, there must be a method to identify and handle dependencies (such as a dependency graph).
    [Wild, far left field thought: How can we ensure that when applying a configuration, it is consistent? Well, in the compiler process, there is a lexical analysis step which breaks the source code into tokens, then there is a dependence analysis step which produces constraints/dependencies between the tokens. Theoretically, if we can harness parts of the compilation process, we could have a method of knowing the dependencies between the changes in the source code.]
  • Granularity. A single logical change may consist of hundreds of lines of code, yet only a couple lines of the change may be responsible for the regression. Thus, There must be a method to break the change into smaller manageable chunks.
  • Monotony. If a change causes a failure, any configuration that includes this change fails as well (makes sense to me). If a configuration does not cause a failure, then all subsets of the configuration do not cause a failure, thus they can be eliminated from the change set (makes some sense to me. However, if my understanding is correct, while a subset of the configuration does not cause the failure by itself, the concept of interference suggests that the subset combined with another configuration may cause the regression).
  • Unambiguity. A failure is caused by only one configuration (No interference). Thus, for efficiency, we do not have to search the other configuration for more failure-inducing changes. Whether or not a configuration is ambiguous or unambiguous, a failure-inducing change will be produced. However, for completeness with regard to finding all failure-inducing changes, it is preferred to search both configurations.


Now that we are aware of the different concepts that we must take into account with regards to delta debugging, the next section will outline some facts and assumptions that are being made, and attempt to define the vision and process of the delta debugging framework.


Project Principles

(↑ top)

  1. The source tree for the Mozilla project is HUGE. With many different source file types (C++, JS, XUL, etc.) in many different directories.
  2. Failure-inducing change(s) will unlikely be localized to a single directory and file. Failure-inducing change(s) may be spread across many different directories and source files.
  3. The source files could be of the same type (C++), mixed type (C++, JS), same directory, different directory. It shouldn't matter. The framework should be source type and location agnostic.
  4. The failure-inducing change(s) may not be localized to a single developer. The failure-inducing change(s) may have been caused by another developer's change(s) to a source file they were working on. That is, a single developer's source scope may not be encapsulated but interconnected and interdependent on other developers source code.
  5. The developer's current version of the source code contains the regression.
  6. The developer has a test case that can be used indicate whether the test passes/fails/is indeterminate.
  7. The developer will NOT know the date/version of the last known good version.

Project Flowchart

(↑ top)

The flowchart represents the simplistic version of the delta debugging algorithm. It will theoretically find a failure-inducing change set but not necessarily the minimal set or the full set of failure-inducing change(s). The algorithm is depicted as recursively linear however it could be binarily recursive. In the linear version, the theoretical maximum number of iterations (worst case scenario) is:

Dd maxiterations.PNG

where n represents the total number of changes and r is a subset of n.

In other words, the summation of the combinations of changes without repetitions that can be made given that the size of the change set can vary from 1 to n.


Updated Delta Debugging Flowchart.


Dd flowchart1.PNG

Here are some thoughts regarding the flowchart:

  1. The whole process revolves around a certain Test, which must be passed to complete the process. It is assumed that the source code passed this test before, but not anymore due to recent changes to the tree. The framework will locate these changes.
  2. The Test is a versatile module and can be adapted to accept any tests the user may use.
  3. When the initial test fails, the framework first attempts to locate which changeset causes this failure. This is done by "going back through time", retrieving the trees from previous revisions, and running each tree through the same test. The idea is to locate the latest revision where the test passes successfully.
  4. Once this revision is identified, the framework will extract the diff, the difference between the two revisions.
  5. The framework will then use this diff to break down the difference possibilities (e.g. directory, file, etc) and isolate the cause of the failure.
  6. Once this is done, the framework will deliver the cause of the failure in a report for the user and the operation is finished.


Project Test Cases

The test cases used in this project are located in the Delta Debugging Testcases page.

Project Roadmap

(↑ top)

Delta Debugging Framework Roadmap

The page outlines our current vision of the delta debugging framework and a roadmap of the work that needs to be completed to accomplish our vision. This roadmap may be subject to change and/or expanded as our vision expands through feedback and requests from others, our own ideas, etc.

Partial Class Diagram

(↑ top)

Most of the classes in blue exist in the source repository. The classes in pale yellow are classes that won't be completed in the first release.


Dd partialclassdiagram2.PNG


Project News

(↑ top)

This is where your regular updates will go. In these you should discuss the status or your work, your interactions with other members of the community (e.g., Seneca and Mozilla), problems you have encountered, etc. Put detailed technical information into the Project Details page (i.e., update it as you go), and save this section for news about participation in the project.

Dec. 22, 2006

I haven't posted an update in a while. So what's been done?

I finally had some time to do a second round of testing & debugging of the delta debugging framework. And guess what? It seems to work now. The problem? Combination of logical errors when applying and unapplying changes in the framework and a bad test case. Go figure.

However, before I get ahead of myself and officially tag and release the delta debugging framework as version 0.1, I would like to test it out on another test program. Hopefully, this can be done this weekend. And if all goes well, version 0.1 will be officially released before the end of the year.


Dec. 13, 2006

Created Delta Debugging Testcases page to discuss the nature of the test cases created to test the algorithm. Included in the page are 2 testcases created so far, the HelloWorld binary test and the Sudoku test. Both tests can be found in the deltatest svn repository. The repository can be checked out using this command:

 svn checkout svn://cdot.senecac.on.ca/deltatest 

Exactly 12 days before Christmas, the delta debugging framework has been released under the GPL Version 2 License.

Unfortunately, we haven't had the time to test the delta debugger much since Dec. 09, 2006 because of exams, other school work; Planning to spend some time this weekend to test the delta debugger and figure out why it currently seems to not be able find the minimal set of failure inducing directories/files (whether its because of unreliable test case or logical error in the program).

A roadmap of our vision of the direction of the project will be heading in the future will be created and posted soon.


Dec. 11, 2006

Uploaded testcase for HelloWorld binary at deltatest svn. The test simulates the error that may occur when compilation fails due to syntax error. The exalted HelloWorld program is located on the HelloWorld directory, while the test definition is at HelloTestCase1.pm. The algorithm detects failed test and reverts the affected file to the version where the test passes.

Note for the future: improve user feedback functions!


Dec. 10, 2006

Where is the CVS/Bonsai work heading? Here is a breakdown of the past 3-4 weeks:

  • Initially was going for a straight wrapper around CVS ala the style Richard used for SVN.
  • Tried to find some functionality within Bonsai that could make it easier.
  • Talked to Reed Loden, he set up a repository for us to try with. Thanks Reed!
  • Thought that there may be some additional (read: unpublished) tools that could be worked with. Got in contact with some of the "Project Participants" listed on [1]. Was told the person in particular wasn't a contributor (just submitted a bug report). They in turn pointed me to #mozwebtools.
  • Lurked on #mozwebtools for a few weeks. Talked to 'justdave' about Bonsai. Reed Loden chimed up and informed me that Bonsai can output to XML using ?xml=1 on the query (score! thanks again).
  • Researched some PERL parsing utilities. Trying out XML::LibXML for DOM-style parsing.
  • Hopefully wrap something up by Wednesday. Failing that, might just go with simple CVS wrapper of some sort.


Dec. 09, 2006

What has been done since last week?

  • Got a test program and uploaded it to svn://cdot.senecac.on.ca/deltatest. The pristine working version is revision 4. The latest committed copy is revision 8. The regressive code was committed somewhere in between.
  • Started testing the delta debugging framework.

The results of the testing?

Finding the minimal revision set/last known good revision

Works. The delta debugger correctly reverts to a previous revision, builds the source code, and runs the test case. The test case returns the proper results on whether or not it passes or fails. The delta debugger correctly stops at revision 4 - the last known good version.

Finding the minimal failure-inducing set of directories

Indeterminate. There is only 1 directory in the source repository so that directory should be returned as the minimal failure inducing set of directories. Does it return it? yes and no.

The delta debugger correctly applies all of the changes within that directory. And I think it correctly builds the source tree and runs the test case. However, the return code of the test case is not as expected. I expect the test case to report that the test fails, however, it reports that it passes. Thus, the delta debugger returns no directories as failure inducing.

However, if I force the test case to return the expected result, then the delta debugger correctly returns the directory as the failure-inducing one.

I suspect (or at least hope) that the indeterminate results of finding the failure inducing set of directories is because of a possibly unreliable or inconsistent test case. However, I can not be sure until I rule out the test case as the problem.

Finding the minimal failure-inducing set of files

Indeterminate. There are multiple source files in the repository. Does it return the correct failure-inducing source file? I don't know. I have the same suspicions for this as for the directory changeset.

Based on the testing, it seems to be able to cycle through every combination of changes in the changeset, apply the combination of changes, build the source code, and run the test case. The test case just seems to not report the correct test results.


Dec. 03, 2006

Committed some updated to the SVN repository.

  • The test framework. There are a couple of files to the framework: Test.pl, TestCase.pl, TestSuite.pl, TestResult.pl, TestRunner.pl. It is loosely based off of the design of the JUnit framework. Why such an elaborate design just for the need of users to define the test case that can determine whether or not a piece of functionality works or not to be run? For a few reasons that I may be adamant about:
  1. To use the delta debugging framework, the user should not have to touch the DeltaDebugger.pl file to define the tests and how to run them. Using the testing framework, this can be done by subclassing the TestCase.pl class and overriding the run() subroutine.
  2. For the delta debugger to work, it needs to know whether the test case passes or fails. Using the test framework, I hope to control the possible return codes of the tests to either pass or fail only.
  • testtest.pl that tests the functionality of the test framework.
  • updates to DeltaDebugger.pl to make use of the test framework.


Crunch time. One week left. The high priority tasks that still need to be done:

  1. Acquisition of a program we could use to test the delta debugging framework. See How To Get Involved for more info.
  2. Test, debug the delta debugging framework.


Nov. 26, 2006

Committed some updates to the SVN repository.

  • Updated the delta debugging algorithm module. I didn't realize this yesterday but the algorithm to find the minimal set of failure inducing files (and code block and line of code changes if those changeset types ever gets dones) is the same (with minor modifications) as the algorithm that can find the minimal set of failure inducing directories. Thus I generalized that algorithm to remove the directory changeset specific code so that it will work with all other types of changesets.
  • Removed the debugging/test related code from the source files.

CVS Repository Setup (thanks to Reed Loden!): hera.senecac.on.ca/deltatest

Milestone:

  • Even though the test framework is incomplete, I think we can go ahead and begin the initial testing of the delta debugger on a real regressive program as I think we are ready. Coincidentally, exactly 2 months after the first project news posting on Sept. 26, 2006.

Nov. 25, 2006

I haven't posted an update in a while so here goes. What's been done since then?

Committed some updates to the SVN repository.

  • Modified the Changeset hierarchy of classes. Added a getChange() subroutine that takes an index and retrieves the change from the changeset. Also modified the getChangeset() subroutine to optionally take an array of directories/files to limit the search scope to within the directories/files passed in. These changes are possibly dangerously untested.
  • Committed the DeltaDebugger.pl file. This file houses the actual delta debugging algorithm. It requires three user-defined pieces of information: a Build object, an RCS object, and the automated test cases. Currently, it can theoretically find the failure inducing revision, and the minimal failure inducing set of directories.
  • Committed a DeltaDebuggerTest.pl file. It just tests the correctivity of the theoretical.


In the works:

  • Continue working on the delta debugging algorithm. Need to be able to find the minimal failure inducing set of files.
  • Test framework. Allow users to plug in test cases/suites without touching the DeltaDebugger.pl module.


The deadline for a version 0.01 release is looming. 1-2 weeks left to get this done. What needs to be done to accomplish this?

  • Finish everything that is in the works real soon.
  • Need a test program that we could use and upload to our test SVN repository to test the delta debugging framework. Ideally, the test program will meet the following requirements:
  1. Has source files that span multiple directories wide and deep yet be small enough that the delta debugging can be done in a short amount of time so that all aspects of the delta debugger can be tested.
  2. Has a regression. Or can easily be modified so that some functionality will stop working.
  3. Has an automated test case that tests the regressive functionality.
  • Put theory into practice. So far the delta debugging algorithm has not been tested on a real program. The correctness of the algorithm has only been confirmed in theory. We need to test the algorithm in a production environment real soon.


Nov. 19, 2006

The earlier crash case we had (see the update directly below) was a non-regressive bug--there was no former build that worked with it.

Going to use Bug #325377 instead. Having difficulty identifying when it was first introduced--the information in the bug report doesn't seem to be quite accurate. Using the nightly builds as archived at http://archive.mozilla.org/pub/mozilla/nightly/ to narrow it down.

Fortunately this crash is easily automated and does not require user interaction.


Nov. 18, 2006

  • Found a suitable crash case thanks to the people of #qa (in particular, asqueella and Aleksej). For full details on the bug, see Bug #354300.
  • Talked to Reed Loden on IRC. He will be setting up a CVS repository for us something this coming week (Tuesday at earliest).


Nov. 17, 2006

Committed some updates to the SVN repository.

  • Changed applyChanges subroutine to take array of indices instead of scalar of an index.
  • Added unapplyChanges subroutine to Changeset classes.
  • Math::Combinatorics, shamelessly stolen from here. This module is used in the Delta Debugging Algorithm module to help find the minimal failure-inducing changeset.

In the pipeline:

  • Delta Debugging Algorithm partially complete. Unthoroughly tested though can theoretically find the directories that contain the failure inducing changes.
  • Test cases and samples we may be able to use to test the algorithm.

Uploaded files into scen1 directory, containing test module for binaryTest. The test is ready to be used in the algorithm. The directory contains:

  • binaryTest.pl - test to detect the existence of a file.
  • helloWorld.pl - enough said!
  • binaryTestCaller.pl - runs helloWorld.pl, pipe the result to hello.log, and have binaryTest.pl attempt to detect it.

This is the working version of the code, labeled revision 12. Now I have to find a way to wreck it........


Nov. 14, 2006

The development of testing system for the framework is in the works. The first scenario revolves around a test called BinaryExist, which has been shamelessly ripped from the Tinderbox script. All this test does is check whether a given file exists in the system. While this test can aspire for great things, right now it's doing simple thing, like checking whether its client Hello World program is doing what it's supposed to. Initial testing reveals that this test has potential. Will be uploaded to the SVN soon.


Nov. 05, 2006

I didn't know where else to put this so I'm putting this here. While searching around for the elusive Mozilla tests that are run in Tinderbox, I found this gem. All of the tests are apparently located in the mozilla/testing directory and can be checked out using this command while at the mozilla directory:

cvs update -d testing

The tests we would most likely be interested in are located in the tinderbox-standalone-tests subdirectory. Based on a quick scan of the perl files there, the test-mozilla.pl file in that directory seems to drive the tests located in the Tests subdirectory which seem to contain a lot of performance tests. The arguments that the test subroutines receive (such as build directory and binary name) seem to come from the Settings.pm file located in the Util directory.

Attempts to run the tests have so far been unsuccessful. If someone (hint hint) could figure out how to run these tests and how these tests work that would be great.


Oct. 31/Nov. 01, 2006

Committed some updates to the SVN repository.

  • Added file DirectoryChangeset.pl. This file gets a list of directories changed since the revision number passed in.
  • Added file DirectoryChange.pl. This file encapsulates the idea of a changed directory, sort of. Basically holds revision number and path of directory.

UPDATE:

  • I didn't feel tired so I added an applyChange() subroutine to the Changeset classes and ChangesetFactory class. This allows the user to apply a change (specified by the index passed in to the subroutine) in a changeset.

Up Next:

  • Based on how much time I predict I will have left to work on this, I don't think I will have enough time to do the Change/Changeset classes for Codeblock or Line. Therefore, it's time to skip ahead and work on the application of changes. Should the user be able to pass in an array of indices of changes to apply in a changeset? Or is just allowing the user to give one index good enough? We may find that out soon enough when we try to implement the delta debugging algorithm.


Oct. 26, 2006

Committed some updates to the SVN repository.

  • Fixed a small bug with the regular expression in FileChangeset.pl's getChangeset() subroutine.
  • Changed the data that is returned in the getChangeset() subroutines. Used to return the Change set. Now returns the number of changes in the change set. This should decrease memory requirements.
  • Merged some functions in FileChangeset.pl that was created based on an erroneous logical assumption.

Up Next:

  • Continue working on the Changeset/Change subclasses. Need to do Directory, Code block, and Line.


Oct. 24/25, 2006

Committed some updates to the SVN repository.

Updates:

  • Updated the package names for all files.
  • Updated the svn.pl file - fixed a small bug.
  • Updated makewrapper.pl - removed debug statements. I should probably rename the file to something better.

Additions:

  • Created ChangesetFactory.pl - Returns a change set based on the type of change set (revision, directory, file, code block, line).
  • Created RevisionChangeset.pl - Gets and returns all of the changes made in a specified revision.
  • Created FileChangeset.pl - Gets and returns a list of the paths to the files that were changed since a specified revision.
  • Created FileChange.pl - Encapsulates the idea of a Change to a file. Basically stores revision number and file path.
  • Created ChangesetTest.pl - Tests the subroutines in the classes and the interactions between the classes.

Up Next:

  • Continue working on the Changeset/Change subclasses. Need to do Directory, Code block, and Line.


Oct. 17, 2006

The quest to build a suitable "wrapper" for the build system is not doing well. So far I have only observed various build logs posted on Tinderbox for the SeaMonkey build. Here are some observations I come up with.

  • Successful builds - green or yellow on Tinderbox
seamonkey-bin exists, is nonzero, and executable.
seamonkey-bin exists, is nonzero, and executable.
seamonkey-bin binary exists, build successful.
  • Broken builds - red on Tinderbox
gmake[6]: *** [nsAttrValue.o] Error 1
...
gmake: *** [alldep] Error 2

Analyzing build logs may be the "lazy" solution to building a wrapper to detect build failures, but that's the best thing we've got so far. Searches for any of the strings listed above in LXR did not yield valuable result, which means the build system remains a mystery to us. Thoughts:

  • Tinderbox builds rely on report logs to capture build data. If we can figure out how to get tap into this process, we can better plan our prototype.
  • Tinderbox performs a build operation first, and then automatically starts a bunch of tests (e.g. MozillaAliveTest, regxpcom test). These tests do not exist within LXR, either, which means they reside outside the tree. If we can find the script that fires these tests, we can probably create our first prototype as a test and hook it into the build system.

So many questions... Perhaps I should start hanging out at #build.


Oct. 15, 2006

Updated the file svn.pl. Added the diff subroutine. It's a wrapper around the svn diff command. Updated the update subroutine. Added another argument and made them both optional. Subroutine can now accept a directory/file argument in addition to revision number.

Also, currently working on creating some subroutines that can break up a changeset based on the type requested. Changeset types include:

  • Revision. A revision may be a changeset. However, MANY changes may occur in each revision so it may be wise to break up the revision into smaller changesets.
  • Directory. All changes made within a directory may be considered a changeset. Again, MANY changes may occur within a directory, so it may be wise to break it up into smaller changesets.
  • File. All changes made within a file may be considered a changeset. Again, MANY changes may occur within a file, so it may be wise to break up the revision into smaller changesets.
  • Code block. Logical code blocks may be considered a changeset.
  • Line. A line in a file is the lowest change denominator.

The lower down the list you go, the higher the possibility of inconsistencies (inability to successfully compile source tree).


Oct. 10, 2006

Added two files to the SVN repository: makewrapper.pl and maketest.pl. I have no idea if the files will be useful or if I just wasted my time.

makewrapper.pl is a light wrapper around the GNU make utility used to build the source tree. This wrapper was created to be able to programmatically execute the make command with various options. One thing I haven't figured out yet is how to get the exit status code that GNU make returned (0 - successful; 2 - error) after it finishes executing so that I can return that exit code in the build subroutine.

maketest.pl just tests the subroutines in makewrapper.pl.

Starting to get a hang of Perl again after not using it for quite some time.


Oct. 09, 2006

Added two files to the SVN repository: svn.pl and svntest.pl. I have no idea if the files will be useful or if I just wasted my time.

svn.pl is a light wrapper around a couple svn commands used to manipulate the svn repository and user's working copy. This wrapper was created to be able to programmatically execute SVN commands. One thing I haven't figured out yet is how to get the exit status code that is returned after every SVN command is executed as I would like to return this exit code in the subroutines.

svntest.pl just tests the subroutines in svn.pl.

Of note, Perl's object-orientedness is quite different from the other object-oriented languages that I am used to such as C++ and Java.


Oct. 03, 2006

Discussed the project with dave humphrey. The gist of the discussion is:

<vipers101> dave: I hereby present you, richard!

<vipers101> richard: say something!

<dave> pffft...way to play like a team

<vipers101> I prefer to have my team share my griefs

......

<dave> let's assume you've got a program that used to work, but now it crashes. you'd like to figure out what changes to the code introduced the crash. you can programmatically test for the existence of the crash. so you want to have the computer go back in time and figure out what was the delta (e.g., changeset) where the crash started. you want to be able to isolate the problem down to a set of changes to a set of files. further if possible. let's say you have revisions 200 to 500, and you know the bug is caused by *something* in there. so you want to bisect your way through those changesets (in SVN each of those numbers is a changeset) until you can see where the bug was introduced

<vipers101> dave: you mean the bugs are not introduced by the recent work?

<dave> you don't know. you can't assume. maybe, maybe not. remember that working on a huge codebase means that there will be lots of changes going on around your work. so if you do a checkin, they take the weekend off, 50 other people could checkin before your next commit. what you need is a way to test for the problem (maybe a return code). then you need to be able to automatically pull and build the code, and run that test. wash, rinse, repeat

<richard> if i get this, i can't assume that the user will already have a test case/function that returns pass/fail?

<dave> right. you probably need to write something to figure this out. so a hung browser might mean you have a timer that watches for an event. or you might wait on a return code. a crash is probably easiest to test for. this is what I mean by writing a "wrapper".

<vipers101> richard: I think we just have a new task list


Sept. 26, 2006

I read through your documentation here, and it is looking good. I also spoke to Shaver by phone this morning, and we chatted briefly about this project. He suggests that you start your work by looking for a suitable Crash Case, one that happens reliably. Then you need at what would be necessary in order to bisect the change set (e.g., bonsai data) in order to get closer to the change that introduced the bug. Shaver suggested that robc (Rob Campbell) might be a good person to help you brainstorm on this.

How to Get Involved

We need a test program that we could use and upload to our test SVN repository to test the delta debugging framework. Ideally, the test program will meet the following requirements:

  1. Has source files that span multiple directories wide and deep yet be small enough that the delta debugging can be done in a short amount of time so that all aspects of the delta debugger can be tested.
  2. Has a regression. Or can easily be modified so that some functionality will stop working.
  3. Has an automated test case that tests the regressive functionality.

If you don't have a program that meets the first requirement, we could also use test programs that have multiple source files. The key being that the program has more than one source file. Programs that are contained in only one source files are useless to us.

If you have a program that meets these requirements, and you want to contribute to this project, then holla.




If you are looking for an easy way in which to contribute to this project, you can jump in by writing one or more tests for the test suite. This does not require that you learn about the delta debugging inner-workings or structure.

Basic Advice:

  • You must be able to automate the test--no human intervention is allowed.
  • Possible test types include:
    Crashing
    Can you crash the program with a minimal collection of circumstances (steps) that are easily reproducable? (In other words, can you write a script so that this happens in a controlled manner.)
    Performance-related
    Is there a threshold for unacceptable consumption of time and/or space that is reason for concern?
    Program hanging
    Does the program hang? Will it occur in a certain functionality of the software that is possible to isolate (reproduce) through scripted means?
    Unexpected return codes
    What is a normal return code for the program? What is considered unexpected? Script a series of actions and pass the return code up to the test framework.
  • Each test will fit into the test framework (which, at this point, still has to be designed). The tests must follow a few rules (again, undecided at this point).

Please check back in a few days. Expect some templates and samples up shortly to help get you going. The currently listed test types are subject to change.


Future of the Project

Here are some of the ideas related to the continuation of this project. Included are some personal ideas of the team members, tasks to reach the overall objective (a working, robust, Delta Debugging Framework for Mozilla), and additional features/functionality that would enhance the framework. This is subject to change, and a project roadmap will be written in the near future.

CVS Support via Bonsai

For the exploration into Bonsai and to see where it is/was heading, please view the Bonsai Direction. It is likely that a workable solution could be produced utilizing some of the details found in the link. This functionality would be particularly useful to Mozilla as this [Bonsai] is the technology they currently use.

Enhancement of the Algorithm

Richard's great algorithm can be further enhanced using a binary search-like approach that splits the revision from the current, all the way back to when the regression was first noticed (or, alternatively, when the crash case last known to have worked). Currently it works in a sequential manner, testing all previous revisions in order.

More Granularity
For this course, Richard's algorithm supported down to the file-level of change. In the future, it could go as far as evaluating changes in lines of code.

Fleshed Out Test Suite Design

The test suite test types should be further fleshed out and individual tests gathered (no participation from the class was possible due to time constraints; the test suite design wasn't fully explored and documented). Test suites could be put together for each major Mozilla.org project (Firefox, Thunderbird, Sunbird, Bugzilla, etc.).

More Crash Cases

More crash cases need to be found for the success in testing the project.

Unit Tests

A debugging framework, more so than other projects, should have its code quality tested and scrutinized heavily.

Code Review

Perhaps some manual audits could be performed by hand from outside contributors in the future.