Code Indexer

From CDOT Wiki
Revision as of 23:22, 8 November 2006 by John64 (talk | contribs) (Options)
Jump to: navigation, search

Project Name

Source Code Indexing Service Analysis

Project Description

Mozilla’s source code is enormous—millions of lines of C, C++, JavaScript, Perl, Python, Java, C#, etc. Developers currently use the lxr system to quickly search and browse it on-line: http://lxr.mozilla.org. Mozilla is planning a move from CVS to Subversion for revision control, and at the same time wants to evaluate other source indexing services. Two BSD students are working to setup, document, and test other potential services (e.g., fisheye, opengrok, mxr) on one of the Seneca-Mozilla servers (see below). In each case this requires configuration changes and some scripting to get the services to properly integrate with Mozilla’s other on-line tools. When the test services are installed and synched with the live source tree, Mozilla will point its developers to them and get feedback—the students will help collect and synthesize this feedback.

Project Leader(s)

John Ford (John64)

Project Contributor(s)

Options

  • Help with LXR/MXR/Bonsai development.
  • Make a sort-of branch respectful version of OpenGrok but this would be a very shoddy implementation that doesn't really do what it should
  • Setup one OpenGrok per active branch of the Mozilla Project, this would have no version history whatsoever, apart from file dates.
  • [Re]write major portions of how OpenGrok deals with history and changesets and the likes, this is my personal preference.
  • Try to fit Fisheye into the current development model, but it seems this might be more like finding a problem for a solution. This is a very very powerful tool, but it is not really like LXR or OpenGrok, it is more useful to analyze CVS/SVN histories more than search for functions, files, definitions and the likes.

Why I like OpenGrok

Apart from the fact that it does not support branches, this is in my opinion the perfect tool. It is fast, open souce and most importantly, it makes really easy to navigate, well thought out pages that just work. Because of the way OpenSolaris does file versions for their code, they don't use branches at all. OpenSolaris currently uses a linear method of file versioning, they don't use branches, they use versions as a sory of branch, basically the idea that Office 12 is the "2007 branch" and Office 11 is the "2003 branch". Mozilla doesn't do this, so it would be nessecary to implement this feature. Luckily, however, OpenGrok is very modularized and atomic in nature. If you go to the OpenGrok page, you can get a more complete explanation, but the basic jist of it is that there are many "Guru's", each with a task. The files are first read by the History Guru who looks at the file and decides what type of versioning the file uses. Once the versions have been analyzed, they are passed on to the file analyzer guru who then decides what type of file it is, and passes it on to a file type analyzer. The allows for portions of the code to be changed without changing the whole system, so if we wanted to be able to do special things with XUL/XPCOM as far as how to handle its symbols, we would write one module which is not dependent at all on any other file analyzer. The same way, if Mozilla switchs to SVN, we would just port the branching support to SVN. On the chances that Mozilla switches to something other than CVS or SVN, a HistoryGuru could be written for that type of versioning history.

In closing, I really like the OpenGrok project because it is very fast, very powerful and VERY modular!

Links

Pulling CVS

This code will pull the CVS for the branches specified in @branches

#!/usr/bin/perl
use strict;
use warnings;

# Pull CVS from the mozilla project server
my $src_root = "/var/mozilla";

# Where you want to you the HTML for the branch info
# This could be SSI'd, but that is a lot of work to set
# up with the ubuntu packages
my $branchlistpath = "${src_root}/branches.html";

# Where is your run.sh for opengrok? (or equivalent script to start the indexer)
my $opengroker = "/var/opengrok/run.sh";

# Where is your server?
my $cvsserver = ':pserver:anonymous@hera.senecac.on.ca:/cvsroot';

# Branches to be pulled
my @branches = (
  "HEAD",
  "MOZILLA_1_8_BRANCH", 
  "MOZILLA_1_8_0_BRANCH", 
  "AVIARY_1_0_1_20050124_BRANCH",
  "REFLOW_20061031_BRANCH" 
);

# Descriptions for each branch, don't delete old ones for the sake of deleting them
my %descriptions = (
  "HEAD" => "Trunk - development branch",
  "MOZILLA_1_8_BRANCH" => "Firefox 2.0 - development branch",
  "MOZILLA_1_8_0_BRANCH" => "Firefox 1.5 - maintainance branch",
  "MOZILLA_1_7_BRANCH" => "Firefox 1.0 - maintainance branch",
  "AVIARY_1_0_1_20050124_BRANCH" => "Suite - maintainance branch",
  "REFLOW_20061031_BRANCH" => "Reflow Refactoring"
);

# Open the file or
open BRANCHLIST, ">$branchlistpath" or die "Could not open file";

# Number of branches, twice, once for the line, once for the description
my $branchcount = ($#{branches} + 1) * 2;

# Print the begining of the select block  
print BRANCHLIST "<select class=\"q\" name=\"branch\" size=\"${branchcount}\" value=\"\" onClick=\"document.cookie = document.sbox.branch.selectedIndex;\"  />\n";

# Clear out what ever source was there
system ("rm -rf ${src_root}/*");

foreach (@branches){
# Download the makefile, then checkout from the makefile
  system("
    mkdir ${src_root}/$_;
    cd ${src_root}/$_;
    cvs -d ${cvsserver} co -r $_ mozilla/client.mk;
    make -f ${src_root}/${_}/mozilla/client.mk checkout MOZ_CO_PROJECT=all;
  ");
  # This is for an attempt at doing something, it is mostly useless
  print BRANCHLIST "\t<optgroup label=\"$descriptions{$_}\" />\n";
  print BRANCHLIST "\t<option> $_ </option>\n"; 
}

print BRANCHLIST "</select>\n";

system ("bash $opengroker");