Code Indexer

Project Name

Source Code Indexing Service Analysis

Project Description

Mozilla’s source code is enormous—millions of lines of C, C++, JavaScript, Perl, Python, Java, C#, etc. Developers currently use the lxr system to quickly search and browse it on-line: http://lxr.mozilla.org. Mozilla is planning a move from CVS to Subversion for revision control, and at the same time wants to evaluate other source indexing services. Two BSD students are working to setup, document, and test other potential services (e.g., fisheye, opengrok, mxr) on one of the Seneca-Mozilla servers (see below). In each case this requires configuration changes and some scripting to get the services to properly integrate with Mozilla’s other on-line tools. When the test services are installed and synched with the live source tree, Mozilla will point its developers to them and get feedback—the students will help collect and synthesize this feedback.

Project Leader(s)

John Ford (John64)

Project Contributor(s)

Status

I have too many assignments. I need to do work on them, but I should have more free time during the Christmas break.

Options

Help with LXR/MXR/Bonsai development.
Make a sort-of branch respectful version of OpenGrok but this would be a very shoddy implementation that doesn't really do what it should
Setup one OpenGrok per active branch of the Mozilla Project, this would have no version history whatsoever, apart from file dates.
[Re]write major portions of how OpenGrok deals with history and changesets and the likes, this is my personal preference.
Try to fit Fisheye into the current development model, but it seems this might be more like finding a problem for a solution. This is a very very powerful tool, but it is not really like LXR or OpenGrok, it is more useful to analyze CVS/SVN histories more than search for functions, files, definitions and the likes. Fisheye is also extremely slow. Within my lan, it takes a long time to do any queries, and over the internet it is impossible. I don't know why, especially since OpenGrok uses the same basic technologies. (10 minutes plus for one page load on a lan connection, in its defence, it is still indexing the code)

Why I like OpenGrok

Apart from the fact that it does not support branches, this is in my opinion the perfect tool. It is fast, open souce and most importantly, it makes really easy to navigate, well thought out pages that just work. Because of the way OpenSolaris does file versions for their code, they don't use branches at all. OpenSolaris currently uses a linear method of file versioning, they don't use branches, they use versions as a sory of branch, basically the idea that Office 12 is the "2007 branch" and Office 11 is the "2003 branch". Mozilla doesn't do this, so it would be nessecary to implement this feature. Luckily, however, OpenGrok is very modularized and atomic in nature. If you go to the OpenGrok page, you can get a more complete explanation, but the basic jist of it is that there are many "Guru's", each with a task. The files are first read by the History Guru who looks at the file and decides what type of versioning the file uses. Once the versions have been analyzed, they are passed on to the file analyzer guru who then decides what type of file it is, and passes it on to a file type analyzer. The allows for portions of the code to be changed without changing the whole system, so if we wanted to be able to do special things with XUL/XPCOM as far as how to handle its symbols, we would write one module which is not dependent at all on any other file analyzer. The same way, if Mozilla switchs to SVN, we would just port the branching support to SVN. On the chances that Mozilla switches to something other than CVS or SVN, a HistoryGuru could be written for that type of versioning history. The OpenGrok project is under the CDDL which derives from the MPL 1.1

In closing, I really like the OpenGrok project because it is very fast, very powerful and VERY modular!

Links

Official Blurb just in case I forget what I am doing :P
John's School Page
CVS2SVN Tool to convert CVS to SVN. Will be used to test SVN interop.
Devmo:CVS Checkout
Devmo: Rsyncing the CVS
Devmo: CVS Tags - to get the branches to checkout
Tomcat5 on Ubuntu
Tomcat Tips
CVS on non-gnu.org
CVS "Guide"
Subversion
Blog Entry on OpenGrok
Java5 on Ubuntu - "sudo update-alternatives --config java" and "apt-get remove --purge java-gcj-compat"
Mozilla Jargon
Misc
CDDL - explanation of the diferences between the MPL and CDDL

Pulling CVS

This code will pull the CVS for the branches specified in @branches, or it did at some point, your mileage may vary

#!/usr/bin/perl
use strict;
use warnings;

# Pull CVS from the mozilla project server
# Where you want the branch folders
my $src_root = "/var/mozilla";

# Where is your run.sh for opengrok? (or equivalent script to start the indexer)
my $opengroker = "/var/opengrok/run.sh";

# Where is your server?
my $cvsserver = ':pserver:anonymous@hera.senecac.on.ca:/cvsroot';

# Branches to be pulled
my @branches = (
  "HEAD",
  "MOZILLA_1_8_BRANCH", 
  "MOZILLA_1_8_0_BRANCH", 
  "AVIARY_1_0_1_20050124_BRANCH",
  "REFLOW_20061031_BRANCH" 
);

# Descriptions for each branch, don't delete old ones for the sake of deleting them
my %descriptions = (
  "HEAD" => "Trunk - development branch",
  "MOZILLA_1_8_BRANCH" => "Firefox 2.0 - development branch",
  "MOZILLA_1_8_0_BRANCH" => "Firefox 1.5 - maintainance branch",
  "MOZILLA_1_7_BRANCH" => "Firefox 1.0 - maintainance branch",
  "AVIARY_1_0_1_20050124_BRANCH" => "Suite - maintainance branch",
  "REFLOW_20061031_BRANCH" => "Reflow Refactoring"
);

# Open the file or
open BRANCHLIST, ">$branchlistpath" or die "Could not open file";

# Clear out what ever source was there
system ("rm -rf ${src_root}/*");

foreach (@branches){
# Download the makefile, then checkout from the makefile
  system("
    mkdir ${src_root}/$_;
    cd ${src_root}/$_;
    cvs -d ${cvsserver} co -r $_ mozilla/client.mk;
    make -f ${src_root}/${_}/mozilla/client.mk checkout MOZ_CO_PROJECT=all;
  ");
}

system ("bash $opengroker");

CDOT Wiki β