Changes

← Older edit

User:Abhishekbh/FSOSS 11

623 bytes added, 17:03, 20 January 2012

→‎Conclusion

OSCON or the O'Reilly Open Source Convention is an annual event organized by the well known O'Reilly Media. The company runs a few other conferences as well, but this is the only one that focuses on open source software and open data. It brings together experts, community leaders, and hackers to discuss community issues and new ideas.

I ended up watching six videos from ~~the conference~~OSCON2011's [http://www.youtube.com/view_play_list?p=93FC98105B19725C Youtube playlist], these being:

* [http://www.youtube.com/watch?v=vKmQW_Nkfk8 Steve Yegge's What would you do with your own Google?]

In this particular talk, he stresses the importance of doing something “important” in your career. He puts this succinctly on a slide with “Always work on stuff you love!”. He laments that far too many talented programmers are busy working on software that solve unimportant problems, such as building platforms to view “cat pictures”. He suggests that important solutions like those sought by The Human Genome Project instead are going unheeded. He sees a folly in those programmers who spend their careers simply doing what they already know, instead of learning and solving new problems. He calls this “mercenary programming”. He says that instead of focusing on Farmville, cat picture platforms and general corporate programming, that engineers should instead look to “social-minded and innovative problem-solving.”

His talk ~~reminds~~ reminded me of a quote by Richard Hamming, the author of The Art of Doing Science and Engineering:

"In science, if you know what you are doing, you should not be doing it. In engineering, if you do not know what you are doing, you should not be doing it. Of course, you seldom, if ever, see either pure state."

Essentially Yegge is asking engineers to be scientists as well. As software finds higher integration in our daily lives, universities and colleges are turning out more engineers than scientists, primarily to serve an industrial demand. This scenario creates the problem that Yegge is describing: that a plenitude of programmers is busy solving trivial problems, and hence the potential of computing is not being met.

He admits that he does not fully practice what he is preaching, however that he is taking concrete steps to mend that fact. In fact, to show he is serious, he actually quits his job on stage, that being the first instance of his boss learning the same fact. He says that he had just recently signed up for a “cat picture project”, by which he likely means Google+, but soon finds himself disillusioned with it enough to want to quit. He then announces that now, once a week, he sits down with his wife to “study” - to learn something new, or to read books he hasn't read before. He urges his audience to do the same, to dedicate at least a few hours every week to reeducating themselves, and taking an interest in important problems. One slide of his talks about “starting a culture change” which refers to self education of “infrastructure and scaling”, and “math, data mining and bioinformatics” to solve these important problems.

He notes that while most organizations today are interested in building easy to sell software which has little depth, there are some (such as O'Reilly Books) which do put a stress on what he calls important problems. He suggests that since the current conference was about open source software (rather than something like iPad conference,) he did believe that the people in attendance have an interest in working on the big problems, but they do need to break out of “code mercenary” lifestyle to embrace this fully.

According to him, if we had the source code for the Human genome, then we could easily discern cures for medical conditions like Cancer. So the “open source” he talks about is a metaphorical one, in which information is available freely and without restrictions. He says that if you look at all the open source code in the world as data, then it would come down to perhaps a couple of terabytes, which is apparently smaller than some of the larger data sets at Google. But the number of problems that require open solutions are very many.

Though this talk was fairly high level, it helps me affirm two important points about my own view of the open source world.:

* Most open source projects do provide a developer with an opportunity to work on “important” problems. The larger majority of corporate programming happens to be on software with a short life-span which other than making the rounds for a couple of years won't contribute much to the world.

* The importance of 'free as in freedom' is a fundamental concept of the Free and Open Source movement. As Yegge mentions, solving hard problems require some cross-training in expertise, and hence software needs to come without a prescribed usage.

==David Eaves' “Saving Open Source Communities With Data”==

* making sure there is a low transaction cost in getting them started and contributing

He does admit however that GitHub has been a saviour or of sorts for the open source world. He says that 'forking' used to be a bad word before GitHub because forked projects would often lead to split loyalties and a dwindling interest. But with GitHub, since the process of forking is very simple and has a negligible investment, it takes away that risk but instead enables experimentation; a majority of forks can die, but progress will still be made. GitHub also lowers the transaction cost required to get a new developers code into a repository, be it the central, or a forked repository. Further, since the process of forking also brings ownership with it, one does not need permissions anymore before they can begin to experiment and contribute test code.

To the point of using community data as a diagnostic tool, he says that by monitoring individual statistics - such as number of commits, number of commits merged, last commit time, etc - a community can gauge its own health level at any given time. If too many of its developers have not made any commits in a long period of time, then that might be a symptom of an inefficiency that lies in the project standards. Eaves is a Mozilla Developer, and is involved with a project that adds a developer dashboard to Bugzilla, which would carry these statistics.

~~He also suggests that such data would allow~~ I tried searching for a ~~community~~ live version of this dashboard hosted online, but was only able to ~~better understand itself from~~ find a ~~developer's perspective. An example he gives surrounds the unknown period~~ screen-shot of ~~time a developer has to wait after submitting a patch but before~~ it ~~gets reviews~~, viewable [http://eaves.ca/wp-content/uploads/2011/04/main-board-everything1.png here]. ~~Since this process is not standardized,~~ Some aspects of it ~~often leads~~ seem similar to ~~frustrations which might convert to quits. According to Eaves~~the statistics that GitHub offers per repository, ~~average wait times will become self-apparent, and if~~ though they ~~begin to slide in individual cases, then moderators could be chastized for their delays. In this way, community data~~ can be ~~used~~ hard to ~~introspection to find efficiencies and standards~~understand for larger projects.

~~I tried searching for~~ He also suggests that such data would allow a community to better understand itself from a ~~live version~~ developer's perspective. An example he gives surrounds the unknown period of ~~this dashboard hosted online, but was only able~~ time a developer has to ~~find~~ wait after submitting a ~~screen-shot of~~ patch before itgets reviewed. Since this process is not standardized, ~~viewable [http://eaves~~it often leads to frustrations which might convert to quits.~~ca/wp~~According to Eaves, average wait times will become self-~~content/uploads/2011/04/main-board-everything1~~apparent, and if they begin to slide in individual cases, then moderators could be chastized for their delays.~~png here]. Some aspects of it seem similar to the statistics that GitHub offers per repository~~In this way, ~~though they~~ community data can be ~~hard~~ used for introspection to ~~understand for larger projects~~find efficiencies and standards.

Finally Eaves gives an example of how open data has helped a client of his in the open government model. The city of Los Angeles recently made restaurant inspection data open, and required it to be posted on every establishment's door (much like in Toronto.) According to Eaves, this led to restaurants with poor records receiving fewer customers and those with better records a higher number. In other words, good restaurants were rewarded and poor restaurants punished. He also noted that there has been a decline in the number of patients that visit the emergency room with food poisoning, a fact that he says is likely grounded in this freeing of information.

Other than his role in open government, I particularly found Eaves' analysis on the challenges of open source communities interesting. Lately I have been really buying into the idea than an open source project needs not only the relevant licenses to be efficient, but also a community. For example, Google's Chromium Browser is an open source project released largely under a BSD license. However it is still largely developed behind closed doors, code from which is dumped externally afterwards with an open source license. This is in contrast to a project like Mozilla Firefox, which not only has an open source license, but also a large and thriving community working on it. The former has a very small community of developers relatively speaking (mostly Google employees), and is hence largely unknown to the public as an open source project. Eaves' analysis speaks to the mechanics of that community building and can perhaps be used to see why certain projects out there are more popular than others.

==Patrick Curran's “Who Needs Standards?”==

As I have been learning more about open source over the past year, I have found myself confused and conflicted by a few discoveries. One of these came a few months ago when I discovered a website selling Open Office for $20 per digital download. The website had a table comparing feature lists between Microsoft Office and Open Office (fairly similar), and them compared their costs, some $270 for the former, and their price of $20 for the latter. My first instinct upon visiting this site was that it was malicious in its intent and thought of reporting it to the Free Software Foundation. After a little more research I learned however that that website was not violating any rules of Open Office's LGPL and Apache licenses - they were distributing the software with the source code and original licenses included. After learning that it was not illegal to sell open source software, I realized that I hardly understood what it meant or stood for.

This was when I began to realize that the 'free' in Free and Open Source Software could mean either 'free as in freedom' or 'free as in gratis'. None of the videos I watched from OSCON 2011 spoke about this subject explicitly, however I did see that difference in the meaning of the word 'free' in context here. It was relieving to know that I at least have ~~a better understanding of this subject~~ that basic now, and can see it being directly applied in industry.

Further, I came across three iterations of the word 'open' in this conference: open source, open data and open standards. The term 'open' is often used in popular Internet culture without being quite pinned to either of those three, and this can make things quite confusing. However, the speakers in these talks helped me reinforce that separation of meaning with their contextual uses of them.

I have also taken a particular liking to Steve Yegge, whose blog posts I've been catching up on since. Again, while not explicitly preaching open source, he does incorporate it in his overall message. For me, it is gratifying to know that such leadership exists out there, and that open source is more than a movement for just sans cost software - it is one for quality software and an environment in which quality software can be produced.

I'll be sure to follow up on this conference in the future.

Abhishekbh

1

edit

Changes

User:Abhishekbh/FSOSS 11

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools