Difference between revisions of "User:Dvillase/FSOSS 2011"

From CDOT Wiki
Jump to: navigation, search
(General Impression)
(Multilingual Sites and Translation Management in Drupal)
 
Line 27: Line 27:
 
Anas outlined a key hurdle in handling translations online.  Currently there are only two main ways of creating a multilingual website.  The first involves hiring someone to create multiple versions of the same website in different languages which is what is done for corporate websites.  The second method involves utilizing a translation tool to automatically translate the website.  Each of these methods has their own advantages and disadvantages.  In the case of the first method, it is more costly both in time, effort and money, depending on whether or not someone was hired for the job or not.  However the first approach provides a highly accurate translation of the website.  The second method, because it is automated tends to be cheaper and faster, however its translation accuracy generally ranges from poor to just barely adequate.  What Anas proposed was system that combines the two methods of translation.  An automated system would carry out the initial translation of a webpage while a human user would correct any mistakes made by the automated translator.  Doing so would take advantage of the strength’s of the two methods while mitigating their faults.
 
Anas outlined a key hurdle in handling translations online.  Currently there are only two main ways of creating a multilingual website.  The first involves hiring someone to create multiple versions of the same website in different languages which is what is done for corporate websites.  The second method involves utilizing a translation tool to automatically translate the website.  Each of these methods has their own advantages and disadvantages.  In the case of the first method, it is more costly both in time, effort and money, depending on whether or not someone was hired for the job or not.  However the first approach provides a highly accurate translation of the website.  The second method, because it is automated tends to be cheaper and faster, however its translation accuracy generally ranges from poor to just barely adequate.  What Anas proposed was system that combines the two methods of translation.  An automated system would carry out the initial translation of a webpage while a human user would correct any mistakes made by the automated translator.  Doing so would take advantage of the strength’s of the two methods while mitigating their faults.
  
The second half of his presentation dealt with how Drupal handles the automated translation of pages and I had to admit that most of what he talked about went over my head.  The only thing I understood was that Drupal’s translator, in the process of translating a website, breaks the website down into two different components and assigns each section into a separate domain.  The first one, L10N deals with the websites localization information and deals mainly with (I think) how the content and characters are presented on screen.  For instance Japanese text is read from right to left rather than left to right as you would in English and most Indo-European languages.    The second domain, I18n handles translating the actual content itself.  He did mention that Drupal currently doesn’t support translating the strings in a website though I wasn’t sure what he meant by this.  If domain I18n is already translating the websites content, then wouldn’t it already be translating the strings as well?
+
The second half of his presentation dealt with how Drupal handles the automated translation of pages and I had to admit that most of what he talked about went over my head.  The only thing I understood was that Drupal’s translator, in the process of translating a website, breaks the website down into two different components and assigns each section into a separate domain.  The first one, L10N deals with the websites localization information and deals mainly with (I think) how the content and characters are presented on screen.  For instance Japanese text is read from right to left rather than left to right as you would in English and most Indo-European languages.  Taking care of that language based layout of the text would be one of the responsibilities of the L10N domain.    The second domain, I18n handles translating the actual content itself.  He did mention that Drupal currently doesn’t support translating the strings in a website though I wasn’t sure what he meant by this.  If domain I18n is already translating the websites content, then wouldn’t it already be translating the strings as well?
  
 
My impression of Anas’ view of open source is that it can be used to develop the tools necessary to make the internet truly global and by truly global I don’t just mean that everyone has access to it but has access to the content and information from people’s websites from around the world.  The language gap is one of the elements that still restrict people’s access to information in the web, by creating a tool that can quickly, cheaply and accurately translate the websites of people around the world, that barrier can be greatly weakened if not removed altogether.
 
My impression of Anas’ view of open source is that it can be used to develop the tools necessary to make the internet truly global and by truly global I don’t just mean that everyone has access to it but has access to the content and information from people’s websites from around the world.  The language gap is one of the elements that still restrict people’s access to information in the web, by creating a tool that can quickly, cheaply and accurately translate the websites of people around the world, that barrier can be greatly weakened if not removed altogether.

Latest revision as of 21:42, 4 November 2011


Introduction

I attended FSOSS 2011 on Saturday, October 29 and saw two presentations, the first one being a presentation on running a build farm using ARM processors running the Fedora operating system presented by Seneca students Jordan Cwang, Anthony Boccia, and Jon Chiappetta who themselves are currently running Seneca’s own ARM based server farm. Their presentation would focus on the how to of setting up your own ARM based server farm using open source software. The second presentation I saw was Multilingual Sites and Translation Management in Drupal which was presented by Anas Tawileh, director of a Toronto based company called Systematics Consulting that provides consulting on open source and information security consulting. His presentation covered methods on creating sites that support multiple languages in Drupal, an open source content management system.

Running a Build Farm with Fedora and ARM

This presentation was made by Jordan Cwang, Anthony Boccia, and Jon Chiappetta who are themselves Seneca students. Unfortunately that and the fact that they are currently running an ARM based server farm in Seneca as part of an open source research project is all that I know of them.

Their presentation covered setting up and running a server farm that has been designed to compile computer applications remotely, only unlike most server farms this one has servers that use ARM hardware which are not only cheap but also don’t utilize as much power as regular servers (ARM processors are essentially the same hardware used on mobile devices) and is run by an open source Fedora OS and using free open source GNU based applications like YUM, KOJI, STYRENE and MOJI.

The general gist of their presentation is that ARM hardware has gotten so powerful to the point that it is, more than ever, feasible for anyone to setup a server farm without spending an exorbitant amount of money on buying the hardware, setting it up and keeping it running, provided they have the technical know how of course.

Their presentation focuses on methods on implementing this using open source Fedora to run the build farm similar to what they used on creating Seneca’s own ARM based server farm that they built as part of an open source project.

The architecture they proposed utilized YUM, (Yellowdog Updater Modified), which is an open source command line package management utility for RPM compatible Linux operating systems released under the GNU General Public license, to provide the environment for building the applications.

Koji would be the next tier in the system which would handle managing the contents of the farm. This is followed by the next tier which will use Styrene to handle any error checking as well as tracking the build status

I found this presentation to be fairly interesting and though I got lost on some of the finer technical details of it I pretty much understood the system architecture that they were proposing. All of the work of managing the content, managing the builds themselves and providing the environment would all be free open source software. Granted, I most likely will not be doing anything like this anytime soon. As cheap as ARM hardware probably is I definitely don’t see setting up something like this from scratch costing anything less than to a few hundred a couple of thousand dollars, not to mention the sheer technical know how required to setting this up is currently way beyond my level of understanding. Still something like this would be invaluable to small companies and organizations or even to a small group of students and academics or really anyone hoping to get a chance to work with their own server farm but need a cost effective means of doing so.

Multilingual Sites and Translation Management in Drupal

The next presentation I saw dealt with translation management of multilingual websites in Drupal by Anas Tawileh and was a bit of an eye opener. I know well enough that there are quite a bit of non-English websites out there and that not everyone in the world knows English, what was surprising was the ratio between people on the internet who don’t speak English at all between the people who can speak it (only 1 out of 6 internet users world wide can read, write and/or speak English). This obviously means that quite a bit of websites out there would be written in their owner’s native language and not in English and would cater to the speakers of that language which in turn means that the language barriers that exist in the real world also exist in the internet. What the Drupal translation management hopes to do is to tear down this barrier in the internet by providing a quick, easy and cost effective means of translating websites.

Anas outlined a key hurdle in handling translations online. Currently there are only two main ways of creating a multilingual website. The first involves hiring someone to create multiple versions of the same website in different languages which is what is done for corporate websites. The second method involves utilizing a translation tool to automatically translate the website. Each of these methods has their own advantages and disadvantages. In the case of the first method, it is more costly both in time, effort and money, depending on whether or not someone was hired for the job or not. However the first approach provides a highly accurate translation of the website. The second method, because it is automated tends to be cheaper and faster, however its translation accuracy generally ranges from poor to just barely adequate. What Anas proposed was system that combines the two methods of translation. An automated system would carry out the initial translation of a webpage while a human user would correct any mistakes made by the automated translator. Doing so would take advantage of the strength’s of the two methods while mitigating their faults.

The second half of his presentation dealt with how Drupal handles the automated translation of pages and I had to admit that most of what he talked about went over my head. The only thing I understood was that Drupal’s translator, in the process of translating a website, breaks the website down into two different components and assigns each section into a separate domain. The first one, L10N deals with the websites localization information and deals mainly with (I think) how the content and characters are presented on screen. For instance Japanese text is read from right to left rather than left to right as you would in English and most Indo-European languages. Taking care of that language based layout of the text would be one of the responsibilities of the L10N domain. The second domain, I18n handles translating the actual content itself. He did mention that Drupal currently doesn’t support translating the strings in a website though I wasn’t sure what he meant by this. If domain I18n is already translating the websites content, then wouldn’t it already be translating the strings as well?

My impression of Anas’ view of open source is that it can be used to develop the tools necessary to make the internet truly global and by truly global I don’t just mean that everyone has access to it but has access to the content and information from people’s websites from around the world. The language gap is one of the elements that still restrict people’s access to information in the web, by creating a tool that can quickly, cheaply and accurately translate the websites of people around the world, that barrier can be greatly weakened if not removed altogether.

General Impression

The presentations were quite well done and were overall easy to grasp and understand though they did end up getting too overly technical at some points and as a result what they were saying ended up going over my head. As to how my impressions of open source has changed after going to FSOSS it’s hard to say, I know that there is quite a bit of open source projects out there but I’ve often held the impression that the vast majority of those projects consisted of small scale stand alone projects that people implemented in their free time either as an academic exercise or just for the sheer fun of coding an application; with only a small minority of large complex projects out there. FSOSS was an eye opener in that I realize that there are far more large complex projects out there and that these projects are highly ambitious in what they hope to accomplish. Anas’ Drupal translator itself would lift a great deal of the restrictions on what content one can view on the internet by providing a means to read and understand sites written in languages that the reader is unfamiliar with, while Jordan, Anthony and Jon’s presentation on ARM based server farms provides the know how that would enable just about anyone to setup their own server farm without having to spend tens of thousands of dollars on both the actual hardware and on maintaining it.

Conclusion

I have to admit, I did end up attending the FSOSS event somewhat begrudgingly. I had gone into study week with far more than enough work to keep me occupied and the prospect of having to spend the next few hours attending an event rather than finishing up my work or more preferably catching up on some much needed sleep wasn’t something that I wasn’t to crazy with. In retrospect though I am glad that I went, that event opened up a new level of understanding of what kinds of open source projects are out there that I can, hopefully someday, participate in once I get a bit more time on my hands. Here’s looking forward to FSOSS 2012, hopefully I won’t be so swamped with work that I can participate on all three days.