User:Mohamed Baig/FSOSS Report 13

From CDOT Wiki
Jump to: navigation, search

MongoDB: Advantages of an Open Source NoSQL Database

This talk, presented by Kevin Cearns, was about MongoDB and the advantages of the this open source NoSQL database.

MongoDB is a document-oriented database system that is also cross-platform. It forgoes the common relational table based database structure for a JSON like document with a custom name for the document called BSON. The database was released with a combination of GNU AGPL and Apache License.

MongoDB is defined as a NoSQL database. The means the interaction between an application and the database dose not require SQL syntax and the database itself does not use the relational model like traditional databases. This is why Carlo Stozzi, who used NoSQL in 1998 to name his relational database that did not expose SQL interface, suggests that NoREL be used to define the current no relational databases because they do not use the relational model at all. The motivations to use non-relational database include simpler designs, horizontal scaling, and better control regarding availability. Kevin talked about data availability as well where in MongoDB the data is stored as it would be presented to the user rather then in separate tables in a traditional relational database. These database are growing in the big data and real-time web applications, because they provide less latency and greater throughput.

Kevin started to talk about MongoDB with its background. It started in 2007 as part of a larger project that was intended to be a platform as a service such as Windows Azure. The company behind the project, 10gen now MongoDB Inc., recognized the importance and advantages of the database itself and decided to focus solely on it. So, in 2009 they open sourced the code for the database and got the community involved into the project. As mentioned before it was open sourced under the AGPL license. The code itself is written in C++ and Kevin mentioned that the reason for its speed is because of the use memory mapped files. MongoDB is compatible with all major operating systems Linux, Windows, and OSX. Although Linux is the biggest install base for the database. As mentioned before the data is stored in a BSON format file and database supports primary and secondary indexes.

The features of MongoDB include Ad-hoc queries, indexing, replication, load balancing, aggregation, capped collections.

  • Ad-hoc queries: MongoDB supports returning specific fields, range queries, regular expressions, and can also include user defined JavaScript functions.
  • Indexing: MongoDB supports indexing of any field in the document. The id field is automatically indexed. There is also support for secondary indices.
  • Replication: MongoDB uses master-slave replication, where the master can perform both reads and writes but he slaves cannot. The slaves are mainly used for backing up data from the master and reads.
  • Load Balancing: Referred to as sharding, this feature allows MongoDB to split up the data from a single data base to different physical locations. New machines, acting as shards, can be added to a running database.
  • Aggregation: This function is similar to the GROUP BY clause in SQL queries.
  • Capped Collections: This feature can be used to get fixed-size data from the database. If the maximum size is reached, it redisplays the first item received and continues.

Kevin's main points about the database included its scalability, its ease of use, its stability, and its support by the developers and the community.

Kevin mentioned in his talk that MongoDB is easily scalable. He spoke of companies like eBay, Craigslist, and CERN that use MongoDB and handle massive amounts of data on a daily basis. He also demonstrated how the database can be set up with different shards and share data between them. He spoke about MongoDB's ability to create master-slave structure and use the slaves as data backup servers. Also, he demonstrated how the different slave servers will chose a new master if the current master ever goes offline.

Kevin spoke about the MongoDB's ease of use as well. He demonstrated how easy it was to use MongoDB during the presentation. Everything from getting it installed to getting up and running was easily done. He demonstrated this by creating three replicates during the demonstration on his local machine in a matter of minutes. He also demonstrated data entry and how there is no actual table creation needed to enter data, the table is created automatically. Getting data is just as easy with JavaScript functions and since the data is presented in JSON format displaying the data is just a matter of parsing JSON. Since there are many libraries for parsing JSON the displaying of data is also easy.

Kevin also mentioned the stability of the database. As mentioned before the database is used by huge companies (eBay, CERN) that process large amounts of data. Also, mentioned before was the fact that MongoDB is growing in the big data market. To achieve this it needs to be very stable and it is. Kevin mentioned that he was working on a project that he was given a few weeks to finish and he would have to use MongoDB for it. He had never used it before and had to use to store and process large datasets. He said he finished the project and had 99.99% uptime for it. He said this was achieved by using the master-slave model with the slaves backing up the data and becoming active when necessary.

Kevin's final point was about the support that is available for MongoDB by the developers and the community. He talked about the documentation of the database itself created by the contributors to the project and how the documentation is far superior from other open source projects that are this new. He also mentioned the various documentation and education projects started by the community. For instance, https://education.mongodb.com is a resource for learning about MongoDB with various technologies and developers. For example, they provide a course for application developers and another course for database administrators. Furthermore, they also provide courses for Java developers and also how to use MongoDB with Node.js.

Kevin spoke about how he was introduced to MongoDB and how he came to use it. He said he was part of an operations company that was assigned a project from Walmart. He said that he was given a few weeks to make this production website while using MongoDB for the first time. He had to use MongoDb because the schema kept changing based on new data. Kevin said he was pleasantly surprised at how easy it was to get going with MongoDB. He said there wasn't a lot of set up involved with the database and once he got it running it had an uptime of 99.99% and was very stable the whole time.

Powered by ARM

This talk presented by, Andrew Greene and Christopher Markieta, was about what the ARM architecture brings to the computing world. Such as, the advantages ARM has and its benefits.

ARM is a set of architectures for computer processors based on the RISC (Reduced Instruction Set Computing) approach of computing developed in Britain. In contrast to the standard desktop processors, ARM processors require significantly less power and produce less heat. For this reason they are most popular in small portable devices such as smart phones and tablets. However, the increased performance and cheap production costs, has also opened the server and supercomputer market for ARM.

ARM was first developed in the 1980's by a British company called Acorn Computers. While determined to create their own architecture, they were inspired by Berkeley's RISC project. In 1983, the Acorn RISC Machine was officially started by Steve Furber and Sophie Wilson. The first ARM based chips were produced in 1985 by Acorn's silicon partner VLSI Technology called the ARM1. A year later the next version of the processor was produced called the ARM2. Two years after the first production ARM processor, the first ARM based computer was created called the Acorn Archimedes, as mentioned in the talk. Apple joined Acorn to create newer version of the ARM architecture. In 1990 Acorn created a separate ARM team called Acorn RISC Machines Ltd. Also, Acorn won the Queen's Award for Technology for the second time in 1992 for its work on ARM. The separate ARM team later became ARM Ltd.

Andrew started the talk by speaking about what an ARM processor is. Defining it as a RISC machine and talking about some advantages to this approach. Continuing the by talking about the some background he mentioned that it started in Britain. He compared the two approaches for creating a processor, RISC and CISC. Where RISC focuses on fast computation and lower power usage, CISC (Complex Instruction Set Computing) concentrates on the creating speed by consuming more power and being able to perform more complex tasks. The CISC approach is mainly used in desktop processors made by Intel and AMD. Andrew then spoke about some advantages of the ARM architecture. Which include smaller die size, faster development time, and the smaller chip size. He then spoke of the MIPS (Million Instructions Per Second) to WATTS ratio which is used to measure the performance of a chip and the ARM architecture has the best ratio out there. Then he mentioned that manufactures use the MIPS to WATT ratio to measure the cost of computing. He talked about the main selling point of the ARM architecture, low power consumption. This being the reason why ARM is so popular in mobile computing.

Christopher Markieta started his part of the talk by first mentioning the types of devices that use ARM processors. The devices include smart phones, cameras, GPS devices, tablets, etc. Next he spoke of the next processor designs by ARM, the Cortex A57 and A53. These new designs will be based on the ARM V8 architecture and will have both 32 and 64 bit capability. He mentioned that these will be the most efficient 64 bit platforms because of energy and computing efficiency. Then Christopher went on to mention the new configuration that the ARM processors will support; that being big.LITTLE configuration. This configuration will allow processors to dynamically change between a lower power CPU for background tasks and higher powered one for foreground tasks. Christopher compared the technology to the nVidia optimus technology which performs the same function. His point was about the ARM company itself. He mentioned that ARM doesn't make the chips but designs them. ARM then licences them out to other companies such as Qualcomm, Samsung, Apple, and nVidia. He mentioned that the future of ARM processors was a bright one and their next line of processor designs have already been licensed by the manufacturing companies.

Andrew's main points were the advantages of an ARM processor, and its future markets.

Andrew first spoke of the advantages of using an ARM based processor over any other architecture. He mentions that an ARM processor is more efficient in terms of calculation and cheaper to produce than the other architectures. He spoke of the die that ARM processor is made on and how as the technology advances the die size gets smaller and smaller, and the chips get more powerful. he explained that the die is the wafer that the chips are installed on. Another advantage to ARM chips is the development is relatively smaller than any other architecture. Manufacturers can produce the chips faster and get their products out faster as a result. Andrew mentioned another advantage of the ARM architecture which is smaller chip size. This advantage enables manufacturers to produce smaller devices. And the last advantage Andrew mentioned was the low power consumption of the chips, which helps manufacturers produce longer lasting devices.

Andrew than spoke of the future markets for ARM processors. He mentioned that there is virtually no competition from other chip manufacturers and designers for where ARM market is going. He also mentioned that the main market for the ARM processors in the future is the embedded systems market. Although, he said that ARM will see a growth in all markets that it is used in, such as smart phones, routers, cameras, etc.

Christopher Markieta's main points included the future of the technology itself, and its use in the smart phone and tablet market.

Christopher first spoke of the future of the processors themselves. He spoke of the new versions of the processors coming next year that will have several advanced features such as, 64 bit support, hardware virtualization, and big.LITTLE configuration. He mentioned that the first 64 bit processor has already been introduced by Apple in its iPhone 5s. However, this is not the latest technology from ARM, that will be introduced next year in 2014. These processors will also have hardware virtualization support that will allow for an abstract layer above the operating system that will allow for multiple operating systems to be installed. The last big feature that Christopher mentioned was the big.LITTLE configuration of the processors. This will allow for less power consumption because of certain tasks being assigned to optimized low power processors automatically.

Christopher then talked about the use of ARM in the smart phone and table markets and the effect it will have on the future. He mentioned that ARM doesn't manufacture the chips but designs them. Then the manufacturers license the design from them, which is how ARM makes most of their money, and create their own versions of the processors. He mentioned that as the processors get more and more powerful, the smart phones and tablets become more and more content creation devices rather than content consumption. The faster and more efficient processors make it possible for the new devices to perform most of the tasks that are performed by the desktop computers.

Andrew Greene works for CDOT and is working on a customized version of Fedora called piDora. He was optimistic about the future of ARM and impressed with the technology itself. He seemed excited to work with the technology and also surprised by how familiar with the technology everyone was.

Christopher Markieta is a recent employee of Seneca College and has been working on improving the networks infrastructure and test piDora as well. He was very enthusiastic about the technology of the processors that he was presenting on. And he seemed very interested in the future of the technologies.

Analysis

Both of the presentation seem to concentrate on speed and efficiency going towards the future. MongoDB emphasizes the use of documents instead of the usual SQL approach for better speeds and easier development. ARM has the same pattern emphasizing smaller chips for faster and easier development, and more efficiency for chips that perform faster.

MongoDB concentrated on performance and faster response times. Going forward it seems the future is bright for the document-oriented database not only in web applications, but also in handling big from supercomputers. Due to non-relational nature, it is well suited for receiving and organising large amounts of data really quickly. There is currently a foot hold of relational databases on the large scale data market but MongoDB seems to be quickly gaining popularity for its stability, durability, and ease of implementation.

The same can be said for ARM processors. ARM processors are already prominent in the mobile industry and MongoDB is ideal for this industry and so is catching up quickly in this market. However, the market that they both have in common is the big data market that is held by someone else right now. ARM is aiming for the server, supercomputer, and computer cluster markets which generally involves large amounts of data, which is where MongoDB is gaining ground.

These two technologies are bound to meet at that destination at some point if they have not met already. One is an open source new way of storing and accessing data that is friendly to mobile devices. The other is the preferred technology for running the biggest open source mobile operating system, Android, and a plethora of other open source software like piDora.

The speakers had similar attitudes of optimism towards the technology they presented and how it related to open source. Where MongoDB is completely open source and free to use and alter, Kevin praised, in surprise, the support that the developers had created. He was impressed by the developers because of that by the community for creating and educational portal to learn this new technology. Although, ARM processor designs are proprietary and must be licensed, they are the catalysts for creating the mobile world we line in today. Without such processors we most likely would not have gotten Android, a huge open source operating system. And we also would not have been able to play around with the Raspberry Pi for which piDora was made, another open source operating system.

Conclusion

My views on open source have not really changed much. I have always liked the idea free software and am somewhat of an advocate for open source itself. This is the case for a few reasons, one open source allows the product to be completely free, it allows the product to me copied and modified, and it allows for others to scrutinize the code.

Free is not necessarily always better than paid but in some cases it can be. In terms of open source a lot of the free and open source alternatives to popular paid software are quickly becoming usable. For example, GIMP is an open source image editor that has most of the functionality of Photoshop, Inkscape is another image editor especially made to vector graphics. LibreOffice is a complete open source office suite. It may not be as well designed as Microsoft Office but it is an excellent open source project that showcases how power full open source be.

A unique feature of open source is that it allows the user the freedom to modify the product they are using. This may not appeal to everyone using a particular software but as a developer it does appeal to me. If I want to change the way something works inside a particular project, I can just get the code change that behaviour. This also has the advantage of allowing others to put their names on a big project. For example, I don't have work for the Linux Foundation to contribute to the Linux kernel, I don't have to work for Mozilla to make some changes to Firefox, etc. Whereas, I would have had to work for Microsoft or Apple to make any changes to their code.

Peer review is important not only for security purposes but also for performance purposes. Scrutiny from peers and other coders provides valuable information on the project. It is something only open source projects can do. Such scrutiny identifies loop holes or critical bugs that may have been missed in a closed source system. Others can also provide a better way of performing a certain function, if that makes the performance better than that code can be implemented.

These talks have reinforced my views on open source and the potential it brings. The talk on MongoDB has proved that documentation is not always neglected in the open source community and the databases' growth proves the need for excellent documentation. Cheap, low powered ARM chips have enabled me to have my own credit card sized computer to play with, the Raspberry Pi. So open source has always been a fascinating topic to say the least. So many people from so many different places working together, without ever meeting, really is amazing.