1
edit
Changes
no edit summary
MongoDB also has full support of '''primary and secondary indexing'''. Indexes support the efficient resolution of queries in MongoDB. Without indexes, MongoDB must scan every document in a collection to select those documents that match the query statement. Fundamentally, indexes in MongoDB are similar to indexes in other database systems. MongoDB defines indexes at the collection level and supports indexes on any field or sub-field of the documents in a MongoDB collection. MongoDB can use indexes to return documents sorted by the index key directly from the index without requiring an additional sort phase.
'''Sharding''' is another feature that Mongo supports. It is the process of storing data records across multiple machines and is MongoDB’s approach to meeting the demands of data growth. As the size of the data increases, a single machine may not be sufficient to store the data nor provide an acceptable read and write throughput. Sharding solves the problem with horizontal scaling. With sharding, you add more machines to support data growth and the demands of read and write operations.
Kevin also spoke about '''Replication'''. A replica set in MongoDB is a group of mongod processes that maintain the same data set. Replica sets provide redundancy and high availability, and are the basis for all production deployments. Data can live across multiple boxes in multiple servers.
He then went to talk about installing MongoDB and how easy it was to set it up, he showed the installation process by demoing it to the audience. I already have mongodb installed and set up on my local computer so this part was bit of repetitive to me. He went to create a collection called FSOSS and demoed the basic Mongo commands and then went to a lot of detail about mongo and demoed a lot of its functionality, which I thought was pretty cool.
=== OpenCL ===
The other talk that interested me was OpenCL. I've taken the GPU programming course in my program last semester, where I learned about CUDA programming concepts in detail and an introduction to OpenCL programming which covered the basic concepts of OpenCL. So, when I heard about the OpenCL talk at FSOSS I just had to sit in. This talk was given by Adrien Guillon.
Adrien started off by talking about Big Data and how the power of GPU programming is very useful in computing big data. He introduced the computational model and how the CPU and GPU sit in the model and how they work together during computations performed by a computer. The talk then lead into what GPU programming is and how it works as opposed to traditional CPU programming. He spoke about the benefits about using GPU programming and how far it has come so far and where he thinks it is going and where he wants GPU programming to go.
The talk did not cover much about OpenCL. He just spoke about, or what it was. He just spoke about how OpenCL is used to build high level abstractions which are very useful in all purposes. He then went on to talk about the issues with OpenCL. He said that OpenCL changes drastically with every release and each release is not very compatible with the previous one. OpenCL does not have a stable release as yet. Adrien spoke about making open source GPU programming stable for the future so in the future programmers will not have to change their code to support every new release.
== Comparison ==
Both the speakers made it clear that open source is a huge and every growing community. Open source is so wide spread that it covers every aspect of computer programming and development. Kevin, who spoke about MongoDB talked about open source from a database perspective and how open source has given rise to many alternatives to sql databases like the nosql database MongoDb. Adrien, who spoke about open source development from a pure programming perspective talked about how open source has taken the GPU programming world by storm. What linked the two speakers was that both of them talked about Big Data and how these two technologies (MongoDB and OpenCL)can prove very useful in performing huge computations for big data analysis.