Recent rise of NoSQL movement ( MongoDB, Cassandra, HBase, CouchDB and many more) can be attributed to 2 factors:
- Adoption of agile aka "write code first design later" methodology: In using RDBMS you have to design your tables, data structure, relations first, then only you can start coding. While using NoSQL you can start coding without worrying about tables. You can modify your objects at lesser cost of development.
- Impedance mismatch between Object Oriented programming and relational database : It is major pain point for developers to use Object Relation Mapping software (e.g. Hibernate). Meanwhile MongoDB, which is document oriented database, can fit directly in Object oriented paradigm.
Both MongoDB
and RDBMS databases have their uses. Some use case are common to both and
some are exclusive.
Consider an example of transaction based system and this is what MongoDB people says on their website:
" While most modern applications require a flexible, scalable system like MongoDB, there are use cases for which a relational database like MySQL would be better suited. Applications that require complex, multi-row transactions (e.g., a double-entry bookkeeping system) would be good examples. MongoDB is not a drop-in replacement for legacy applications built around the relational data model and SQL. "
(https://www.mongodb.com/compare/mongodb-mysql
)
Let us compare MongoDB and MySQL (any RDBMS will do, I have just chose MySQL
for sake of It.) on various features.
- Philosophy: MySQL is relation oriented while MongoDB is document oriented. If you manipulate a single type of object most of time then document oriented database will suit you more. Relation oriented is more suitable if you frequently manipulate more than one type of object at a time.
- Querying through user interface: MySQL use standard SQL for querying, while MongoDB use propriety JSON protocol. It provides JavaScript client for querying, which is very different from SQL. e.g. MySQL query "SELECT id, name FROM users WHERE age>50 LIMIT 10" will translate into "db.users.find({"age":{"$gt":50}}, {"id":1,"name":1}).limit(10)". SQL although have steep learning curve, but is versatile.
- Ease of Use: You have to give MongoDB full marks here. MongoDB is very easy to set up and get it running. Its Java client is also very simple. Since document structure can be easily mapped to Object structure, users of OO language such as Java, C#, C++ will find less mental friction while using Mongo client.
- Maturity: Since MongoDB is not even a decade old, so release stability is problem. There were major changes between Mongo 2 vs Mongo 3. Compare this to RDBMS which are more or less unchanged in structure from last 2 decades.
- Multi table query: MongoDB has no concept of joins, so it cannot be done in MongoDB but then again it is how it has been designed, it is not RDBMS. But you can change your document structure and embed the other document inside the first document. This sometimes leads to very massive document structure. Again MongoDB has maximum document size limit of 16MB, so you cannot keep going on fattening your document.
- Foreign Keys: MongoDB does not support foreign keys, if you need these type of constraint, you have to handle it in code, a complexity at the implementation time.
- SQL injection vulnerability: The name itself indicates that is applicable for SQL only, while MongoDB is susceptible to other kinds of attack but is unaffected by SQL injection.
- Lock granularity: MySQL provides very fine granularity of locking, while MongoDB provides only one level. Till MongoDB 2 it was only at collection (table) level. In MongoDB 3 it is on document (row) level. There is no other level of locking.
- User Privilege: MongoDB provide access on role based. It does not support query based privilege.
- Indexes: MongoDB support multiple type of indexes. But it puts limitation on size of index, number of indexes, and number of fields in compound indexes, some similarity with RDBMS. Without indexes MongoDB queries are very slow and once your table size grows more than 2-3 million documents (rows). If you have an indexed field and you are adding a document in which field size is greater than 1 KB, it will raise an error.
- Scaling: Horizontal scaling with MongoDB is more convenient in practice. This has been major selling point of MongoDB. But if you see top 10 most popular website, 3 of them (YouTube, Wikipedia and Twitter) uses MySQL, none have used MongoDB. It is not a matter of what to use, it is more of how to use. There appears to be a compelling reason why MongoDB is out in their design decisions.
- Error reporting: Error reporting while writing to MongoDB leaves lot to be desired. It does not return all the error, only last error. Also while replicating to slaves if some of replication has failed, it will just ignore it, so consistency is another issue.
- Performance degradation: If load on database is heavy and data is not properly indexed, then all databases RDBMS and NoSQL suffer performance degradation alike. But behavior of both are different. MongoDB currently puts limit on max number of connection as 20000. When this limit is reached, it behave very unwieldy. Sometimes only option is to restart.
- Data durability: Before MongoDB 3.2, MMAP was default database engine of MongoDB. MMAP uses memory mapped file for writing. Memory mapped file syncs to disk in regular interval. If journaling is not enabled and crash happens then that time interval data will be lost. Enabling Journal slows down writes dramatically.
In part 2 we will discuss factors which are responsible for rise in adoption of MongoDB.