MongoDB 4.4 new features

This version of MongoDB has some path breaking features, few are badly required and most of them has given for enhancement of its features. There are many features, but I am talking about in this article which is more important for performance aspect as well as scalability concern. I am highlighting few new features are as following.

Refinable Shard Keys

As per previous version concern, we all know the choice of the shard key is final and it is immutable. Now this scenario has been overcome by this feature called Refinable Shard Key. With this command, one can refine a collection’s shard key by adding a suffix field or fields to the existing key. Refining a collection’s shard key allows for a more uniform data distribution and can resolve situations like the existing key has led to jumbo chunks due to insufficient cardinality.

#Enable sharding on the database, if sharding is not already enabled
sh.enableSharding("speedtest")

#The operation uses the customer_id field as the initial shard key
db.adminCommand( { shardCollection: "speedtest.orders", key: { customer_id: 1 } } )

#To modify the shard key to be the customer_id field and the ping_id field { customer_id: 1, ping_id: 1 }

#Create the index to support the new shard key if the index does not already exist.

db.getSiblingDB("speedtest").orders.createIndex( { customer_id: 1, ping_id: 1 } )

#Run refineCollectionShardKey command to add the ping_id field as a suffix

db.adminCommand( {
   refineCollectionShardKey: "speedtest.orders",
   key: { customer_id: 1, ping_id: 1 }
} )

Hedged Reads

From 4.4 mongos query router by default can hedge/route reads that use non-primary read preference. It will route a single read to multiple replica set members per each queried shard and return results from the first respondent per shard. This is how MongoDB dealt with latencies. To turn off a mongos instance’s support for hedged reads, set the readHedgingMode parameter at mongos. If the hedged read support is off,  mongos does not use hedged reads regardless of the hedge option specified for the read preference.

Compound Hashed Shard Key

In previous version of MongoDB, if you are not sure about equal distribution of chunks, you can make the hashed shard key for equal distribution. Hashed shard key have some performance impact rather using Range or Geography (Zone) sharding because of use hashing algorithms. You can only make one field which is non array field can be hashed shard key now you can make compound (Multiple key) as a hashed shard.

Compound hashed sharding also supports shard keys with a hashed prefix for resolving data distribution issues related to monotonically increasing fields

sh.shardCollection(
  "test.speedtest",
  { "_id" : "hashed", "ipAddress" : 1}
)

Replica Set

  1. Resumable Initial Sync — A secondary performing initial sync can attempt to resume the sync process if interrupted by a network error, collection drop, or collection rename.
  2. Minimum Oplog Retention Period — One can specify the minimum number of hours to preserve an oplog entry. The mongod only removes an oplog entry if
  • The oplog has reached the maximum configured sizeand
  • The oplog entry is older than the configured number of hours based on the host system clock.

Aggregation

  1. Union All ($unionWith Stage) — Providing the ability to combines pipeline results from multiple collections into a single result set.
  2. Custom Aggregation Expressions$accumulator and $function are new operators that allow users to define custom aggregation expressions
  3. Few Aggregation operators are newly added.

Structured Logging

Previously, log entries were output as plaintext. Now output all log messages in structured JSON format as a series of key-value pairs, where each key indicates a log message field type, such as “severity”, and each corresponding value records the associated logging information for that field type, such as “informational”.

{"t":{"$date":"2020-05-18T20:18:13.533+00:00"},"s":"I",  "c":"NETWORK",  "id":23015,   "ctx":"listener","msg":"Listening on","attr":{"address":"127.0.0.1"}}
{"t":{"$date":"2020-05-18T20:18:13.533+00:00"},"s":"I",  "c":"NETWORK",  "id":23016,   "ctx":"listener","msg":"Waiting for connections","attr":{"port":27001,"ssl":"off"}}

Hidden Indexes

Now you can toggle Indexes (Hide / Un Hide) from the query planner to pick the index, in previous version we need to remove the index and recreate it, now we have the ability to hide and un-hide index for performance analysis of queryPlanner.

To support hidden indexes, MongoDB introduces:

Few removed features

  • mongoreplay Removed from MongoDB Packaging
  • Removed Commands like cloneCollection, planCacheListPlans, planCacheListQueryShapes.
  • Removed Platforms –Amazon Linux 2013.03RHEL / CentOS / Oracle 6 on the s390x architectureWindows 7 / Server 2008 R2Windows 8 / Server 2012Windows 8.1 / Server 2012 R2macOS 10.12

Conclusion

There are may features are included in this 4.4 version, but I am highlighting here those which actually badly required. In other words it can be said that this version of MongoDB is a paradigm shift. Actually in MongoDB developer forum mongodb users across the world were suggest to overcome many limitations over the years, finally MongoDB developers listen our words, and made MongoDB more flexible and performant.