How to find the document with maximum size in MongoDB collection

| Comments

The command to get document size is Object.bsonsize. The next query is to find the document in a small collection, cause it can be slow:

db.getCollection('my_collection').find({}).map(doc => {
    return {_id: doc._id, size: Object.bsonsize(doc)};
}).reduce((a, b) => a.size > b.size ? a : b)

To do this faster with mongo mapReduce:

db.getCollection('my_collection').mapReduce(
    function() {
        emit('size', {_id: this._id, size: Object.bsonsize(this)});
    },
    function(key, values) {
        return values.reduce((a, b) => a.size > b.size ? a : b);
    },
    {out: {inline: 1}}
)

How to find number of MongoDB connections

| Comments

From the MongoDB side the current connections can be found with db.currentOp() command. Then they can be grouped by client ip, counted and sorted.

var ips = db.currentOp(true).inprog.filter(
        d => d.client
    ).map(
        d => d.client.split(':')[0]
    ).reduce(
        (ips, ip) => {
            if(!ips[ip]) {
                ips[ip] = 0;
            }
            ips[ip]++;
            return ips;
        }, {}
    );
Object.keys(ips).map(
        key => {
            return {"ip": key, "num": ips[key]};
        }
    ).sort(
        (a, b) => b.num - a.num
    );

The result will be like this:

[
    {
        "ip" : "11.22.33.444",
        "num" : 77.0
    },
    {
        "ip" : "11.22.33.445",
        "num" : 63.0
    },
    {
        "ip" : "11.22.33.344",
        "num" : 57.0
    }
]

Then if there are several Docker containers on client host, the connections can be found by netstat command in each of them. Suppose there are several MongoDB replicas with ips starting on 44.55... and 77.88..., the command to count all connections to the replicas is:

netstat -tn | grep -e 44.55 -e 77.88 | wc -l

MongoDB select fields after $lookup

When there is a $lookup stage to join a list of large documents, an error Total size of documents in ... matching ... exceeds maximum document size can arrive.

It’s possible to avoid this with $unwind stage right after $lookup. More explanations in the documentation. And then the documents can be regrouped with the required fields.

Order.objects.aggregate(
    {
        '$lookup': {
            'from': 'item',
            'localField': '_id',
            'foreignField': 'order_id',
            'as': 'items'
        }
    },
    {
        "$unwind": "$items"
    },
    {
        "$group": {
            "_id": "$_id",
            "date": {"$last": "$date"},
            "items": {
                "$push": {
                    "name": "$items.name",
                    "price": "$items.price"
                }
            }
        }
    }
)

How to find a change for a field with MongoDB aggregation

For example there is a collection device_status which stores the different states for the devices. The task is to find the devices which passed from off to on at least one time.

{ "device" : "device1", "state" : "on", "ts": ISODate("2018-06-07T17:05:29.340+0000") }
{ "device" : "device2", "state" : "off", "ts": ISODate("2018-06-08T17:05:29.340+0000") }
{ "device" : "device3", "state" : "on", "ts": ISODate("2018-06-09T17:05:29.340+0000")}
{ "device" : "device3", "state" : "shutdown", "ts": ISODate("2018-06-09T18:05:29.340+0000")}
{ "device" : "device2", "state" : "load", "ts": ISODate("2018-06-09T19:05:29.340+0000") }
{ "device" : "device2", "state" : "on", "ts": ISODate("2018-06-10T17:05:29.340+0000") }
{ "device" : "device3", "state" : "off", "ts": ISODate("2018-06-11T17:05:29.340+0000") }
{ "device" : "device1", "state" : "idle", "ts": ISODate("2018-06-11T18:05:29.340+0000") }
{ "device" : "device3", "state" : "on", "ts": ISODate("2018-06-12T17:05:29.340+0000") }
...

The first stage is to sort the data by device and date.

A bootstrap for a microservice based on Flask with MongoDB

Starting a new project is a common task in microservices architecture. To do this it’s better to have a some template. I put my version to the flask-mongoengine-bootstrap repository. The key point are:

  • very basic, only flask, flask-mongoengine and structlog in requirements
  • configuration through environment variables
  • configured logging in JSON format
  • marking log records with request_id
  • possibility to run development version make dev and tests make test through docker
  • a template for Makefile
  • examples of model, api route and tests

If you’ll use it, do not forget to change SECRET_KEY.

A simple CRUD app with Django and Mongoengine

| Comments

There are several possibilities to use MongoDB in Django:

Django MongoDB uses django-nonrel, which is a fork based on Django 1.3.
I don’t like this idea, because now Django 1.5 is ready to out. Beetween Mongoengine and MongoKit, I like more Mongoengine. There are several comparative articles:

So I created a simple CRUD app using Mongoengine. The model definition in Mongoengine looks like in Django.