Wednesday, June 22, 2016

Replica Set Components - MongoDB

replica set in MongoDB is a group of mongod processes that maintain the same data set. Replica sets provide redundancy and high availability, and are the basis for all production deployments. This section tells about various components in MongoDB replica set.

Replica Set Primary
 The primary is the only member in the replica set that receives write operations. MongoDB applies write operations on the primary and then records the operations on the primary’s oplog. Secondary members replicate this log and apply the operations to their data sets.
In the following three-member replica set, the primary accepts all write operations. Then the secondaries replicate the oplog to apply to their data sets.
Diagram of default routing of reads and writes to the primary.
All members of the replica set can accept read operations. However, by default, an application directs its read operations to the primary member. See Read Preference post for details on changing the default read behaviour.
The replica set can have at most one primary. [1] If the current primary becomes unavailable, an election determines the new primary. Read Replica Set Elections post for more details.
In the following 3-member replica set, the primary becomes unavailable. This triggers an election which selects one of the remaining secondaries as the new primary.

Diagram of an election of a new primary. In a three member replica set with two secondaries, the primary becomes unreachable. The loss of a primary triggers an election where one of the secondaries becomes the new primary
[1]In some circumstances, two nodes in a replica set may transiently believe that they are the primary, but at most, one of them will be able to complete writes with { w: "majority" } write concern. The node that can complete { w:"majority" } writes is the current primary, and the other node is a former primary that has not yet recognized its demotion, typically due to a network partition. When this occurs, clients that connect to the former primary may observe stale data despite having requested read preference primary, and new writes to the former primary will eventually roll back.

Replica Set Secondary Members
A secondary maintains a copy of the primary’s data set. To replicate data, a secondary applies operations from the primary’s oplog to its own data set in an asynchronous process. A replica set can have one or more secondaries.

The following three-member replica set has two secondary members. The secondaries replicate the primary’s oplog and apply the operations to their data sets.
Diagram of a 3 member replica set that consists of a primary and two secondaries.
Although clients cannot write data to secondaries, clients can read data from secondary members. See Read Preference for more information on how clients direct read operations to replica sets.
A secondary can become a primary. If the current primary becomes unavailable, the replica set holds an election to choose which of the secondaries becomes the new primary.
In the following three-member replica set, the primary becomes unavailable. This triggers an election where one of the remaining secondaries becomes the new primary.
Diagram of an election of a new primary. In a three member replica set with two secondaries, the primary becomes unreachable. The loss of a primary triggers an election where one of the secondaries becomes the new primary
You can configure a secondary member for a specific purpose. You can configure a secondary to:
  • Prevent it from becoming a primary in an election, which allows it to reside in a secondary data center or to serve as a cold standby. (Priority 0 Replica Set Members).
  • Prevent applications from reading from it, which allows it to run applications that require separation from normal traffic. (Hidden Replica Set Members).
  • Keep a running “historical” snapshot for use in recovery from certain errors, such as unintentionally deleted databases. (Delayed Replica Set Members).

Replica Set Arbiter

An arbiter does not have a copy of data set and cannot become a primary. Replica sets may have arbiters to add a vote in elections of for primary. Arbiters always have exactly 1 vote election, and thus allow replica sets to have an uneven number of members, without the overhead of a member that replicates data.
IMPORTANT
Do not run an arbiter on systems that also host the primary or the secondary members of the replica set.
Only add an arbiter to sets with even numbers of members. If you add an arbiter to a set with an odd number of members, the set may suffer from tied elections.
Example For example, in the following replica set, an arbiter allows the set to have an odd number of votes for elections:
Diagram of a four member replica set plus an arbiter for odd number of votes.

Hidden Replica Set Members

A hidden member maintains a copy of the primary’s data set but is invisible to client applications. Hidden members are good for workloads with different usage patterns from the other members in the replica set. Hidden members must always be priority 0 members and so cannot become primary. The db.isMaster() method does not display hidden members. Hidden members, however, may vote in elections.
In the following five-member replica set, all four secondary members have copies of the primary’s data set, but one of the secondary members is hidden.
Diagram of a 5 member replica set with a hidden priority 0 member.

Behaviour

Read OperationsClients will not distribute reads with the appropriate read preference to hidden members. As a result, these members receive no traffic other than basic replication. Use hidden members for dedicated tasks such as reporting and backups. Delayed members should be hidden.

In a sharded cluster, mongos do not interact with hidden members.
VotingHidden members may vote in replica set elections. If you stop a voting hidden member, ensure that the set has an active majority or the primary will step down.
For the purposes of backups,
  • If using the MMAPv1 storage engine, you can avoid stopping a hidden member with thedb.fsyncLock() and db.fsyncUnlock() operations to flush all writes and lock the mongod instance for the duration of the backup operation.
  • Changed in version 3.2: db.fsyncLock() can ensure that the data files do not change for MongoDB instances using either the MMAPv1 or the WiredTiger storage engines, thus providing consistency for the purposes of creating backups.
To configure a secondary member as hidden, set its members[n].priority value to 0 and set its members[n].hidden value to true in its member configuration:
{
  "_id" : <num>
  "host" : <hostname:port>,
  "priority" : 0,
  "hidden" : true
}

Configuration Procedure

The following example hides the secondary member currently at the index 0 in the members array. To configure a hidden member, use the following sequence of operations in a mongo shell connected to the primary, specifying the member to configure by its array index in the members array:
cfg = rs.conf()
cfg.members[0].priority = 0
cfg.members[0].hidden = true
rs.reconfig(cfg)
Replica Set Oplog
The oplog (operations log) is a special capped collection that keeps a rolling record of all operations that modify the data stored in your databases. MongoDB applies database operations on the primary and then records the operations on the primary’s oplog. The secondary members then copy and apply these operations in an asynchronous process. All replica set members contain a copy of the oplog, in the local.oplog.rs collection, which allows them to maintain the current state of the database.
To facilitate replication, all replica set members send heartbeats (pings) to all other members. Any member can import oplog entries from any other member.
Whether applied once or multiple times to the target dataset, each operation in the oplog produces the same results, i.e. each operation in the oplog is idempotent. For proper replication operations, entries in the oplog must be idempotent:
  • initial sync
  • post-rollback catch-up
  • sharding chunk migrations

Read Preference

Read preference describes how MongoDB clients route read operations to the members of a replica set.
Read operations to a replica set. Default read preference routes the read to the primary. Read preference of ``nearest`` routes the read to the nearest member.
By default, an application directs its read operations to the primary member in a replica set.

Read Preference Modes

IMPORTANT
All read preference modes except primary may return stale data because secondaries replicate operations from the primary with some delay. [1] Ensure that your application can tolerate stale data if you choose to use a non-primary mode.
MongoDB drivers support five read preference modes.
Read Preference ModeDescription
primaryDefault mode. All operations read from the current replica set primary.
primaryPreferredIn most situations, operations read from the primary but if it is unavailable, operations read from secondary members.
secondaryAll operations read from the secondary members of the replica set.
secondaryPreferredIn most situations, operations read from secondary members but if no secondary members are available, operations read from the primary.
nearestOperations read from member of the replica set with the least network latency, irrespective of the member’s type.

Tuesday, June 21, 2016

Replica Set Elections - MongoDB

Replica sets use elections to determine which set member will become primary. Elections occur after initiating a replica set, and also any time the primary becomes unavailable. The primary is the only member in the set that can accept write operations. If a primary becomes unavailable, elections allow the set to recover normal operations without manual intervention. Elections are part of the failover process.
In the following three-member replica set, the primary is unavailable. One of the remaining secondaries holds an election to elect itself as a new primary.
Diagram of an election of a new primary. In a three member replica set with two secondaries, the primary becomes unreachable. The loss of a primary triggers an election where one of the secondaries becomes the new primary
Elections are essential for independent operation of a replica set; however, elections take time to complete. While an election is in process, the replica set has no primary and cannot accept writes and all remaining members become read-only. MongoDB avoids elections unless necessary.
If a majority of the replica set is inaccessible or unavailable to the current primary, the primary will step down and become a secondary. The replica set cannot accept writes after this occurs, but remaining members can continue to serve read queries if such queries are configured to run on secondaries.

Factors and Conditions that Affect Elections

Replication Election Protocol

New in version 3.2: MongoDB introduces a version 1 of the replication protocol (protocolVersion: 1) to reduce replica set failover time and accelerates the detection of multiple simultaneous primaries. New replica sets will, by default, use protocolVersion: 1. Previous versions of MongoDB use version 0 of the protocol.

Heartbeats

Replica set members send heartbeats (pings) to each other every two seconds. If a heartbeat does not return within 10 seconds, the other members mark the delinquent member as inaccessible.

Member Priority

After a replica set has a stable primary, the election algorithm will make a “best-effort” attempt to have the secondary with the highest priority available call an election. Member priority affects both the timing and the outcome of elections; secondaries with higher priority call elections relatively sooner than secondaries with lower priority, and are also more likely to win. However, a lower priority instance can be elected as primary for brief periods, even if a higher priority secondary is available. Replica set members continue to call elections until the highest priority member available becomes primary.
Members with a priority value of 0 cannot become primary and do not seek election. 

Loss of a Data Center

With a distributed replica set, the loss of a data center may affect the ability of the remaining members in other data center or data centers to elect a primary.
If possible, distribute the replica set members across data centers to maximize the likelihood that even with a loss of a data center, one of the remaining replica set members can become the new primary.

Network Partition

network partition may segregate a primary into a partition with a minority of nodes. When the primary detects that it can only see a minority of nodes in the replica set, the primary steps down as primary and becomes a secondary. Independently, a member in the partition that can communicate with a majority of the nodes (including itself) holds an election to become the new primary.

Vetoes in Elections

Changed in version 3.2: The protocolVersion: 1 obviates the need for vetos. The following veto discussion applies to replica sets that use the older protocolVersion: 0.
For replica sets using protocolVersion: 0, all members of a replica set can veto an election, including non-voting members. A member will veto an election:
  • If the member seeking an election is not a member of the voter’s set.
  • If the current primary has more recent operations (i.e. a higher optime) than the member seeking election, from the perspective of another voting member.
  • If the current primary has the same or more recent operations (i.e. a higher or equal optime) than the member seeking election.
  • If a priority 0 member [1] is the most current member at the time of the election. In this case, another eligible member of the set will catch up to the state of the priority 0 member member and then attempt to become primary.
  • If the member seeking an election has a lower priority than another member in the set that is also eligible for election.
[1]Hidden and delayed imply priority 0 configuration.

Non-Voting Members

Although non-voting members do not vote in elections, these members hold copies of the replica set’s data and can accept read operations from client applications.
Because a replica set can have up to 50 members, but only 7 voting members, non-voting members allow a replica set to have more than seven members.
For instance, the following nine-member replica set has seven voting members and two non-voting members.
Diagram of a 9 member replica set with the maximum of 7 voting members.
A non-voting member has a members[n].votes setting equal to 0 in its member configuration:
{
"_id" : <num>
"host" : <hostname:port>,
"votes" : 0
}


Priority 0 Replica Set Members

priority 0 member is a secondary that cannot become primary. Priority 0 members cannot trigger elections. Otherwise these members function as normal secondaries. A priority 0 member maintains a copy of the data set, accepts read operations, and votes in elections. Configure a priority 0 member to prevent secondaries from becoming primary, which is particularly useful in multi-data center deployments.
In a three-member replica set, in one data center hosts the primary and a secondary. A second data center hosts one priority 0 member that cannot become primary.
Diagram of a 3 member replica set distributed across two data centers. Replica set includes a priority 0 member.

Priority 0 Members as Standbys

priority 0 member can function as a standby. In some replica sets, it might not be possible to add a new member in a reasonable amount of time. A standby member keeps a current copy of the data to be able to replace an unavailable member.
In many cases, you need not set standby to priority 0. However, in sets with varied hardware or geographic distribution, a priority 0 standby ensures that only qualified members become primary.
priority 0 standby may also be valuable for some members of a set with different hardware or workload profiles. In these cases, deploy a member with priority 0 so it can’t become primary. Also consider using a hidden member for this purpose.
If your set already has seven voting members, also configure the member as non-voting.

Priority 0 Members and Failover

When configuring a priority 0 member, consider potential failover patterns, including all possible network partitions. Always ensure that your main data center contains both a quorum of voting members and contains members that are eligible to be primary.
Priority 0 Member Configuration
When updating the replica configuration object, access the replica set members in the members array with the array index. The array index begins with 0. Do not confuse this index value with the value of the members[n]._id field in each document in the members array.
NOTE
MongoDB does not permit the current primary to have a priority of 0. To prevent the current primary from again becoming a primary, you must first step down the current primary using rs.stepDown().

Procedure

This tutorial uses a sample replica set with 5 members.
WARNING
  • The rs.reconfig() shell method can force the current primary to step down, which causes an election. When the primary steps down, the mongod closes all client connections. While this typically takes 10-20 seconds, try to make these changes during scheduled maintenance periods.
  • To successfully reconfigure a replica set, a majority of the members must be accessible. If your replica set has an even number of members, add an arbiter to ensure that members can quickly obtain a majority of votes in an election for primary.
1

Retrieve the current replica set configuration.

The rs.conf() method returns a replica set configuration document that contains the current configuration for a replica set.
In a mongo shell connected to a primary, run the rs.conf() method and assign the result to a variable:
cfg = rs.conf()
The returned document contains a members field which contains an array of member configuration documents, one document for each member of the replica set.
2

Assign priority value of 0.

To prevent a secondary member from becoming a primary, update the secondary member’smembers[n].priority to 0.
To assign a priority value to a member of the replica set, access the member configuration document using the array index. In this tutorial, the secondary member to change corresponds to the configuration document found at position 2 of the members array.
cfg.members[2].priority = 0
The configuration change does not take effect until you reconfigure the replica set.
3

Reconfigure the replica set.

Use rs.reconfig() method to reconfigure the replica set with the updated replica set configuration document.
Pass the cfg variable to the rs.reconfig() method:
rs.reconfig(cfg)

SQL to MongoDB Mapping Chart

In addition to the charts that follow, you might want to consider the Frequently Asked Questions section for a selection of common questions about MongoDB.

Terminology and Concepts

The following table presents the various SQL terminology and concepts and the corresponding MongoDB terminology and concepts.
SQL Terms/ConceptsMongoDB Terms/Concepts
databasedatabase
tablecollection
rowdocument or BSON document
columnfield
indexindex
table joinsembedded documents and linking
primary key
Specify any unique column or column combination as primary key.
primary key
In MongoDB, the primary key is automatically set to the _id field.
aggregation (e.g. group by)
aggregation pipeline
See the SQL to Aggregation Mapping Chart below.

Executables

The following table presents some database executables and the corresponding MongoDB executables. This table is not meant to be exhaustive.
MongoDBMySQLOracleInformixDB2
Database ServermongodmysqldoracleIDSDB2 Server
Database ClientmongomysqlsqlplusDB-AccessDB2 Client

Examples

The following table presents the various SQL statements and the corresponding MongoDB statements. The examples in the table assume the following conditions:
  • The SQL examples assume a table named users.
  • The MongoDB examples assume a collection named users that contain documents of the following prototype:
    {
    _id: ObjectId("509a8fb2f3f4948bd2f983a0"),
    user_id: "abc123",
    age: 55,
    status: 'A'
    }

Create and Alter

The following table presents the various SQL statements related to table-level actions and the corresponding MongoDB statements.
SQL Schema StatementsMongoDB Schema Statements
CREATE TABLE users (
    id MEDIUMINT NOT NULL
        AUTO_INCREMENT,
    user_id Varchar(30),
    age Number,
    status char(1),
    PRIMARY KEY (id)
)
Implicitly created on first insert() operation. The primary key_id is automatically added if _id field is not specified.
db.users.insert( {
    user_id: "abc123",
    age: 55,
    status: "A"
 } )
However, you can also explicitly create a collection:
db.createCollection("users")
ALTER TABLE users
ADD join_date DATETIME
Collections do not describe or enforce the structure of its documents; i.e. there is no structural alteration at the collection level.
However, at the document level, update() operations can add fields to existing documents using the $set operator.
db.users.update(
    { },
    { $set: { join_date: new Date() } },
    { multi: true }
)
ALTER TABLE users
DROP COLUMN join_date
Collections do not describe or enforce the structure of its documents; i.e. there is no structural alteration at the collection level.
However, at the document level, update() operations can remove fields from documents using the $unset operator.
db.users.update(
    { },
    { $unset: { join_date: "" } },
    { multi: true }
)
CREATE INDEX idx_user_id_asc
ON users(user_id)
db.users.createIndex( { user_id: 1 } )
CREATE INDEX
       idx_user_id_asc_age_desc
ON users(user_id, age DESC)
db.users.createIndex( { user_id: 1, age: -1 } )
DROP TABLE users
db.users.drop()

Insert

The following table presents the various SQL statements related to inserting records into tables and the corresponding MongoDB statements.
SQL INSERT StatementsMongoDB insert() Statements
INSERT INTO users(user_id,
                  age,
                  status)
VALUES ("bcd001",
        45,
        "A")
db.users.insert(
   { user_id: "bcd001", age: 45, status: "A" }
)

Select

The following table presents the various SQL statements related to reading records from tables and the corresponding MongoDB statements.
NOTE
The find() method always includes the _id field in the returned documents unless specifically excluded through projection. Some of the SQL queries below may include an _id field to reflect this, even if the field is not included in the corresponding find() query.
SQL SELECT StatementsMongoDB find() Statements
SELECT *
FROM users
db.users.find()
SELECT id,
       user_id,
       status
FROM users
db.users.find(
    { },
    { user_id: 1, status: 1 }
)
SELECT user_id, status
FROM users
db.users.find(
    { },
    { user_id: 1, status: 1, _id: 0 }
)
SELECT *
FROM users
WHERE status = "A"
db.users.find(
    { status: "A" }
)
SELECT user_id, status
FROM users
WHERE status = "A"
db.users.find(
    { status: "A" },
    { user_id: 1, status: 1, _id: 0 }
)
SELECT *
FROM users
WHERE status != "A"
db.users.find(
    { status: { $ne: "A" } }
)
SELECT *
FROM users
WHERE status = "A"
AND age = 50
db.users.find(
    { status: "A",
      age: 50 }
)
SELECT *
FROM users
WHERE status = "A"
OR age = 50
db.users.find(
    { $or: [ { status: "A" } ,
             { age: 50 } ] }
)
SELECT *
FROM users
WHERE age > 25
db.users.find(
    { age: { $gt: 25 } }
)
SELECT *
FROM users
WHERE age < 25
db.users.find(
   { age: { $lt: 25 } }
)
SELECT *
FROM users
WHERE age > 25
AND   age <= 50
db.users.find(
   { age: { $gt: 25, $lte: 50 } }
)
SELECT *
FROM users
WHERE user_id like "%bc%"
db.users.find( { user_id: /bc/ } )
SELECT *
FROM users
WHERE user_id like "bc%"
db.users.find( { user_id: /^bc/ } )
SELECT *
FROM users
WHERE status = "A"
ORDER BY user_id ASC
db.users.find( { status: "A" } ).sort( { user_id: 1 } )
SELECT *
FROM users
WHERE status = "A"
ORDER BY user_id DESC
db.users.find( { status: "A" } ).sort( { user_id: -1 } )
SELECT COUNT(*)
FROM users
db.users.count()
or
db.users.find().count()
SELECT COUNT(user_id)
FROM users
db.users.count( { user_id: { $exists: true } } )
or
db.users.find( { user_id: { $exists: true } } ).count()
SELECT COUNT(*)
FROM users
WHERE age > 30
db.users.count( { age: { $gt: 30 } } )
or
db.users.find( { age: { $gt: 30 } } ).count()
SELECT DISTINCT(status)
FROM users
db.users.distinct( "status" )
SELECT *
FROM users
LIMIT 1
db.users.findOne()
or
db.users.find().limit(1)
SELECT *
FROM users
LIMIT 5
SKIP 10
db.users.find().limit(5).skip(10)
EXPLAIN SELECT *
FROM users
WHERE status = "A"
db.users.find( { status: "A" } ).explain()

Update Records

The following table presents the various SQL statements related to updating existing records in tables and the corresponding MongoDB statements.
SQL Update StatementsMongoDB update() Statements
UPDATE users
SET status = "C"
WHERE age > 25
db.users.update(
   { age: { $gt: 25 } },
   { $set: { status: "C" } },
   { multi: true }
)
UPDATE users
SET age = age + 3
WHERE status = "A"
db.users.update(
   { status: "A" } ,
   { $inc: { age: 3 } },
   { multi: true }
)

Delete Records

The following table presents the various SQL statements related to deleting records from tables and the corresponding MongoDB statements.
SQL Delete StatementsMongoDB remove() Statements
DELETE FROM users
WHERE status = "D"
db.users.remove( { status: "D" } )
DELETE FROM users
db.users.remove({})

SQL to Aggregation Mapping Chart

The aggregation pipeline allows MongoDB to provide native aggregation capabilities that corresponds to many common data aggregation operations in SQL.
The following table provides an overview of common SQL aggregation terms, functions, and concepts and the corresponding MongoDB aggregation operators:
SQL Terms, Functions, and ConceptsMongoDB Aggregation Operators
WHERE$match
GROUP BY$group
HAVING$match
SELECT$project
ORDER BY$sort
LIMIT$limit
SUM()$sum
COUNT()$sum
join
$lookup
New in version 3.2.

Examples

The following table presents a quick reference of SQL aggregation statements and the corresponding MongoDB statements. The examples in the table assume the following conditions:
  • The SQL examples assume two tables, orders and order_lineitem that join by theorder_lineitem.order_id and the orders.id columns.
  • The MongoDB examples assume one collection orders that contain documents of the following prototype:
    {
      cust_id: "abc123",
      ord_date: ISODate("2012-11-02T17:04:11.102Z"),
      status: 'A',
      price: 50,
      items: [ { sku: "xxx", qty: 25, price: 1 },
               { sku: "yyy", qty: 25, price: 1 } ]
    }
    
SQL ExampleMongoDB ExampleDescription
SELECT COUNT(*) AS count
FROM orders
db.orders.aggregate( [
   {
     $group: {
        _id: null,
        count: { $sum: 1 }
     }
   }
] )
Count all records fromorders
SELECT SUM(price) AS total
FROM orders
db.orders.aggregate( [
   {
     $group: {
        _id: null,
        total: { $sum: "$price" }
     }
   }
] )
Sum the price field from orders
SELECT cust_id,
       SUM(price) AS total
FROM orders
GROUP BY cust_id
db.orders.aggregate( [
   {
     $group: {
        _id: "$cust_id",
        total: { $sum: "$price" }
     }
   }
] )
For each unique cust_id, sum the price field.
SELECT cust_id,
       SUM(price) AS total
FROM orders
GROUP BY cust_id
ORDER BY total
db.orders.aggregate( [
   {
     $group: {
        _id: "$cust_id",
        total: { $sum: "$price" }
     }
   },
   { $sort: { total: 1 } }
] )
For each unique cust_id, sum the price field, results sorted by sum.
SELECT cust_id,
       ord_date,
       SUM(price) AS total
FROM orders
GROUP BY cust_id,
         ord_date
db.orders.aggregate( [
   {
     $group: {
        _id: {
           cust_id: "$cust_id",
           ord_date: {
               month: { $month: "$ord_date" },
               day: { $dayOfMonth: "$ord_date" },
               year: { $year: "$ord_date"}
           }
        },
        total: { $sum: "$price" }
     }
   }
] )
For each unique cust_idord_date grouping, sum the price field. Excludes the time portion of the date.
SELECT cust_id,
       count(*)
FROM orders
GROUP BY cust_id
HAVING count(*) > 1
db.orders.aggregate( [
   {
     $group: {
        _id: "$cust_id",
        count: { $sum: 1 }
     }
   },
   { $match: { count: { $gt: 1 } } }
] )
For cust_idwith multiple records, return thecust_id and the corresponding record count.
SELECT cust_id,
       ord_date,
       SUM(price) AS total
FROM orders
GROUP BY cust_id,
         ord_date
HAVING total > 250
db.orders.aggregate( [
   {
     $group: {
        _id: {
           cust_id: "$cust_id",
           ord_date: {
               month: { $month: "$ord_date" },
               day: { $dayOfMonth: "$ord_date" },
               year: { $year: "$ord_date"}
           }
        },
        total: { $sum: "$price" }
     }
   },
   { $match: { total: { $gt: 250 } } }
] )
For each unique cust_idord_date grouping, sum the pricefield and return only where the sum is greater than 250. Excludes the time portion of the date.
SELECT cust_id,
       SUM(price) as total
FROM orders
WHERE status = 'A'
GROUP BY cust_id
db.orders.aggregate( [
   { $match: { status: 'A' } },
   {
     $group: {
        _id: "$cust_id",
        total: { $sum: "$price" }
     }
   }
] )
For each uniquecust_idwith status A, sum theprice field.
SELECT cust_id,
       SUM(price) as total
FROM orders
WHERE status = 'A'
GROUP BY cust_id
HAVING total > 250
db.orders.aggregate( [
   { $match: { status: 'A' } },
   {
     $group: {
        _id: "$cust_id",
        total: { $sum: "$price" }
     }
   },
   { $match: { total: { $gt: 250 } } }
] )
For each uniquecust_idwith status A, sum theprice field and return only where the sum is greater than 250.
SELECT cust_id,
       SUM(li.qty) as qty
FROM orders o,
     order_lineitem li
WHERE li.order_id = o.id
GROUP BY cust_id
db.orders.aggregate( [
   { $unwind: "$items" },
   {
     $group: {
        _id: "$cust_id",
        qty: { $sum: "$items.qty" }
     }
   }
] )
For each uniquecust_id, sum the corresponding line item qtyfields associated with the orders.
SELECT COUNT(*)
FROM (SELECT cust_id,
             ord_date
      FROM orders
      GROUP BY cust_id,
               ord_date)
      as DerivedTable
db.orders.aggregate( [
   {
     $group: {
        _id: {
           cust_id: "$cust_id",
           ord_date: {
               month: { $month: "$ord_date" },
               day: { $dayOfMonth: "$ord_date" },
               year: { $year: "$ord_date"}
           }
        }
     }
   },
   {
     $group: {
        _id: null,
        count: { $sum: 1 }
     }
   }
] )



Mongodb explain() Query Analyzer and it's Verbosity

First creating 1 million documents: > for(i=0; i<100; i++) { for(j=0; j<100; j++) {x = []; for(k=0; k<100; k++) { x.push({a:...