Big Data - B'cuz My Data is Big: ReplicaSet Setting up for Testing

Step 1: Create the necessary data directories for each member by issuing a command similar to the following:

mkdir -p /srv/mongodb/rs0-0 /srv/mongodb/rs0-1 /srv/mongodb/rs0-2

This will create directories called “rs0-0”, “rs0-1”, and “rs0-2”, which will contain the instances’ database files.

Step 2:

Start your mongod instances in their own shell windows by issuing the following commands:

First member:

mongod --storageEngine=mmapv1 --port 27017 --dbpath D:\MongoDB\rs0-0 --replSet rs0 --smallfiles --oplogSize 128

Second member:

mongod --storageEngine=mmapv1 --port 27018 --dbpath D:\MongoDB\rs0-1 --replSet rs0 --smallfiles --oplogSize 128

Third member:

mongod --storageEngine=mmapv1 --port 27019 --dbpath D:\MongoDB\rs0-2 --replSet rs0 --smallfiles --oplogSize 128

This starts each instance as a member of a replica set named rs0, each running on a distinct port, and specifies the path to your data directory with the --dbpath setting. If you are already using the suggested ports, select different ports.

The --smallfiles and --oplogSize settings reduce the disk space that each mongod instance uses. This is ideal for testing and development deployments as it prevents overloading your machine.

Step 3: Connect to one of your mongod instances through the mongo shell. You will need to indicate which instance by specifying its port number. For the sake of simplicity and clarity, you may want to choose the first one, as in the following command;

mongo --port 27017

Step 4: In the mongo shell, use rs.initiate() to initiate the replica set. You can create a replica set configuration object in the mongo shell environment, as in the following example:

rsconf = {

_id: "rs0",

members: [

{

_id: 0,

host: "<hostname>:27017"

}

]

}

If you do not use rsconf then while adding other members you might get following errors:

rs0:PRIMARY> rs.add("localhost:27018")

{

"ok" : 0,

"errmsg" : "Either all host names in a replica set configuration must be localhost references, or none must be; found 1 out of 2",

"code" : 103

}

To make sure all have localhost we will use configuration object:

rs0:PRIMARY> rsconf = {

... _id: "rs0",

... members: [

... {

... _id: 0,

... host: "localhost:27017"

... }

... ]

... }

{

"_id" : "rs0",

"members" : [

{

"_id" : 0,

"host" : "localhost:27017"

}

]

}

replacing <hostname> with your system’s hostname, and then pass the rsconf file to rs.initiate() as follows:

rs0:PRIMARY> rs.initiate(rsconf)

{

"info" : "try querying local.system.replset to see current configuration",

"ok" : 0,

"errmsg" : "already initialized",

"code" : 23

}

This is because we have already given the rs.initiate() command. Now to make the server use the configuration object we need to use reconfig option as:

rs0:PRIMARY> rs.reconfig(rsconf)

{ "ok" : 1 }

When we executee rs.reconfig() without any configuration object following messages showing conversion into Primary database will appear:

2016-06-19T16:31:31.121+0530 I REPL [ReplicationExecutor] New replica set config in use: { _id: "rs0", version: 1, protocolVersion: 1, members: [ { _id: 0, host: "himanshu-PC:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 } } }

2016-06-19T16:31:31.121+0530 I REPL [ReplicationExecutor] This node is himanshu-PC:27017 in the config

2016-06-19T16:31:31.123+0530 I REPL [ReplicationExecutor] transition to STARTUP2

2016-06-19T16:31:31.126+0530 I REPL [conn2] Starting replication applier threads

2016-06-19T16:31:31.131+0530 I COMMAND [conn2] command local.oplog.rs command: replSetInitiate { replSetInitiate: undefined } keyUpdates:0 writeConflicts:0 numYields:0 reslen:129 locks:{ Global: { acquireCount: { r: 5, w: 3, W: 2 }, acquireWaitCount: { W: 1 }, timeAcquiringMicros: { W: 2679 } }, MMAPV1Journal: { acquireCount: { w: 4 } }, Database: { acquireCount: { w: 2, W: 1 } }, Metadata: { acquireCount: { w: 1, W: 3 } }, oplog: { acquireCount: { W: 2 } } } protocol:op_command 252ms

2016-06-19T16:31:31.142+0530 I REPL [ReplicationExecutor] transition to RECOVERING

2016-06-19T16:31:31.162+0530 I REPL [ReplicationExecutor] transition to SECONDARY

2016-06-19T16:31:31.163+0530 I REPL [ReplicationExecutor] conducting a dry run election to see if we could be elected

2016-06-19T16:31:31.164+0530 I REPL [ReplicationExecutor] dry election run succeeded, running for election

2016-06-19T16:31:31.169+0530 I REPL [ReplicationExecutor] election succeeded, assuming primary role in term 1

2016-06-19T16:31:31.170+0530 I REPL [ReplicationExecutor] transition to PRIMARY

2016-06-19T16:31:32.167+0530 I REPL [rsSync] transition to primary complete; database writes are now permitted

After changing the configuration file using command rs.reconfig(rsconf), following messages will appear on Primary server instance:

2016-06-19T16:49:41.451+0530 I REPL [conn3] replSetReconfig admin command received from client

2016-06-19T16:49:41.455+0530 I REPL [conn3] replSetReconfig config object with 1 members parses ok

2016-06-19T16:49:41.490+0530 I REPL [ReplicationExecutor] New replica set config in use: { _id: "rs0", version: 2, protocolVersion: 1, members: [ { _id: 0, host: "localhost:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 } } }

2016-06-19T16:49:41.491+0530 I REPL [ReplicationExecutor] This node is localhost:27017 in the config

Before adding the secondary nodes, following messages will come in secondary instance:

2016-06-19T16:50:06.595+0530 W REPL [rsSync] did not receive a valid config yet

Step 5:

In the mongo shell connected to the primary, add the second and third mongod instances to the replica set using the rs.add() method. Replace <hostname> with your system’s hostname in the following examples:

rs.add("<hostname>:27018")

rs.add("<hostname>:27019")

rs0:PRIMARY> rs.add("localhost:27018")

{ "ok" : 1 }

rs0:PRIMARY>

rs0:PRIMARY> rs.add("localhost:27019")

{ "ok" : 1 }

rs0:PRIMARY>

After the execution of above commands to add secondary rs.add("localhost:27018"), following messages appeared on instance log of Primary server (with port 27017):

2016-06-19T16:50:06.528+0530 I REPL [conn3] replSetReconfig admin command received from client

2016-06-19T16:50:06.550+0530 I REPL [conn3] replSetReconfig config object with 2 members parses ok

2016-06-19T16:50:06.557+0530 I ASIO [NetworkInterfaceASIO-0] Successfully connected to localhost:27018

2016-06-19T16:50:06.561+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:54229 #4 (2 connections now open)

2016-06-19T16:50:06.567+0530 I REPL [ReplicationExecutor] New replica set config in use: { _id: "rs0", version: 3, protocolVersion: 1, members: [ { _id: 0, host: "localhost:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 1, host: "localhost:27018", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 } } }

2016-06-19T16:50:06.568+0530 I REPL [ReplicationExecutor] This node is localhost:27017 in the config

2016-06-19T16:50:06.580+0530 I REPL [ReplicationExecutor] Member localhost:27018 is now in state STARTUP

2016-06-19T16:50:06.582+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:54230 #5 (3 connections now open)

2016-06-19T16:50:06.592+0530 I NETWORK [conn5] end connection 127.0.0.1:54230 (2 connections now open)

2016-06-19T16:50:07.848+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:54232 #6 (3 connections now open)

2016-06-19T16:50:07.921+0530 I NETWORK [conn6] end connection 127.0.0.1:54232 (2 connections now open)

2016-06-19T16:50:08.580+0530 I REPL [ReplicationExecutor] Member localhost:27018 is now in state SECONDARY

Similary, when rs.add("localhost:27019") command is executed following messages appear:

2016-06-19T16:50:10.872+0530 I REPL [conn3] replSetReconfig admin command received from client

2016-06-19T16:50:10.882+0530 I REPL [conn3] replSetReconfig config object with 3 members parses ok

2016-06-19T16:50:10.884+0530 I REPL [ReplicationExecutor] New replica set config in use: { _id: "rs0", version: 4, protocolVersion: 1, members: [ { _id: 0, host: "localhost:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 1, host: "localhost:27018", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 2, host: "localhost:27019", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 } } }

2016-06-19T16:50:10.900+0530 I ASIO [NetworkInterfaceASIO-0] Successfully connected to localhost:27019

2016-06-19T16:50:10.900+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:54238 #8 (3 connections now open)

2016-06-19T16:50:10.902+0530 I REPL [ReplicationExecutor] Member localhost:27019 is now in state STARTUP

2016-06-19T16:50:10.903+0530 I ASIO [NetworkInterfaceASIO-0] Successfully connected to localhost:27019

2016-06-19T16:50:10.911+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:54239 #9 (4 connections now open)

2016-06-19T16:50:10.947+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:54241 #10 (5 connections now open)

2016-06-19T16:50:10.959+0530 I NETWORK [conn10] end connection 127.0.0.1:54241 (4 connections now open)

2016-06-19T16:50:11.613+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:54246 #11 (5 connections now open)

2016-06-19T16:50:11.616+0530 I NETWORK [conn11] end connection 127.0.0.1:54246 (4 connections now open)

2016-06-19T16:50:11.619+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:54247 #12 (5 connections now open)

2016-06-19T16:50:11.626+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:54248 #13 (6 connections now open)

2016-06-19T16:50:12.902+0530 I REPL [ReplicationExecutor] Member localhost:27019 is now in state STARTUP2

2016-06-19T16:50:16.902+0530 I REPL [ReplicationExecutor] Member localhost:27019 is now in state SECONDARY

At the same time messages appear in Instance Log of Secondary servers:

2016-06-19T16:50:06.533+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:54227 #1 (1 connection now open)

2016-06-19T16:50:06.550+0530 I NETWORK [conn1] end connection 127.0.0.1:54227 (0 connections now open)

2016-06-19T16:50:06.552+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:54228 #2 (1 connection now open)

2016-06-19T16:50:06.578+0530 I ASIO [NetworkInterfaceASIO-0] Successfully connected to localhost:27017

2016-06-19T16:50:06.595+0530 I REPL [replExecDBWorker-2] Starting replication applier threads

2016-06-19T16:50:06.595+0530 W REPL [rsSync] did not receive a valid config yet

2016-06-19T16:50:06.596+0530 I REPL [ReplicationExecutor] New replica set config in use: { _id: "rs0", version: 3, protocolVersion: 1, members: [ { _id: 0, host: "localhost:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 1, host: "localhost:27018", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 } } }

2016-06-19T16:50:06.596+0530 I REPL [ReplicationExecutor] This node is localhost:27018 in the config

2016-06-19T16:50:06.596+0530 I REPL [ReplicationExecutor] transition to STARTUP2

2016-06-19T16:50:06.597+0530 I REPL [ReplicationExecutor] Member localhost:27017 is now in state PRIMARY

2016-06-19T16:50:07.596+0530 I REPL [rsSync] ******

2016-06-19T16:50:07.596+0530 I REPL [rsSync] creating replication oplog of size: 128MB...

2016-06-19T16:50:07.596+0530 I STORAGE [FileAllocator] allocating new datafile D:\MongoDB\rs0-1\local.1, filling with zeroes...

2016-06-19T16:50:07.608+0530 I STORAGE [FileAllocator] done allocating datafile D:\MongoDB\rs0-1\local.1, size: 256MB, took 0.01 secs

2016-06-19T16:50:07.820+0530 I REPL [rsSync] ******

2016-06-19T16:50:07.821+0530 I REPL [rsSync] initial sync pending

2016-06-19T16:50:07.846+0530 I REPL [ReplicationExecutor] syncing from: localhost:27017

2016-06-19T16:50:07.880+0530 I REPL [rsSync] initial sync drop all databases

2016-06-19T16:50:07.881+0530 I STORAGE [rsSync] dropAllDatabasesExceptLocal 1

2016-06-19T16:50:07.882+0530 I REPL [rsSync] initial sync clone all databases

2016-06-19T16:50:07.892+0530 I REPL [rsSync] initial sync data copy, starting syncup

2016-06-19T16:50:07.893+0530 I REPL [rsSync] oplog sync 1 of 3

2016-06-19T16:50:07.895+0530 I REPL [rsSync] oplog sync 2 of 3

2016-06-19T16:50:07.898+0530 I REPL [rsSync] initial sync building indexes

2016-06-19T16:50:07.901+0530 I REPL [rsSync] oplog sync 3 of 3

2016-06-19T16:50:07.907+0530 I REPL [rsSync] initial sync finishing up

2016-06-19T16:50:07.907+0530 I REPL [rsSync] set minValid=(term: 1, timestamp: Jun 19 16:50:06:1)

2016-06-19T16:50:07.920+0530 I REPL [rsSync] initial sync done

2016-06-19T16:50:07.921+0530 I REPL [ReplicationExecutor] transition to RECOVERING

2016-06-19T16:50:07.940+0530 I REPL [ReplicationExecutor] transition to SECONDARY

rs0:PRIMARY> rs.conf()

{

"_id" : "rs0",

"version" : 4,

"protocolVersion" : NumberLong(1),

"members" : [

{

"_id" : 0,

"host" : "localhost:27017",

"arbiterOnly" : false,

"buildIndexes" : true,

"hidden" : false,

"priority" : 1,

"tags" : {

"slaveDelay" : NumberLong(0),

"votes" : 1

{

"_id" : 1,

"host" : "localhost:27018",

"arbiterOnly" : false,

"buildIndexes" : true,

"hidden" : false,

"priority" : 1,

"tags" : {

"slaveDelay" : NumberLong(0),

"votes" : 1

{

"_id" : 2,

"host" : "localhost:27019",

"arbiterOnly" : false,

"buildIndexes" : true,

"hidden" : false,

"priority" : 1,

"tags" : {

"slaveDelay" : NumberLong(0),

"votes" : 1

}

"settings" : {

"chainingAllowed" : true,

"heartbeatIntervalMillis" : 2000,

"heartbeatTimeoutSecs" : 10,

"electionTimeoutMillis" : 10000,

"getLastErrorModes" : {

"getLastErrorDefaults" : {

"w" : 1,

"wtimeout" : 0

}

rs0:PRIMARY>

Now I can perform write operation on Primary server and all collection made to primary will be copied to Secondary instances as well. But note that NO WRITE OPERATION WILL BE ALLOWED ON SECONDARY.

rs0:PRIMARY> show dbs

local 0.281GB

rs0:PRIMARY> use testing

switched to db testing

rs0:PRIMARY>

rs0:PRIMARY> db.coll1.insert({"name":"Himanshu Karki", "age":28, "weight":72})

WriteResult({ "nInserted" : 1 })

rs0:PRIMARY>

We have created a new database "testing" on primary server, added a colleciton "coll1" and added one document in it.

rs0:SECONDARY> db.coll1.find()

Error: error: { "ok" : 0, "errmsg" : "not master and slaveOk=false", "code" : 13435 }

rs0:SECONDARY>

rs0:SECONDARY> use testing

switched to db testing

rs0:SECONDARY>

rs0:SECONDARY> db.coll1.insert({"name":"Rohit Karki", "age":34, "weight":62})

WriteResult({ "writeError" : { "code" : 10107, "errmsg" : "not master" } })

However if we see the directories of dbpath we provided while initiating these databases. When we inserted first document or added the collection, same directory structure and file got created in dbpath of other secondary server which is similar to primary. When this was being done, following messages appear in instance log of secondary server:

After addition of database testing and collection coll1 this message appeared on Secondary instance:

2016-06-19T17:05:52.112+0530 I INDEX [repl writer worker 1] allocating new ns file D:\MongoDB\rs0-2\testing.ns, filling with zeroes...

2016-06-19T17:05:52.625+0530 I STORAGE [FileAllocator] allocating new datafile D:\MongoDB\rs0-2\testing.0, filling with zeroes...

2016-06-19T17:05:52.648+0530 I STORAGE [FileAllocator] done allocating datafile D:\MongoDB\rs0-2\testing.0, size: 16MB, took 0 secs

Directory of D:\MongoDB\rs0-0

06/19/2016 05:05 PM <DIR> .

06/19/2016 05:05 PM <DIR> ..

06/19/2016 06:43 PM <DIR> diagnostic.data

06/19/2016 04:28 PM 16,777,216 local.0

06/19/2016 04:31 PM 268,435,456 local.1

06/19/2016 04:28 PM 16,777,216 local.ns

06/19/2016 04:28 PM 5 mongod.lock

06/19/2016 04:28 PM 69 storage.bson

06/19/2016 05:05 PM 16,777,216 testing.0

06/19/2016 05:05 PM 16,777,216 testing.ns

06/19/2016 05:05 PM <DIR> _tmp

7 File(s) 335,544,394 bytes

4 Dir(s) 34,869,489,664 bytes free

Directory of D:\MongoDB\rs0-1

06/19/2016 05:05 PM <DIR> .

06/19/2016 05:05 PM <DIR> ..

06/19/2016 06:43 PM <DIR> diagnostic.data

06/19/2016 04:28 PM 16,777,216 local.0

06/19/2016 04:50 PM 268,435,456 local.1

06/19/2016 04:28 PM 16,777,216 local.ns

06/19/2016 04:28 PM 5 mongod.lock

06/19/2016 04:28 PM 69 storage.bson

06/19/2016 05:05 PM 16,777,216 testing.0

06/19/2016 05:05 PM 16,777,216 testing.ns

06/19/2016 05:05 PM <DIR> _tmp

7 File(s) 335,544,394 bytes

4 Dir(s) 34,869,489,664 bytes free

Now I will try to shutdown the primary database.

Step 7: Perform maintenance on the primary last.

To perform maintenance on the primary after completing maintenance tasks on all secondaries, use rs.stepDown() in the mongo shell to step down the primary and allow one of the secondaries to be elected the new primary. Specify a 300 second waiting period to prevent the member from being elected primary again for five minutes:

rs.stepDown(300)

After the primary steps down, the replica set will elect a new primary.

On one primary Server:

rs0:PRIMARY> rs.stepDown()

2016-06-19T23:41:37.219+0530 E QUERY [thread1] Error: error doing query: failed: network error while attempting to run command 'replSetStepDown' on host '127.0.0.1:27017' :

DB.prototype.runCommand@src/mongo/shell/db.js:132:1

DB.prototype.adminCommand@src/mongo/shell/db.js:149:12

rs.stepDown@src/mongo/shell/utils.js:1080:12

@(shell):1:1

2016-06-19T23:41:37.280+0530 I NETWORK [thread1] trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed

2016-06-19T23:41:37.343+0530 I NETWORK [thread1] reconnect 127.0.0.1:27017 (127.0.0.1) ok

rs0:SECONDARY>

This shows it will immediately converted to Secondary server. Checking the Primary(port 27017) Instance logs:

2016-06-19T23:41:37.016+0530 I COMMAND [conn3] Attempting to step down in response to replSetStepDown command

2016-06-19T23:41:37.066+0530 I REPL [ReplicationExecutor] transition to SECONDARY

2016-06-19T23:41:37.067+0530 I NETWORK [conn8] end connection 127.0.0.1:54238 (6 connections now open)

2016-06-19T23:41:37.068+0530 I NETWORK [conn4] end connection 127.0.0.1:54229 (6 connections now open)

2016-06-19T23:41:37.068+0530 I NETWORK [conn12] end connection 127.0.0.1:54247 (6 connections now open)

2016-06-19T23:41:37.068+0530 I NETWORK [conn14] end connection 127.0.0.1:54881 (6 connections now open)

2016-06-19T23:41:37.068+0530 I NETWORK [conn9] end connection 127.0.0.1:54239 (6 connections now open)

2016-06-19T23:41:37.124+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:58418 #15 (3 connections now open)

2016-06-19T23:41:37.190+0530 I NETWORK [conn3] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [127.0.0.1:54169]

2016-06-19T23:41:37.331+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:58419 #16 (3 connections now open)

2016-06-19T23:41:37.912+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:58420 #17 (4 connections now open)

2016-06-19T23:41:40.702+0530 I NETWORK [conn13] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [127.0.0.1:54248]

2016-06-19T23:41:47.854+0530 I REPL [ReplicationExecutor] Member localhost:27018 is now in state PRIMARY

2016-06-19T23:41:48.276+0530 I REPL [ReplicationExecutor] syncing from: localhost:27018

2016-06-19T23:41:48.282+0530 I REPL [SyncSourceFeedback] setting syncSourceFeedback to localhost:27018

2016-06-19T23:41:48.323+0530 I ASIO [NetworkInterfaceASIO-0] Successfully connected to localhost:27018

Messages in Instance log of Secondary server (port 27018) which is the new primary:

2016-06-19T23:41:37.068+0530 I REPL [ReplicationExecutor] could not find member to sync from

2016-06-19T23:41:37.077+0530 W REPL [ReplicationExecutor] The liveness timeout does not match callback handle, so not resetting it.

2016-06-19T23:41:37.079+0530 I ASIO [ReplicationExecutor] dropping unhealthy pooled connection to localhost:27017

2016-06-19T23:41:37.082+0530 I ASIO [ReplicationExecutor] dropping unhealthy pooled connection to localhost:27017

2016-06-19T23:41:37.082+0530 I ASIO [ReplicationExecutor] after drop, pool was empty, going to spawn some connections

2016-06-19T23:41:37.133+0530 I ASIO [NetworkInterfaceASIO-0] Successfully connected to localhost:27017

2016-06-19T23:41:37.134+0530 I REPL [ReplicationExecutor] Member localhost:27017 is now in state SECONDARY

2016-06-19T23:41:45.006+0530 I NETWORK [conn10] end connection 127.0.0.1:54388 (4 connections now open)

2016-06-19T23:41:46.642+0530 I REPL [ReplicationExecutor] Starting an election, since we've seen no PRIMARY in the past 10000ms

2016-06-19T23:41:46.649+0530 I REPL [ReplicationExecutor] conducting a dry run election to see if we could be elected

2016-06-19T23:41:46.674+0530 I REPL [ReplicationExecutor] dry election run succeeded, running for election

2016-06-19T23:41:46.682+0530 I REPL [ReplicationExecutor] election succeeded, assuming primary role in term 2

2016-06-19T23:41:46.682+0530 I REPL [ReplicationExecutor] transition to PRIMARY

2016-06-19T23:41:46.684+0530 W REPL [ReplicationExecutor] The liveness timeout does not match callback handle, so not resetting it.

2016-06-19T23:41:47.080+0530 I REPL [rsSync] transition to primary complete; database writes are now permitted

2016-06-19T23:41:48.278+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:58422 #12 (5 connections now open)

Messages in Instance log of Secondary server (port 27019) which remain secondary without any change in its state:

2016-06-19T23:41:37.902+0530 I ASIO [ReplicationExecutor] dropping unhealthy pooled connection to localhost:27017

2016-06-19T23:41:37.910+0530 I ASIO [ReplicationExecutor] after drop, pool was empty, going to spawn some connections

2016-06-19T23:41:37.914+0530 I ASIO [NetworkInterfaceASIO-0] Successfully connected to localhost:27017

2016-06-19T23:41:37.915+0530 I REPL [ReplicationExecutor] Member localhost:27017 is now in state SECONDARY

2016-06-19T23:41:40.658+0530 I REPL [ReplicationExecutor] could not find member to sync from

2016-06-19T23:41:40.662+0530 W REPL [ReplicationExecutor] The liveness timeout does not match callback handle, so not resetting it.

2016-06-19T23:41:50.663+0530 I REPL [ReplicationExecutor] Member localhost:27018 is now in state PRIMARY

2016-06-19T23:41:50.665+0530 I REPL [ReplicationExecutor] syncing from: localhost:27018

2016-06-19T23:41:50.668+0530 I REPL [SyncSourceFeedback] setting syncSourceFeedback to localhost:27018

Step 8: Accidental Primary shutdown.

I accidently shut down primary when trying to shutdown the secondary. Following command got executed on new Primary (port 27018):

rs0:PRIMARY> use admin

switched to db admin

rs0:PRIMARY>

rs0:PRIMARY> db.shutdownServer()

server should be down...

2016-06-19T23:46:57.065+0530 I NETWORK [thread1] trying reconnect to 127.0.0.1:27018 (127.0.0.1) failed

2016-06-19T23:46:57.079+0530 I NETWORK [thread1] reconnect 127.0.0.1:27018 (127.0.0.1) ok

rs0:SECONDARY>

2016-06-19T23:47:08.997+0530 I NETWORK [thread1] Socket recv() errno:10053 An established connection was aborted by the software in your host machine. 127.0.0.1:27018

2016-06-19T23:47:08.997+0530 I NETWORK [thread1] SocketException: remote: (NONE):0 error: 9001 socket exception [RECV_ERROR] server [127.0.0.1:27018]

2016-06-19T23:47:08.999+0530 I NETWORK [thread1] trying reconnect to 127.0.0.1:27018 (127.0.0.1) failed

2016-06-19T23:47:10.013+0530 W NETWORK [thread1] Failed to connect to 127.0.0.1:27018, reason: errno:10061 No connection could be made because the target machine actively refused it.

2016-06-19T23:47:10.014+0530 I NETWORK [thread1] reconnect 127.0.0.1:27018 (127.0.0.1) failed failed

Following thing happened on server:

Server 3 (port 27019) immediately automatically become the new primary. Messages in instance log of server (port 27019)

2016-06-19T23:46:57.064+0530 I ASIO [ReplicationExecutor] dropping unhealthy pooled connection to localhost:27018

2016-06-19T23:46:57.064+0530 I ASIO [ReplicationExecutor] after drop, pool was empty, going to spawn some connections

2016-06-19T23:46:57.090+0530 I ASIO [NetworkInterfaceASIO-0] Successfully connected to localhost:27018

2016-06-19T23:46:57.091+0530 I REPL [ReplicationExecutor] Member localhost:27018 is now in state SECONDARY

2016-06-19T23:46:59.139+0530 I NETWORK [conn5] end connection 127.0.0.1:54242 (2 connections now open)

2016-06-19T23:47:02.091+0530 I ASIO [ReplicationExecutor] dropping unhealthy pooled connection to localhost:27018

2016-06-19T23:47:02.092+0530 I ASIO [ReplicationExecutor] after drop, pool was empty, going to spawn some connections

2016-06-19T23:47:03.093+0530 I REPL [ReplicationExecutor] Error in heartbeat request to localhost:27018; HostUnreachable No connection could be made because the target machine actively refused it.

2016-06-19T23:47:04.095+0530 I REPL [ReplicationExecutor] Error in heartbeat request to localhost:27018; HostUnreachable No connection could be made because the target machine actively refused it.

2016-06-19T23:47:05.096+0530 I REPL [ReplicationExecutor] Error in heartbeat request to localhost:27018; HostUnreachable No connection could be made because the target machine actively refused it.

2016-06-19T23:47:06.463+0530 I REPL [ReplicationExecutor] Starting an election, since we've seen no PRIMARY in the past 10000ms

2016-06-19T23:47:06.463+0530 I REPL [ReplicationExecutor] conducting a dry run election to see if we could be elected

2016-06-19T23:47:06.464+0530 I REPL [ReplicationExecutor] dry election run succeeded, running for election

2016-06-19T23:47:06.465+0530 I REPL [ReplicationExecutor] election succeeded, assuming primary role in term 3

2016-06-19T23:47:06.474+0530 I REPL [ReplicationExecutor] transition to PRIMARY

2016-06-19T23:47:07.065+0530 I REPL [rsSync] transition to primary complete; database writes are now permitted

We kept getting following messages untill we restart/reconnect to server2 (port 27018)

2016-06-19T23:47:08.463+0530 I REPL [ReplicationExecutor] Error in heartbeat request to localhost:27018; HostUnreachable No connection could be made because the target machine actively refused it.

2016-06-19T23:47:09.464+0530 I REPL [ReplicationExecutor] Error in heartbeat request to localhost:27018; HostUnreachable No connection could be made because the target machine actively refused it.

2016-06-19T23:47:12.465+0530 I REPL [ReplicationExecutor] Error in heartbeat request to localhost:27018; HostUnreachable No connection could be made because the target machine actively refused it.

2016-06-19T23:47:13.466+0530 I REPL [ReplicationExecutor] Error in heartbeat request to localhost:27018; HostUnreachable No connection could be made because the target machine actively refused it.

2016-06-19T23:47:14.467+0530 I REPL [ReplicationExecutor] Error in heartbeat request to localhost:27018; HostUnreachable No connection could be made because the target machine actively refused it.

When we start the server 2( port 27018), it will become secondary:

D:\MongoDB\bin>mongod --storageEngine=mmapv1 --port 27018 --dbpath D:\MongoDB\rs0-1 --replSet rs0 --smallfiles --oplogSize 128

2016-06-19T23:48:21.094+0530 I FTDC [initandlisten] Initializing full-time diagnostic data capture with directory 'D:/MongoDB/rs0-1/diagnostic.data'

2016-06-19T23:48:21.101+0530 I NETWORK [initandlisten] waiting for connections on port 27018

2016-06-19T23:48:21.126+0530 I REPL [ReplicationExecutor] New replica set config in use: { _id: "rs0", version: 4, protocolVersion: 1, members: [ { _id: 0, host: "localhost:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 1, host: "localhost:27018", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 2, host: "localhost:27019", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 } } }

2016-06-19T23:48:21.130+0530 I REPL [ReplicationExecutor] This node is localhost:27018 in the config

2016-06-19T23:48:21.131+0530 I REPL [ReplicationExecutor] transition to STARTUP2

2016-06-19T23:48:21.144+0530 I REPL [ReplicationExecutor] Starting replication applier threads

2016-06-19T23:48:21.150+0530 I REPL [ReplicationExecutor] transition to RECOVERING

2016-06-19T23:48:21.164+0530 I REPL [ReplicationExecutor] transition to SECONDARY

2016-06-19T23:48:21.172+0530 I ASIO [NetworkInterfaceASIO-0] Successfully connected to localhost:27019

2016-06-19T23:48:21.172+0530 I ASIO [NetworkInterfaceASIO-0] Successfully connected to localhost:27017

2016-06-19T23:48:21.174+0530 I REPL [ReplicationExecutor] Member localhost:27019 is now in state PRIMARY

2016-06-19T23:48:21.178+0530 I REPL [ReplicationExecutor] Member localhost:27017 is now in state SECONDARY

Step 10:

One by one we press ctl+C on each server and following thing happen:

First Ctl+C on server 1(port 27017)

2016-06-20T00:32:16.756+0530 I CONTROL [thread1] Ctrl-C signal

2016-06-20T00:32:16.757+0530 I CONTROL [consoleTerminate] got CTRL_C_EVENT, will terminate after current cmd ends

2016-06-20T00:32:16.758+0530 I FTDC [consoleTerminate] Shutting down full-time diagnostic data capture

2016-06-20T00:32:16.761+0530 I REPL [consoleTerminate] Stopping replication applier threads

2016-06-20T00:32:17.300+0530 I STORAGE [conn19] got request after shutdown()

2016-06-20T00:32:18.466+0530 W EXECUTOR [rsBackgroundSync] killCursors command task failed: CallbackCanceled Callback canceled

2016-06-20T00:32:18.466+0530 I CONTROL [consoleTerminate] now exiting

2016-06-20T00:32:18.466+0530 I NETWORK [consoleTerminate] shutdown: going to close listening sockets...

2016-06-20T00:32:18.466+0530 I NETWORK [consoleTerminate] closing listening socket: 396

2016-06-20T00:32:18.467+0530 I NETWORK [consoleTerminate] shutdown: going to flush diaglog...

2016-06-20T00:32:18.467+0530 I NETWORK [consoleTerminate] shutdown: going to close sockets...

2016-06-20T00:32:18.467+0530 I STORAGE [consoleTerminate] shutdown: waiting for fs preallocator...

2016-06-20T00:32:18.467+0530 I NETWORK [conn16] end connection 127.0.0.1:58419 (1 connection now open)

2016-06-20T00:32:18.468+0530 I STORAGE [consoleTerminate] shutdown: closing all files...

2016-06-20T00:32:18.471+0530 I STORAGE [consoleTerminate] closeAllFiles() finished

2016-06-20T00:32:18.471+0530 I STORAGE [consoleTerminate] shutdown: removing fs lock...

2016-06-20T00:32:18.471+0530 I CONTROL [consoleTerminate] dbexit: rc: 12

2016-06-20T00:32:18.471+0530 I NETWORK [conn17] end connection 127.0.0.1:58420 (0 connections now open)

Second on server 2 (port 27018):

2016-06-20T00:32:17.301+0530 I REPL [ReplicationExecutor] Error in heartbeat request to localhost:27017; HostUnreachable End of file

2016-06-20T00:32:18.467+0530 I REPL [ReplicationExecutor] Error in heartbeat request to localhost:27017; HostUnreachable An existing connection was forcibly closed by the remote host.

2016-06-20T00:32:19.480+0530 I REPL [ReplicationExecutor] Error in heartbeat request to localhost:27017; HostUnreachable No connection could be made because the target machine actively refused it.

2016-06-20T00:32:22.481+0530 I REPL [ReplicationExecutor] Error in heartbeat request to localhost:27017; HostUnreachable No connection could be made because the target machine actively refused it.

2016-06-20T00:32:23.492+0530 I REPL [ReplicationExecutor] Error in heartbeat request to localhost:27017; HostUnreachable No connection could be made because the target machine actively refused it.

When pressed ctl+c:

2016-06-20T00:32:34.533+0530 I CONTROL [consoleTerminate] got CTRL_C_EVENT, will terminate after current cmd ends

2016-06-20T00:32:34.533+0530 I FTDC [consoleTerminate] Shutting down full-time diagnostic data capture

2016-06-20T00:32:34.536+0530 I REPL [consoleTerminate] Stopping replication applier threads

2016-06-20T00:32:35.596+0530 I STORAGE [conn2] got request after shutdown()

2016-06-20T00:32:37.616+0530 W EXECUTOR [rsBackgroundSync] killCursors command task failed: CallbackCanceled Callback canceled

2016-06-20T00:32:37.617+0530 I CONTROL [consoleTerminate] now exiting

2016-06-20T00:32:37.617+0530 I NETWORK [consoleTerminate] shutdown: going to close listening sockets...

2016-06-20T00:32:37.617+0530 I NETWORK [consoleTerminate] closing listening socket: 396

2016-06-20T00:32:37.618+0530 I NETWORK [consoleTerminate] shutdown: going to flush diaglog...

2016-06-20T00:32:37.618+0530 I NETWORK [consoleTerminate] shutdown: going to close sockets...

2016-06-20T00:32:37.619+0530 I STORAGE [consoleTerminate] shutdown: waiting for fs preallocator...

2016-06-20T00:32:37.619+0530 I STORAGE [consoleTerminate] shutdown: closing all files...

2016-06-20T00:32:37.625+0530 I STORAGE [consoleTerminate] closeAllFiles() finished

2016-06-20T00:32:37.625+0530 I STORAGE [consoleTerminate] shutdown: removing fs lock...

2016-06-20T00:32:37.625+0530 I CONTROL [consoleTerminate] dbexit: rc: 12

On server 3 ( port 27019):

2016-06-20T00:32:18.630+0530 I ASIO [ReplicationExecutor] dropping unhealthy pooled connection to localhost:27017

2016-06-20T00:32:18.630+0530 I ASIO [ReplicationExecutor] after drop, pool was empty, going to spawn some connections

2016-06-20T00:32:19.632+0530 I REPL [ReplicationExecutor] Error in heartbeat request to localhost:27017; HostUnreachable No connection could be made because the target machine actively refused it.

2016-06-20T00:32:20.634+0530 I REPL [ReplicationExecutor] Error in heartbeat request to localhost:27017; HostUnreachable No connection could be made because the target machine actively refused it.

2016-06-20T00:32:21.641+0530 I REPL [ReplicationExecutor] Error in heartbeat request to localhost:27017; HostUnreachable No connection could be made because the target machine actively refused it.

2016-06-20T00:32:35.597+0530 I REPL [ReplicationExecutor] Error in heartbeat request to localhost:27018; HostUnreachable End of file

2016-06-20T00:32:35.669+0530 I REPL [ReplicationExecutor] Error in heartbeat request to localhost:27017; HostUnreachable No connection could be made because the target machine actively refused it.

2016-06-20T00:32:36.670+0530 I REPL [ReplicationExecutor] Error in heartbeat request to localhost:27017; HostUnreachable No connection could be made because the target machine actively refused it.

2016-06-20T00:32:37.618+0530 I REPL [ReplicationExecutor] Error in heartbeat request to localhost:27018; HostUnreachable An existing connection was forcibly closed by the remote host.

Besides this it has been observed that it has step down from Primary to Secondary:

2016-06-20T00:32:43.302+0530 I REPL [ReplicationExecutor] can't see a majority of the set, relinquishing primary

2016-06-20T00:32:43.302+0530 I REPL [ReplicationExecutor] Stepping down from primary in response to heartbeat

2016-06-20T00:32:43.303+0530 I REPL [replExecDBWorker-2] transition to SECONDARY

2016-06-20T00:32:54.434+0530 I REPL [ReplicationExecutor] Not starting an election, since we are not electable

Finally Ctrl+C on server (port 27017) also:

2016-06-20T00:33:00.291+0530 I CONTROL [thread1] Ctrl-C signal

2016-06-20T00:33:00.291+0530 I CONTROL [consoleTerminate] got CTRL_C_EVENT, will terminate after current cmd ends

2016-06-20T00:33:00.292+0530 I FTDC [consoleTerminate] Shutting down full-time diagnostic data capture

2016-06-20T00:33:00.302+0530 I REPL [consoleTerminate] Stopping replication applier threads

2016-06-20T00:33:01.268+0530 I CONTROL [consoleTerminate] now exiting

2016-06-20T00:33:01.268+0530 I NETWORK [consoleTerminate] shutdown: going to close listening sockets...

2016-06-20T00:33:01.268+0530 I NETWORK [consoleTerminate] closing listening socket: 396

2016-06-20T00:33:01.269+0530 I NETWORK [consoleTerminate] shutdown: going to flush diaglog...

2016-06-20T00:33:01.270+0530 I NETWORK [consoleTerminate] shutdown: going to close sockets...

2016-06-20T00:33:01.274+0530 I STORAGE [consoleTerminate] shutdown: waiting for fs preallocator...

2016-06-20T00:33:01.274+0530 I STORAGE [consoleTerminate] shutdown: closing all files...

2016-06-20T00:33:01.283+0530 I STORAGE [consoleTerminate] closeAllFiles() finished

2016-06-20T00:33:01.283+0530 I STORAGE [consoleTerminate] shutdown: removing fs lock...

2016-06-20T00:33:01.284+0530 I CONTROL [consoleTerminate] dbexit: rc: 12ex

Step 11: Adding an Arbiter:

Arbiters are mongod instances that are part of a replica set but do not hold data. Arbiters participate in elections in order to break ties. If a replica set has an even number of members, add an arbiter.

An arbiter does not store data, but until the arbiter’s mongod process is added to the replica set, the arbiter will act like any other mongod process and start up with a set of data files and with a full-sized journal.

To minimize the default creation of data, set the following in the arbiter’s configuration file:

* storage.journal.enabled to false

* For MMAPv1 storage engine, storage.mmapv1.smallFiles to true

WARNING

Never set storage.journal.enabled to false on a data-bearing node.

a). Start an instance for Arbiter on port 30000:

D:\MongoDB\bin>mongod --storageEngine=mmapv1 --nojournal --port 30000 --dbpath D:/mongodb/arb --replSet rs0 --smallfiles

b). Now connect to PRIMARY instance (Currently server with port 27017):

D:\MongoDB\bin>mongo --port 27017

rs0:PRIMARY>

c). Now add Arbiter to replica set:

rs0:PRIMARY> rs.addArb("localhost:30000")

Following messages will appear on all the instances:

2016-06-25T17:44:39.122+0530 I ASIO [NetworkInterfaceASIO-0] Successfully connected to localhost:30000

2016-06-25T17:44:39.138+0530 I REPL [ReplicationExecutor] Member localhost:30000 is now in state ARBITER

2016-06-25T17:44:39.139+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:51041 #24 (3 connections now open)

d). New configuration can now be seen by:

rs0:PRIMARY> rs.status()

{

"set" : "rs0",

"date" : ISODate("2016-06-25T12:15:29.824Z"),

"myState" : 1,

"term" : NumberLong(5),

"heartbeatIntervalMillis" : NumberLong(2000),

"members" : [

{

"_id" : 0,

"name" : "localhost:27017",

"health" : 1,

"state" : 1,

"stateStr" : "PRIMARY",

"uptime" : 4682,

"optime" : {

"ts" : Timestamp(1466856879, 1),

"t" : NumberLong(5)

"optimeDate" : ISODate("2016-06-25T12:14:39Z"),

"electionTime" : Timestamp(1466852516, 1),

"electionDate" : ISODate("2016-06-25T11:01:56Z"),

"configVersion" : 13,

"self" : true

{

"_id" : 1,

"name" : "localhost:27018",

"health" : 1,

"state" : 2,

"stateStr" : "SECONDARY",

"uptime" : 4422,

"optime" : {

"ts" : Timestamp(1466856879, 1),

"t" : NumberLong(5)

"optimeDate" : ISODate("2016-06-25T12:14:39Z"),

"lastHeartbeat" : ISODate("2016-06-25T12:15:29.021Z"),

"lastHeartbeatRecv" : ISODate("2016-06-25T12:15:29.119Z"),

"pingMs" : NumberLong(0),

"syncingTo" : "localhost:27017",

"configVersion" : 13

{

"_id" : 2,

"name" : "localhost:27019",

"health" : 1,

"state" : 2,

"stateStr" : "SECONDARY",

"uptime" : 4409,

"optime" : {

"ts" : Timestamp(1466856879, 1),

"t" : NumberLong(5)

"optimeDate" : ISODate("2016-06-25T12:14:39Z"),

"lastHeartbeat" : ISODate("2016-06-25T12:15:29.023Z"),

"lastHeartbeatRecv" : ISODate("2016-06-25T12:15:29.120Z"),

"pingMs" : NumberLong(0),

"syncingTo" : "localhost:27017",

"configVersion" : 13

{

"_id" : 3,

"name" : "localhost:30000",

"health" : 1,

"state" : 7,

"stateStr" : "ARBITER",

"uptime" : 50,

"lastHeartbeat" : ISODate("2016-06-25T12:15:29.024Z"),

"lastHeartbeatRecv" : ISODate("2016-06-25T12:15:29.138Z"),

"pingMs" : NumberLong(0),

"configVersion" : 13

}

"ok" : 1

}

rs0:PRIMARY>

Message logs from Arbiter instance server (port 30000):

2016-06-25T17:44:06.834+0530 I NETWORK [initandlisten] waiting for connections on port 30000

2016-06-25T17:44:39.004+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:51027 #1 (1 connection now open)

2016-06-25T17:44:39.005+0530 I NETWORK [conn1] end connection 127.0.0.1:51027 (0 connections now open)

2016-06-25T17:44:39.009+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:51028 #2 (1 connection now open)

2016-06-25T17:44:39.050+0530 I ASIO [NetworkInterfaceASIO-0] Successfully connected to localhost:27017

2016-06-25T17:44:39.069+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:51035 #3 (2 connections now open)

2016-06-25T17:44:39.070+0530 I NETWORK [conn3] end connection 127.0.0.1:51035 (1 connection now open)

2016-06-25T17:44:39.082+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:51037 #4 (2 connections now open)

2016-06-25T17:44:39.084+0530 I NETWORK [conn4] end connection 127.0.0.1:51037 (1 connection now open)

2016-06-25T17:44:39.121+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:51039 #5 (2 connections now open)

2016-06-25T17:44:39.122+0530 I NETWORK [initandlisten] connection accepted from 127.0.0.1:51040 #6 (3 connections now open)

2016-06-25T17:44:39.125+0530 I REPL [ReplicationExecutor] New replica set config in use: { _id: "rs0", version: 13, protocolVersion: 1, members: [ { _id: 0, host: "localhost:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 1, host: "localhost:27018", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 2, host: "localhost:27019", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 3, host: "localhost:30000", arbiterOnly: true, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 } } }

2016-06-25T17:44:39.125+0530 I REPL [ReplicationExecutor] This node is localhost:30000 in the config

2016-06-25T17:44:39.136+0530 I REPL [ReplicationExecutor] transition to ARBITER

2016-06-25T17:44:39.138+0530 I REPL [ReplicationExecutor] Member localhost:27017 is now in state PRIMARY

2016-06-25T17:44:39.166+0530 I ASIO [NetworkInterfaceASIO-0] Successfully connected to localhost:27018

2016-06-25T17:44:39.167+0530 I ASIO [NetworkInterfaceASIO-0] Successfully connected to localhost:27019

2016-06-25T17:44:39.168+0530 I REPL [ReplicationExecutor] Member localhost:27018 is now in state SECONDARY

2016-06-25T17:44:39.169+0530 I REPL [ReplicationExecutor] Member localhost:27019 is now in state SECONDARY

e). To remove arbiter or any secondary member from the Replica set:

rs.remove("localhost:30000")

Step 12: Hidden Replica Set Member

A hidden member maintains a copy of the primary’s data set but is invisible to client applications. Hidden members are good for workloads with different usage patterns from the other members in the replica set. Hidden members must always be priority 0 members and so cannot become primary. The db.isMaster() method does not display hidden members. Hidden members, however, may vote in elections.

Clients will not distribute reads with the appropriate read preference to hidden members. As a result, these members receive no traffic other than basic replication. Use hidden members for dedicated tasks such as reporting and backups.

Steps for Making a Secondary Hidden:

a). Configuration of replica set is shown above. Use following commands to change priority of a Secondary and to make it hidden:

rs0:PRIMARY> cfg = rs.conf()

cfg.members[1].priority = 0

cfg.members[1].hidden = true

rs.reconfig(cfg)

rs0:PRIMARY> cfg.members[1].priority = 0

rs0:PRIMARY> cfg.members[1].hidden = true

true

rs0:PRIMARY> rs.reconfig(cfg)

{ "ok" : 1 }

rs0:PRIMARY>

rs0:PRIMARY> rs.config()

{

"_id" : "rs0",

"version" : 14,

"protocolVersion" : NumberLong(1),

"members" : [

{

"_id" : 0,

"host" : "localhost:27017",

"arbiterOnly" : false,

"buildIndexes" : true,

"hidden" : false,

"priority" : 1,

"tags" : {

"slaveDelay" : NumberLong(0),

"votes" : 1

{

"_id" : 1,

"host" : "localhost:27018",

"arbiterOnly" : false,

"buildIndexes" : true,

"hidden" : true,

"priority" : 0,

"tags" : {

"slaveDelay" : NumberLong(0),

"votes" : 1

{

"_id" : 2,

"host" : "localhost:27019",

"arbiterOnly" : false,

"buildIndexes" : true,

"hidden" : false,

"priority" : 1,

"tags" : {

"slaveDelay" : NumberLong(0),

"votes" : 1

{

"_id" : 3,

"host" : "localhost:30000",

"arbiterOnly" : true,

"buildIndexes" : true,

"hidden" : false,

"priority" : 1,

"tags" : {

"slaveDelay" : NumberLong(0),

"votes" : 1

}

"settings" : {

"chainingAllowed" : true,

"heartbeatIntervalMillis" : 2000,

"heartbeatTimeoutSecs" : 10,

"electionTimeoutMillis" : 10000,

"getLastErrorModes" : {

"getLastErrorDefaults" : {

"w" : 1,

"wtimeout" : 0

}

rs0:PRIMARY>

Step 13: Write concern in a Replica Set:

From the perspective of a client application, whether a MongoDB instance is running as a single server (i.e. “standalone”) or a replica set is transparent. However, replica sets offer some configuration options for write.

For a replica set, the default write concern requests acknowledgement only from the primary. You can, however, override this default write concern, such as to confirm write operations on a specified number of the replica set members.

To override the default write concern, specify a write concern with each write operation. For example, the following method includes a write concern that specifies that the method return only after the write propagates to the primary and at least one secondary or the method times out after 5 seconds.

db.products.insert(

{ item: "envelopes", qty : 100, type: "Clasp" },

{ writeConcern: { w: 2, wtimeout: 5000 } }

)

You can include a timeout threshold for a write concern. This prevents write operations from blocking indefinitely if the write concern is unachievable. For example, if the write concern requires acknowledgement from 4 members of the replica set and the replica set has only available 3 members, the operation blocks until those members become available.

Modify Default Write Concern:

You can modify the default write concern for a replica set by setting the settings.getLastErrorDefaults setting in the replica set configuration. The following sequence of commands creates a configuration that waits for the write operation to complete on a majority of the voting members before returning:

cfg = rs.conf()

cfg.settings = {}

cfg.settings.getLastErrorDefaults = { w: "majority", wtimeout: 5000 }

rs.reconfig(cfg)

If you issue a write operation with a specific write concern, the write operation uses its own write concern instead of the default.

Step 14:

Some other configuration check commands:

rs0:PRIMARY> db.isMaster()

{

"hosts" : [

"localhost:27017",

"localhost:27018",

"localhost:27019"

"arbiters" : [

"localhost:30000"

"setName" : "rs0",

"setVersion" : 13,

"ismaster" : true,

"secondary" : false,

"primary" : "localhost:27017",

"me" : "localhost:27017",

"electionId" : ObjectId("576e64a40000000000000005"),

"maxBsonObjectSize" : 16777216,

"maxMessageSizeBytes" : 48000000,

"maxWriteBatchSize" : 1000,

"localTime" : ISODate("2016-06-25T12:32:13.354Z"),

"maxWireVersion" : 4,

"minWireVersion" : 0,

"ok" : 1

}

rs0:PRIMARY>

2 comments:

TejutejuAugust 7, 2018 at 12:20 AM
It is nice blog Thank you provide important information and i am searching for same information to save my time Big Data Hadoop Online Training

UnknownDecember 2, 2018 at 12:02 AM
Thank you very much. I found the information very useful for my MongoDB certification preparation.

Big Data - B'cuz My Data is Big

Wednesday, July 6, 2016

ReplicaSet Setting up for Testing - MongoDB

2 comments:

Mongodb explain() Query Analyzer and it's Verbosity

Report Abuse

Labels