NSX-T Controller Installation on KVM

Overview

This post builds upon the my previous post named “NSX-T Manager Installation on KVM“.  In this post we will be:

  • Installing 3x NSX Controllers
  • Joining all 3 controllers to the NSX Management Plane
  • Creating a control cluster from the first NSX Controller created
  • Joining the other 2 controllers to the NSX control cluster

By the end we will have the starting point for building a NSX Fabric that Edge and compute nodes can use for overlay tunneling.

The network topology will look like the picture above once we have completed our work.  We used the 10.10.1.0/24 (Vlan 10) network as our out of band (OOB) management subnet and made the NSX Manager 10.10.1.200.  We will use the next three IP address after .200 for the controllers:

Controller 1: 10.10.1.201
Controller 2: 10.10.1.202
Controller 3: 10.10.1.203

We will be using the latest controller code (as of the time of this writing) of nsx-controller-2.0.0.0.0.6522091.qcow2.  You must have rights in order to download this code rev. from vmware’s website.

Deploy the primary controller in KVM

Source: Deploy NSX Controller

Like with the NSX Manager, we need to configure the basic settings into the image using guestfish.  If you need to revisit what the file looked like, please go back to the first blog post (link above) to review the information needed.  For now, I will just show the commands to apply the settings to the qcow2 images.

I created three guestfish config files named guestinfo, guestinfo2, and guestinfo3.  These files are applied to Controller1, Controller2 and Controller3 respectively.  The commands to execute this action are below:
guestfish --rw -i -a nsx-controller1.qcow2 upload guestinfo /config/guestinfo
guestfish --rw -i -a nsx-controller2.qcow2 upload guestinfo2 /config/guestinfo
guestfish --rw -i -a nsx-controller3.qcow2 upload guestinfo3 /config/guestinfo

NOTE
I did not specify this in the last post.  I took the original image named nsx-controller-2.0.0.0.0.6522091.qcow2 and copied a clean version to a new folder that holds the qcow2 disk images.  I like to keep things separated in my host machines, but this is not required.  I just created three new copies of the master qcow2 image and named them nsx-controller1.qcow2, nsx-controller2.qcow2, and nsx-controller3.qcow2.

Now that the images have been prepped, let’s deploy the first controller:
virt-install --import \
--name nsx-controller1 \
--ram 16348 \
--vcpus 4 \
--network network=vsMgmt,model=virtio \
--disk path=/vmstorage/NSX-T/controller/nsx-controller1.qcow2,format=qcow2 \
--graphics vnc,listen=0.0.0.0 --noautoconsole

Warning
If you get permissions issues while trying to start the VM, check to make sure SELinux is disabled.  This will require a reboot.

If everything worked as expected, the VM should be started.
virsh list

Output:
Id Name State
—————————————————-
1 nsx-manager1 running
2 nsx-controller1 running

The –graphics vnc,listen=0.0.0.0 virt-install option allows us to connect to the console of the VM via VNC.  You can use any VNC client to connect to the port specified for the VM.  You can find that by using the following command:
virsh vncdisplay <domain>

Example:
virsh vncdisplay nsx-controller1

Output:
:1

VNC ports, by default, start at 5900.  In the example above, it returned 1 so the port used is 5901.  If the KVM host is 10.10.1.10, then we need to VNC to 10.10.1.10:5901 to connect to the console of nsx-controller1.

Warning
If you have trouble connecting to the VM console, check to make sure the firewalld service is stopped on the KVM host.

Join the controller to the management plane

In order to join the first controller to the management plane, we need the api thumbprint of the NSX manager.  SSH into the NSX manager and run the command below:
get certificate api thumbprint

SSH into the controller and execute the following:
join management-plane NSX-Manager username admin thumbprint <NSX-Manager-thumbprint>

The controller will ask you for the password of the admin account.  Enter in the NSX Manager admin account password.  This should be the same as the password you use to log into the NSX Manager GUI.

If all goes well, you should see a success message on the controller

Example:
nsx-controller1> join management-plane 10.10.1.200 username admin thumbprint <NSX-Manager-thumbprint>
Password for API user:
Node successfully registered and controller restarted

We can confirm the controller is connected:
nsx-controller1> get managers
- 10.10.1.200 Connected

Initialize Control Cluster (only for the primary controller….Skip when creating the other 2 controllers)

Now that the controller is joined to the management plane, we need to initialize the control cluster in the primary controller.

nsx-controller1> set control-cluster security-model shared-secret secret <password>
Security secret successfully set on the node.

We create a password to use to join the other controllers to the control cluster.  This can be the same password you used before or a completely different one you want to make up.  Save this password…you will need it later.

nsx-controller1> initialize control-cluster
Control cluster initialization successful.

We can confirm the status of the control cluster with:
get control-cluster status verbose

Example:
nsx-controller1> get control-cluster status verbose
NSX Controller Status:

uuid: 3d843907-05fe-4dbb-a5d3-e585a43b9190
is master: false
in majority: false
This node has not yet joined the cluster.

Cluster Management Server Status:

uuid rpc address rpc port global id vpn address status
fc6e9632-2946-40f8-bf37-19c9447e0fb4 10.10.1.201 7777 1 169.254.1.1 connected

Zookeeper Ensemble Status:

Zookeeper Server IP: 169.254.1.1, reachable, ok
Zookeeper version: 3.5.1-alpha–1, built on 09/01/2017 12:29 GMT
Latency min/avg/max: 0/1/47
Received: 216
Sent: 232
Connections: 2
Outstanding: 0
Zxid: 0x10000001d
Mode: leader
Node count: 23
Connections: /169.254.1.1:35486[1](queued=0,recved=208,sent=225,sid=0x100000637390001,lop=GETD,est=1508440715972,to=40000,lcxid=0xcd,lzxid=0x10000001d,lresp=439090,llat=0,minlat=0,avglat=1,maxlat=19)
/169.254.1.1:35532[0](queued=0,recved=1,sent=0)

Next steps

Repeat the steps above for deployment of the other two controllers.  Using the template above, you should be able to deploy them and join them to the management plane.  Skip the control cluster initialization and continue below once the two other controllers are ready.

Join other controllers to control cluster

Set the controller shared secret password so it can join the control cluster.  This must be the same password you set in the primary controller from the steps above.  See…..I told you to remember that password 🙂

nsx-controller2> set control-cluster security-model shared-secret secret <password>
Security secret successfully set on the node.

Now get the second controllers api thumbprint:

nsx-controller2> get control-cluster certificate thumbprint
...

We will need this thumbprint to join the controller to the control cluster.  Now go back to your SSH session in Controller1.  You need to tell Controller1 to connect to Controller2 to have it join the control cluster:

nsx-controller1> join control-cluster 10.10.1.202 thumbprint <nsx-controller2-thumbprint>
Node 10.10.1.202 has successfully joined the control cluster. Please run 'activate control-cluster' command on the new node.

Now jump back into the SSH session of Controller2 and activate the control cluster.

nsx-controller2> activate control-cluster
Control cluster activation successful.

Repeat the above steps to join Controller3 to the control cluster.

Confirmation

nsx-controller1> get control-cluster status
uuid: 3d843907-05fe-4dbb-a5d3-e585a43b9190
is master: true
in majority: true
uuid address status
3d843907-05fe-4dbb-a5d3-e585a43b9190 10.10.1.201 active
1c790c2b-6455-4ceb-8c82-9955beca8b54 10.10.1.202 active
88b44c52-3104-4dd8-8280-8ea0106df324 10.10.1.203 active

As you can see, all three controllers (10.10.1.201, 10.10.1.202, and 10.10.1.203) are joined and active.  We should see the controllers in a green state in the NSX Manager dashboard

What’s next

Now that we have a working control cluster, we are ready to start building the fabric.  In the next post, we will create two NSX Edges, join them to the management plane, create transport zones and a list of other tasks involved with creating an NSX overlay fabric.

Thanks,

AJ