vDS is NOT a pre-requisite for NSX Guest Introspection

vDS is NOT a pre-requisite for NSX Guest Introspection

This seems to be a common misconception both from customers and third party vendors, the vDS is also not licensed as part of NSX in the “NSX for Endpoint” license. Therefore a normal VSS standard vSwitch is fully supported for deploying NSX Guest Introspection to use with third party vendors i.e. for Anti Virus.

Please refer to the NSX 6.x Installation Guide on how to use the “Specified on host” option when deploying Guest Introspection

Extract from the NSX 6.x Installation Guide Below…

 

NSX Troubleshooting Commands

Initial Troubleshooting – Start with the basics in troubleshooting – Transport Network and Control Plane

Identifying Controller Deployment Issues:

  • View the vCenter Tasks and Events logs during deployment.

Verify connectivity from NSX Manager to vCenter:

  • Run command ping x.x.x.x or show arp or show ip route or debug connection <IP-or hostname> on NSX Manager to verify network connectivity.
  • On NSX Manager run the command show log follow then connect to vCenter and look in log for connection failure.
  • Verify network configuration on NSX Manager by running command show running-config.

Identify EAM common issues:

  • Check vSphere ESXi Agent Manager for errors:
  1. vCenter home > vCenter Solutions Manager >vSphere ESX Agent Manager.
  2. Check status of Agencies prefixed with _VCNS_153.
  3. Log into the ESXi host as root that is experiencing the issue and run command tail /var/log/esxupdate.log.

NOTE: You can also access the Managed Object Browser by accessing address:
https://<VCIP or hostname>/eam/mob/

UWA’s – vsfwd or netcpa – not functioning correctly? This manifest itself as firewall showing a bad status or the control plane between hypervisor(s) and the controllers being down.

  • To find fault on the messaging infrastructure, check if Messaging Infrastructure is down on host in NSX Manager System Events.
  • More than one ESXi host affected? Check message bus service on NSX Manager Appliance web UI under the Summary Tab.
  • If RabbitMQ is stopped, then restart it.

Common Deployment Issues:

1. Connecting NSX to vCenter

  • DNS/NTP incorrectly configured on NSX Manager and vCenter.
  • User account without vCenter Administrator role used to connect NSX Manager to vCenter.
  • No network connectivity between NSX Manager and vCenter Server.
  • User logging into vCenter with an account with no role on NSX Manager.

2. Controller Deployment

  • Insufficient resources, available to host controllers.
  • NTP on ESXi hosts and NSX Manager not in sync.
  • IP connectivity issues between NSX Manager and the NSX Controllers.

3. Host Preparation

  • EAM fails to deploy VIBs because of mis-configured DNS on ESXi hosts.
  • EAM fails to deploy VIBs because of firewall blocking required ports between ESXi, NSX Manager and vCenter.
  • A Previous VIB of an older version is already installed requiring user intervention to reboot the hosts.

4. VXLAN

  • Incorrect teaming method selected for VXLAN.
  • VTEPs does not have connectivity between each other.
  • DHCP selected to assign VTEP IPs but unavailable.
  • If a vmknic has a bad IP, configure manually via vCenter.
  • MTU on Transport Network not set to 1600 bytes or greater.

CLI

NSX Controller CLI VXLAN Commands:

  • show control-cluster logical-switches vni <vni>
  • show control-cluster logical-switches connection-table <vni>
  • show control-cluster logical-switches vtep-table <vni>
  • show control-cluster logical-switches mac-table <vni>
  • show control-cluster logical-switches arp-table <vni>
  • show control-cluster logical-switches vni-stats <vni>

NSX Controller CLI cluster status and health:

  • show control-cluster status
  • show control-cluster startup-nodes
  • show control-cluster roles
  • show control-cluster connections
  • show control-cluster core stats
  • show network <arg>
  • show log cloudnet/cloudnet_java-vnet-controller. <start-time-stamp>.log
  • sync control-cluster <arg>

VXLAN namespace for esxcli:

  • esxcli network vswitch dvs vmware vxlan list
  • esxcli network vswitch dvs vmware vxlan network list –vds-name=<vds>
  • esxcli network vswitch dvs vmware vxlan network mac list –vds-name =<vds> –vxlan-id=<vni>
  • esxcli network vswitch dvs vmware vxlan network arp list –vds-name –vxlan-id=
  • esxcli network vswitch dvs vmware vxlan network port list –vds-name <vds> –vxlan-id=<vni>
  • esxcli network vswitch dvs vmware vxlan network stats list –vds-name <vds> –vxlan-id=<vni>

Troubleshooting Components – Understand the component interactions to narrow the problem focus

Controller Issues:

No connectivity for new VM’s, increased BUM traffic (ARP cache misses).

General NSX Controller troubleshooting steps:

  • Verify Controller cluster status and roles.
  • Verify Controller node network connectivity.
  • Check Controller API service.
  • Validate VXLAN and Logical Router mapping table entries to ensure they are consistent.
  • Review source and destination netcpa logs and CLI to determine control plane connectivity issues between ESXi hosts and NSX Controller.

Example:

Verify VTEPs have sent network information to Controllers.

On Controller:

  • show control-cluster logical-switches vni <vni.no>
  • show control-cluster logical-switches vtep-table <vni.no>

User World Agent (UWA) issues:

  • Logical switching not functioning (netcpa)
  • Firewall rules not being updated (vsfwd).

General UWA troubleshooting steps:

  • Start UWA if not running.

/etc/init.d/netcpad [status/start]

/etc/init.d/vShield-Stateful-Firewall [status/start]
  • tail netcpa logs: /var/log/netcpa.log
  • tail vsfwd logs: /var/log/vsfwd.log

Check if UWA’s are connected to NSX Manager and Controllers.

esxcli network ip connection list |grep 5671 (Message bus TCP connection)

esxcli network ip connection list |grep 1234 (Controller TCP connection)

Check the configuration file /etc/vmware/netcpa/config-by-vsm.xml on the ESXi host that has the settings under UserVars/Rmq* (In particular UserVars/RmqipAddress).

The list of UserVars needed for the message bus currently are:

a. RmqClientPeerName

b. RmqHostId

c. RmqClientResponseQueue

d. RmqClientExchange

e. RmqSslCertSha1ThumbprintBase64

f. RmqHostVer

g. RmqClientId

h. RmqClientToken

i. RmqClientRequestQueue

j. RmqVsmExchange

k. RmqPort

l. RmqVsmRequestQueue

m. RmqVHost

n. RmqPassword

o. RmqUserna

p. RmqIpAddress

NVS Issues – Limited/Intermittent connectivity for VM’s on the same Logical switch.

General VXLAN troubleshooting steps:

  • Check for incorrect MTU on Physical Network for VXLAN traffic.
  • Incorrect IP route configured during VXLAN configuration.
  • Physical network connectivity issues.

Verify connectivity between VTEPs:

  • ping from VXLAN dedicated TCP/IP stack ping ++netstack=vxlan -l vmk1 <ip address>
  • View ARP table of VXLAN dedicated TCP/IP stack esxcli network ip route ipv4 list -N vxlan
  • Ping succeeds between VM’s but TCP seems intermittent. Possible reason is due to incorrect MTU on physical network or VDS on vSphere. Check the MTU configured on VDS in use by VXLAN.

Verify VXLAN component:

  • Verify VXLAN vib is installed and the correct version esxcli software vib get -vibname esx-vxlan
  • Verify VXLAN kernel module vdl2 is loaded vmkload_mod -l |grep vdl2 (logs to /var/log/vmkernel.log – prefix VXLAN)
  • Verify Control Plane is up and active for Logical Switch on the ESXi host esxcli network vswitch dvs vmware vxlan network list –vds-name
  • Verify VM information. Verify that ESXi host has learned MAC addresses of remote VM’s esxcli network vswitch dvs vmware vxlan network mac list –vds-name
  • List active ports and VTEP they are mapped to esxcli network vswitch dvs vmware vxlan network port list –vds-name –vxlan-id=
  • Verify host has locally cached ARP entry for remote VM’s esxcli network vswitch dvs vmware vxlan network arp list –vds-name –vxlan-id=

Distributed Firewall issues – Flow Monitoring provides vNIC level visibility of VM traffic flow.

  • Detailed Flow Data for both Allow and Block flows.
  • Global flow collection disabled by default – click the Enable button.
  • By default NSX excludes own VMs: NSX Manager, Controllers and Edges.

Note: Add a VM to the Exclusion List to remove it from the DFW. This allows you to determine if it’s a DFW issue. If there is still a problem, then it is not DFW-related.

  • In vSphere Web client, go to Network and Security and select the NSX Manager. From Manage > Exclusion List, click the + and select the VM.
  • Verify the DFW vib (esx-vsip) is installed on the ESXi host esxcli software vib list |grep vsip
  • Verify that the DFW kernel module is installed on the ESXi host vmkload_mod -l |grep vsip
  • Verify the vsfwd service daemon is running on the ESXi host ps |grep vsfwd
  • To start/stop the vsfwd daemon on the ESXi host /etc/init.d/vShield-stateful-Firewall[stop|start|status|restart]

LOGS

NSX Manager Log (collected via WEB UI)

  • Select Download Tech Support Log

Edge (VDR/ESG) Log (collected via WEB UI)

  • From the vSphere Web Client, right click vCenter Server, select All vCenter Actions -> Export System Logs.
  • You can also generate ESXi host logs on the ESXi host cli: vm-support.

NSX Controller Logs

  • From the vSphere Web Client, go to Networking & Security -> Installation -> NSX Conroller Nodes.
  • Select the Controller and select Download Tech Support logs.

Issues and Corresponding Logs

Installation/upgrade related issues

  • NSX Manager Log.
  • vCenter Support Bundle: /var/log/vmware/vpx/EAM.log and /var/log/esxupdate.log.

VDR issues

  • NSX Manager log.
  • VDR log from the affected VDR.
  • VM support bundle: var/log/netcpa.log and /var/log/vsfwd.log.
  • Controller logs.

Edge Services Gateway issues

  • NSX Manager log.
  • Edge log for the affected ESG.

NSX Manager issues

  • NSX Manager log.

VXLAN/Controller/Logical Switch

  • NSX Manager log.
  • vCenter Support bundle.
  • VM support bundle.

VXLAN data plane: /var/log/vmkernel.log.

VXLAN control plane: /var/log/netcpa.log.

Management plane: /var/log/vsfwd.log and /var/log/netcpa.log.

Distributed Firewall (DFW) issues

  • NSX Manager log.
  • VM support bundle: /var/log/vsfwd.log, /var/log/vmkernel.log.
  • VC support bundle.

Implementing a multi-tenant networking platform…

Implementing a multi-tenant networking platform with NSX

Implementing a multi-tenant networking platform…

So we have covered the typical challenges of a multi-tenant network and designed a solution to one of these, it’s time to get down to the bones of it and do some configuration! Let’s implement it in the lab, I have set up an NSX ESG Cust_1-ESG and an NSX DLR control VM Cust_1-DLR with the below IP configuration:


VMware Social Media Advocacy

vCNS to NSX Upgrades – T+1 Post-Upgrade Steps

T+1 Post-Upgrade Steps

After the upgrade, do the following:

  1. Delete the snapshot of the NSX Manager taken before the upgrade.
  2. Create a current backup of the NSX Manager after the upgrade.
  3. Check that VIBs have been installed on the hosts.

NSX installs these VIBs:

esxcli software vib get --vibname esx-vxlan

esxcli software vib get --vibname esx-vsip
  1. If Guest Introspection has been installed, also check that this VIB is present on the hosts:
esxcli software vib get --vibname epsec-mux
  1. Resynchronize the host message bus. VMware advises that all customers perform resync after an upgrade. You can use the following API call to perform the resynchronization on each host.
URL : https://<nsx-mgr-ip>/api/4.0/firewall/forceSync/<host-id>

HTTP Method : POST 

 Headers: 

Authorization : base64encoded value of username password

Accept : application/xml

Content-Type : application/xml

vCNS to NSX Upgrades – vShield Endpoint to NSX Guest Introspection

vShield Endpoint to NSX Guest Introspection

  1. The Installation Status column says Upgrade Available.In the Installation tab, click Service Deployments.
  2. Select the Guest Introspection deployment that you want to upgrade.
  3. The Upgrade ( ) icon in the toolbar above the services table is enabled.
  4. Click the Upgrade ( ) icon and follow the UI prompts.

After Guest Introspection is upgraded, the installation status is Succeeded and service status is Up. Guest Introspection service virtual machines are visible in the vCenter Server inventory.

For more information in this series please continue on to the next part

vCNS to NSX Upgrades – vShield Edges to NSX Edges

vShield Edge to NSX Edge Upgrade Steps

  1. In the vSphere Web Client, select Networking & Security > NSX Edges
  2. For each NSX Edge instance, double click the edge and check for the following configuration settings before upgrading
    1. Click ManageVPN > L2 VPN and check if L2 VPN is enabled. If it is, take note of the configuration details and then delete all L2 VPN configuration
    2. Click ManageRouting Static Routes and check if any static routes are missing a next hop setting. If they are, add the next hop before upgrading the NSX Edge
  3. For each NSX Edge instance, select Upgrade Version from the Actions menu

After the NSX Edge is upgraded successfully, the Status is Deployed, and the Version column displays the new NSX versionIf the upgrade fails with the error message “Failed to deploy edge appliance,” make sure that the host on which the NSX edge appliance is deployed is connected and not in maintenance mode.

  1. If an Edge fails to upgrade and does not rollback to the old version, click the Redeploy NSX Edge icon and then retry the upgrade

For more information in this series please continue on to the next part

vCNS to NSX Upgrades – Host Upgrades

Host Upgrades

  1. Place DRS in to manual mode (Do not disable DRS)
  2. Click Networking & Securityand then click Installation.
  3. Click the Host Preparation

All clusters in your infrastructure are displayed.

  1. For each cluster, click Update or Install in the Installation Status column.
  2. Each host in the cluster receives the new logical switch software.

The host upgrade initiates a host scan. The old VIBs are removed (though they are not completely deleted until after the reboot). New VIBs are installed on the altboot partition. To view the new VIBs on a host that has not yet rebooted, you can run the esxcli software vib list –rebooting-image | grep esx command.]

  1. Monitor the installation until the Installation Status column displays a green check mark
  2. After manually evacuating the hosts, select the cluster and click the Resolve The Resolveaction attempts to complete the upgrade and reboot all hosts in the cluster. If the host reboot fails for any reason, the Resolve action halts. Check the hosts in the Hosts and Clusters view, make sure the hosts are powered on, connected, and contain no running VMs. Then retry the Resolve action.
  3. You may have to repeat the above process for each host.
  4. You can confirm connectivity by performing the following checks
    1. Verify that VXLAN segments are functional. Make sure to set the packet size correctly and include the don’t fragment bit.
    2. Ping between two VMs that are on same virtual wire but on two different hosts (one host that has been upgraded and one host that has not)
      1. From a Windows VM: ping -l 1472 –f <dest VM>
      2. From a Linux VM: ping -s 1472 –M do <dest VM>
    3. Ping between two hosts’ VTEP interfaces.
      1. ping ++netstack=vxlan -d -s 1572 <dest VTEP IP>
    4. All virtual wires from your 5.5 infrastructure are renamed to NSX logical switches, and the VXLAN column for the cluster says Enabled

For more information in this series please continue on to the next part

vCNS to NSX Upgrades – vShield Manager Upgrade Steps

vShield Manager Upgrade Steps

1. Confirm the steps in 2.1.2 have been actioned.

2. Backup to FTP upgraded vCNS and shut-down the VM.

3. Check Snapshot has been taken of vShield Manager.

4. Check Support Bundle has been taken.

5. Shutdown vShield Manager and check the appliance has 4 vCPU’s and 16GB of memory.

6. Power on vShield Manager.

7. Upload vShield-NSX Upgrade, apply and reboot (upgrade file size is 2.4G!).

8. Check you can login to NSX Manager A once the upgrade has completed.

9. You may need to restart the VC Web Client in order to see the plugin in the vSphere Web Client.

10. Check SSO single sign-on in NSX Manager configuration. May need to re-register.

11. Configure Segment ID’s and Multicast Address (Recorded from vShield Manager).

12. Configure backup to FTP location – take backup of NSX Manager A.

13. Create Snapshot on NSX Manager A.

14. Shutdown NSX Manager A.

15. Deploy new NSX Manager B from OVF with same IP as A.

16. Restore FTP Backup from NSX Manager A.

17. Check vCenter Registration and NSX Manager Login.

18. Check NSX Manager for list of Edges, Logical Switches.

19. Once you are happy connectivity is functioning correctly continue with the upgrade.

For more information in this series please continue on to the next part

vCNS to NSX Upgrades – Pre Upgrade Steps

Pre-Upgrade Steps

One or more days before the upgrade, do the following:

  1. Verify that vCNS is at least version (See point 8 Below)
  2. Check you are running one of the following recommended builds vSphere 5.5U3 or vSphere 6.0U2
  3. Verify that all required ports are open (please see appendix A)
  4. Verify that your vSphere environment has sufficient resource for the NSX components.
  1. Verify that all your applicable vSphere clusters have sufficient resource to allow DRS to migrate running workloads during the host preparation stage (n+1).
  2. Verify that you can retrieve uplink port name information for vSphere Distributed Switches. See VMware KB 2129200 for further information. (Note. This is not applicable to NHS Manchester as we are expected to upgrade to NSX 6.2.4)
  3. Ensure that forward and reverse DNS, NTP as well as Lookup Service is working.
  4. If any vShield Endpoint partner services are deployed, verify compatibility before upgrading:
    • Consult the VMware Compatibility Guide for Networking and Security.
    • Consult the partner documentation for compatibility and upgrade details
  5. If you have Data Security in your environment, uninstall it before upgrading vShield Manager.
  6. Check all running edges are on the same latest version as the vShield Manager i.e.
  7. Verify that the vShield Manager vNIC adaptor is VMXNET3.this should be the case if running vShield Manager version however the e1000 vNIC may have been retained if you have previously upgraded the vShield Manager. In order to replace the vNIC follow the steps in KB 2114813. This in part involves deploying a fresh vShield Manager and restoring the configuration. See Appendix C or (http://kb.vmware.com/lb/2114813)
  8. Increase the vShield Manager memory to 16GB.

Pre-Upgrade Validation Steps

Immediately before you begin the upgrade, do the following to validate the existing installation.

  1. Identify administrative user IDs and passwords.
  2. Verify that forward and reverse name resolution is working for all components.
  3. Verify you can log in to all vSphere and vShield components.
  4. Note the current versions of vShield Manager, vCenter Server, ESXi and vShield
  5. Check Multicast address ranges are valid (The recommended multicast address range starts at 239.0.1.0/24 and excludes 239.128.0.0/24.)
  6. Verify that VXLAN segments are functional. Make sure to set the packet size correctly and include the don’t fragment bit.
    1. Ping between two VMs that are on same virtual wire but on two different hosts.
      1. From a Windows VM: ping -l 1472 –f <dest VM>
      2. From a Linux VM: ping -s 1472 –M do <dest VM>
    2. Ping between two hosts’ VTEP interfaces.
      1. ping ++netstack=vxlan -d -s 1572 <dest VTEP IP>
    3. Validate North-South connectivity by pinging out from a VM.
    4. Visually inspect the vShield environment to make sure all status indicators are green, normal, or deployed.
    5. Verify that syslog is configured.
    6. If possible, in the pre-upgrade environment, create some new components and test their functionality.
      1. Validate netcpad and vsfwd user-world agent (UWA) connections.
        1. On an ESXi host, run esxcli network vswitch dvs vmware vxlan network list –vds-name= and check the controller connection state.
        2. On vShield Manager, run the show tech-support save session command, and search for “5671” to ensure that all hosts are connected to vShield Manager.
      2. Check Firewall functionality via Telnet or netcat to confirm the edge firewalls are working as expected.

Pre-Upgrade Steps

Immediately before you begin the upgrade, do the following:

  1. Verify that you have a current backup of the vShield Manager, vCenter and other vCloud Networking and Security components. See Appendix B for the necessary steps to accomplish this.
  2. Purge old logs from the vShield Manager “Purge log Manager” and “purge log system”
  3. Take a snapshot of the vShield Manager, including its virtual memory.
  4. Take a backup of the vDS
  5. Create a Tech Support Bundle.
  6. Record Segment ID’s and Multicast address ranges in use
  7. Increase the memory on the vShield Manager to 16GB and 4 vCPU.
  8. Ensure that forward and reverse domain name resolution is working, using the nslookup command.
  9. If VUM is in use in the environment, ensure that the bypassVumEnabled flag is set to true in vCenter. This setting configures the EAM to install the VIBs directly to the ESXi hosts even when the VUM is installed and/or not available.
  10. Download and stage the upgrade bundle, validate with md5sum.
  11. Do not power down or delete any vCloud Networking and Security components or appliances before instructed to do so.
  12. VMware recommends to do the upgrade work in a maintenance window as defined by your company.

For more information in this series please continue on to the next part