NSX Design & Deploy – Course Notes Part 2

Following on from the last post here are my remaining key points:

  • L2 Bridging – Single vDs only supported
  • DLR’s don’t place two DLR uplinks in the same Transport Zone
  • OSPF – Consider setting the OSPF timers to an aggressive setting if you have a requirement to do so  lowest setting is – Hello 1 Sec / Dead 3 Secs ( on Physical Router as well – set both sides i.e. the physical as well but note it might limit what you can set as an aggressive timer ) (10 Hello /40 Dead is the default protocol timeout)
  • Recommendation is to use anti-affinity rules to prevent deploying the DLR Control VM on the same ESXi host with an Active ESG – to prevent losing the DLR control VM and the Edge with the summary routes on (if you have set them – you should have! 😉 )
  • When applying DFW rules on the cluster object if you move VM’s between clusters then there is a few second outage on protection while the new rules are applied
  • Security policy weights – higher number takes priority e.g. 2000 higher than 1000

NSX Design & Deploy – Course Notes

So I have just finished a week in Munich on an internal NSX design & deploy course which had some great content. I have made a few notes and wanted to very briefly blog those, mainly for my benefit but also in the hope they may be useful or interesting to others as well.

In no particular order:

  • vSphere 5.5 – When upgrading the vDS from 5.1 to 5.5 you need to upgrade vDS to Advanced mode after you do the vDS upgrade. This enables features such as LACP and NIOC v3 but is a requirement for NSX to function correctly.
  • When considering Multicast mode with VTEPs in different subnets be aware that PIM is not included in the Cisco standard licence and requires advanced! PIM is a requirement when using multiple VTEPs in different subnets.
  • When using dynamic routing with a HA enabled DLR the total failover time could be up to 120 seconds
    • 15 sec dead time ( 6 sec min 9-12 recommended- 6 not recommended as its pinging every 3 seconds)
    • claim interfaces/ start services
    • ESXi update
    • route reconvergence
  • Consider setting a static route (summary route) with the correct weight which is lower than the dynamic route  so traffic isn’t disrupted during DLR failover
  • Different subnets for virtual and physical workloads lets you have 1 summary route for the virtual world. (for the above event – this is likely not possible in a brownfield deployment)
  • Note during a DLR failover it looses the routing adjacency – the routing table entry upstream is removed during this period – which looses the route – this could be avoided by increasing the OSPF timers upstream so they won’t lose the route.
  • If you fail to spread the transport zones across all the applicable hosts that are part of the same vDS -the port groups will exist but won’t be able to communicate (No traffic flow – if someone accidentally selects that port group)

Part two coming soon!