To deal with loops in switched networks, the Spanning Tree Protocol was developed. For a description of how STP works, see the Wikipedia page on the subject; in short, it disables certain ports on certain switches to break loops. So far so good.

When using VLANs, there are several alternatives:

  • Use a Common Spanning Tree: i.e. use the same topology for all VLANs.
  • Use Per Vlan Spanning Tree: i.e. run a separate STP instance for each VLAN.
  • Use Multiple Spanning Tree: This is the IEEE standard and is a compromise.

In MST, VLANs can be mapped to instances. All VLANs mapped to the same instance share the same Spanning Tree. This allows some flexibility by using multiple instances, without the CPU problems of running a single instance for each and every VLAN.

Now consider the following situation:

This diagram uses only 2 VLANs: a data VLAN, drawn in blue, and a management VLAN, in red. The data VLAN is used to connect the Left and Right switch together, along with the attached servers. The management VLAN is only used to manage the switches.

When implementing this on Cisco switches (I tried it on Catalyst 3750s), everything works as expected. The two servers can talk to each other, both switches are manageable.

When implementing this on HP ProCurve switches (I tried it on 5400s, 2610s and 2810s), this does not work: Depending on the MAC-addresses of the switches, either the servers cannot talk to each other or one of both switches is disconnected from the management station…

This simple setup uses a single MST region with the following instance mapping:

  • VLAN management (red) to instance 1
  • VLAN data (blue) to instance 2

In my test-lab, the Management switch had the lowest MAC-address, so it became the root bridge. Both switches Left and Right happily joined the party and decided that the blue trunk between them was redundant and should be put into a blocking state, thereby cutting the connection between both servers:

RIGHT# show spanning-tree instance 2
  Port  Type      Cost      Priority Role       State      Bridge
  ----- --------- --------- -------- ---------- ---------- -------------
<...>
  20    100/1000T 200000    128      Root       Forwarding 001db3-xxxxxx
<...>
  Trk1  100/1000T 20000     128      Alternate  Blocking   001db3-yyyyyy

The problem seems to be that HP does not take the VLAN assignments into account when calculating the MST topology, which was the whole purpose of developing MST over STP. After 4 months of mailing to/from HP Support, I finally got that confirmed:

> […]
> Am I correct if I summarize it as follows:
> MSTP does not take the VLAN-assignments into account when calculating its spanning-tree.
> Therefore, all VLANs should be allowed on redundant links.
> Does HP ProCurve consider this a bug? Or is this expected behaviour?
> […]

[…]
Your statement about MSTP is correct. I mean MSTP function is create a loop-free connection between switches. It is mainly a configuration of Ports.
[…]

Clearly, HP does not consider this a bug. Strangely enough, Cisco will work just fine in this scenario…

Update 2008-12-10

I re-checked this with ProCurve’s EMEA Technical consultant. Some quotes from his reply:

If Cisco works as you say, it must do a check for all VLANs within  the instance and that could make sense. On the practical side, you’re responsible for the VLAN configuration. If you understand well what is the primary link and what is/are backup links, and if you configure the right VLAN On it, that should work.
That is true as well with PVST.

In your example, we could say that STP does not make sense because there is no loop by VLAN Construction.
But the thing is that PVST will detect it but not MSTP (at least on ProCurve) which does not care about VLANs.

This confirms what I already said: ProCurve’s implementation assumes that all VLANs are allowed on all ports. If this assumption is not correct (as in the fairly simple example above), MST might converge “wrong”.

Update 2009-02-21

For those interested: I just retried this on our lab-setup. The config is not completely the same as described above, but very similar. Here are the (gzipped) outputs of “show tech all” for the three switches. I also did a “show spanning-tree instance 1” to demonstrate the problem.

19 Comments

  1. tohkawa says:

    I’m facing the same problem today…

  2. Niobos says:

    Did you have contact with HP ProCurve Support about this? What was their reaction?
    If you want, I can give you the case-number that I had.

  3. tohkawa says:

    I found your article and I thought I couldn’t expect the support of HP about this…

    I enabled bpdu-filter on the management ports of both (left and right) switches, and managed to get it working…

  4. Niobos says:

    I worked around the problem by using multiple MSTP-regions. In my example I put “left” and “right” in one MSTP-region and “management” in another region. That way, I could keep a backup path even if one of the management-links failed.

  5. Mark says:

    Just a question, I’ve been testing this same stuff and run into some of the same things. I get the impression that the Cisco boxes and the HP boxes have other defaults as far as regions are concerned. So the HP’s and the Cisco’s end up in different regions (or the Cisco’s don’t do regions by default, the HP’s do), so basically your CST is used for the switching, explaining the behaviour you see. If you configure them as one region, does it work better?

  6. Niobos says:

    Mark,
    The test I describe in my post is full HP ProCurve, no Cisco devices are involved. I’m sure the Regions match (verified by show span).
    When I repeat the same lab with only Cisco devices, things work “as expected”.

    Just FYI:
    * HP defaults to MSTP, but with the region name equal to its MAC-address (i.e. all connections are edge)
    * Cisco defaults to per-VLAN spanning tree (PVST)

  7. jerome says:

    Saying it’s a bug is incorrect.
    It’s completely normal, MST doesn’t care about what you tag or untag. The topo applies only to instances, regardless what you tag on your ports. Even a port with a untagged vlan could be preferred over a port tagging all vlans (if it has a lower path cost, ie, bigger bandwith – take care: LACP not taken in account for calculation !).
    It’s normal that Cisco works off the shelf because of PVST+ or rapid-pvst default. If you use MST on your Cisco, you’ll have the same behavior too.
    Your way to go is to decrease pathcost and prio of trunk (lacp) ports of Left and Right, if you ever read this 6 months later. !

  8. Niobos says:

    Jerome,
    Thanks for you comment. However, I’d like to clarify some things:
    * The problem is not between tagged/untagged, but between allowed/forbidden. ProCurve’s implementation sets a port in forwarding even if the VLAN is not allowed at all (neither tagged nor untagged)
    * I know I can solve this issue by tweaking the priorities. My point is that the ProCurve implementation CONSIDERS the wrong ports; setting “forbid 1” in a VLAN should be enough to remove it from STP-consideration as well.
    * I did use MST on Cisco and had different, in my eyes better, behavior

  9. Nicola says:

    I have configured this using ProCurve switches and have no issue with the forwarding of traffic between the switches.

    Have you looked at your configuration of MSTP, for such a minor configuration of Spanning Tree you might find that a single Instance for Left and Right.
    assigning priority 0 to left and Priority 1 to right will ensure that the communication between the two continues.

    leaving either the management vlan to run in the Default IST often refered to as the CST, and configuring the Root of the CST to match the configuration of the created instance.

    You will need to configure the Config-name and config-revision also to ensure all switches communicate properly.

    Please can you post your configuration of Spanning Tree, and I will help you with your CONFIGURATION MISTAKES, me personally I would RTFM, or attend some training as HP is not CISCO.

  10. Niobos says:

    Nicola,

    I updated the post to include the “show tech all” output, which includes the configuration and pretty much every show-command available.

    If I understand you correctly, you propose to split the MST-region into two separate regions? In the situation explained above, this will indeed solve the issue. However, the original problem we had with a customer was much more complex; simply splitting the region was not possible.

    Feel free to review and correct my configuration, preferably without splitting the MST-region.

  11. Molina says:

    I seems that root of the problem is that you should configure the stp root of each instance. On the topology above, you need to define the management switch as the root of instance 1, and choose between the two switches, left and right, to be the root of instance 2. I can do it just seting the priority of each instance.

  12. Niobos says:

    Molina,
    Thank you for the response, but your comment does not provide a solution, but a workaround. I can easily draw a more complex topology where “choose the root right” is not possible.

  13. Scott says:

    Arrrgh! I wish I had known/thought to look for this BEFORE I bought two Procurve switches. All my past experience was with Ciscos and I’d assumed the same behavior was configurable, even if not default.

    I have two trunked switches that I wish to terminate redundant data center drops into. I forbid the DC drop VLAN from the trunk, but…well, obviously you know what happens as your management/data config above exactly mirrors my issue.

    I got quite the surprise when I plugged it together and my trunk was blocked! Time to set the cost/priority to ensure it’s never blocked and to put the DC drop vlan into the trunk as it’s virtually there (in STP terms) anyway.

  14. Simon says:

    Looking at your posted configs there appears to be a glaring error. All switches are operating in the same MST instance with the same root priority! You are leaving MST to decide for itself what you intended. With MST you have to think about what you want your trees to look like and manually create them via different root priorities.

    For your example you would create an instance for the management vlan that has a priority of 0 for the management switch and >0 for the other two. Then create another instance for the data vlan and set left as 0 and right as 8192 (or vice versa) and management as default (32768). Job done!

  15. Niobos says:

    Simon,
    As I already replied to Molina above, your comment provides a workaround for this specific situation, not a solution. I can easily design a more complex example where there is no “right root” to choose.

  16. Reinder says:

    Hi Niobos,

    I’m relatively new in spanning tree as my topologies until now were so streight forward that I could not appreciate the Spanningtree protocol that much (like too much reading, an annoyingly habbit of vendors using different solutions etc.) From what I’ve found and tested until now,
    I have to agree with Simon and Molina…

    MSTP, in all configuration examples that I have seen and configs I tried, just asks for one or more MST instances (using only the standard default instance would defeat the whole purpose of using MST over RST) with designated primary (priotiy 0) roots and backup (priority 2) roots. The only problem with RST is that Cisco does not properly seems to be using this standardized protocol if not used with either its proprietary PVST+ or MST, meaning that in mixed network environments MST is your only available choice.

    Maybee in your eyes the Cisco way of MST is better, and perhaps it realy is as I can’t yet realy judge on that. Personally I would prefer a resiliant switch hierarchy of which I can fully predict with certainty what wil happen if a switch goes down, or a trunk (ether-channel in cisco language) goes wasted.

    I’m sure that you can come up with a more difficult networklayout, but the question is: where and when do you actually need such a difficult layout that the suggested solution by Simon and Molina would actually no longer work. It seems to me anyway that it’s a less error prone solution then to specifically deny certain VLAN’s on a link to make sure that the spanningtree follows what you would expect. I mean, what if a underqualified collegue – or worse, some visiting group of accountant bloaks with their personal el-cheapo switches accidentally decides to create a loop. That’s what Spanningtree is all about in the first place.

  17. Niobos says:

    I’m not going to go into this again…

    The point I was trying to raise, obviously with the wrong example, is that HP’s MSTP implementation considers using ports which are not carrying the VLANs of the instance. In my opinion, Cisco’s implementation of MSTP does “the right thing” by ignoring ports that don’t carry the VLANs of the instance. I say “the right thing”, because this is the behavior I usually want and I can’t find a situation where HP’s default would be more appropriate.

    Yes, I know I can change the port priority, but that would mean that for every VLAN-change I do, I also need to re-calculate the MSTP-topology by hand to verify that this particular change won’t cause a disconnection when STP looses a link somewhere.

  18. Jimbo says:

    Hi Niobos,

    Well it’s 2018, and i have the same grumble. Indeed the HP solution is dangerous as it designates a port that can’t actually carry the traffic for the vlans which the instance is configured for. In my opinion very stupid.

    I think replacing the switches may be the way to go!

  19. Jimbo says:

    oh… and for what it’s worth i paired mine with an old 3com in the lab, and the 3com did what we wanted and took the vlans into account.