Juniper: Constrained Shortest Path First (CSPF)

It's time to recap a few basics of MPLS, and in particular of CSPF. The Constrained Shortest Path First (CSPF) algorithm allow an ingress LSR to compute a Label Switched Path (LSP) out of a Traffic Engineering (TE) database, the latter includes various constraints or requirements on how a LSP must be signaled. As you may wonder, CSPF is widely use for traffic engineering purpose, but it's also a prerequisite for two protection mechanisms, namely Fast Reroute (FRR) and link/node protection. In fact, these two, uses the TE database to compute and later signal backup tunnels (or bypass LSPs). CSPF is therefore an important piece on the MPLS chessboard.

Do I need to enable CSPF to control where to send my traffic? The answer is no. You can manually configure an Explicit Route Objects (ERO) list, and let RSVP doing the job. In the presence of the ERO object, the RSVP Path messages will follow the path specified, thus the Resv messages carrying the labels in the opposite direction.

As the name implies, Constrained Shortest Path First (CSPF) is an extension of the well-known SPF algorithm. We still want to compute the shortest path, but before, we want to take into account various constraints, which could be one or many of the following:

  • LSP Priority
  • Minimum Bandwidth Guarantee
  • End-to-End Delay
  • Hops Limitation
  • Administrative Coloring
  • Explicit route (strict or loose)

All these information are usually conveyed by a link-state dynamic routing protocol like OSPF (RFC3630) or IS-IS (RFC5305). OSPF implementations use the area-local, Type-10 Opaque LSAs with new-TLVs and sub-TLVs definitions. ISIS also uses sub-CLVs to convey TE attributes, which for the most important are Sub-TLV 3 (Admin Group), Sub-TLV 6 (Tunnel Source), Sub-TLV 8 (Tunnel Destination), Sub-TLV 9 (Maximum Bandwidth), Sub-TLV 10 (Reservable Bandwidth) and finally Sub-TLV 11 (Unreserved Bandwidth). They are included in TLV 22 (Extended IS Reachability) and TLV 135 (Extended IP Reachability) which can also carry wide-metrics.

With both OSPF and ISIS you can limit the propagation scope of your traffic-engineering information, using respectively the OSPF areas and the ISIS levels. Upon reception of the above mentioned IGP's LSA, the MPLS Ingress LSRs store the TE information inside their respective database. They can now compute a traffic-engineered LSP according to the constraints you locally defined. To do so, they run the CSPF algorithm; the following compute traffic paths of candidate LSPs, one at a time, beginning with the highest-priority LSP (the one with the lowest setup priority value). Each LSP is evaluated using the following sequence:

  1. Prune links with insufficient reservable bandwidth.
  2. Prune all links that do not have an included color (if LSP admin-group include statement present).
  3. Prune all links that do have a excluded color (if LSP admin-group exclude statement present). If a link has no color, it's accepted.
  4. Calculate the shortest path to the LSP destination consistent with the ERO.
  5. If equal-cost paths exists, choose the path whose last-hop address equals the LSP's destination.
  6. If equal-cost paths remains, select the one with the least number of hops.
  7. If equal-cost paths still remain, applies CSPF load-balancing rules (least fill, most fill, or random).
  8. Returns the complete ERO for the choosen path.

At the end of the evaluation process (step 8), the CSPF algorithm return an ordered-list of links the path must traverse. This is similar to a strict ERO you can manually configure. This time, it's computed by the CSPF algorithm and it's directly returned to RSVP for signaling.

For the purpose of demonstrating how CSPF compute paths, I built a lab reproducing a fictitious European MPLS backbone. The latter includes 13 label switching routers (LSRs), several PEs and a few CEs. The backbone is made of a mix of FastEthernet, GigabitEthernet, Logical Tunnels and ATM links, therefore, the bandwidth capacities may varies significantly. The IGP used is ISIS, with only Level-2 adjacency. I didn't generated much data traffic over the LSPs, because what is important here is how the LSPs are signaled according to the TE information distributed over the network. In fact, the control-plane is not aware of the amount of traffic passing through the data-plane; there's not even any insurance the traffic won't exceed the signaled bandwidth guarantees. There's some mechanisms to specifically address those concerns, but they're outside the scope of this article. As they are important features, they may be however worth a second article.

MPLS CSPF Lab - Overview

To start our demonstration. we want to establish two Label Switched Path (LSPs) between our Provider Edge Routers in Paris and Frankfurt. Our LSPs will have a bandwidth guarantee of 80 Mbits each. As the above diagram show, there is three possible traffic paths for the LSPs between the Paris PE and the Frankfurt PE routers. In this lab, I intentionally configured (overridden) the core interfaces IGP costs so all the paths have a cumulative cost of 100 (merely disabling step 4 of the CSPF algorithm). They do have however different numbers of hops, Path B has 2 hops, Path A and C have 3. As we will see, the hop count is very important, because the CSPF algorithm is using it as a tie-breaker when equal-cost paths exists (step 6).

The first sequence of the CSPF algorithm is to prune all links that have insufficient available bandwidth. As we have already several LSPs established across our backbone and they transits the same core links as our candidate LSPs, they will compete with each other. In fact, these pre-established LSPs previously made bandwidth reservations across the core links, as the output below show:

root@lab-m7i-2> show rsvp interface logical-system Paris        
RSVP interface: 4 active
                  Active Subscr- Static      Available   Reserved    Highwater
Interface   State resv   iption  BW          BW          BW          mark
fe-0/0/0.602Up         4   100%  100Mbps     100Mbps     0bps        0bps       
fe-1/3/0.607Up         0   100%  100Mbps     100Mbps     0bps        0bps       
fe-1/3/1.606Up         5   100%  100Mbps     70Mbps      30Mbps      30Mbps     
lt-1/2/0.1605
            Up         4   100%  800Mbps     770Mbps     30Mbps      30Mbps  

As our first LSP between Paris and Frankfurt PE require a bandwidth guarantee of 80 Mbps, the link between Paris and London (fe-1/3/1.606) is pruned. In fact, as you can see on the output above, the available bandwidth is only of 70Mbps. Therefore, Path A cannot be used to signal our LSP. Path B and C do have however enough available bandwidth. In this scenario, the link between Paris and London has been pruned by the CSPF algorithm, thus directly excluding Path A from the computation. In a real network, the bandwidth could also have been insufficient between London and Amsterdam due to other reservations made between these two P routers. The common point here, is that CSPF can figure that out prior to signaling because of the TE database. RSVP can still reject the reservation if not enough resources are available. In fact, the Traffic Engineering (TE) database could hold outdated information.

In MPLS, all LSPs are unidirectional, as the bandwidth reservations. In the reverse direction, the ingress PE Frankfurt has different available bandwidths values:

root@lab-m7i-2> show rsvp interface logical-system Frankfurt 
RSVP interface: 3 active
                  Active Subscr- Static      Available   Reserved    Highwater
Interface   State resv   iption  BW          BW          BW          mark
fe-0/0/0.601Up         5   100%  100Mbps     80Mbps      20Mbps      20Mbps     
fe-1/3/0.611Up         6   100%  100Mbps     100Mbps     0bps        0bps       
lt-1/2/0.1610
            Up         8   100%  800Mbps     780Mbps     20Mbps      20Mbps    

This is an important, because in a network as big as this lab, bandwidth reservations between two PEs can be asymmetrical. The two unidirectional LSPs can take different paths, and can even have different bandwidth guarantees. You may want to force symmetric routing across your backbone, to do so, you can use strict ERO and at some extend administrative colors.

Let's configure our candidate LSPs on the Paris and Frankfurt PEs, and then we verify the results.

root@lab-m7i-2# show | compare 
[edit logical-systems Frankfurt protocols mpls]
      label-switched-path Frankfurt-To-Milan { ... }
+     label-switched-path Frankfurt-To-Paris {
+         to 10.0.255.3;
+         bandwidth 80m;
+     }
[edit logical-systems Paris protocols mpls]
+     label-switched-path Paris-To-Frankfurt {
+         to 10.0.255.7;
+         bandwidth 80m;
+     }

[edit]
root@lab-m7i-2# commit 
commit complete

[edit]
root@lab-m7i-2# run show rsvp session logical-system Paris ingress name Paris-To-Frankfurt extensive       
Ingress RSVP: 1 sessions

10.0.255.7
  From: 10.0.255.3, LSPstate: Up, ActiveRoute: 0
  LSPname: Paris-To-Frankfurt, LSPpath: Primary
  LSPtype: Static Configured
  Suggested label received: -, Suggested label sent: -
  Recovery label received: -, Recovery label sent: 300144
  Resv style: 1 FF, Label in: -, Label out: 300144
  Time left:    -, Since: Mon Aug 26 14:03:47 2013
  Tspec: rate 80Mbps size 80Mbps peak Infbps m 20 M 1500
  Port number: sender 1 receiver 50777 protocol 0
  PATH rcvfrom: localclient 
  Adspec: sent MTU 1500
  Path MTU: received 1500
  PATH sentto: 10.0.0.9 (fe-0/0/0.602) 6 pkts
  RESV rcvfrom: 10.0.0.9 (fe-0/0/0.602) 6 pkts
  Explct route: 10.0.0.9 10.0.0.6 
  Record route: <self> 10.0.0.9 10.0.0.6  
Total 1 displayed, Up 1, Down 0

[edit]
root@lab-m7i-2# run show rsvp session logical-system Frankfurt ingress name Frankfurt-To-Paris extensive   
Ingress RSVP: 9 sessions

10.0.255.3
  From: 10.0.255.7, LSPstate: Up, ActiveRoute: 0
  LSPname: Frankfurt-To-Paris, LSPpath: Primary
  LSPtype: Static Configured
  Suggested label received: -, Suggested label sent: -
  Recovery label received: -, Recovery label sent: 300160
  Resv style: 1 FF, Label in: -, Label out: 300160
  Time left:    -, Since: Mon Aug 26 14:03:48 2013
  Tspec: rate 80Mbps size 80Mbps peak Infbps m 20 M 1500
  Port number: sender 1 receiver 65379 protocol 0
  PATH rcvfrom: localclient 
  Adspec: sent MTU 1500
  Path MTU: received 1500
  PATH sentto: 10.0.0.5 (fe-0/0/0.601) 6 pkts
  RESV rcvfrom: 10.0.0.5 (fe-0/0/0.601) 6 pkts
  Explct route: 10.0.0.5 10.0.0.10 
  Record route: <self> 10.0.0.5 10.0.0.10  
Total 1 displayed, Up 1, Down 0

The output above show our ingress LSPs are all in the UP state. The record route lists holds the LSR's core interfaces IP addresses the RSVP's Path messages traversed. The lists here indicates the LSPs are established across Path B.

MPLS CSPF Lab - Bidirectional LSP Between PEs

Someone may ask why Path C has not be picked up. After all, all paths have the same IGP costs. This is because of the step 6 of the algorithm. When equal-cost paths exists, the paths with the least number of hops always win.

Now, let's see how we can make use of administrative coloring (step 2, 3), an inherent feature of MPLS Traffic Engineering. We want to mark all our backbone's core links with one or more of the following colors: Bronze, Silver or Gold. We assume here to each colors correspond to a specific SLA level. As some core links perform better than others (lower delay, better reliability), we want to direct the important traffic on these links.

MPLS CSPF Lab - Administrative Coloring

We start by defining our colors on each LSRs by creating what is called an administrative groups in Junos:

[edit logical-systems Paris protocols mpls]
root@lab-m7i-2# set protocols mpls admin-groups gold 1
root@lab-m7i-2# set protocols mpls admin-groups silver 2
root@lab-m7i-2# set protocols mpls admin-groups bronze 3

We then apply the admin-groups to our MPLS interfaces:

[edit logical-systems Paris]
root@lab-m7i-2# set protocols mpls interface fe-0/0/0.602 admin-group bronze 
root@lab-m7i-2# set protocols mpls interface fe-1/3/1.606 admin-group silver                           
root@lab-m7i-2# set protocols mpls interface lt-1/2/0.1605 admin-group [ gold silver ]  
root@lab-m7i-2# show | compare 
[edit logical-systems Paris]
+   admin-groups {
+       gold 1;
+       silver 2;
+       bronze 3;
+   }
[edit logical-systems Paris]
     interface all { ... }
+    interface fe-0/0/0.602 {
+        admin-group bronze;
+    }
+    interface fe-1/3/1.606 {
+        admin-group silver;
+    }
+    interface lt-1/2/0.1605 {
+        admin-group [ gold silver ];
+    }

We repeat the last two operations on all LSRs. And, we finally instruct our PE to traffic-engineer their ingress LSPs according to the color(s) or admin-group(s) they belong to, here 'Gold':

[edit logical-systems Frankfurt protocols mpls label-switched-path Frankfurt-To-Paris]
+      admin-group include-any gold;
[edit logical-systems Paris protocols mpls label-switched-path Paris-To-Frankfurt]
+      admin-group include-any gold;
[edit]
root@lab-m7i-2# commit 
commit complete
root@lab-m7i-2# top
[edit]
root@lab-m7i-2# run show rsvp session logical-system Paris ingress name Paris-To-Frankfurt extensive 
Ingress RSVP: 1 sessions

10.0.255.7
  From: 10.0.255.3, LSPstate: Up, ActiveRoute: 0
  LSPname: Paris-To-Frankfurt, LSPpath: Primary
  LSPtype: Static Configured
  Suggested label received: -, Suggested label sent: -
  Recovery label received: -, Recovery label sent: 301056
  Resv style: 1 FF, Label in: -, Label out: 301056
  Time left:    -, Since: Tue Aug 27 13:12:29 2013
  Tspec: rate 80Mbps size 80Mbps peak Infbps m 20 M 1500
  Port number: sender 2 receiver 8755 protocol 0
  PATH rcvfrom: localclient 
  Adspec: sent MTU 1500
  Path MTU: received 1500
  PATH sentto: 10.0.0.21 (lt-1/2/0.1605) 12 pkts
  RESV rcvfrom: 10.0.0.21 (lt-1/2/0.1605) 11 pkts
  Explct route: 10.0.0.21 10.0.0.1 10.0.0.6 
  Record route: <self> 10.0.0.21 10.0.0.1 10.0.0.6  
Total 1 displayed, Up 1, Down 0

[edit]
root@lab-m7i-2# run show rsvp session logical-system Frankfurt ingress name Frankfurt-To-Paris extensive 
Ingress RSVP: 9 sessions

10.0.255.3
  From: 10.0.255.7, LSPstate: Up, ActiveRoute: 0
  LSPname: Frankfurt-To-Paris, LSPpath: Primary
  LSPtype: Static Configured
  Suggested label received: -, Suggested label sent: -
  Recovery label received: -, Recovery label sent: 301600
  Resv style: 1 FF, Label in: -, Label out: 301600
  Time left:    -, Since: Tue Aug 27 13:13:26 2013
  Tspec: rate 80Mbps size 80Mbps peak Infbps m 20 M 1500
  Port number: sender 1 receiver 24554 protocol 0
  PATH rcvfrom: localclient 
  Adspec: sent MTU 1500
  Path MTU: received 1500
  PATH sentto: 10.0.0.5 (fe-0/0/0.601) 11 pkts
  RESV rcvfrom: 10.0.0.5 (fe-0/0/0.601) 10 pkts
  Explct route: 10.0.0.5 10.0.0.2 10.0.0.22 
  Record route: <self> 10.0.0.5 10.0.0.2 10.0.0.22  
Total 1 displayed, Up 1, Down 0

As we can see, the LSPs now take Path C. Path A and B being pruned by Step 2 of the CSPF algorithm.

Now, let's configure our candidate LSPs, so they belong to the 'Silver' color:

root@lab-m7i-2# show | compare 
[edit logical-systems Frankfurt protocols mpls label-switched-path Frankfurt-To-Paris]
-      admin-group include-any gold;
+      admin-group include-any silver;
[edit logical-systems Paris protocols mpls label-switched-path Paris-To-Frankfurt]
-      admin-group include-any gold;
+      admin-group include-any silver;

[edit]
root@lab-m7i-2# run show rsvp session logical-system Paris ingress name Paris-To-Frankfurt extensive       
Ingress RSVP: 1 sessions

10.0.255.7
  From: 10.0.255.3, LSPstate: Up, ActiveRoute: 0
  LSPname: Paris-To-Frankfurt, LSPpath: Primary
  LSPtype: Static Configured
  Suggested label received: -, Suggested label sent: -
  Recovery label received: -, Recovery label sent: 301024
  Resv style: 1 FF, Label in: -, Label out: 301024
  Time left:    -, Since: Tue Aug 27 14:06:55 2013
  Tspec: rate 80Mbps size 80Mbps peak Infbps m 20 M 1500
  Port number: sender 1 receiver 8756 protocol 0
  PATH rcvfrom: localclient 
  Adspec: sent MTU 1500
  Path MTU: received 1500
  PATH sentto: 10.0.0.26 (fe-1/3/1.606) 15 pkts
  RESV rcvfrom: 10.0.0.26 (fe-1/3/1.606) 15 pkts
  Explct route: 10.0.0.26 10.0.0.34 10.0.0.42 
  Record route: <self> 10.0.0.26 10.0.0.34 10.0.0.42  
Total 1 displayed, Up 1, Down 0

[edit]
root@lab-m7i-2# run show rsvp session logical-system Frankfurt ingress name Frankfurt-To-Paris extensive 
Ingress RSVP: 9 sessions

10.0.255.3
  From: 10.0.255.7, LSPstate: Up, ActiveRoute: 0
  LSPname: Frankfurt-To-Paris, LSPpath: Primary
  LSPtype: Static Configured
  Suggested label received: -, Suggested label sent: -
  Recovery label received: -, Recovery label sent: 301616
  Resv style: 1 FF, Label in: -, Label out: 301616
  Time left:    -, Since: Tue Aug 27 14:07:24 2013
  Tspec: rate 80Mbps size 80Mbps peak Infbps m 20 M 1500
  Port number: sender 1 receiver 24557 protocol 0
  PATH rcvfrom: localclient 
  Adspec: sent MTU 1500
  Path MTU: received 1500
  PATH sentto: 10.0.0.5 (fe-0/0/0.601) 15 pkts
  RESV rcvfrom: 10.0.0.5 (fe-0/0/0.601) 15 pkts
  Explct route: 10.0.0.5 10.0.0.2 10.0.0.22 
  Record route: <self> 10.0.0.5 10.0.0.2 10.0.0.22  
Total 1 displayed, Up 1, Down 0

We can see the Record Route lists changed, one LSP is now taking Path A and the other Path C. It confirm the fact we can easily have asymmetric routing among LSPs. To understand why Path A and C has been picked up, we must go to the penultimate steps of the CSPF algorithm, at step 7. The latter states when equal-cost paths exists, the path with the highest amount of available bandwidth must be picked (least-fill), the inverse (most-fill) or the path is chosen randomly (random). By default, Juniper Junos use the 'random' load-balancing method. To see what load-balancing method is in effect for a given LSP, use the show mpls lsp detail command:

root@lab-m7i-2> show mpls lsp logical-system Paris ingress detail 
Ingress LSP: 1 sessions

10.0.255.7
  From: 10.0.255.3, State: Up, ActiveRoute: 0, LSPname: Paris-To-Frankfurt
  ActivePath:  (primary)
  LSPtype: Static Configured
  LoadBalance: Random
  Encoding type: Packet, Switching type: Packet, GPID: IPv4
 *Primary                    State: Up
    Priorities: 7 0
    Bandwidth: 80Mbps
    SmartOptimizeTimer: 180
          Include Any: silver
    Computed ERO (S [L] denotes strict [loose] hops): (CSPF metric: 100)
 10.0.0.26 S 10.0.0.34 S 10.0.0.42 S 
    Received RRO (ProtectionFlag 1=Available 2=InUse 4=B/W 8=Node 10=SoftPreempt 20=Node-ID):
          10.0.0.26 10.0.0.34 10.0.0.42
Total 1 displayed, Up 1, Down 0

You can change the default behavior by using the least-fill, most-fill or random knobs in the MPLS label-switched-path configuration.

This conclude this article on the Constrained Shortest Path First (CSPF) algorithm. As previously stated, CSPF is an important piece of the MPLS toolstack. In fact, it allow us to control where our traffic must flow according to the administrative constraints we define and according to the state of the network, as MPLS-TE see it. CSPF is also a prerequisite for some protection mechanisms, because it allow them to achieve very fast re-convergence, even beating the SDH 50ms convergence time.

I hope you enjoyed this article. If you are working with MPLS and CSPF and you have additional information to share (I'm pretty sure I omitted a few things), feel free to leave a comment to this article.

About the author Nicolas Chabbey

Nicolas Chabbey is a Network Engineer certified with Cisco Systems and Juniper Networks. He has begun his career in 2003, and has designed, implemented and maintained networks for enterprises and service providers. When he is not behind a computer, he is riding his mountain bike across the Swiss alps.

Previous Home Next

Comments

blog comments powered by Disqus