Thursday, February 27, 2014

Software upgrade procedures on the routers from different vendors - Part 3

In a part 3  (Part 1 and Part 2) we will observe the most complicated upgrade of CISCO CRS-1 (aka HFR (huge f**cking router)) running IOS-XR. Competing with Junos in flexibility and expanded capabilities which are absent in a standart IOS, IOS-XR requires a lot of time and moves during upgrade procedure. Same time adding some SMU-s (bugfix) to the current software may require reload of the box and lead to service interruption for about 30 minutes. ISSU functionality is not an option at all, especially for major release upgrades.
So, let's cross the fingers and start upgrade procedure:






IOS XR Upgrade Procedures on CRS-1 with Redundant RP-s






Example will be based on CRS-1 upgrade from 4.1.2 to 4.2.4 PX image.
First let’s obtain required files from CCO:
1.       PIE file, in our case composite one with k9 : CRS-iosxr-px-k9-4.2.4.tar
2.       SMU files, in our case recommended one :  4.2.4_hfr-px_REC_SMUS_2013-06-10.tar

Next untar archives on PC and copy all the files to router’s RP0 and RP1  (harddisk:)

Check free space :

RP/0/RP0/CPU0:#dir harddisk:

RP/0/RP0/CPU0:#dir harddisk: location 0/RP1/CPU0



Before copy operation new directory has to be created on harddisks :

RP/0/RP0/CPU0:#admin mkdir harddisk:/iosxr424

RP/0/RP0/CPU0:#admin mkdir harddisk:/iosxr424 location 0/RP1/CPU0


Copy files via ftp to CRS:

RP/0/RP0/CPU0:#copy ftp://1.1.1.1/4.2.4.files harddisk:/iosxr424

RP/0/RP0/CPU0:#copy ftp://1.1.1.1/4.2.4.files harddisk:/iosxr424  location 0/RP1/CPU0 

Remark : 4.2.4.files are all the files from TAR archives


Also we can copy between RP-s with the wildcard in filename, if files are already saved to RP0 : 

RP/0/RP0/CPU0:#copy harddisk:/iosxr424/* location 0/RP0/CPU0  harddisk:/iosxr424/  location 0/RP1/CPU0



Perform necessary system check:

show platform

show install active

cfs check

admin show license

 

 

RP/0/RP0/CPU0:#sh hfr
Node          Type              PLIM               State           Config State
------------- ----------------- ------------------ --------------- ---------------
0/0/2         FP40(SPA)         10X1GE             OK              PWR,NSHUT,MON
0/0/4         FP40(SPA)         OC192RPR-XFP       OK              PWR,NSHUT,MON
0/4/CPU0      FP40              2-10GbE-FLX        IOS XR RUN      PWR,NSHUT,MON
0/4/0         FP40(SPA)         1x10GE             OK              PWR,NSHUT,MON
0/4/4         FP40(SPA)         OC192RPR-XFP       OK              PWR,NSHUT,MON
0/6/CPU0      FP40              4-10GbE            IOS XR RUN      PWR,NSHUT,MON
0/7/CPU0      FP40              4-10GbE            IOS XR RUN      PWR,NSHUT,MON
0/RP0/CPU0    RP(Active)        N/A                IOS XR RUN      PWR,NSHUT,MON
0/RP1/CPU0    RP(Standby)       N/A                IOS XR RUN      PWR,NSHUT,MON

 

and so on…
Check free space on flash cards (disk0):

RP/0/RP0/CPU0:#show media

 

Media Information for 0/RP0/CPU0.

                   Image   Current  Part

  Mountpoint       FsType  FsType   Size   State         DrvrPid  Mirror  Flags

================================================================================

  /disk0:          FAT16   FAT32    3.4G Mounted         0036891  Enabled

  /disk0a:         FAT16   FAT16    0.5G Mounted         0036891         

  /disk1:          FAT16   (?)           Not Present                     

  /disk1a:         FAT16   (?)           Not Present                      

  /harddisk:       QNX4    QNX4    33.5G Mounted         0032791         

  /harddiska:      QNX4    QNX4    11.2G Mounted         0032791         

  /harddiskb:      FAT32   FAT32   11.2G Mounted         0032791         

  /lcdisk0:        FAT32   (?)           Not Present                     

  /lcdisk0a:       FAT32   (?)           Not Present                     





RP/0/RP0/CPU0:#show media location 0/RP1/CPU0

 

Media Information for 0/RP1/CPU0.

                   Image   Current  Part

  Mountpoint       FsType  FsType   Size   State         DrvrPid  Mirror  Flags

================================================================================

  /disk0:          FAT16   FAT32    3.4G Mounted         0036891  Enabled

  /disk0a:         FAT16   FAT16    0.5G Mounted         0036891         

  /disk1:          FAT16   (?)           Not Present                     

  /disk1a:         FAT16   (?)           Not Present                     

  /harddisk:       QNX4    QNX4    33.5G Mounted         0028696         

  /harddiska:      QNX4    QNX4    11.2G Mounted         0028696         

  /harddiskb:      FAT32   FAT32   11.2G Mounted         0028696         

  /lcdisk0:        FAT32   (?)           Not Present                      

  /lcdisk0a:       FAT32   (?)           Not Present                    



RP/0/RP0/CPU0#show filesystem  disk0:





   Model:                       UNIGEN FLASH                          

   Firmware:                       30/06/03

   BIOS Geometry:               16 Heads, 63 Sectors

   Drive Geometry:              16 Heads, 8150 Tracks, 63 Sectors

   Drive Capacity:              8215200 Cur Sctrs, 8215200 User Sctrs, Extd

   Address Mode:                LBA

   PIO mode:                    2

   Multimode Blocks/Transfer:   32



   Capacity:    8215201 Sectors, Total 4206182912 Bytes, (512 Bytes/sector)



RP/0/RP0/CPU0#show filesystem  disk0: location 0/RP1/CPU0





   Model:                       UNIGEN FLASH                          

   Firmware:                       30/06/03

   BIOS Geometry:               16 Heads, 63 Sectors

   Drive Geometry:              16 Heads, 8150 Tracks, 63 Sectors

   Drive Capacity:              8215200 Cur Sctrs, 8215200 User Sctrs, Extd

   Address Mode:                LBA

   PIO mode:                    2

   Multimode Blocks/Transfer:   32



   Capacity:    8215201 Sectors, Total 4206182912 Bytes, (512 Bytes/sector)



If there’s not enough space on disk0 (at least 1.5GB), then inactive packages should be removed:

RP/0/RP0/CPU0:#admin show install inactive

if found some inactive packages then do next:

RP/0/RP0/CPU0:# admin install remove inactive

RP/0/RP0/CPU0:# admin install commit



For the upgrade process to flow without interruptions it’s recommended to offload the traffic and shutdown routing protocols (BGP, OSPF, ISIS, LDP etc.) Even better is to reload whole box before upgrade:

 

RP/0/RP0/CPU0:#admin reload location all


Also let the FPD-s upgraded automatically during the new image installation:

RP/0/RP0/CPU0:#admin

RP/0/RP0/CPU0(admin):#conf  t

RP/0/RP0/CPU0(admin):# fpd auto-upgrade

RP/0/RP0/CPU0(admin):# commit



Let the upgrade process begin with the text scripts prepared in notepad:

RP/0/RP0/CPU0:# admin install add harddisk:/424/hfr-mpls-px.pie-4.2.4 harddisk:/424/hfr-services-px.pie-4.2.4 harddisk:/424/hfr-fpd-px.pie-4.2.4 harddisk:/424/hfr-mcast-px.pie-4.2.4 harddisk:/424/hfr-mini-px.pie-4.2.4 harddisk:/424/hfr-k9sec-px.pie-4.2.4 harddisk:/424/hfr-diags-px.pie-4.2.4 harddisk:/424/hfr-mgbl-px.pie-4.2.4 harddisk:/424/hfr-doc-px.pie-4.2.4 sync

 

Info:     The following packages are now available to be activated:

Info:     

Info:         disk0:hfr-mpls-px-4.2.4

Info:         disk0:hfr-services-px-4.2.4

Info:         disk0:hfr-fpd-px-4.2.4

Info:         disk0:hfr-mcast-px-4.2.4

Info:         disk0:hfr-mini-px-4.2.4

Info:         disk0:hfr-k9sec-px-4.2.4

Info:         disk0:hfr-diags-px-4.2.4

Info:         disk0:hfr-mgbl-px-4.2.4

Info:         disk0:hfr-doc-px-4.2.4

Info:     

Info:     The packages can be activated across the entire router.





Let’s add the recommended SMU-s also:

RP/0/RP0/CPU0:# admin install add harddisk:/424/hfr-px-4.2.4.CSCue53201.pie harddisk:/424/hfr-px-4.2.4.CSCug09031.pie harddisk:/424/hfr-px-4.2.4.CSCug20386.pie harddisk:/424/hfr-px-4.2.4.CSCue55783.pie harddisk:/424/hfr-px-4.2.4.CSCue71114.pie harddisk:/424/hfr-px-4.2.4.CSCue04603.pie harddisk:/424/hfr-px-4.2.4.CSCue19011.pie harddisk:/424/hfr-px-4.2.4.CSCuc56287.pie harddisk:/424/hfr-px-4.2.4.CSCue21974.pie harddisk:/424/hfr-px-4.2.4.CSCud41972.pie sync

 

Info:     The following packages are now available to be activated:

Info:     

Info:         disk0:hfr-px-4.2.4.CSCue53201-1.0.0

Info:         disk0:hfr-px-4.2.4.CSCug09031-1.0.0

Info:         disk0:hfr-px-4.2.4.CSCug20386-1.0.0

Info:         disk0:hfr-px-4.2.4.CSCue55783-1.0.0

Info:         disk0:hfr-px-4.2.4.CSCue71114-1.0.0

Info:         disk0:hfr-px-4.2.4.CSCue04603-1.0.0

Info:         disk0:hfr-px-4.2.4.CSCue19011-1.0.0

Info:         disk0:hfr-px-4.2.4.CSCuc56287-1.0.0

Info:         disk0:hfr-px-4.2.4.CSCue21974-1.0.0

Info:         disk0:hfr-px-4.2.4.CSCud41972-1.0.0

Info:     

Info:     The packages can be activated across the entire router.




Time to activate new XR:

RP/0/RP0/CPU0:# admin install activate disk0:hfr-mpls-px-4.2.4 disk0:hfr-doc-px-4.2.4 disk0:hfr-services-px-4.2.4 disk0:hfr-fpd-px-4.2.4 disk0:hfr-mcast-px-4.2.4 disk0:hfr-mini-px-4.2.4 disk0:hfr-k9sec-px-4.2.4 disk0:hfr-diags-px-4.2.4 disk0:hfr-mgbl-px-4.2.4 sync test


test” keyword is needed to perform virtual installation and check if any errors can be found. After success remove “test” and perform real activation…

Info:     This operation will reload the following nodes in parallel:

Info:         0/0/SP (MSC-DRP-SP) (Admin Resource)

Info:         0/4/SP (MSC-DRP-SP) (Admin Resource)

Info:         0/6/SP (MSC-DRP-SP) (Admin Resource)

Info:         0/7/SP (MSC-DRP-SP) (Admin Resource)

Info:         0/0/CPU0 (LC) (SDR: Owner)

Info:         0/4/CPU0 (LC) (SDR: Owner)

Info:         0/6/CPU0 (LC) (SDR: Owner)

Info:         0/7/CPU0 (LC) (SDR: Owner)

Info:         0/RP0/CPU0 (HRP) (SDR: Owner)

Info:         0/SM0/SP (140G-Fabric-SP-B) (Admin Resource)

Info:         0/SM1/SP (140G-Fabric-SP-B) (Admin Resource)

Info:         0/SM2/SP (140G-Fabric-SP-B) (Admin Resource)

Info:         0/SM3/SP (140G-Fabric-SP-B) (Admin Resource)

Proceed with this install operation (y/n)? [y]

Info:     Install Method: Parallel Reload

 

….

 

Info:     The changes made to software configurations will not be persistent

Info:     across system reloads. Use the command '(admin) install commit' to

Info:     make changes persistent.

Info:     Please verify that the system is consistent following the software

Info:     change using the following commands:

Info:         show system verify

Info:         install verify packages

Install operation 53 completed successfully at 00:00:00



Router will reload itself upon completing of installation and activate new image:

Do important check of the new system :

RP/0/RP0/CPU0:#show version

RP/0/RP0/CPU0:#show configuration failed startup

RP/0/RP0/CPU0:#admin show configuration failed startup

RP/0/RP0/CPU0:#show platform




If everything is fine apply the final commit for the new image :

RP/0/RP0/CPU0:# admin install commit

Install operation 54 '(admin) install commit' started by user ‘root’ via CLI at

| 100% complete: The operation can no longer be aborted (ctrl-c for options)RP/0/RP0/CPU0: instdir[251]: %INSTALL-INSTMGR-4-ACTIVE_SOFTWARE_COMMITTED_INFO : The currently active software is now the same as the committed software.

Install operation 54 completed successfully at 00:00:00




Activating SMU-s are absolutely the same process as activating PIE-s, just create new text script and paste it into the box. If the some of the SMU-s requires reload then at the end of the activation box will be rebooted. To finalize type “admin install commit” one more time for the SMU-s.

Downtime during each reload is about 20-30 minutes. “Add” operation takes about 1 hour, activation with reboot +30 minutes. Almost the same timings are for the SMU-s add+activation.

Total outage expected is 4+ hours!

TIP: Before the start of the upgrade procedure physically remove (with the help of OIR) RP1 from the router. With this operation you can do the fast rollback to the old XR version saved on RP1. After the complete upgrade to new XR on RP0 and putting system in service just insert the RP1 back to the box and it will perform its upgrade to the new image automatically. First it’ll download required files via tftp from the RP0 and then add+activate them. This process will take additional 4 hours for RP1, but the box during that time can be in production and work without any interruption.

PS: Check the licenses are in place after upgrade: “admin show license”.



Small comparison table of upgrade levels:





Vendor
Model
Upgrade files' count
Total files' size
Outage time
Difficulty Level
Cisco
C7600
2
200MB
15-20 min
Easy
Juniper
MX960
1
420MB
2 min
Moderate
Cisco
CRS-1
10+
1GB+
4 + hours
Challenging