Our team recently performed an upgrade of our Virtual Connect modules to ensure compatibility for the new G8 blades. We have 29 C7000 enclosures in our primary data center and all of them are slated to get the new VC 3.60 firmware that was released earlier this month. We planned to upgrade six enclosures to start with as we had a planned downtime scheduled for a Saturday afternoon. We initially kicked off the Onboard Administrator and iLO firmware upgrades with HP SUM that is packaged in the latest SPP release. For some reason, HP SUM would not authenticate with our VC modules so we were forced to use the Virtual Connect Support Utility (VCSU). We’ve used the VCSU in the past and had good results so we weren’t too discouraged.
We began the VC upgrades after OA and iLO were finished. We noticed within VCSU that the status stalled around 40% for a very long period of time but then eventually completed. We moved on and completed all six enclosures. The OAs reported the VC firmware as v3.60 just as we expected. Some of our staff logged into a couple of our clustered nodes that reside in the upgraded enclosures and noticed there was zero network connectivity on half of the network teams. Further investigation showed that the firmware was not upgraded to v3.60 once logged into the VCM, regardless of what the OA had reported. More investigation followed and some of the VCMs were unavailable and inaccessible.
Troubleshooting began to bring them back online. Reseating the modules did not bring the modules back online. Someone then tried just powering off the modules and powering them back up. This method did allow us to gain access back into the VCM. The firmware version of 3.60 was then correctly reported in the VCM and network connectivity was restored. A call to HP Support provided less than desired results. We were even told to just sit and wait for one of the modules to come back by itself. After many hours of battling, all of the modules were restored and reported firmware v3.60.
Later we contacted our HP solutions architect and he was able to invite a VC engineer on the call. We described our situation and indicated we were upgrading from versions 3.15 and 3.18 on VC 1/10gb Ethernet modules and Flex-10 10gb modules. This issue we saw were the same across the two versions we had and the two types of modules we upgraded. The HP VC engineer confirmed they have had reports from customers with this issue. He confirmed when upgrading to v3.60 from a version older than v3.51, the modules hang during the reboot phase when it is attempting to confirm the configuration with the other module. The excessive hang at 40% that we experienced is caused by this. A power off/on of the module clears this hang and thus restores its functionality. He also said that a customer advisory would be published soon about this known issue. In the meantime, he has given us workarounds to prevent this issue from occurring:
- Upgrade from pre-v3.51 firmware to v3.51 and then perform an additional upgrade from v3.51 to v3.60 using SPP or VCSU
- Upgrade from pre-v3.51 firmware directly to v3.60 and perform a “Reset” of the modules using SPP or VCSU
- Upgrade from pre-v3.51 firmware directly to v3.60 using VCSU and order the upgrade to occur in parallel instead of the default staggered approach
Option 3 can only be performed using VCSU as the SPP (HP SUM) approach does not allow the upgrade to occur in parallel. Be aware that a parallel upgrade will cause an outage of approximately 45 seconds as the modules reboot themselves.
If you are currently running a VC firmware older than v3.51 and plan to upgrade to v3.60, be aware of the issue above. Plan to use one of the three workarounds to reduce the issues you may run into.
Update! – HP’s Advisory about this issue was released on June 29th, 2012 and can be accessed through this link.