So for folks wondering about the Garmin outage, here’s my two decades of Enterprise IT experience “guess” at what’s going on. Just my speculation, but I’d put money on some of my predictions being right.
The report is that they got hit with ransomware. That most likely means that someone got socially engineered and downloaded something to their computer that they shouldn’t have. If much of Garmin’s staff is working remote this could have been someone using a personal computer to do work, when they should have been given a dedicated, and hardened corporate device. Many organizations were NOT ready for a massive shift to remote working and got caught with their pants down, and Garmin may have been lax with security in the interest of getting work done.
Lesson #1: Disaster recovery is not just a side hustle for your infrastructure manager. You need a top to bottom plan on how you’ll run your business when people can’t be in the building.
The latest scuttlebutt is that it also took down their phone system and email. If they’re running an internal email system that means that this thing was probably running rampant for far longer than it should have. It means it got into nooks and crannies of their storage systems that it never should have. The fact that it took down their phones means it probably got into some incredibly critical network systems as well. This is a huge breach of security, and means that the initial infection may have been someone with really high level access to systems.
Lesson #2: Don’t let people log in to secure systems with day-to-day user accounts. Force people to use specialized, highly secured, network accounts to get access to sensitive systems. Yes, it’s a bit of a pain to deal with multiple layers of access, but it can prevent things like ransomware from spreading.
Finally, the news is saying that they may be down until the 25th. That’s two days from now. If that’s the case, then they’re probably looking at doing some form of mass data restoration from archive backup. Dear god I hope they’re still not depending on tape. If it takes 36 hours (which is where we’ll be by then) to restore your critical systems, your backup strategy has some serious flaws. In my organization we dealt with this type of thing often, and we would use vendor specific snapshots to allow for rapid recovery.
Lesson #3: Your backup system needs to be able to deal with rapid recovery of massive systems. You can’t just archive stuff through Commvault and expect speedy recovery times.
So that’s my quick and dirty assessment. This is ugly, and there’s probably a lot more to the story that will come out in the next few days. There’s a lot more lessons here than the three I mentioned, and I’m sure that Garmin will be spending a lot of time improving themselves after this event.
Unless this was some kind of state-sponsored, targeted, attack, there’s a lot that Garmin could have done to prevent this. Let this be a lesson for other companies. Think ahead and don’t brush off the recommendations of your cybersecurity and infrastructure people. We know what we’re talking about.