Skip to main content

Easy Start

Today I learned not to rely on a third party backup vendor or support.

I remember a fella once reminding me that the chance a hard drive will fail is exactly 100% and that it's just a matter of time. I've as of yet to ever have a hard drive failure that I wasn't anticipating but I did just spend the last week recovering from a failed backup. It also reminds me of an ongoing joke I have about my pressure washer because it has a little tag on it that says "Easy Start" and it's litterally the only machine I own that I have to fight to start with starter fluid.

It started with the discovery of an out of place file in a website install while doing some routine maintenance. No biggie there because I've literally fixed hundreds of hacked websites. And let me point out that these were on referral and not sites I actively managed because I tighten down the hatches. The file seemed to be the typical base641, hex, and octal encoded spam non-sense until I started reverse engineering and discovered more files.

At first I'm thinking it's just some traversal technique with the ability to dump files until I start noticing that the files are referencing one another and a command server. I run some checks on the command server and it turns up referencing a known state sponsored hacking organization. I find the files indicating a compromised server and discover a web shell2 Then the bomb 💣 when I found that the files have elevated root permissions indicating a rootkit3. Although I know I'm outgunned, I still like a challenge so continued to investigate while trying to avoid being detected. It's got all the signs of a an ELF4 attack, so I start scanning the Linux system files and viola - modified files.

I notified the host data center support team which is the second point of failure. I provided all the details of the hack including the fact that the root user account had been disabled, all unused ports blocked, and the other connective ports obscured. It takes almost an entire day for the support request to make its way up to senior support and network operations. Once they finally concede that it is indeed a Linux level attack, they offer to re-image the disc and restore from their backup service.

I'm not going to point fingers here, but I will say this is a major vendor and my client is a commonly targeted organization. We've got a high level support plan in place. At this point, I've only got a small bit offline and I'm monitoring the server keeping it rolling. I schedule the re-image on off hours and send out notifications. The time comes and goes, it's getting late and I go to bed with the confidence that everything will be restored by the morning. WRONG - I wake up early to text messages about the system down, log into the support portal and there are zero messages which is even more worrisome than an explanation.

The first support message comes in with a rather nondescript 'there was a restore failure and we're working on it'. Not good. I'm sure they're used to clients pointing out the importance that their services stay online, so I try to kindly reassure the rest of folks. This repeats for hours and finally after one rather laughable service request response, I know I've gotta move on to the emergency plan. Problem is, I don't have one. Never did because I had trust in both the provider and the third party backup vendor.

I keep copies of everything I build or maintain on my local machines but there not always up to date because I rely on automated backups for those. Good news though because these particular local copies were only three months old and I could migrate them elsewhere. I setup up a new server elsewhere and put on a fresh pot of coffee. By the next morning I had them manually migrated and we nicked in just under 24 hours of downtime. I'm too old to be running on just hour hours of sleep and I caffeine hummed through the next 24 hours making fixes from the manual migration. I'm still working on reconfiguring the server.

Lessons learned: don't put your entire trust in a third party vendor for security, support, or backups. Double up to reduce the odds. I've joked with the better half on occasion that I'm a file hoarder. I've got backups of the backups of my backups. Disc space is so cheap now that it makes sense. I'm just gonna start doing that for production servers too, just automate the duplication the entire discs offsite on so that I'm never dependent on a provider as a single point of failure. It turns out that the odds that a vendor will fail you are likely 100% as well, it's just a matter of time. And if the backup service is advertised as "Recover Faster", it's gonna take you a week and you might even lose your data.




Footnotes

  1. Base64 - https://en.wikipedia.org/wiki/Base64

  2. Web shell - https://en.wikipedia.org/wiki/Web_shell

  3. Rootkit - https://en.wikipedia.org/wiki/Rootkit

  4. Linux ELF Malware: The New Front in the Battle for Cloud Security - https://linuxsecurity.com/features/linux-elf-malware-cloud-security