episode 158
CrowdStrike as the Exception which Proves the Rule
episode 158
CrowdStrike as the Exception which Proves the Rule
Today we are coming at you a day early for a timely bonus episode with Rob Collie diving deep into the digital disruption that shook the world: The Great CrowdStrike Hiccup of 2024. Picture this: a day when the antivirus behemoth stumbled, and the digital realm felt the shockwaves, akin to a #VALUE! error in your perfectly tuned DAX formula—but on a colossal, global scale.
Before you start placing the blame on IT, let’s take a step back to see the broader view. This incident with CrowdStrike, while jarring, underscores the exceptional resilience typically woven into our digital fabric. Consider it the outlier in your data set that highlights how well everything else aligns. Every day, countless professionals from coders to IT support keep our digital infrastructure running as smoothly as a finely tuned Power Query.
The key lesson from this event? Our digital networks are sturdier than they appear, much like a robust pivot table equipped with slicers. Despite the hiccup, the swift response from tech teams worldwide to rectify the issue was nothing short of remarkable. It’s a nod to the everyday heroes in tech who ensure that our systems run without a hitch. So, the next time you see that meme mocking “the other side,” remember, we’re all part of the same intricate data model of life.
And, don’t worry, we are still on track for a full-length release tomorrow. Be sure to tune in then when we welcome special guest Madison Brooks to share some behind-the-scenes insight on Power BI, Power Apps, and the manufacturing industry!
Episode Transcript
Rob (00:00): Hello, friends. I'm recording this bonus episode early on Friday afternoon while the CrowdStrike outage still rages around us. This is clearly going to be one of those we were all there for at moments. The vast majority of modern civilization is impacted by this to some degree, and it all happened basically in the same instant. How often do we experience things like that? Thankfully, not often. COVID-19 was certainly a bigger deal by orders of magnitude, but that wave didn't hit in a single day. Its lights-out moment played out over a period of weeks. On one hand, China was literally welding people's apartment doors shut from the outside while it was still business as usual in the USA. The more I think about it, I'm not sure we've ever had an event that impacted this many people worldwide all on the same day. Now, it might surprise you to hear that the lesson I want us to take away from today is a positive one, and not even positive in the how we can avoid this sort of thing in the future sense.
(01:06): Other people will take care of that. I mean legitimate feel-good, positive, and in a way that might subtly change our behaviors and thinking. So let's call it a secular non-denominational sermon informed by my career in data and software for sure, but very much in the human plane. First, let's get the obvious negatives out of the way. Let's be clear, people died today. People who would've ended today alive and relatively healthy if this hadn't happened, will have suffered a series of unfortunate events as a result of this with tragic consequences. It will be the result of a complicated and subtle set of interactions that don't make for compelling clickbait news, so we'll probably never hear their stories, but just the core mundane disruptions of the system at this scale will have tremendous follow-on consequences. And okay, let's get gross and talk about the revenues lost today.
(02:04): Folks, the Starbucks mobile ordering app was down for intermittent windows today as were the terminals at many of the stores, and that first world problems, admittedly silly example bookends the spectrum, doesn't it? People dying on one end and we can't order our bougie lattes on the other. Basically, the rest of modern life sits between those markers, doesn't it? Oh, my gosh. It's like I wish we could put a record scratch sound effect in here because I had to stop recording there for a moment because my wife called, this is the sort of thing I couldn't even make up. My wife is at the vet and she called me because she needed help paying them because guess what? The vet's credit card system is down. It's just the gift that keeps giving. Again, small survivable problems, right? But it's just everywhere, so pervasive. And certainly our clients were definitely impacted today. There were failed scheduled refreshes, unresponsive virtual machines.
(03:05): One of our enterprise clients had literally every single desktop impacted. Imagine what that looks like, and oh my gosh, there are some people at CrowdStrike today who are living an absolute hell. In the end, there's going to be one to two engineers most directly responsible for this, and they know it. They already know it. And then there are immediate supervisors who in hindsight are going to be judged to have not been sufficiently vigilant. Those people are going to need therapy and support. Like we discussed in the Forest Brazil episode, software engineers are on average, some of the best people you'll meet. Their intentions, their benevolence, their desire to help, their relative selflessness, there aren't any bond villains behind this. Hopefully it's relatively senior engineers there with great stock options and financial independence to the cushion the blow so that maybe we don't have to feel too bad for them, but I mean, it is a very, very life altering event for these people.
(04:06): But the negative side of today is not the thing I want to focus on. In fact, I want you to think about how often things like this don't happen. Did you ever expect to wake up today and learn that an antivirus software vendor was going to negatively impact the whole world? Nope. It is a random thing out of left field. And even if we just focus on that one vector, the one company CrowdStrike pushing a bad update around the world all at once. Guess how many such updates we've survived to date without incident, hundreds, if not thousands. And again, that's just one company who we'd never realized until today held our fates in its hands. There are many, many more organizations in similar positions of leverage, and at each company, each one of those organizations, there are in turn many people in critical roles.
(05:04): Every single day, there are literally millions of people in roles who can make the lights go out if they do the wrong things. And folks, the lights stay on basically all the time, and that's the core message I want us all to think about today. There are millions of ways every day that things can break for the entire world, and they largely don't. They don't because other people are doing their jobs, they're being responsible, and oftentimes doing so at great cost to themselves. It would be easier to be careless, to be detached, to not take their jobs seriously, to let their cards down, and they don't because otherwise days like today would be commonplace. On New Year's Eve, December 31st 1999 and into the early morning hours of January 1st 2000, I was at home visiting my family in Florida and I was sitting on my mom's back porch with some childhood friends telling them how I expected the next day to be chaotic.
(06:08): I'd seen enough of the software industry by that point, 3 years at Microsoft, three plus years, to know just how fragile things are and just how many things need to go right for software to work. And I just couldn't imagine at that moment, human society being collectively vigilant enough to have found all the potential lights out problems when the calendar rolled over and the transition from two digit to four digit year storage got its first real world test on a global scale. Now, young me was a little bit disappointed to be honest when things went well on January 1st, I was a bit too invested in being right, I think. But let's chalk that up to age appropriate behavior for a 25-year-old. But hey, we can time travel back now and tell them, "Yeah, don't worry. 25 years from now, an antivirus company is going to show us all what things could have looked like."
(06:56): But I believed that chaos was a real possibility on January 1st, 2000 for good reason. Software is a hard problem. Bugs can be incredibly insidious, and they can fly under the radar for a long time until they pounce with spectacular consequences. Oftentimes, when you later on do the forensics on what caused them the postmortem, you realize that no single person was responsible even. Heck, we found a bug one time in Microsoft Word when I was at Microsoft that was not really a bug in Word. It was exposing a bug, a flaw in an Intel processor that had never been discovered before. This problem was discovered for the first time ever when a particular sequence of computer instructions just happened to line up just right in Word's code. So every day, your personal and professional life is in the hands of millions of other people every single day, and things go well, so much so that we notice big time when they don't.
(07:57): Now, let's take that beyond the software world for a moment. You go to the grocery store and there's food there. You go to the pharmacy and they have your medicine. You get injured, and then the medical establishment puts you back together. The supply chains and the systems behind those everyday things we take for granted, both physical and digital, are immensely complicated and largely made of other people. So again, the lights stay on every day because basically everyone else does their job and does it well. I've been thinking about this a lot in recent years and how much bigger the overall pie gets when we're all cooperating. And when you stop to think about it all and reflect on it like I have been today, it's breathtaking. It's just miraculous how the overall system works and how this interdependency between us all raises all boats. 99% of our actions and opinions every day are aligned with this because otherwise this miracle couldn't possibly exist.
(09:02): But when we tune into the news or social media, no one is talking about that. It's all about conflict and outrage and how half of humanity, always the other half, is clearly trying to destroy us no matter which "team" you're on in the game, that's 100% how it feels. The other half's trying to destroy us. And I don't want to diminish any of those feelings because there are plenty of ideas and opinions and behaviors which do warrant our criticism. All of this noise gets a lot clearer though for me when I drop any notion of team and instead focus on the world we live in 99% of the time. The world that I've been describing in this episode, the one where we're all pulling in the same direction with most of our energy, that's the lens I'm trying to use. What makes that world better? In the only game that truly matters, there's only ever been one team like it or not, and we all play for it. And with that, we'll return to our regularly scheduled programming with our normal episode this week. Catch you then, and thank you always for listening.
Sign up to receive email updates
Enter your name and email address below and I'll send you periodic updates about the podcast.
Subscribe on your favorite platform.