For this article, I had the wonderful honor of co-authoring a piece with my colleague and new friend, Alexey Glazachev, a Senior Reliability Engineer who has spent his career working in the Russian Space Program. Here he shares some stories from the beginning of his career and lessons he has learned along the way. I hope you enjoy reading his story as much as I enjoyed editing it.
***
“Reliability is a pseudoscience, akin to astrology!” With these words, I was greeted by the Operations Manager for the Soyuz ILV Complex in Moscow when I ask him for his last 5 years of missile failure data. This was my first day on the job as a Reliability Engineer.
Although I was early in my career, this wasn’t my first experience with reliability or Integrated Launch Vehicles (ILV) – the high-velocity, dispensable rockets designed for placing satellites into low Earth orbit. In fact it seems that reliability and rockets were always part of my life.
As I toddler, I remember all my favorite toys. They were plastic rockets and airplanes. I used to dream of going to the Baikonur Cosmodrome in southern Kazakhstan (the world’s first and largest operational space launch facility). Math and science always came easy to me. And when I needed to decide what I would do “when I grew up”, there was only one answer: I would work in Russia’s space program.
I graduated from the Moscow Aviation Institute (MAI) with my Master’s degree in Aeronautical and Astronautical Engineering where I studied rocket and space system operation. In my fourth year of studies at MAI, I took an elective course on reliability engineering. I remember a lot of math and graphs in that class, but how I might apply these concepts in my career was entirely unclear to me.
While still at university, I interned at the Moscow Design Bureau of Transport Machinery (which is now part of TsENKI-Roskosmos). DBTM is well known throughout the Russia’s space program for their design and manufacturing of launch support equipment for the Sea Launch Space Complex. And once I graduated – to my great delight – Sea Launch offered me a position with the team of engineers working in Long Beach, California assembling the rocket stages and docking the telecommunication satellites to the rockets in their Zenit-3SL ILV program.
The Zenit-3SL ILV is a rocket designed to deliver satellites into near-earth orbits. In California, the ILV and its payload were fully assembled, and then transferred by crane to the Odyssey, a semi-submersible, mobile launch platform. Once completed, the Odyssey was driven into the Pacific Ocean near the equator where the launches took place. For the safety of the crew, everyone evacuated the platform and the ILV launch sequence was conducted remotely.
Transfer of ILV Zenith-3SL to launch platform. © Alexey Glazachev 2020
Those were amazing times! I had the privilege of working on 4 of the 36 Zenit-3SL launch missions. And it was on these missions where I first began to see the value of reliability engineering.
On the day prior to transferring the assembled ILV to the launch platform, the full team of engineers simulated the entire transfer sequence. And at each simulation, the cranes worked smoothly. But the next day, when working with a real rocket, the cranes failed, leaving the rocket suspended mid-air for very costly durations.
Following my fourth ILV launch mission, Sea Launch offered me the Reliability Engineering position back in Moscow, which I graciously accepted. This is where I met the skeptical and cantankerous operations manager. Although I knew little about reliability, this was a welcomed change for me – a chance to return home and dive into an entirely new discipline.
The whole field of reliability is based on probability theory. The reliability of any system – from kitchen toasters to rocket ships – is described by a random, probabilistic value ranging from 0 to 1. But because most engineers deal with specific, non-random numbers like pressure, mass and load, applications of probability remain difficult to grasp.
Nassim Taleb, bestselling author of “Fooled by Randomness”, contends that our brain is poorly adapted to work with chance and probability. To demonstrate his point, compare these two statements:
- Humans have 10 fingers.
- Humans have 10 fingers with a probability of 0.999.
Which one is clearer for you? Most people are more comfortable with statement #1, even though statement #2 is more accurate.
I soon learned that this difficulty with probably also extended to many engineers and managers within the Russian aerospace industry. For many of them, reliability is a senseless “game of nines” where they see little difference or meaning between .99 and .999.
It turns out that reliability engineering not only saves money, but more importantly, it saves lives. Arguably, it’s the most important aspect of technology. No matter how fast and beautiful a plane is, you’re not stepping on board unless it’s reliable.
Intuitively, we all have basic ideas about reliability. We know that two kidneys are better than one. So it makes sense to us that two computers on a spaceship are better than one, and three are better than two. But engineers don’t use words like “better” or “worse”. We use the language of numbers. Three onboard computers are three times heavier, consume three times the electricity, and cost three times as much as one onboard computer.
The question for the engineering team is then, what is the optimal compromise between high reliability and low weight, energy consumption and cost? This is a tough puzzle to solve! Trying to answer that question is what led me to deepen my knowledge of reliability and quality engineering.
Today I manage a reliability engineering team for a company in Moscow, examining the safety, security and regulatory issues of unmanned aerial vehicles. And I love every minute of it! I also teach young (and sometimes, not so young) engineers on the great value of quality and reliability engineering, and other lessons I’ve learned in the Russian space program.
***
For a greater understanding of the field of reliability engineering and to learn its fundamental concepts, sign up for my online class titled, “An Introduction to Reliability Engineering”.
Authors’ Biographies:
Alexey Glazachev is a Lead Reliability Engineer for Kronstadt, a company specializing in the development and production of high-tech UAV’s. He can be reached through LinkedIn at www.linkedin.com/in/alexeyglazachev, or through his consulting website, https://areliability.com/.
Ray Harkins is the Quality and Technical Manager for Ohio Star Forge in Warren, Ohio. He can be reached through LinkedIn at www.linkedin.com/in/ray-harkins, or through his teaching website https://themanufacturingacademy.com