Modern IT is nothing without connectivity, obviously. A server without access to outside networks is very secure—but mostly useless. Ensuring fast and reliable network links for our instances is a top priority for Equinix, and the vast majority of newly created Equinix Metal servers will come up just fine, with fully functional connection to the outside.
But every now and then, there’s a corner case when a newly created instance either cannot communicate with the outside world at all or has limited access, only to its local networking segment. Recently, we decided to take a close look at the corner cases that have come up in network troubleshooting to see if we could pinpoint some common causes behind them. One of the interesting findings was that often, it only takes a small configuration mistake to create a big issue—including complete non-reachability. Another conclusion was that because connectivity problems are caused by so many different things, debugging them often takes a long time—much longer than we like to see.
We then took a closer look at the "usual suspects," the causes and the reasons for the lengthy debugging processes. We found that the part that consumed most of the time from the initial opening of a case in our support system to the problem being solved was information gathering. This is due to the nature of Equinix Metal: because we don’t have direct access to your instances (by design), we need to constantly get the information specific to your instance from you in order to see the bigger picture of the situation. Our customer success team needs to see that bigger picture to think of the possible causes of the issue at hand and start investigating.
A Network Troubleshooting Guide Series
We then asked ourselves: what can we do to speed this network troubleshooting process up? What information can we give you ahead of time to empower you to find and fix issues yourself, and if that doesn't work, how can we make your support experience as quick and as smooth as possible?
Part of the answer is a series of technical guides to debugging (virtual) networking infrastructure in our environment. The first guide in the series, titled Network Path Troubleshooting With Linux, takes a deep dive into common configuration problems with Linux instances and ways to identify them. We put a lot of effort into ensuring that a lot of people can apply the knowledge in this guide. You don’t need years of experience designing and maintaining networks as a Linux administrator to understand and use it.
We hope it has enough information to enable you to identify most of the existing issues on your own. But even when it doesn’t, the guide will still come in handy. By performing the debugging steps listed in the document you’ll be able to rule out a number of issues yourself before even speaking to us and gather a lot of useful information for our customer success team in the process. That way, they’ll be able to help you much faster.
This first guide in the series mostly deals with networking and connectivity issues when connecting from Equinix Metal instances to hosts within the Equinix Metal network or on the internet. The upcoming second guide will highlight the client side of things and explain how to get a client's perspective on connectivity issues when using either Windows or macOS.
Besides these basic network troubleshooting guides, we’re working on guides that zero in specifically on things that go wrong most often. They will contain even more detail on what to look for in the context of each specific issue. So, please stay tuned. There’s a lot more to come!
Ready to kick the tires?
Sign up and get going today, or request a demo to get a tour from an expert.