paint-brush
On troubleshooting networks: Tracerouteby@pjperez
4,479 reads
4,479 reads

On troubleshooting networks: Traceroute

by Pedro PérezSeptember 16th, 2016
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

I’d like to start a series of posts talking about networking concepts. Nothing extremely complex and just for fun. I thought it would be a good idea to start talking about traceroute as it’s one of the most known troubleshooting tools.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - On troubleshooting networks: Traceroute
Pedro Pérez HackerNoon profile picture

I’d like to start a series of posts talking about networking concepts. Nothing extremely complex and just for fun. I thought it would be a good idea to start talking about traceroute as it’s one of the most known troubleshooting tools.

Hope you enjoy the reading.

Traceroute.

Any of you working with networks have used traceroute at some point to troubleshoot a connectivity issue, but how much do we really know about how it works and about interpreting its results?

Most of you know that traceroute “uses ICMP”, but this is a very broad statement and not always 100% accurate. Why do I say that? Well, let’s start by analysing one of the most common traceroute implementations, tracert.exe from Windows.

tracert.exe


This implementation of traceroute in its most simple usage takes just one parameter: The destination IP address we would like to trace our route to.What does tracert.exe under the hood when we hit enter? It immediately sends 3 ICMP packets with Type 8 Code 0, also known as Echo Request or more commonly Ping, to the destination IP address.

These 3 ICMP packets have a particularity though: Their TTL value is 1.

The TTL is an IP header that gets decremented in 1 every time an IP packet is moved from a subnet to another. In other words, every time a router forwards a packet, it decrements the TTL of the packet.

Expanded IP header in Wireshark

So tracert.exe sent 3 pings to the destination IP address, with a TTL value of 1. Where do these packets go first? Well, given that the destination IP address is not another host in our network, our computer running tracert.exe will look up its own routing table to make a decision on where to send this packet. For most scenarios we will have a computer with simply a default route, which then points to what we call a default gateway (because it is the gateway designated for the default route).

The packet reaches the default gateway, which would be a network device with routing capabilities. Probably a router or a firewall. This router will look into the packet headers to decide what to do with it. In normal situations I would mention now that the router will check its own routing table too, but there’s one thing the router will also check apart from the destination IP address: The TTL header.

As we have mentioned before, the TTL header has a value of 1 and a router will decrement it when forwarding the packet, so in our case the router will have to forward the packet with a TTL value of 0. Is that even possible? Well, a broken IP stack might want to do it, but the short answer is no. Once you have to forward a packet and decrease its TTL value to 0, instead of forwarding it you will drop it and send a notification back to the original sender. Who is the original sender in our case? Our computer.

What kind of notification will the router send back to our computer? If only we had some sort of standardized Internet Control Message Protocol!

That’s right, the router will send back an ICMP packet of Type 11 Code 0, also known as “TTL Expired (or TTL exceeded) in transit”. This ICMP will also contain the original dropped packet as its payload, so your computer can identify the packet that has been dropped.

Well, that’s all fine, but how does tracert.exe take advantage of this? Easy! ICMP packets are of course IP packets, so they have IP headers, right? What useful information could tracert.exe use from the IP headers? The source IP address. As the TTL Expired packet is a new packet originated on the router, the source IP address is the IP address of the router, so tracert.exe will read that IP address and identify who is the first hop:

As you can see in the screenshot above, our computer 192.168.1.103 sends 3 pings to 8.8.8.8 (with TTL==1 that can’t be seen in the screenshot, but you can see in previous screenshots on this article) and gets 3 TTL Expired packets back from 192.168.1.254. That’s how tracert.exe identified the first hop and how it measured the latency between us and that hop (time difference between sending the ping and receiving the TTL Expired packet). The process is the same until the TTL is big enough to reach the destination without expiring in transit, in which case we will get a reply to our ping (Echo Reply ICMP).

So yeah, we could say that tracert.exe “Uses ICMP”, because there’s no other protocol higher than layer 3 involved.

Request timed out at hop #2. What’s going on?

Request timed out.

It’s not rare to see this message on our tracert.exe output. What does this mean? In a nutshell, it means that we never got a TTL expired ICMP packet nor an Echo Reply ICMP back for those pings sent. Reason?


- The router is configured to not send those replies for security or economy reasons.- Either our pings got lost before arriving to the router or the ICMP replies got lost in their way back to us.

In the above screenshot, I’m inclined to think that the router at hop #2 is configured to not send those ICMP packets back, just because the rest of the hops seem to be in good health and reporting back, so I would discard heavy packet loss in the path (see? We have started getting something useful out of tracert.exe’s results).

Fun fact: After hop #9 we have crossed the Atlantic Ocean (+100ms).

There are other occasions where the Request timed out messages do appear after certain hop and we never get a new hop IP address. In those cases, like in the screenshot above and just after hop #14, there is a device dropping the ICMP packets. In some occasions you’ll be missing also a bunch of results until the very last one: In that case, the intermediate device is dropping ICMPs that are not pings (echo request and echo reply).

In this case and with only this output, we can’t say for sure how many hops there are until destination and tracert.exe would keep going until hop #30 (by default).

How do you go past this? You can’t. But I’ve heard about tcptraceroute! — Yes, tcptraceroute is a bit different, but it’s not going to do the job on many scenarios. Let me explain why:

Tcptraceroute sends TCP SYN packets to the destination, starting with TTL == 1 and increasing its value every time after getting a result. Still, we have to receive TTL Expired ICMPs to be able to identify who dropped the packet — we won’t be able to get them if a device is dropping ICMPs.

Why tcptraceroute then? Because if the final host doesn’t reply to ping, it might reply to your TCP SYN packet. Or a firewall in between might let TCP traffic inbound but not ICMP inbound (and permit anything outbound, including TTL Expired ICMPs).

There’s one more detail I’d like to highlight from the screenshot above. The theory is that tracert.exe hops should be increasing its latency, even if it’s less than 1ms and doesn’t reflect on the output, but how come does take longer to get to hop #6 than hop #7? ICMPs surely don’t travel back in time. With what we know now, that each hop is measured independently and the latency is how long does it take for us to get a TTL Expired ICMP back after we send our ping, having higher latency on a farther hop doesn’t sound that crazy. From the above screenshot I can’t be sure, but it’s probably one of these two:

  • ICMP Throttling: Some network administrators, especially at ISPs with busy networks, configure policies to reply to these ICMPs with a very low priority so they might be very fast forwarding your ICMPs to the next hop, but when they are the ones that have to reply to them, they will probably delay the response a bit further.
  • Network fluctuations: A device in the path might have been more overloaded than when we tested hop #7.

The latency difference between #6 and #7 is not that big, but in some cases you might see up to 100ms delay on closer hops. Don’t get fooled by this, there might be no issues at all in the path even with those weird results.

Fun fact: Most traceroute implementations in Linux use UDP by default.

So, again, traceroute doesn’t necessarily “Use ICMP”, but it surely relies on it :-)

If you think this article has been either useful or entertaining, please don’t forget to click on the tiny green heart below to recommend it to other fellow readers. I would also be extremely grateful if you could share this among your friends and colleagues on your favourite social network. You could also take a look at Wormhole Network, a company that provides on-demand secure private networks as a service. It may be of your interest.

Thank you!