Use MIN_PORT to MAX_PORT port range to avoid getting EADDRNOTAVAIL caused by
sockets in FIN-WAIT-1 state. This issue is easy to reproduce with following
loop (as root).
src="$(ip route | awk '/default/ {print $9}')"
while true; do
echo "6000 send-probe ip-4 1.1.1.1 local-ip-4 $src port 443 protocol tcp" |
./mtr-packet
done | head -n 10
6000 reply ip-4 1.1.1.1 round-trip-time 11306
6000 address-not-available
6000 address-not-available
[...]
Reported-by: Scott Pearson <scott@cloudflare.com>
Reproeuced-by: Jarred Trainor <jarred@cloudflare.com>
Signed-off-by: Sami Kerola <kerolasa@iki.fi> && <kerolasa@cloudflare.com>
The prior implementation of mtr-packet on Cygwin would
not respond to Unix-style signals sent from other processes.
It was unkillable from the Cygwin shell, even with SIGKILL,
and exiting mtr would sometimes stall for several seconds because
it would ignore the SIGTERM sent from the main mtr process.
It would then wait for all outstanding probes to timeout before
exiting.
Signals were ignored because they are implemented by the Cygwin
library at the user level, (i.e. not provided by the OS kernel),
and mtr-packet often bypassed Cygwin's I/O functions by calling
Win32 APIs directly.
With this rework, the Cygwin implementation uses an ICMP service
thread to call the Win32 ICMP functions, but the main thread
uses a POSIX-style select() loop, similar to the Unix version mtr.
I would have liked to avoid multithreading entirely, but here are
the constraints:
a) mtr was originally a Unix program which used "raw sockets"
b) In order to port mtr to Windows, Cygwin is used to get a
Unix-like environment
c) You can't use a raw socket to receive an ICMP reply on Windows.
However, Windows provides a separate API in the form of
ICMP.DLL for sending and receiving ICMP messages.
d) The ICMP API works asynchronously, and requires completion
through an asynchronous procedure call ("APC")
e) APCs are only delivered during blocking Win32 operations
which are flagged as "alertable." This prevents apps from
having APCs execute unexpectedly during an I/O operation.
f) Cygwin's implementation of POSIX functions does all I/O
through non-alertable I/O operations. This is reasonable
because APCs don't exist in the POSIX API.
g) Cygwin implements Unix-style signals at the application level,
since the Windows kernel doesn't have them. We want our
program to respond to SIGTERM and SIGKILL, at least.
h) Cygwin's signal implementation will deliver signals during
blocking I/O functions in the Cygwin library, but won't
respond to signals if the signal is sent while the application
is in a blocking Windows API call which Cygwin is not aware of.
i) Since we want to both send/receive ICMP probes and also respond
to Unix-style signals, we require two threads: one which
uses Cygwin's POSIX style blocking I/O and can respond to
signals, and one which uses alertable waits using Win32
blocking APIs.
The solution is to have the main thread using select() as the
blocking operation in its loop, and also to have an ICMP service
thread using WaitForSingleObjectEx() as its blocking operation.
The main thread will respond to signals. The ICMP service thread
will run the APCs completing ICMP.DLL requests.
This change doesn't affect non-Windows versions of mtr, other than
moving the code from command_unix.c back into command.c,
since it can now be shared between Unix-like systems and Windows.
There is now a package for Python which invokes mtr-packet and sends
network probes asynchronously: https://pypi.org/project/mtrpacket/
This change adds mention of it to the mtr-packet man page, which
should help people interested in using mtr-packet in monitoring scripts
find it.
Change 817f171d broke the MacOS build.
Change 817f171d converted the error reporting in mtr-packet from
perror() to error(). That's fine, but error() is missing on MacOS.
We already had portability/error.c for this reason, but that was
previously only linked into the mtr binary, not mtr-packet.
This change also links portability/error.c with mtr-packet, when
error() is missing from the OS.