Short intro to "Grants4Companies"
Our project "Grants for Companies" won the first price in the competition "eGovernment Wettbewerb 2021". In this blogpost, we show you some details about the implementation.
Obscure command line options, lots of data, output readable only for the initiated - what's not to like?
"strace" is a Linux tool that intercepts system calls from one or more processes to log their inputs and outputs. (There are Windows equivalents, eg. "ProcMon" and the ETW suite.)
This helps to cut the search space for unexplainable behaviour in half - carefully looking at what goes into the OS kernel and what comes out again might offer enough clues to find the application or the kernel side (which includes all the hardware, most notably storage and networking!) at "fault".
Advantages of "strace" are:
Disadvantages are:
My favourite call goes like this (see the man page linked above for details):
strace -fttT -o <filename> -s <bytes> -p <pid>
and I vary the number of bytes depending on my needs -- from zero (when the data itself doesn't matter, and/or to reduce overhead) to lots-of-KB (to capture whole network communications, eg. if it's not easily possible to use "tcpdump").
The following is a heavily redacted example (timestamp and PID removed, lines broken and shortened) for looking at a DNS request done via "host brz.gv.at":
socket(AF_INET, SOCK_DGRAM, IPPROTO_IP) = 9 <0.000058>
connect(9, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("…")}, 16) = 0 <0.000045>
sendmmsg(9, [{msg_hdr={msg_name=NULL,
msg_namelen=0,
msg_iov=[{iov_base="Gf\1\0\0\1\0\0\0\0\0\0\3brz\2gv\2at\0\0\17\0\1", iov_len=27}],
msg_iovlen=1, msg_controllen=0, mgs=0},
msg_len=27}], 1, 0) = 1 <0.000087>
recvmsg(9, {msg_name={sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("…")},
msg_namelen=128 => 16,
msg_iov=[{iov_base="Gf\201\200\0\1\0\0\0\1\0\0\3v\2"..., iov_len=1310700}],
msg_iovlen=1, msg_controllen=0, msg_flags=0},
0) = 97 <0.000065>
The last line shows the result being received; if the "strace" command had specified "-s 97" or higher, the complete DNS answer would be visible.
Here are a few examples of its usefulness.
A few VMs are notably slower than some others - load balancer logfiles show a consistent mismatch.
By comparing the timing of a null-overhead syscall, eg. via
strace -e getuid --syscall-times=ns -- id
the same performance difference could be seen -- but as this syscall has no external dependencies (ie. doesn't use the network or storage), only the raw machine itself could be the culprit. Further investigation showed that after an update cycle the BIOS settings were wrong - CPU and RAM speed were reset.
When looking at network operations (like "send", "recv", "sendmsg", "recvmsg", "write" or "read" to/from a socket, etc.), by associating requests and answers it's possible to see latencies of remote services. If individual "write"s take too long, network problems (TCP buffers, packet loss, ICMP troubles, ...) might even be discerned.
For storage operations it's not quite that easy - a filesystem with the associated complexity (syncs, datasyncs, journaling, ...) typically causes write amplification, the backend-storage (rotating disks, SSDs, Fibrechannel, iSCSI, NVMe, NFS, ...) might have its own caching (and acknowledging) strategy depending on the battery-backup charge (a broken or nearly empty battery might mean direct writes instead of write buffering, and therefore higher latency), and so on.
Still, comparing two similar systems might help to diagnose the issue.
When the data is encrypted (eg. with TLS connections), you won't see the actual requests (SQL statements, HTTP queries, etc.). In that case,
Tracing an application might show up a few easy things to fix, like
While the strace output looks daunting (from the amount of data, the many details, and the hard-to-get big-picture overview), programmers should be in a good position to interpret it -- their programs are causing it, and it's "just" (ha!) a more detailed view of the individual program statements.
BTW, I'm well aware of "perf trace" -- but that one doesn't show contents (no file names, no actual bytes being send, received, written, or read; and in an OpenShift container it didn't even give syscall names!), so while I'm all for decreased overhead, it doesn't give me the data I need (yet).
Last, but not least -- if you already have some idea about the problem and need more specific tracing/debugging, eBPF is your friend!