ply
Light-Weight Dynamic Tracing for Linux

Approachable

The syntax is inspired by UNIX's "little languages" of yore, such as awk. Thus, people who have written an awk script should be able to effortlessly write ply scripts. Variables are declared and their types inferred automatically, making iterating on scripts fast and easy.

Light-Weight

Designed with embedded systems in mind. Written in C, all ply needs to run is libc and a modern kernel with Linux BPF support. No external kernel modules, no LLVM compiler, no python inerpreter. It is also very easy to add support for new architectures.

Efficient

All data gathering and aggregation is done in the kernel using Linux BPF programs that are JIT compiled to native instructions on most common architectures. This means that ply runs with very low overhead, allowing it to probe even the hottest code paths.

What Can You Do With it?


Counting Syscalls

#!/usr/bin/env ply

kprobe:SyS_*
{
    $syscalls[func].count()
}

This probe will be attached to all functions whose name starts with SYS_, i.e. all syscalls. On each syscall, the probe will fire and index into the user-defined map $syscalls using the built-in variable func as the key and bump a counter.

ply will compile the script, attach it to the matching probes and start collecting data. On exit, ply will dump the values of all user-defined variables and maps.

wkz@wkz-box:~$ sudo ./syscall-count.ply
331 probes active
^Cde-activating probes

$syscalls:
sys_mprotect                   1
sys_readv                      1
sys_newlstat                   1
sys_access                     2
sys_bind                       3
sys_getsockname                3
sys_rt_sigaction               4
sys_ftruncate                  4
sys_unlink                     4
sys_pselect6                   4
sys_timerfd_settime            4
sys_dup                        5
sys_fdatasync                  5
sys_lseek                     17
sys_inotify_add_watch         21

[ ... output redacted for clarity ... ]

sys_select                 14624
sys_munmap                 14778
sys_mmap_pgoff             14887
sys_epoll_wait             14898
sys_writev                 19516
sys_write                  22644
sys_read                   28700
sys_poll                   53401
sys_futex                  78401
sys_ioctl                 146141
sys_recvmsg               181933

Read Size Distribution

#!/usr/bin/env ply

kprobe:SyS_read
{
    $sizes.quantize(arg(2))
}

This example shows a very simple script that instruments the read(2) syscall and records the distribution of the size argument, i.e. argument 2 (zero indexed), into the user-defined variable $sizes.

wkz@wkz-box:~$ sudo ./read-dist.ply
1 probe active
^Cde-activating probes

$sizes:
[   0,    1]        2089
[   2,    4)         434
[   4,    8)        6334
[   8,   16)        9738
[  16,   32)        1645
[  32,   64)          16
[  64,  128)          24
[ 128,  256)          63
[ 256,  512)         102
[ 512,   1k)         200
[  1k,   2k)         433
[  2k,   4k)         750
[  4k,   8k)        1492
[  8k,  16k)        1157
[ 16k,  32k)        5703
[ 32k,  64k)          26
[ 64k, 128k)          48

Socket Buffer tracking

#!/usr/bin/env ply

kprobe:__netif_receive_skb_core
{
	$rx[arg(0)] = nsecs;
}

kprobe:ip_rcv / $rx[arg(0)] /
{
	$t.quantize(nsecs - $rx[arg(0)]);
	$rx[arg(0)] = nil;
}

Record the distribution of the time it takes an skb to move from __netif_receive_skb_core to ip_rcv.

In the first probe, a timestamp is stored in $rx using the skb pointer as the key. The second probe is then allowed to run if that skb is a valid key in $rx.

Then the difference with the current timestamp is calculated and its distribution stored in $t. As a final step, the original timestamp is removed so that the map will not fill up with useless entries.

Notice that the unit here is nanoseconds. For example, the median time in this case lays somewhere between 1 and 2 microseconds (1-2k nanoseconds).

wkz@wkz-box:~$ sudo ./skb-track.ply
2 probes active
^Cde-activating probes

$rx:

$t:
[ 512,   1k)	    1041
[  1k,   2k)	    5972
[  2k,   4k)	    3439
[  4k,   8k)	     655
[  8k,  16k)	      39
[ 16k,  32k)	       2
[ 32k,  64k)	      10
[ 64k, 128k)	       1
[  1M,   2M)	       1

What Can't You Do With it?


At the moment, any data that is not available as a built-in function in ply, or in a CPU register is out of reach. That means no stack variables, no pointer de-referencing, no type-casting.

ply is intended to be a light-weight tool, useful for exploratory probing. It is a very young project and by no means a complete tracer.

For more explicit control of instrumentation, data processing, and presentation have a look at the bcc project.

It uses an LLVM based C back-end and a Python front-end. With it you can write kprobe tracepoints, tc filters and much more.

bcc on Github