Automatic collection of network measurement metadata.
This is a complete rewrite of SoMeta in go. The earlier (Python) version of SoMeta
can be found at https://github.com/jsommers/metameasurement.
Current version is v1.4.1.
The easiest way to build and install is to do: go install github.com/jsommers/someta@latest
(or replace latest
with a specific version tag). This command will download and compile all dependencies, and install the someta
binary in your $GOPATH/bin
directory.
You can, of course, also clone the source repository and use the go
toolchain to build, etc.
SoMeta can be configured using a YAML file. This method of configuration is new in v1.3, and subsumes (nearly) all command-line parameters indicated below. Configuring via a file allows (1) inclusion of URLs and details of associated data sets used in the measurement task(s), (2) inclusion of README-like details which aren't possible with the command-line, and (3) creation of a configuration template that can easily be reused.
An example configuration is provided in the repo, named someta_config_example.yaml
. It should, in general, be self-explanatory (see command-line arguments discussion below for details regarding config parameters for individual monitors).
You can also create a new yaml config using the -I
option. Any command-line flags to change defaults or to configure monitors will be parsed and included in the generated config, which will be written to stdout. The -c
(command) option is required; if any monitors are configured (through -M
command-line arguments), those configurations will be included in the generated yaml, but if no monitors are configured, default configurations for all monitors will be included in the generated yaml. (New in v1.4.0.)
To start SoMeta with a configuration file, use the -y
option:
someta -y someta_config_example.yaml
Another useful command-line parameter is -Y
, which does some simple sanity checks on the configuration and exits prior to starting any monitoring. This option can be used with or without a yaml configuration.
Note that non-monitor command-line parameters are overridden by config file options. Note also that any monitor configurations provided on the command-line are added to monitors configured via the yaml file. It is recommended for your own sanity to just use the configuration file when and where possible and not to mix the two configuration styles.
There are several possible command-line options. See below for a listing of all parameters (i.e., the output of someta -a
. Some additional detail is below, specifically regarding monitors and options.
Usage of ./someta:
-C int
Set CPU affinity (default is not to set affinity) (default -1)
-F duration
Time period after which in-memory metadata will be flushed to file (default 10m0s)
-M value
Select monitors to include. Default=None. Valid monitors=cpu,io,mem,netstat,rtt
-R duration
Time period after which metadata output will rollover to a new file (default 1h0m0s)
-Y Check configuration but don't start metadata collection
-c string
Command line for external measurement program
-d Debug output (metadata is written to stdout)
-f string
Output file basename; current date/time is included as part of the filename (default "metadata")
-l Send logging messages to a file (by default, they go to stdout)
-m duration
Time interval on which to gather metadata from monitors (default 1s)
-q Quiet output
-u duration
Time interval on which to show periodic status while running (default 5s)
-v Verbose output
-w duration
Wait time before starting external tool, and wait time after external tool stops, during which metadata are collected (default 1s)
-y string
Name of YAML configuration file
The -c
option indicates the "external" measurement tool to start. By default,
SoMeta starts sleep 5
, which causes SoMeta simply to collect 5 seconds-worth of
metadata, given what ever monitors have been configured. You'll almost certainly
need to quote the command line for the external tool, and some escaping may be required
if there are embedded quotes needed for the tool (see the example with scamper, below).
The -M
option specifies a monitor to start. Standard available sources include cpu, mem, io, netstat, rtt (see the monitors/
directory).
To configure a monitor, parameters may be specified along with each monitor name, each separated by a colon (:
) or a comma (,
). Each parameter may be a single string, or a key=value
pair. The order of parameters doesn't matter.
Note that if you are using the rtt monitor with IPv6, you'll need to use comma separators because the colon key-value separator can't be distinguished from the colon separator within an IPv6 address.
Here's an example with turning on all monitors (io, netstat, cpu, mem, rtt):
sudo ./someta -M=io,disk0 -M=netstat,en0 -M=cpu -M=me -M=rtt,type=hoplimited,dest=149.43.80.25,maxttl=3,interface=en0 -R 1m -F 20s -f fulltest -m 1s -w 2s -v -c "sleep 150"
Again, type ./someta -h
for a list of command line options and their defaults.
Valid parameters for each standard monitor are:
-
-M=cpu:interval=X
: set the periodic sampling interval (default 1 sec) -
-M=io:interval=X
: set the periodic sampling interval (default 1 sec) -
-M=mem:interval=X
: set the periodic sampling interval (default 1 sec) -
-M=netstat:interval=X
: set the periodic sampling interval.Note that the interval time value is parsed by go's
time.parseDuration
(https://golang.org/pkg/time/#ParseDuration), so any value must also include a unit, likeinterval=1s
(1 second interval).Additional string arguments to the netstat monitor can specify interface names to monitor (all interfaces are included if none are specified). For example, to monitor en0's netstat counters every 5 seconds:
-M=netstat:interval=5s:en0
-
-M=rtt:interface=IfaceName:rate=R:dest=D:type=ProbeType:maxttl=MaxTTL:proto=Protocol:allhops:constflow
Monitor RTT along a path to destination
D
out of interfaceIfaceName
with probe rateR
. Probe interval is gamma distributed. The default destination is 8.8.8.8 and default probe rate is 1/sec.ProbeType
can either beping
orhoplimited
(default is hoplimited)MaxTTL
is maximum ttl for hop-limited probes (pointless for ping probes).
Default is maxttl = 1.Protocol
is (icmp | tcp | udp) (for hop-limited probes). Default is icmp.allhops
: probe all hops up to maxttl (for hop-limited probes)constflow
: manipulate packet contents to force first 4 bytes of transport header to be constant (to make probes follow a constant path). This parameter only has an affect on icmp; data are appended to force the checksum to be a constant value. Note: udp/tcp probes always have const first 4 bytes. -
-M=ss
Monitor socket statistics using the
ss
tool (linux only). Thanks to Ricky Mok (CAIDA) for contributing this module.
Here are some examples:
# Monitor only CPU performance while emitting 100 ICMP echo request (ping) probes to
# www.google.com.
$ sudo ./someta -M=cpu -c "ping -c 100 www.google.com"
# Monitor CPU performance and netstat counters (for all interfaces) for traceroute
$ sudo ./someta -M=cpu -M=netstat -c "traceroute www.google.com"
# Monitor CPU, IO and Netstat counters for ping
# Set the metadata output file to start with "ping_google"
$ sudo ./someta -M=io -M=netstat -c "ping www.google.com" -f ping_google
# Monitor everything, including RTT for the first 3 hops of the network path toward
# 8.8.8.8. As the external tool, use scamper to emit ICMP echo requests, dumping
# its output to a warts file.
$ sudo ./someta -M=cpu -M=mem -M=io -M=netstat:eth0 -M=rtt:interface=eth0:type=hoplimited:maxttl=3:dest=8.8.8.8 -f ping_metadata -l -c "scamper -c \"ping -P icmp-echo -c 60 -s 64\" -o ping.warts -O warts -i 8.8.8.8"
# An example with using the RTT monitor w/IPv6 (with the dummy command `sleep`).
# Note that in my example below I used an IPv6 (6-in-4) tunnel interface.
$ sudo ./someta -c "sleep 5" -M=rtt,dest="2607:f8b0:4006:805::200e",type=hoplimited,interface=he-ipv6,maxttl=6 -v
Sending SIGHUP to SoMeta will cause it to re-read its YAML configuration file. This feature is in progress.
The analyzemeta.py
script performs some simple analysis on SoMeta metadata, printing results to the console.
For more complex data analyses (or, if you prefer, metadata analyses), there is a Python module read_someta.py
that provides a function read_someta
for reading data in a SoMeta .json
file into a dictionary of Pandas DataFrame objects. There will be a different DataFrame object associated with each monitor.
For example:
>>> from read_someta import read_someta
>>> d = read_someta('fulltest_2018-05-03T18:07:11-04:00.json')
>>> d.keys()
dict_keys(['someta', 'cpu', 'mem', 'rtt', 'io', 'netstat'])
>>> d['cpu']
cpu0_idle cpu1_idle cpu2_idle cpu3_idle
timestamp
2018-05-03 18:07:12.978601317-04:00 62.037037 87.735849 68.224299 89.719626
2018-05-03 18:07:13.979181597-04:00 70.000000 93.069307 71.000000 96.000000
2018-05-03 18:07:14.980990941-04:00 82.828283 97.979798 86.000000 98.000000
2018-05-03 18:07:15.980368940-04:00 74.000000 96.039604 79.000000 96.000000
2018-05-03 18:07:16.981288271-04:00 69.306931 89.000000 75.000000 91.089109
... ... ... ... ...
2018-05-03 18:08:08.981608769-04:00 80.808081 94.000000 83.838384 90.000000
2018-05-03 18:08:09.983457489-04:00 83.000000 94.000000 86.274510 89.000000
2018-05-03 18:08:10.981178466-04:00 87.000000 97.000000 93.000000 98.000000
2018-05-03 18:08:11.983964314-04:00 70.297030 92.079208 72.000000 91.000000
2018-05-03 18:08:12.981282530-04:00 90.909091 98.000000 95.959596 99.000000
[61 rows x 4 columns]
>>>
NB: plotting tools need some updating still from the earlier Python versions.
The plotmeta.py
tool is designed to help plot various metrics collected through SoMeta monitors. To see what metrics may be plotted, you can run the following::
$ python3 plotmeta.py -l meta.json
where meta.json
is a SoMeta metadata file. The output of plotmeta.py
with the -l
option shows various items that can be plotted. Each item is organized into groups. You can either plot any number of individual items (-i
option), or plot each metric for an entire group (-g
option). If you want everything, use the -a
option. In addition, -t
option can be used to change the type of output plot. Use ecdf for empirical CDF or timeseries for simple scatter plot with timeline (which is default output of the plot tool). See plotmeta.py -h
for all options.
Here are some examples::
$ python3 plotmeta.py -t ecdf -i cpu:idle -i io:disk0_write_time meta.json
$ python3 plotmeta.py -t timeseries -g cpu meta.json
$ python3 plotmeta.py -a meta.json
Changes from the earlier Python version of SoMeta:
- Because of Go's command-argument handling, flags to someta cannot be written like
-Mcpu
, but must rather be written as-M=cpu
or-M cpu
. - CPU affinity is not yet implemented
- Metadata structure is changed to permit a less tightly-coupled architecture between the someta main and monitors
- The plotting tool hasn't been updated yet to handle these changes, though the basic analysis tool has been updated.
- There's even more rich data collected about the system when someta starts up
v1.3
- Addition of yaml configuration method
- Some minor other code cleanup
- Documentation update
- Addition of
commandlinetool
monitor, which subsumes the contributedss
monitor
v1.4
- Addition of -I flag to create a new config and dump to stdout
I gratefully acknowledge support from the National Science Foundation. The materials here are based upon work supported by the NSF under grant 1814537 ("NeTS: Small: RUI: Automating Active Measurement Metadata Collection and Analysis").
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.
Copyright 2018-21 SoMeta authors. All rights reserved.
The SoMeta software is distributed under terms of the GNU General Public License, version 3. See below for the standard GNU GPL v3 copying text.
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.