MAN page from CentOS Other silk-common-3.19.1-3.el8.x86_64.rpm


Section: SiLK Tool Suite (1)
Updated: 2021-01-04


rwfileinfo - Print information about a SiLK file 


  rwfileinfo [--fields=FIELDS] [--summary] [--no-titles]        [--site-config-file=FILENAME]        {--xargs | --xargs=FILENAME | FILE [FILE...]}  rwfileinfo --help  rwfileinfo --help-fields  rwfileinfo --version


rwfileinfo prints information about a binary SiLK file that can bedetermined by reading the file's header and by moving quickly over thedata blocks in the file.

rwfileinfo requires one or more filename arguments to be given onthe command line or the use of the --xargs switch. When the--xargs switch is provided, rwfileinfo reads the names of thefiles to process from the named text file or from the standard inputif no file name argument is provided to the switch. The input to--xargs must contain one file name per line. rwfileinfo doesnot read a SiLK file's content from the standard input by default, butit does when either "-" or "stdin" is given as a filename argument.

When the --summary switch is given, rwfileinfo first prints theinformation for each individual file and then prints the number offiles processed, the sum of the individual file sizes, and the sum ofthe individual record counts. 

Field Descriptions

By default, rwfileinfo prints the following information for eachfile argument. Use the --fields switch to modify which pieces ofinformation are printed.

(rwfileinfo prints each field in the order in which support forthat field was added to SiLK. The field descriptions are presentedhere in a more logical order.)

The size of the file on disk as reported by the operating system.rwfileinfo prints 0 for the file-size when reading from thestandard input.
Every binary file written by SiLK has a version number field. SinceSiLK 1.0.0, the version number field has been used to indicate thegeneral structure (or layout) of the file. The file structure adoptedin SiLK 1.0.0 uses a version number of 16 and has a header sectionand a data section. The header section begins with 16 bytes thatspecify well-defined values, and those bytes are followed by one ormore variably-sized header entries. The specifics of the datasection depend on the content of the file.
The header-length field shows the number of octets required by header(i.e., the initial 16 bytes and the header entries). Since everythingafter the header is data, the header-length is the starting offset ofthe data section. The smallest header length is 24 bytes, buttypically the header is padded to be an integer multiple of therecord-length. The header-length that rwfileinfo prints for a fileis determined dynamically by reading the file's header.
When a SiLK tool creates a binary file, the tool writes the currentSiLK release number (such as 3.9.0) into the file's header as a way tohelp diagnose issues should a bug with a particular release of SiLK bediscovered in the future.
Every SiLK file has a byte-order or endian field. SiLK uses themachine's native representation of integers when writing data, andthis field shows what representation the file contains. "BigEndian"is network byte order and "littleEndian" is used by Intel chips. Therwswapbytes(1) tool changes a file's integer representation, andsome tools have a --byte-order switch that allows the user tospecify the integer representation of output files. Theheader-section of a file is always written in network byte order.
SiLK tools may use the zlib library (<>), the LZOlibrary (<>), or the snappylibrary (<>) to compress thedata section of a file. The compression field specifies which library(if any) was used to compress the data section. If a file iscompressed with a library that was not included in an installationof SiLK, SiLK is unable to read the data section of the file. ManySiLK tools accept the --compression-method switch to choose aparticular compression method. (The compression field does notindicate whether the entire file has been compressed with an externalcompression utility such as gzip(1).)
Every binary file written by SiLK has two fields in the header thatspecify exactly what the file contains: the format and therecord-version. In general, the format indicates the content typeof the file and the record-version indicates the evolution of thatcontent.

The contents of a file whose format is "FT_IPSET", "FT_RWBAG", or"FT_PREFIXMAP" is fairly obvious (an IPset, a Bag, a prefix map).

There are many different file formats for writing SiLK Flow records,but the SiLK analysis tools largely use a single Flow file format.That format is "FT_RWIPV6ROUTING" if SiLK has been compiled with IPv6support, or "FT_RWGENERIC" otherwise. A file that uses the"FT_RWGENERIC" format is only capable of holding IPv4 addresses.

The other SiLK Flow file formats are created by rwflowpack(8) as itwrites flow records to the repository. These formats often omitfields and use reduced bit-sizes for fields to reduce the spacerequired for an individual flow record.

The record-version field indicates changes within the general typespecified by the format field. For example, SiLK incremented therecord-version of the formats that hold flow records when theresolution of record timestamps was changed from seconds tomilliseconds.

Together with the format fields specifies the contents of the file.See the discussion of format for details.
Files created by SiLK 1.0.0 and later have a record length field.This field contains the length of an individual record, and this valueis dependent on the format and record-version fields described above.Some files (such as those containing IPsets or prefix maps) do notwrite individual records to the output, and the record length is 1 forthese files.
The count-records field is generated dynamically by determining thelength the data section would require if it were completelyuncompressed and dividing it by the record-length. When therecord-length is 1 (such as for IPset files), the count-records fielddoes not provide much information beyond the length of theuncompressed data. For an uncompressed file, adding header-length tothe product of count-records and record-length is equal to thefile-size.

The fields given above are either present in the well-defined headeror are computed by reading the file.

The following fields are generated by reading the header entries anddetermining if one or more header entries of the specified type arepresent. The field is not printed in the output when the header entryis not present in the file.

Many of the SiLK tools write a header entry to the output file thatcontains the command line invocation used to create that file, andsome of the SiLK tools also copy the command line history from theirinput files to the output file. (The --invocation-strip switch onthe tools can be used to prevent copying and recording of theinvocation.) The command lines are stored in individual headerentries and this field displays those entries with the most recentinvocation at the end of the list.

The command line history is has a couple of issues:

When multiple input files are used to create a single output, theentries are stored as a list, and this makes it is difficult to knowwhich set of command line entries are associated with which inputfile.
When a SiLK tool creates multiple output files (e.g., when using both--pass and --fail to rwfilter(1)), the tool writes the samecommand line entry to each output file. Some context in addition tothe command line history may be needed to know which branch of thattool a particular file represents.
Most of SiLK tools that create binary output files provide the--note-add and --note-file-add switches which allow an arbitraryannotation to be added to the header of a file. Some tools also copythe annotations from the source files to the destination files. Theannotations are stored in individual header entries and this fielddisplays those entries.
The IPset writing tools (rwset(1), rwsetbuild(1),rwsettool(1), rwaggbagtool(1), and rwbagtool(1)) support thefollowing output formats for IPset data structures:
May hold only IPv4 addresses and does not have an ipset header entry.
May hold IPv4 or IPv6 addresses and is readable by SiLK 3.0 andlater. It contains a header entry that describes the IPset datastructure, and the entry specifies the number of nodes, the number ofbranches from each node, the number of leaves, the size of the nodesand leaves, and which node is the root of the tree.
May hold IPv4 or IPv6 addresses and is readable by SiLK 3.7 andlater. The file's header entry specifies whether the file containsIPv4 addresses or IPv6 addresses.
May hold only IPv6 addresses and is readable bySiLK 3.14 and later. The header entry specifies that the filecontains IPv6 data.
Since SiLK 3.0.0, the tools that write binary Bag files (rwbag(1),rwbagbuild(1), and rwbagtool(1)) have written a header entrythat specifies the type and size of the key and of the counter in thefile.
The tools rwaggbag(1), rwaggbagbuild(1), and rwaggbagtool(1)write a header entry that contains the field types that comprise thekey and the counter.
When using rwpmapbuild(1) to create a prefix map file, a stringthat specifies a mapname may be provided. rwpmapbuild writesthe mapname to a header entry in the prefix map file. The mapname isused to generate command line switches or field names when the--pmap-file switch is specified to several of the SiLK tools (seepmapfilter(3) for details). When displaying the mapname,rwfileinfo prefixes it with the string "v1:" which denotes aversion number for the prefix-map header entry. (The version numberis printed for completeness.)
When rwflowpack(8) creates a SiLK Flow file for the repository, allthe records in the file have the same starting hour, the same sensor,and the same flowtype (class/type pair). rwflowpack writes aheader entry to the file that contains these values, and this fielddisplays those values. (To print the names for the sensor andflowtype, the silk.conf(5) file must be accessible.)
When flowcap(8) creates a SiLK flow file, it adds a header entryspecifying the name of the probe from which the data was collected.


Option names may be abbreviated if the abbreviation is unique or is anexact match for an option. A parameter to an option may be specifiedas --arg=param or --arg param, though the first form isrequired for options that take optional parameters.
Specify what information to print for each file argument on thecommand line. FIELDS is a comma separated list of field-names,field-integers, and ranges of field-integers; a range is specified byseparating the start and end of the range with a hyphen (-).Field-names are case-insensitive and may be shortened to a uniqueprefix. When the --fields option is not given, all fields areprinted if the file contains the necessary information. The fieldsare always printed in the order they appear here regardless of theorder they are specified in FIELDS.

The possible field values are given next with a brief description ofeach. For a full description of each field, see ``FieldDescriptions'' above.

The contents of the file as a name and the corresponding hexadecimalID.
An integer describing the layout or structure of the file.
Either "BigEndian" or "littleEndian" to indicate the representationused to store integers in the file (network or non-network byteorder).
The compression library (if any) used to compress the data-section ofthe file, specified as a name and its decimal ID.
The octet length of the file's header; alternatively the offset wheredata begins.
The octet length of a single record or the value 1 if the file'scontent is not record-based.
The number of records in the file, computed by dividing theuncompressed data length by the record-length.
The size of the file on disk as reported by the operating system.
The command line invocation used to generate this file.
The version of the records contained in the file.
The release of SiLK that wrote this file.
For a repository Flow file generated by rwflowpack(8), this printsthe timestamp of the starting hour, the flowtype, and the sensor ofeach flow record in the file.
For a Flow file generated by flowcap(8), the name of the probewhere the flow records where initially collected.
The notes (annotations) that users have added to the file's header.
For a prefix map file, the "mapname" that was set when the file wascreated by rwpmapbuild(1).
For an IPset file whose record-version is 3, a description of the treedata structure. For an IPset file whose record-version is 4, the typeof IP addresses (IPv4 or IPv6).
For a bag file, the type and size of the key and of the counter.
For an aggregate bag file, the field types that comprise the key andthe counter.
After the data for each individual file is printed, print a summarythat shows the number of files processed, the sum of the individualfile sizes, and the total number of records contained in those files.
Suppress printing of the file name and field names. The outputcontains only the values, where each value is printed left-justifiedon a single line.
Read the SiLK site configuration from the named file FILENAME.When this switch is not provided, rwfileinfo searches for the siteconfiguration file in the locations specified in the ``FILES''section.
Read the names of the input files from FILENAME or from thestandard input if FILENAME is not provided. The input is expectedto have one filename per line. rwfileinfo opens each named file inturn and prints its information as if the filenames had been listed onthe command line. Since SiLK 3.15.0.
Print the available options and exit.
Print a description of each field, its alias, and exit.
Print the version number and information about how SiLK wasconfigured, then exit the application.


In the following examples, the dollar sign ("$") represents the shellprompt. The text after the dollar sign represents the command line.

Get information about the file

 $ rwfileinfo   format(id)          FT_RWGENERIC(0x16)   version             16   byte-order          littleEndian   compression(id)     none(0)   header-length       208   record-length       52   record-version      5   silk-version        1.0.1   count-records       7   file-size           572   command-lines                    1  rwfilter --proto=6 ...   annotations                    1  This is some interesting TCP data

Return a single value which is the number of records in the

 $ rwfileinfo --no-titles --field=count-records 7


This environment variable is used as the value for the--site-config-file when that switch is not provided.
This environment variable specifies the root directory of datarepository. As described in the ``FILES'' section, rwfileinfo mayuse this environment variable when searching for the SiLK siteconfiguration file.
This environment variable gives the root of the install tree. Whensearching for configuration files, rwfileinfo may use thisenvironment variable. See the ``FILES'' section for details.


Possible locations for the SiLK site configuration file which arechecked when the --site-config-file switch is not provided.


rwfilter(1), rwaggbag(1), rwaggbagbuild(1),rwaggbagtool(1), rwbag(1), rwbagbuild(1), rwbagtool(1),rwpmapbuild(1), rwset(1), rwsetbuild(1), rwsettool(1)rwswapbytes(1), silk.conf(5), pmapfilter(3), flowcap(8),rwflowpack(8), silk(7), gzip(1)



Field Descriptions

This document was created byman2html,using the manual pages.