POKI_PUT_TOC_HERE

Parsing log-file output

This, of course, depends highly on what’s in your log files. But, as an example, suppose you have log-file lines such as POKI_CARDIFY(2015-10-08 08:29:09,445 INFO com.company.path.to.ClassName @ [sometext] various/sorts/of data {& punctuation} hits=1 status=0 time=2.378)HERE I prefer to pre-filter with grep and/or sed to extract the structured text, then hand that to Miller. Example: POKI_CARDIFY(grep 'various sorts' *.log | sed 's/.*} //' | mlr --fs space --repifs --oxtab stats1 -a min,p10,p50,p90,max -f time -g status)HERE

Bulk rename of field names

POKI_RUN_COMMAND{{cat spaces.csv}}HERE POKI_RUN_COMMAND{{mlr --csv --rs lf rename -r -g ' ,_' spaces.csv}}HERE POKI_RUN_COMMAND{{mlr --csv --irs lf --opprint rename -r -g ' ,_' spaces.csv}}HERE

Filtering paragraphs of text

The idea is to use a record separator which is a pair of newlines. Then, if you want each paragraph to be a record with a single value, use a field-separator which isn’t present in the input data (e.g. a control-A which is octal 001). Or, if you want each paragraph to have its lines as separate values, use newline as field separator. POKI_RUN_COMMAND{{cat paragraphs.txt}}HERE POKI_RUN_COMMAND{{mlr --from paragraphs.txt --nidx --rs '\n\n' --fs '\001' filter '$1 =~ "the"'}}HERE POKI_RUN_COMMAND{{mlr --from paragraphs.txt --nidx --rs '\n\n' --fs '\n' cut -f 1,3}}HERE

Doing arithmetic on fields with currency symbols

POKI_INCLUDE_ESCAPED(data/dollar-sign.txt)HERE

Program timing

This admittedly artificial example demonstrates using Miller time and stats functions to introspectly acquire some information about Miller’s own runtime. The delta function computes the difference between successive timestamps. POKI_INCLUDE_ESCAPED(data/timing-example.txt)HERE

Using out-of-stream variables

One of Miller’s strengths is its compact notation: for example, given input of the form POKI_RUN_COMMAND{{head -n 5 ../data/medium}}HERE you can simply do POKI_RUN_COMMAND{{mlr --oxtab stats1 -a sum -f x ../data/medium}}HERE or POKI_RUN_COMMAND{{mlr --opprint stats1 -a sum -f x -g b ../data/medium}}HERE rather than the more tedious POKI_INCLUDE_AND_RUN_ESCAPED(oosvar-example-sum.sh)HERE or POKI_INCLUDE_AND_RUN_ESCAPED(oosvar-example-sum-grouped.sh)HERE

The former (mlr stats1 et al.) has the advantages of being easier to type, being less error-prone to type, and running faster.

Nonetheless, out-of-stream variables (which I whimsically call oosvars), begin/end blocks, and emit statements give you the ability to implement logic — if you wish to do so — which isn’t present in other Miller verbs. (If you find yourself often using the same out-of-stream-variable logic over and over, please file a request at https://github.com/johnkerl/miller/issues to get it implemented directly in C as a Miller verb of its own.)

The following examples compute some things using oosvars which are already computable using Miller verbs, by way of providing food for thought.

Mean with/without oosvars

POKI_RUN_COMMAND{{mlr stats1 -a mean -f x data/medium}}HERE POKI_RUN_COMMAND{{mlr put -q '@x_sum += $x; @x_count += 1; end{@x_mean=@x_sum/@x_count; emit @x_mean}' data/medium}}HERE

Variance and standard deviation with/without oosvars

POKI_RUN_COMMAND{{mlr --oxtab stats1 -a count,sum,mean,var,stddev -f x data/medium}}HERE POKI_RUN_COMMAND{{cat variance.mlr}}HERE POKI_RUN_COMMAND{{mlr --oxtab put -q -f variance.mlr data/medium}}HERE

Min/max with/without oosvars

POKI_RUN_COMMAND{{mlr --oxtab stats1 -a min,max -f x data/medium}}HERE POKI_RUN_COMMAND{{mlr --oxtab put -q '@min = min(@min, $x); @max = max(@max, $x); end{emitf @min, @max}' data/medium}}HERE

Delta with/without oosvars

POKI_RUN_COMMAND{{mlr --opprint step -a delta -f x data/small}}HERE POKI_RUN_COMMAND{{mlr --opprint put '$x_delta = ispresent(@last) ? $x - @last : 0; @last = $x' data/small}}HERE

Keyed delta with/without oosvars

POKI_RUN_COMMAND{{mlr --opprint step -a delta -f x -g a data/small}}HERE POKI_RUN_COMMAND{{mlr --opprint put '$x_delta = ispresent(@last[$a]) ? $x - @last[$a] : 0; @last[$a]=$x' data/small}}HERE

Exponentially weighted moving averages with/without oosvars

POKI_INCLUDE_AND_RUN_ESCAPED(verb-example-ewma.sh)HERE POKI_INCLUDE_AND_RUN_ESCAPED(oosvar-example-ewma.sh)HERE