|
How to do things AI Noob vs. Pro
List biggest files Free Open Source: Swiss File Knifea command line
Depeche View
command line
free external tools,
cpp sources
articles |
sfk ... +xex /from/ [/from2/]
a stream text filter using SFK Simple Expressions.
- takes text stream input from a previous command
or a single file.
- joins all lines into one large block that can be
searched in complete.
- splits output again into lines for further use,
or passes output as binary to +xed
xed/xex is designed to post process small to medium sized
data streams or files. it is not suitable to edit large files
beyond 100 MB, as the whole content must fit into memory
multiple times. use "sfk xreplace" to process large files.
wildcards and SFK expressions
SFK Expressions are simple patterns containing literal text,
wildcards * and ? and character classes in square brackets [].
basically, the syntax provides extended wilcards but no
further logic and is not related to regular expressions.
search patterns are surrounded by a separator character which
can be anything not contained in the search text, like / or _
within a pattern /fromtext/totext/ the fromtext may contain:
* - 0 to 4000 characters in the same
text line or paragraph, i.e. all
bytes not being CR, LF or NULL.
4000 is just a default maximum
that can be changed by:
[0.100000 chars] - 0 to 100000 characters in the same
text line or paragraph, i.e. the
same as * but with a larger range.
? - one character.
????? - same as [5.5 chars] or [5 chars]
[bytes] - 0 to 4000 bytes (with CR,LF,NULL)
i.e. it collects stream text
across lines, even in binary data
** - the same as [bytes].
[0.100 bytes] - 0 to 100 bytes
[.100000 bytes] - up to 100000 bytes
[1.* bytes] - 1 to default maximum bytes
[2 chars] - exactly 2 chars
[30 bytes] - exactly 30 bytes
[byte of aeiou] - one vocal (a OR A OR e OR ...),
case insensitive by default.
"aeiou" is a character list.
[byte of \\\x2f] - a backslash \ or forw. slash /
[bytes of \r\n \t] - whitespace incl. line ends
[bytes of (\r\n \t)] - the same, () are optional
[bytes not \r\n\0] - up to 4000 bytes as long as no
CR, LF or NULL byte appears
[chars] - the same as [bytes not \r\n\0],
i.e. collect text in a line
[char not ( \t)] - same as [byte not ( \r\n\0\t)],
everything not blanks and tabs
[char not )( \t] - not brackets, blanks and tabs,
same as not (\(\) \t)
[chars of a-z0-9] - means a-zA-Z0-9 as search is
case insensitive by default
[chars of \x61-\x7A] - search a-z but not A-Z, or use
option -case for case search
[eol] - end of line by characters:
CRLF or LF or CR
[white] = chars of (\t ) - 0 or more whitespaces
[xwhite] = bytes of (\t \r\n) - same but across lines
[1 white] = byte of (\t ) - 1 whitespace
[digit] = byte of (0-9) - 1 digit
[digits] = bytes of (0-9) - 0 or more digits
[hexdigit] = byte of (0-9a-f) - 1 hexadecimal digit
[hexdigits] = bytes of (0-9a-f) - 0 or more hex digits
special keywords that do not count as tokens:
[skip] - at the start of a pattern: skip such text
completely, do not count it as a search hit.
[keep] - search also the following text but keep it
in the input data, without consuming it.
[ortext] - foo[ortext]bar searches word foo or bar.
[ortext] is allowed only between literals.
anchors that have no length of their own:
[start] - start of file
[end] - end of file
[lstart] - line start, i.e. start or CRLF or CR or LF
[lend] - logical line end, i.e. eol or end of file.
to replace line ends use [eol] instead.
how to search or replace special characters:
- to search or replace text containing the literal characters
* ? \ [ ] then these must be escaped like \* \? \\ \[ \]
- ( ) are escaped only within character lists, like \( \)
- to search or replace the forward slash '/' type \x2f or use
another char around from/to text, e.g. _fromtext_totext_
- parameters with blanks and non trivial characters need double
quotes "", see also "about Shell Command Characters" below.
expansion priorities: (highest first)
if two search parts are side by side, and the same input
character matches both, then these priorities apply:
5: start, end, lstart, lend
4: literal text, eol
3: whitelist classes: byte of, bytes of
2: blacklist classes: chars not, bytes not
1: plain wildcards: ?, *, **, byte, bytes, chars
this means in "/[bytes]foo/" the [bytes] will stop to collect
characters as soon as "foo" is found, as "foo" is a literal.
on same or higher priority the right side stops the left side.
avoid overlapping character groups. for example, [chars][white]
cannot work, as space and tab are part of chars. to fix this
extend chars by relevant exclusions: [chars not ( \t)][white]
the totext may contain:
[part 1] use first text part of the fromtext.
e.g. the fromtext /*foo[.100 chars]bar*/
contains parts : 1 2 3 4 5
[part1] the same (blank is optional).
[parts 1,2,3] use parts 1, 2 and 3.
[parts 1-10] use parts 1 to 10.
[strip(part1,\0)] use part 1 but remove zero bytes.
only zero bytes "\0" can be removed.
[file.name] full input filename with path
[file.relname] input filename without path
[file.path] input file's path
[file.base] relname without last .extension
[file.ext] input filename extension
[all] use all parts from fromtext.
[setvar name]...[endvar] set variable "name" with data
between setvar and endvar.
[getvar name] fill in data from variable "name"
although anchors like lstart, lend count as a separate part
they need NOT be specified in the totext. this means that
/[lstart]foo[lend]/bar/ just changes the word "foo".
if replace looses line endings in output
- when using [eol] in most cases you should add [part...]
to the output pattern, to copy the actual found line
separators, or line endings may get lost.
supported slash patterns
\t = TAB
\r = CR
\n = LF
\x00 = one byte with code 00 hexadecimal
\0 = short form for \x00
\q = a double quote "
\\ = the backslash character \ itself
\[ = the bracket open character [
\] = the bracket close character ]
\* = the literal star character *
\? = the literal question mark ?
\- = to use literal "-" in a command
Within multi line -bylist files:
\ = slash+blank is changed to a single blank
Only within "char of" or "byte not" lists:
\( = to use literal character "("
\) = to use literal character ")"
SFK expression options
-showpart(s) print /from/ part numbers, range statistics
and expansion priority points per part.
done automatically if a required /to/ text
is not given with a command.
-showbest if a /from/ pattern finds nothing, use this to
see how many parts would match so far, and with
up to how many bytes per part. anchors like [lstart]
may show a non zero length when matching (CR)LF.
-showlist with -bylist, show the internal joined list if
commands are spread across multiple lines.
-showall show all of the above.
-xmaxlen=n set default maximum length for chars or bytes commands,
e.g. -xmaxlen=10000 means /foo*bar/ matches with up to
10000 characters between foo and bar. the default max
length without this option is 4000 characters.
performance notes
- always use a string literal, or single byte or char, at the start
of your search expressions, like in /foo*bar/ starting with 'f'.
Do not use a wildcard like * at the start like in /*foobar/
when searching huge input data, as your search will slow down by
factor 256. Use /[lstart]*foobar/ instead.
- the system may cache output file(s), writing to disk in background
after sfk has finished. subsequent batch commands may execute slower.
options
-case compare case sensitive, default is nocase.
for further options see: sfk help nocase
-bylist x read /from/to/ patterns from a file x,
supporting multiple lines per pattern.
for details type: sfk rep -full
-bylinelist x read /from/to/ or just /from/ patterns
from a file with one pattern per line.
best for searching many phrases with
simple or no output reformatting.
-i process text stream from standard input
-tolines force output as text lines. use this
if you get unexpected hex data.
-nomark do not highlight changes in output
-nocol no colors at all to allow more memory
-write if input filename is given, rewrite file
with the changed data. produces an empty
file if search patterns are not found.
-tofile f write output to file f. do not use +tofile
chaining as it splits data into text lines.
-rawterm on output to terminal do not strip codes
below 32. Null bytes are always stripped.
-dump[raw] create hex dump [raw = w/o eol highlight]
-crlf, -lf for file headers and default totext: force
crlf or lf line endings instead of default
-justrc print no output, just set return code.
-firsthit use only first matching result.
chaining I/O support
extract ... +xed supports binary data transfer.
xed ... +xed supports binary data transfer.
In all other cases like xed ... +filter data is passed
as text lines without zero bytes and up to 4000 chars
per line. Binary transfer needs four times free memory
available then the actual number of bytes passed.
unexpected hex data with xed chaining
if you use xed and get an unexpected hex output
like 746573746... it means a following command
cannot handle stream data. use option -tolines then.
unexpected line breaks with +tofile
happen if lines are longer then 4096 chars.
use -tofile instead.
happen if data contains carriage return chars.
add "/\r//" to remove them.
see also
sfk swap change single line character order
web access support
extracting the head section from a web page can be done like:
sfk xex http://192.168.1.100/ "_<head>**</head>_"
sfk xex http://.100/ "_<head>**</head>_"
sfk web .100 +xex "_<head>**</head>_"
archive file reading
xed may directly read archive file entries like
src.zip\\sub1.bz2\\sub2.tar.gz. for details and
limitations type "sfk help xe".
beware of Shell Command Characters.
to find or replace text patterns containing spaces or special
characters like <>|!&?* you must add quotes "" around parameters
or the shell environment will destroy your command. for example,
pattern /foo bar/other/ must be written like "/foo bar/other/"
within a .bat or .cmd file the percent % must be escaped like %%
even within quotes: sfk echo -spat "percent %% is a percent \x25"
unexpected repeat replace behaviour
depending on the input data and search/replace expressions,
it can happen that running the same replace multiple times
on the same stream produces further hits that didn't exist
in the first run. read the sfk replace extended help text
by "sfk replace -full" for details.
quoted multi line parameters are supported in scripts
using full trim. type "sfk script" for details.
return codes for batch files
0 = no matches, 1 = matches found, >1 = major error occurred.
see also "sfk help opt" on how to influence error processing.
about example numbers with [brackets]
if you see [1] type "sfk cmd 1" for whole command in one line.
more in the SFK Book
the SFK Book contains a 60 page tutorial, including
detailed xed examples with input, script and output.
type "sfk book" for details.
examples
Note: also see "sfk xed" for further examples.
sfk xex in.txt "_foo*bar_[part2]\n_"
extract any text found within the same line between
foo and bar, using "_" as separator character instead
of "/". you may leave out the third "_" to get
an info text listing part numbers.
sfk xex in.txt "_\qfoo\q[.100 bytes]\qbar\q_[all]\n_"
extract any text starting with "foo" enclosed by double
quotes, then having up to 100 bytes (including CR or LF,
i.e. across multiple text lines), then ending with bar
enclosed by double quotes, and print all parts.
sfk xex in.cpp "/printf([bytes]);/[all]\n/"
+xed "/);[eol]/[all]/" "/[eol][.100 bytes of \x20]/ /"
collect all (multi line) printf statements from a text
and reformat them as one statement per line. notice
that "/);[eol]/[all]/" is a cover pattern, meaning
it does not change anything, but keeps line endings
after ");" from being changed by other patterns. [1]
sfk xex in.xml "_<row>[xwhite]<artist>*</[bytes]<album>*</
[bytes]<track>*</_[part4]\t[part8]\t[part12]\n_"
if in.xml contains simple xml data like:
<row><artist>foo</artist><album>bar</album>
<track>foobar</track></row>
then reformat this to tab separated csv data. [2]
sfk xex in.csv "_[lstart]*\t*\t*_<row>\n <artist>[part2]
</artist>\n <album>[part4]</album>\n <track>[part6]</track>\n
</row>\n_"
if in.csv contains tab separated data like:
artistname{TAB}albumname{TAB}trackname
then reformat this to xml data. [3]
dir | sfk xex -i -bylist dirtags.txt
reformat windows 'dir' command output like:
05.12.2013 19:17 <DIR> myproj
28.01.2010 22:08 197 readme.txt
using a bylist file dirtags.txt like:
/??.??.????[white]??:??[white]<dir>[white]*
/DIR : [part13]\n/.
/??.??.????[white]??:??[white][digit][* not ( )][white]*
/file: [part14]\n/.
producing output:
DIR : myproj
file: readme.txt
sfk xex in.hpp "/bool[xwhite]bCl*;/" "/int[xwhite]iCl*;/"
extract variable declarations like "bool bClDone;"
or "int iClCounter;" from source code, including
statements across multiple lines.
sfk xex in.html "_<head>**</head>_"
extract head section from an html. notice that "_" is
used as the separator, as "/" is part of the text.
sfk xex in.txt "/[lstart][4 chars][15 chars][15 chars]*/
[part2]\t[part3]\t[part4]\n/"
+xed "/[white][char of (\t\r\n)]/[part2]/" +tabtocsv
extract from fixed column data like below: [4]
7936JAMES FOO ATLANTA 20140129
the first three columns as comma separated data like
7936,JAMES FOO,ATLANTA
sfk xex in.zip\\sub1.tar.bz2\\sub2.tar.gz\\Trace.hpp "/class*/"
XE: extract phrases starting with "class" from a
.tar.gz within a .tar.bz2 within a .zip file.
XD: demo reads first 1000 bytes from sub2.tar.gz
sfk xex in.txt "/rel: [02 digits].[02 digits].[04 digits]
/[setvar date][parts 2,3,4,5,6][endvar]/" +getvar
searches a phrase like "rel: 03.09.2016" within in.txt
and stores it as an sfk variable "date". the +getvar
prints all defined variables with their content. [7]
sfk xex in.xml "_<zone>**<id>*</id>_[part"
type this incomplete command to get part number infos.
then complete the command like:
sfk xex in.xml "_<zone>**<id>*</id>_[part4]\n_" +filt -line=3
+setvar zoneid +echo -var ".100/start.php?zone=#(zoneid)"
from an xml file like
<zone><id>3</id></zone><zone><id>1</id></zone>
<zone><id>8</id></zone><zone><id>2</id></zone>
get the 3rd id and create an http URL using echo.
add +tweb to execute the web request. [9]
sfk xex foo.h +setvar a +then xed bar.c
"/[lstart]#include \qfoo.h\q*[eol]/[getvar a]/"
replace a text line: #include "foo.h"
within file bar.c by the file content of foo.h
sfk -var setvar a="foo bar" +echo -pure "#(a)"
+xex -justrc "_foo_" +if "rc=1" tell "got foo"
check if variable a contains 'foo' by xex.
can be extended to check for multiple, flexible
expression patterns in parallel.
sfk -var setvar a="foo bar"
+if "#(contains(a,'foo')) = 1" tell "got foo"
check if variable a contains 'foo' directly.
fast but only one static text pattern.
for details, type: sfk help var
sfk xex in.xml "/[skip]<[chars not >]>/" /work/
search 'work' in text data of in.xml but
not in tag names like <workbook>
sfk echo aabbccdd +xed "/[2 chars][2 chars]
[2 chars][2 chars]/[parts 4,3,2,1]/"
produces ddccbbaa, i.e. it swaps 4 blocks of
2 chars each. (little endian conversion)
sfk ... +xex /from/ [/from2/]
a stream text filter using SFK Simple
Expressions.
- takes text stream input from a previous
command or a single file.
- joins all lines into one large block that
can be searched in complete.
- splits output again into lines for
further use, or passes output as binary
to +xed
xed/xex is designed to post process small
to medium sized data streams or files. it
is not suitable to edit large files beyond
100 MB, as the whole content must fit into
memory multiple times. use "sfk xreplace"
to process large files.
wildcards and SFK expressions
SFK Expressions are simple patterns
containing literal text, wildcards * and
? and character classes in square
brackets []. basically, the syntax
provides extended wilcards but no
further logic and is not related to
regular expressions.
search patterns are surrounded by a
separator character which can be
anything not contained in the search
text, like / or _
within a pattern /fromtext/totext/ the
fromtext may contain:
*
0 to 4000 characters in the same text
line or paragraph, i.e. all bytes not
being CR, LF or NULL. 4000 is just a
default maximum that can be changed
by:
[0.100000 chars]
0 to 100000 characters in the same
text line or paragraph, i.e. the same
as * but with a larger range.
?
one character.
?????
same as [5.5 chars] or [5 chars]
[bytes]
0 to 4000 bytes (with CR,LF,NULL) i.e.
it collects stream text across lines,
even in binary data
**
the same as [bytes].
[0.100 bytes]
0 to 100 bytes
[.100000 bytes]
up to 100000 bytes
[1.* bytes]
1 to default maximum bytes
[2 chars]
exactly 2 chars
[30 bytes]
exactly 30 bytes
[byte of aeiou]
one vocal (a OR A OR e OR ...), case
insensitive by default. "aeiou" is a
character list.
[byte of \\\x2f]
a backslash \ or forw. slash /
[bytes of \r\n \t]
whitespace incl. line ends
[bytes of (\r\n \t)]
the same, () are optional
[bytes not \r\n\0]
up to 4000 bytes as long as no CR, LF
or NULL byte appears
[chars]
the same as [bytes not \r\n\0], i.e.
collect text in a line
[char not ( \t)]
same as [byte not ( \r\n\0\t)],
everything not blanks and tabs
[char not )( \t]
not brackets, blanks and tabs, same as
not (\(\) \t)
[chars of a-z0-9]
means a-zA-Z0-9 as search is case
insensitive by default
[chars of \x61-\x7A]
search a-z but not A-Z, or use option
-case for case search
[eol]
end of line by characters: CRLF or LF
or CR
[white]
chars of (\t ) - 0 or more
whitespaces
[xwhite]
bytes of (\t \r\n) - same but across
lines
[1 white]
byte of (\t ) - 1 whitespace
[digit]
byte of (0-9) - 1 digit
[digits]
bytes of (0-9) - 0 or more digits
[hexdigit]
byte of (0-9a-f) - 1 hexadecimal
digit
[hexdigits]
bytes of (0-9a-f) - 0 or more hex
digits
special keywords that do not count as
tokens:
[skip]
at the start of a pattern: skip such
text completely, do not count it as a
search hit.
[keep]
search also the following text but
keep it in the input data, without
consuming it.
[ortext]
foo[ortext]bar searches word foo or
bar. [ortext] is allowed only between
literals.
anchors that have no length of their own:
[start]
start of file
[end]
end of file
[lstart]
line start, i.e. start or CRLF or CR
or LF
[lend]
logical line end, i.e. eol or end of
file. to replace line ends use [eol]
instead.
how to search or replace special
characters:
- to search or replace text containing
the literal characters * ? \ [ ]
then these must be escaped like \* \?
\\ \[ \]
- ( ) are escaped only within
character lists, like \( \)
- to search or replace the forward
slash '/' type \x2f or use another
char around from/to text, e.g.
_fromtext_totext_
- parameters with blanks and non
trivial characters need double quotes
"", see also "about Shell Command
Characters" below.
expansion priorities: (highest first)
if two search parts are side by side, and
the same input character matches both,
then these priorities
apply:
5: start, end, lstart, lend
4: literal text, eol
3: whitelist classes: byte of, bytes of
2: blacklist classes: chars not,
bytes not
1: plain wildcards: ?, *, **, byte,
bytes, chars
this means in "/[bytes]foo/" the [bytes]
will stop to collect characters as soon
as "foo" is found, as "foo" is a literal.
on same or higher priority the right side
stops the left side.
avoid overlapping character groups. for
example, [chars][white]
cannot work, as space and tab are part of
chars. to fix this
extend chars by relevant exclusions:
[chars not ( \t)][white]
the totext may contain:
[part 1]
use first text part of the fromtext.
e.g. the fromtext
/*foo[.100
chars]bar*/ contains
parts : 1 2
3 4 5
[part1]
the same (blank is optional).
[parts 1,2,3]
use parts 1, 2 and 3.
[parts 1-10]
use parts 1 to 10.
[strip(part1,\0)]
use part 1 but remove zero bytes.
only zero bytes "\0"
can be removed.
[file.name]
full input filename with path
[file.relname]
input filename without path
[file.path]
input file's path
[file.base]
relname without last .extension
[file.ext]
input filename extension
[all]
use all parts from fromtext.
[setvar name]...[endvar]
set variable "name" with data
between setvar
and endvar.
[getvar name]
fill in data from variable "name"
although anchors like lstart, lend count
as a separate part they need NOT be
specified in the totext. this means that /
[lstart]foo[lend]/bar/ just changes the
word "foo".
if replace looses line endings in output
in output
- when using [eol] in most cases you
should add [part...] to the output
pattern, to copy the actual found line
separators, or line endings may get lost.
supported slash patterns
\t = TAB
\r = CR
\n = LF
\x00 = one byte with code 00
hexadecimal
\0 = short form for \x00
\q = a double quote "
\\ = the backslash character \
itself
\[ = the bracket open character [
\] = the bracket close character ]
\* = the literal star character *
\? = the literal question mark ?
\- = to use literal "-" in a command
Within multi line -bylist files:
\ = slash+blank is changed to a
single blank
Only within "char of" or "byte not"
lists: \( = to use literal
character "(" \) = to use literal
character ")"
SFK expression options
-showpart(s) print /from/ part numbers,
range statistics and
expansion priority points
per part. done
automatically if a
required /to/ text is not
given with a command.
-showbest if a /from/ pattern finds
nothing, use this to see
how many parts would match
so far, and with up to how
many bytes per part.
anchors like [lstart] may
show a non zero length
when matching (CR)LF.
-showlist with -bylist, show the
internal joined list
if
commands are spread across
multiple lines.
-showall show all of the above.
-xmaxlen=n set default maximum length
for chars or bytes
commands, e.g.
-xmaxlen=10000 means /
foo*bar/ matches with up
to 10000 characters
between foo and bar. the
default max length without
this option is 4000
characters.
performance notes
- always use a string literal, or single
byte or char, at the start of your
search expressions, like in /foo*bar/
starting with 'f'. Do not use a
wildcard like * at the start like in /
*foobar/ when searching huge input data,
as your search will slow down by
factor 256. Use /[lstart]*foobar/
instead.
- the system may cache output file(s),
writing to disk in background after sfk
has finished. subsequent batch commands
may execute slower.
options
-case compare case sensitive,
default is nocase. for
further options see: sfk
help nocase
-bylist x read /from/to/ patterns
from a file x, supporting
multiple lines per pattern.
for details type: sfk rep
-full
-bylinelist x read /from/to/ or just
/from/ patterns
from a file with one
pattern per line. best for
searching many phrases with
simple or no output
reformatting.
-i process text stream from
standard input
-tolines force output as text lines.
use this if you get
unexpected hex data.
-nomark do not highlight changes in
output
-nocol no colors at all to allow
more memory
-write if input filename is given,
rewrite file with the
changed data. produces an
empty file if search
patterns are not found.
-tofile f write output to file f. do
not use +tofile chaining as
it splits data into text
lines.
-rawterm on output to terminal do
not strip codes below 32.
Null bytes are always
stripped.
-dump[raw] create hex dump [raw = w/o
eol highlight]
-crlf, -lf for file headers and
default totext: force crlf
or lf line endings instead
of default
-justrc print no output, just set
return code.
-firsthit use only first matching
result.
chaining I/O support
extract ... +xed supports binary
data transfer.
xed ... +xed supports binary
data transfer.
In all other cases like xed ... +filter
data is passed as text lines without
zero bytes and up to 4000 chars per line.
Binary transfer needs four times free
memory available then the actual number
of bytes passed.
unexpected hex data with xed chaining
if you use xed and get an unexpected hex
output like 746573746... it means a
following command cannot handle stream
data. use option -tolines then.
unexpected line breaks with +tofile
happen if lines are longer then 4096
chars.
use -tofile instead.
happen if data contains carriage return
chars.
add "/\r//" to remove them.
see also
sfk swap change single line
character order
web access support
extracting the head section from a web
page can be done like: sfk xex http://
192.168.1.100/ "_<head>**</
head>_" sfk xex http://.100/
"_<head>**</head>_" sfk
web .100 +xex "_<head>**</
head>_"
archive file reading
xed may directly read archive file
entries like src.zip\\sub1.bz2\\sub2.tar.
gz. for details and limitations type
"sfk help xe".
beware of Shell Command Characters.
to find or replace text patterns
containing spaces or special
characters like <>|!&?* you
must add quotes "" around parameters
or the shell environment will destroy
your command. for example, pattern /
foo bar/other/ must be written like "/
foo bar/other/" within a .bat or .cmd
file the percent % must be escaped
like %% even within quotes: sfk echo
-spat "percent %% is a percent \x25"
unexpected repeat replace behaviour
depending on the input data and
search/replace expressions, it can
happen that running the same replace
multiple times on the same stream
produces further hits that didn't exist
in the first run. read the sfk replace
extended help text by "sfk replace
-full" for details.
quoted multi line parameters are supported
in scripts
using full trim. type "sfk script" for
details.
return codes for batch files
0 = no matches, 1 = matches found, >1
= major error occurred. see also "sfk
help opt" on how to influence error
processing.
about example numbers with [brackets]
if you see [1] type "sfk cmd 1" for
whole command in one line.
more in the SFK Book
the SFK Book contains a 60 page
tutorial, including
detailed xed examples with input, script
and output. type "sfk book" for
details.
examples
Note: also see "sfk xed" for further
examples.
sfk xex in.txt "_foo*bar_[part2]\n_"
extract any text found within the
same line between foo and bar, using
"_" as separator character instead of
"/". you may leave out the third "_"
to get an info text listing part
numbers.
sfk xex in.txt "_\qfoo\q[.100 bytes]\
qbar\q_[all]\n_"
extract any text starting with "foo"
enclosed by double quotes, then
having up to 100 bytes (including CR
or LF, i.e. across multiple text
lines), then ending with bar enclosed
by double quotes, and print all parts.
sfk xex in.cpp "/printf([bytes]);/[all]\
n/"
+xed "/);[eol]/[all]/" "/[eol][.100
bytes of \x20]/ /"
collect all (multi line) printf
statements from a text and reformat
them as one statement per line.
notice that "/);[eol]/[all]/" is a
cover pattern, meaning it does not
change anything, but keeps line
endings after ");" from being changed
by other patterns. [1]
sfk xex in.xml
"_<row>[xwhite]<artist>*</
[bytes]<album>*</
[bytes]<track>*</_[part4]\
t[part8]\t[part12]\n_"
if in.xml contains simple xml
data like:
<row><artist>foo</
artist><album>bar</album>
<track>foobar</
track></row> then reformat
this to tab separated csv data.
[2]
sfk xex in.csv "_[lstart]*\t*\
t*_<row>\n <artist>[part2]
</artist>\n
<album>[part4]</album>\n
<track>[part6]</track>\n
</row>\n_"
if in.csv contains tab separated data
like:
artistname{TAB}albumname{TAB}trackname
then reformat this to xml data. [3]
dir | sfk xex -i -bylist dirtags.txt
reformat windows 'dir' command output
like:
05.12.2013 19:17 <DIR>
myproj
28.01.2010 22:08 197
readme.txt
using a bylist file dirtags.txt like:
/??.??.????[white]??:?
?[white]<dir>[white]* /DIR :
[part13]\n/. /??.??.
????[white]??:??[white][digit][* not (
)][white]* /file: [part14]\n/.
producing output: DIR : myproj
file: readme.txt
sfk xex in.hpp "/bool[xwhite]bCl*;/"
"/int[xwhite]iCl*;/"
extract variable declarations like
"bool bClDone;" or "int iClCounter;"
from source code, including
statements across multiple lines.
sfk xex in.html
"_<head>**</head>_"
extract head section from an html.
notice that "_" is used as the
separator, as "/" is part of the text.
sfk xex in.txt "/[lstart][4 chars][15
chars][15 chars]*/
[part2]\t[part3]\t[part4]\n/"
+xed "/[white][char of (\t\r\
n)]/[part2]/" +tabtocsv
extract from fixed column data like
below: [4]
7936JAMES FOO ATLANTA
20140129
the first three columns as comma
separated data like
7936,JAMES FOO,ATLANTA
sfk xex in.zip\\sub1.tar.bz2\\
sub2.tar.gz\\Trace.hpp "/class*/"
XE: extract phrases starting with
"class" from a .tar.gz within a .
tar.bz2 within a .zip file.
XD: demo reads first 1000 bytes from
sub2.tar.gz
sfk xex in.txt "/rel: [02 digits].[02
digits].[04
digits]
/[setvar date][parts 2,3,4,5,
6][endvar]/" +getvar
searches a phrase like "rel:
03.09.2016" within in.txt and stores
it as an sfk variable "date". the
+getvar prints all defined variables
with their content. [7]
sfk xex in.xml
"_<zone>**<id>*</
id>_[part"
type this incomplete command to get
part number infos. then complete the
command like:
sfk xex in.xml
"_<zone>**<id>*</
id>_[part4]\n_" +filt -line=3
+setvar zoneid +echo -var
".100/start.php?zone=#(zoneid)"
from an xml file like
<zone><id>3</id></
zone><zone><id>1</
id></zone>
<zone><id>8</id></
zone><zone><id>2</
id></zone> get the 3rd id
and create an http URL using echo.
add +tweb to execute the web request.
[9]
sfk xex foo.h +setvar a +then xed bar.c
"/[lstart]#include \qfoo.h\
q*[eol]/[getvar a]/"
replace a text line: #include "foo.h"
within file bar.c by the file content
of foo.h
sfk -var setvar a="foo bar" +echo -pure
"#(a)"
+xex -justrc "_foo_" +if "rc=1" tell
"got foo"
check if variable a contains 'foo' by
xex. can be extended to check for
multiple, flexible expression
patterns in parallel.
sfk -var setvar a="foo bar"
+if "#(contains(a,'foo')) = 1" tell
"got foo"
check if variable a contains 'foo'
directly. fast but only one static
text pattern. for details, type: sfk
help var
sfk xex in.xml "/[skip]<[chars not
>]>/" /work/
search 'work' in text data of
in.xml but not in tag names like
<workbook>
sfk echo aabbccdd +xed "/[2 chars][2
chars]
[2 chars][2 chars]/[parts 4,3,2,1]/"
produces ddccbbaa, i.e. it swaps 4
blocks of 2 chars each. (little
endian conversion)
you are viewing this page in mobile portrait mode with a limited layout. turn your device right, use a desktop browser or buy the sfk e-book for improved reading. sfk is a free open-source tool, running instantly without installation efforts. no DLL's, no registry changes - just get sfk.exe from the zip package and use it (binaries for windows, linux and mac are included).
|



