Replace text in files using wildcards: sfk xreplace for Windows, Mac OS X and Linux

How to do things
AI Noob vs. Pro

List biggest files
List newest files
Show subdir sizes
Search in files
Replace word in files
List dir differences
Send files in LAN

Free Open Source:

Swiss File Knife

a command line
multi function tool.

remove tabs
list dir sizes
find text
filter lines
find in path
collect text
instant ftp or
http server
file transfer
send text
patch text
patch binary
run own cmd
convert crlf
dup file find
md5 lists
fromto clip
hexdump
split files
list latest
compare dirs
save typing
trace http
echo colors
head & tail
dep. listing
find classes
speed shell
zip search
zip dir list

Depeche View
Source Research
First Steps

windows GUI
automation

command line
file encryption

free external tools,
zero install effort,
usb stick compliant:

zip and unzip
diff and merge
reformat xml
reformat source

cpp sources

log tracing
mem tracing
hexdump
using printf

articles

embedded
stat. c array
stat. java array
var. c array
var. java array
view all text
as you type
surf over text
find by click
quick copy
multi view
find nearby
fullscreen
bookmarks
find by path
expressions
location jump
skip accents
clip match
filter lines
edit text
highlight
load filter
hotkey list
receive text
send in C++
send in Java
smooth scroll
touch scroll
fly wxWidgets
fly over Qt
search Java

Replace text in files on the command line using wildcards and Simple Expressions with Swiss File Knife XE for Windows, Mac OS X and Linux.

sfk xreplace dirName "/searchtext/totext/"

replace in text and binary files using wildcards * and ?
as well as SFK Simple Expressions in brackets [].

Multiple search patterns are executed in the given sequence. Mind this
if they overlap, e.g. /foo/bar/ /foosys/thesys/ makes no sense (foo is
replaced by the first expression, so the 2nd one will fail to match).

by default, replace functions run in SIMULATION mode,
   previewing hits without changing anything. add -yes to apply changes.
   Changing binaries may lead to unpredictable results, therefore keep
   backups of your files in any case.

subdirectories are included by default
   the sfk default for most commands is to process the given directories,
   as well as all subdirs within them. specify -nosub to disable this.

options
   -nosub        do not include files in subdirectories.
   -nobin[ary]   skip binary files.
   -case         case-sensitive text comparison. default is insensitive.
                 for details type: sfk help nocase
   -pat          starts a list of search or replace patterns of the form
                 xsrcxdstx where x is the separator char, src the source
                 to search for, and dst the destination to replace it with.
                 e.g. /foo/bar/ or _foo_bar_ both replace foo by bar.
                 -pat is not required if a single filename is given.
   -text         the same as -pat, starting a text pattern list.
   -bylist x.txt read search patterns from a file x.txt, supporting
                 multiple lines per pattern. (add -full for more.)
   -bylinelist x read /from/to/ or just /from/ patterns from a file x
                 with one pattern per line. (add -full for more.)
                 -by(line)list does not support sfk variables.
                 to use variables in patterns create an sfk script
                 with patterns as parameters. "sfk script" for more.
   -usetmp       allow creation of temporary files if output data is larger
                 than the memory limit (default: 300 MB). without -usetmp,
                 SFK uses the whole RAM, but stops with an error if it runs
                 out of memory.
   -memlimit=n   use up to n mbytes of RAM to store output data, and when
                 the limit is reached, use a temporary file. this option
                 implies -usetmp. to set this permanently by environment,
                 type in a batch file: set SFK_CONFIG=memlimit:n
   -tmpdir x     set directory x as temporary file directory. default is
                 to use the path specified by TEMP or TMP env variable.
   -showtmp      tell verbosely which temporary files are created.
                 SFK temporary filenames contain the process ID
                 to make sure multiple SFK running in parallel
                 do not use the same temporary file.
   -notmp        never create temporary files (default). if combined with
                 -memlimit, sfk stops with an error if memlimit is reached.
   -recsize      set input record size for processing (default=100k).
                 xreplace, xfind and xhexfind extend this automatically
                 based on the largest search patterns.
   -firsthit     process only first found pattern match per file.
   -quiet        do not show progress infos.
   -stat         show statistics like hits per pattern and no. of files.
   -perf         show performance statistics.
   -full         print full help text telling about -bylist pattern files,
                 special character case sensitivity and nested or repeated
                 replace behaviour.

output options
   -dump         create hexdump of search hits or replaced text.
    -wide        with -dump: show 16 bytes per line.
    -lean        with -dump: show  8 bytes per line.
   -dumpfrom     always dump search hits but not replaced text.
   -dumpall      dump search text and replaced text.
   -nodump       do not create a hexdump, list only matching files.
   -astext       no hexdump, but print search hits as plain text.
                 use this only with plain text files, not binary.
   -showle       highlight CR/LF line endings in hex dump output
   -context=n    with hexdump: show additional n bytes of context.
   -reldist      with hexdump: tell relative distances to previous hits.
   -to dir\$file write output files to given path. for details about
                 output file masks, type "sfk help opt" or "sfk run".
   -tofile x     write output data to a single output filename x
                 (which is not interpreted as a mask but taken as is).
   -more[n]      pause output every 30 or n lines.

return codes for batch files
   0 = no matches, 1 = matches found, >1 = major error occurred.
   see also "sfk help opt" on how to influence error processing.

temporary files
   with option -usetmp or -memlimit sfk may create temporary files
   in a folder specified by TEMP or TMP environment variable,
   or within /tmp under Linux, or in a folder given by -tmpdir
   or from an SFK_CONFIG=tmpdir:... setting. type "sfk help opt"
   for further infos.

about nested replacement patterns
   sfk replace myfile.dat /foo/bar/ /bar/goo/
   with SFK base, "foo" will be replaced by "bar" and then
   immediately "bar" is replaced again by "goo".
   with SFK Plus or XE, a replaced part of text is not replaced
   again in the same command, so "foo" stays replaced by "bar".

unexpected repeat replace behaviour
   depending on the input data and search/replace expressions,
   it can happen that running the same replace multiple times
   on the same file produces further hits that didn't exist
   in the first run. add option -full to read more on this.

quoted multi line parameters are supported in scripts
   using full trim. type "sfk script" for details.

wildcards and SFK expressions
   SFK Expressions are simple patterns containing literal text,
   wildcards * and ? and character classes in square brackets [].
   basically, the syntax provides extended wilcards but no
   further logic and is not related to regular expressions.

   search patterns are surrounded by a separator character which
   can be anything not contained in the search text, like / or _

   within a pattern /fromtext/totext/ the fromtext may contain:

     *                       - 0 to 4000 characters in the same
                               text line or paragraph, i.e. all
                               bytes not being CR, LF or NULL.
                               4000 is just a default maximum
                               that can be changed by:
     [0.100000 chars]        - 0 to 100000 characters in the same
                               text line or paragraph, i.e. the
                               same as * but with a larger range.
     ?                       - one character.
     ?????                   - same as [5.5 chars] or [5 chars]
     [bytes]                 - 0 to 4000 bytes (with CR,LF,NULL)
                               i.e. it collects stream text
                               across lines, even in binary data
     **                      - the same as [bytes].
     [0.100 bytes]           - 0 to 100 bytes
     [.100000 bytes]         - up to 100000 bytes
     [1.* bytes]             - 1 to default maximum bytes
     [2 chars]               - exactly 2 chars
     [30 bytes]              - exactly 30 bytes
     [byte of aeiou]         - one vocal (a OR A OR e OR ...),
                               case insensitive by default.
                               "aeiou" is a character list.
     [byte of \\\x2f]        - a backslash \ or forw. slash /
     [bytes of \r\n \t]      - whitespace incl. line ends
     [bytes of (\r\n \t)]    - the same, () are optional
     [bytes not \r\n\0]      - up to 4000 bytes as long as no
                               CR, LF or NULL byte appears
     [chars]                 - the same as [bytes not \r\n\0],
                               i.e. collect text in a line
     [char not ( \t)]        - same as [byte not ( \r\n\0\t)],
                               everything not blanks and tabs
     [char not )( \t]        - not brackets, blanks and tabs,
                               same as not (\(\) \t)
     [chars of a-z0-9]       - means a-zA-Z0-9 as search is
                               case insensitive by default
     [chars of \x61-\x7A]    - search a-z but not A-Z, or use
                               option -case for case search
     [eol]                   - end of line by characters:
                               CRLF or LF or CR

     [white]     = chars of (\t )     - 0 or more whitespaces
     [xwhite]    = bytes of (\t \r\n) - same but across lines
     [1 white]   = byte  of (\t )     - 1 whitespace
     [digit]     = byte  of (0-9)     - 1 digit
     [digits]    = bytes of (0-9)     - 0 or more digits
     [hexdigit]  = byte  of (0-9a-f)  - 1 hexadecimal digit
     [hexdigits]  = bytes of (0-9a-f) - 0 or more hex digits

     special keywords that do not count as tokens:
     [skip]   - at the start of a pattern: skip such text
                completely, do not count it as a search hit.
     [keep]   - search also the following text but keep it
                in the input data, without consuming it.
     [ortext] - foo[ortext]bar searches word foo or bar.
                [ortext] is allowed only between literals.

     anchors that have no length of their own:
     [start]  - start of file
     [end]    - end of file
     [lstart] - line start, i.e. start or CRLF or CR or LF
     [lend]   - logical line end, i.e. eol or end of file.
                to replace line ends use [eol] instead.

     how to search or replace special characters:
     -  to search or replace text containing the literal characters
        * ? \ [ ] then these must be escaped like \* \? \\ \[ \]
     -  ( ) are escaped only within character lists, like \( \)
     -  to search or replace the forward slash '/' type \x2f or use
        another char around from/to text, e.g. _fromtext_totext_
     -  parameters with blanks and non trivial characters need double
        quotes "", see also "about Shell Command Characters" below.

     expansion priorities: (highest first)
     if two search parts are side by side, and the same input
     character matches both, then these priorities apply:

       5:  start, end, lstart, lend
       4:  literal text, eol
       3:  whitelist classes: byte of, bytes of
       2:  blacklist classes: chars not, bytes not
       1:  plain wildcards: ?, *, **, byte, bytes, chars

     this means in "/[bytes]foo/" the [bytes] will stop to collect
     characters as soon as "foo" is found, as "foo" is a literal.
     on same or higher priority the right side stops the left side.

     avoid overlapping character groups. for example, [chars][white]
     cannot work, as space and tab are part of chars. to fix this
     extend chars by relevant exclusions: [chars not ( \t)][white]

   the totext may contain:

     [part 1]            use first text part of the fromtext.
                         e.g. the fromtext /*foo[.100 chars]bar*/
                         contains parts :   1 2         3    4 5
     [part1]             the same (blank is optional).
     [parts 1,2,3]       use parts 1, 2 and 3.
     [parts 1-10]        use parts 1 to 10.
     [strip(part1,\0)]   use part 1 but remove zero bytes.
                         only zero bytes "\0" can be removed.
     [file.name]         full input filename with path
     [file.relname]      input filename without path
     [file.path]         input file's path
     [file.base]         relname without last .extension
     [file.ext]          input filename extension
     [all]               use all parts from fromtext.

     [setvar name]...[endvar]   set variable "name" with data
                                between setvar and endvar.
     [getvar name]              fill in data from variable "name"

     although anchors like lstart, lend count as a separate part
     they need NOT be specified in the totext. this means that
     /[lstart]foo[lend]/bar/ just changes the word "foo".

supported slash patterns
   \t    = TAB
   \r    = CR
   \n    = LF
   \x00  = one byte with code 00 hexadecimal
   \0    = short form for \x00
   \q    = a double quote "
   \\    = the backslash character \ itself
   \[    = the bracket open character [
   \]    = the bracket close character ]
   \*    = the literal star character *
   \?    = the literal question mark  ?
   \-    = to use literal "-" in a command
   Within multi line -bylist files:
   \     = slash+blank is changed to a single blank
   Only within "char of" or "byte not" lists:
   \(    = to use literal character "("
   \)    = to use literal character ")"

SFK expression options
   -showpart(s)  print /from/ part numbers, range statistics
                 and expansion priority points per part.
                 done automatically if a required /to/ text
                 is not given with a command.
   -showbest     if a /from/ pattern finds nothing, use this to
                 see how many parts would match so far, and with
                 up to how many bytes per part. anchors like [lstart]
                 may show a non zero length when matching (CR)LF.
   -showlist     with -bylist, show the internal joined list if
                 commands are spread across multiple lines.
   -showall      show all of the above.
   -xmaxlen=n    set default maximum length for chars or bytes commands,
                 e.g. -xmaxlen=10000 means /foo*bar/ matches with up to
                 10000 characters between foo and bar. the default max
                 length without this option is 4000 characters.

performance notes
 - always use a string literal, or single byte or char, at the start
   of your search expressions, like in /foo*bar/ starting with 'f'.
   Do not use a wildcard like * at the start like in /*foobar/
   when searching huge input data, as your search will slow down by
   factor 256. Use /[lstart]*foobar/ instead.
 - the system may cache output file(s), writing to disk in background
   after sfk has finished. subsequent batch commands may execute slower.

office file support
   sfk ofind        search in .xml text file contents of
                    office files like .docx .xlsx .ods .odt.
   sfk help office  for more infos and options

see also
   --- open source commands ---
   sfk xfind     search  wildcard text in   plain text files
   sfk ofind     search  in office files    .docx .xlsx .ods
   sfk xfindbin  search  wildcard text in   text/binary files
   sfk xhexfind  search  in text/binary with hex dump output
   sfk extract   extract wildcard data from text/binary files
   sfk filter    filter  and edit text with simple wildcards
   sfk find      search  fixed    text in   text        files
   sfk findbin   search  fixed    text in   text/binary files
   sfk hexfind   search  fixed    text in        binary files
   sfk replace   replace fixed    text in   text/binary files
   --- freeware commands ---
   sfk view      GUI tool to search text as you type
   --- xe commercial commands ---
   sfk replace   replace fixed    text with high performance
   sfk xreplace  replace wildcard text in   text/binary files
   sfk help xe   about SFK XE and xreplace with SFK Expressions.

beware of Shell Command Characters.
   to find or replace text patterns containing spaces or special
   characters like <>|!&?* you must add quotes "" around parameters
   or the shell environment will destroy your command. for example,
   pattern /foo bar/other/ must be written like "/foo bar/other/"
   within a .bat or .cmd file the percent % must be escaped like %%
   even within quotes: sfk echo -spat "percent %% is a percent \x25"

web reference
   http://stahlworks.com/sfk-xrep

about example numbers with [brackets]
   if you see [1] type "sfk cmd 1" for whole command in one line.

bad examples with corrections
   if input text contains:
      bool bClFoo;
      bool bClBar   ;
   sfk xfind in.txt "/bool[xwhite]bCl*[xwhite];/"
      does NOT match "bool bClFoo;" because * eats the
      whole input line including ";" so no input is left
      for "[xwhite];" and the whole expression fails.
   sfk xfind in.txt "/bool[xwhite]bCl[* not ;][xwhite];/"
      does both match "bool bClFoo;" and "bool bClBar   ;".
      this means whenever your search fails to work write
      in detail which characters (not) to collect where.
   sfk xex in.txt "/[lstart]foo/[lstart]goo/"
      there is no need to write an anchor like [lstart]
      within totext as it contains no data. use instead:
         sfk xex in.txt "/[lstart]foo/goo/"
   sfk xex in.txt "/foo[lend]bar/goo[part2]bar/"
      anchors like [lend] must be at start or end of fromtext
      and cannot be referenced within totext. use instead:
         sfk xex in.txt "/foo[eol]bar/goo[part2]bar/"

working examples
   sfk xrep mydir "/foo*bar/"
      an incomplete command (missing "totext" part in pattern).
      sfk shows an info text telling about part numbers
      and runs a search for "foo*bar" in all files of mydir.
      nothing is changed so far.
   sfk xrep mydir "/foo*bar/[part1]goo[part3]/"
      same as above, but now the /fromtext/totext/ is complete.
      again sfk runs a search for "foo*bar", but now it displays
      the changed output text (totext), with everything between
      "foo" and "bar" being changed to "goo". add option
      -dumpfrom to display the original found text instead.
   sfk sel mydir .txt +xrep "/foo*bar/[part1]goo[part3]/"
      similar to above, replace in all .txt files of mydir.
   sfk xrep -text "/class* CFoo/[part1][part3]/" -dir mydir -file .hpp
      search only .hpp files within mydir, and replace for example
      "class IMPORT CFoo" by "class CFoo".
   sfk xrep -pat "/[byte not \n][end]/[part1]\n/"
    -dir mydir -file .cpp .hpp -dumpall
      find all .cpp or .hpp files in mydir whose last line is not
      ending with a linefeed, and add the linefeed. to check exactly
      what is changed dump both input and output text. [23]
   sfk xrep -dir mydir -file .hpp -enddir
    -text "/[byte not \n][end]/[part1]\n/" -dumpall
      same as above but with dir parameters first. [25]
   sfk xrep io.txt "/[lstart][20 chars]*/[part3]/"
      cut first 20 characters in every line of io.txt.
   sfk xrep io.txt "/[lstart][9 bytes]1001*/[part2]9009[part4]/"
      in fixed position text file data like:
         rec. 001:5318 aef3 2751 1001
         rec. 002:1001 aef5 275a 1001
         rec. 003:ef49 aef7 2763 1001
      replace "1001" where it appears in columns 10 to 13,
      in this example only the first "1001" in record 2.
   sfk xrep in.dat "/\xFF\xFE[1 byte]\x80\x81/\xFF\xFE\x00\x80\x81/"
      replace byte sequences (not ASCII text strings) in binary data.
      searches byte groups starting with values 0xFF 0xFE, then any
      single byte, then 0x80 0x81, and replaces the variable byte
      by always a binary 0x00 value.
   sfk xreplace in.txt "/foo*bar/other/"
      replace phrases starting with "foo" and ending with "bar"
      by word "other" in single file in.txt
   sfk xreplace -text "/foo*bar/===[part2]===/" -dir mydir -file .txt
      replace foo*bar in all .txt files of folder mydir
      with a new pattern containing the text between foo and bar
      surrounded by "===".
   sfk xrep -text "/\x66\x6f\x6f[0.100 bytes]\x62\x61\x72/---/"
    -dir mydir -file .dat
      replace binary data starting with bytes 0x66, 0x6f, 0x6f,
      ending with 0x62, 0x61, 0x72 and up to 100 bytes inbetween
      by "---" within all .dat files of folder mydir. [24]

sfk xreplace dirName "/searchtext/totext/"

replace in text and binary files using 
wildcards * and ? as well as SFK Simple
Expressions in brackets [].

Multiple search patterns are executed in 
the given sequence. Mind this if they
overlap, e.g. /foo/bar/ /foosys/thesys/
makes no sense (foo is replaced by the
first expression, so the 2nd one will fail
to match).

by default, replace functions run in 
SIMULATION mode,
   previewing hits without changing 
   anything. add -yes to apply changes.
   Changing binaries may lead to
   unpredictable results, therefore keep
   backups of your files in any case.

subdirectories are included by default
   the sfk default for most commands is to 
   process the given directories, as well
   as all subdirs within them. specify
   -nosub to disable this.

options
   -nosub        do not include files in 
                 subdirectories.
   -nobin[ary]   skip binary files.
   -case         case-sensitive text 
                 comparison. default is
                 insensitive. for details
                 type: sfk help nocase
   -pat          starts a list of search or 
                 replace patterns of the
                 form xsrcxdstx where x is
                 the separator char, src
                 the source to search for,
                 and dst the destination to
                 replace it with. e.g. /foo/
                 bar/ or _foo_bar_ both
                 replace foo by bar. -pat
                 is not required if a
                 single filename is given.
   -text         the same as -pat, starting 
                 a text pattern list.
   -bylist x.txt read search patterns from 
    a file x.txt, supporting
                 multiple lines per pattern.
                 (add -full for more.)
   -bylinelist x read /from/to/ or just 
    /from/ patterns from a file x
                 with one pattern per line. 
                 (add -full for more.)
                 -by(line)list does not
                 support sfk variables. to
                 use variables in patterns
                 create an sfk script with
                 patterns as parameters.
                 "sfk script" for more.
   -usetmp       allow creation of 
                 temporary files if output
                 data is larger than the
                 memory limit (default: 300
                 MB). without -usetmp, SFK
                 uses the whole RAM, but
                 stops with an error if it
                 runs out of memory.
   -memlimit=n   use up to n mbytes of RAM 
                 to store output data, and
                 when the limit is reached,
                 use a temporary file. this
                 option implies -usetmp. to
                 set this permanently by
                 environment, type in a
                 batch file: set
                 SFK_CONFIG=memlimit:n
   -tmpdir x     set directory x as 
                 temporary file directory.
                 default is to use the path
                 specified by TEMP or TMP
                 env variable.
   -showtmp      tell verbosely which 
                 temporary files are
                 created. SFK temporary
                 filenames contain the
                 process ID to make sure
                 multiple SFK running in
                 parallel do not use the
                 same temporary file.
   -notmp        never create temporary 
                 files (default). if
                 combined with -memlimit,
                 sfk stops with an error if
                 memlimit is reached.
   -recsize      set input record size for 
                 processing (default=100k).
                 xreplace, xfind and
                 xhexfind extend this
                 automatically based on the
                 largest search patterns.
   -firsthit     process only first found 
                 pattern match per file.
   -quiet        do not show progress infos.
   -stat         show statistics like hits 
                 per pattern and no. of
                 files.
   -perf         show performance 
                 statistics.
   -full         print full help text 
                 telling about -bylist
                 pattern files, special
                 character case sensitivity
                 and nested or repeated
                 replace behaviour.

output options
   -dump         create hexdump of search 
                 hits or replaced text.
    -wide        with -dump: show 16 bytes 
                       per line.
    -lean        with -dump: show  8 bytes 
                       per line.
   -dumpfrom     always dump search hits 
                 but not replaced text.
   -dumpall      dump search text and 
                 replaced text.
   -nodump       do not create a hexdump, 
                 list only matching files.
   -astext       no hexdump, but print 
                 search hits as plain text.
                 use this only with plain
                 text files, not binary.
   -showle       highlight CR/LF line 
                 endings in hex dump output
   -context=n    with hexdump: show 
                 additional n bytes of
                 context.
   -reldist      with hexdump: tell 
                 relative distances to
                 previous hits.
   -to dir\$file write output files to 
    given path. for details about
                 output file masks, type 
                 "sfk help opt" or "sfk
                 run".
   -tofile x     write output data to a 
                 single output filename x
                 (which is not interpreted
                 as a mask but taken as is).
                 
   -more[n]      pause output every 30 or n 
                 lines.

return codes for batch files
   0 = no matches, 1 = matches found, >1 
   = major error occurred. see also "sfk
   help opt" on how to influence error
   processing.

temporary files
   with option -usetmp or -memlimit sfk 
                may create temporary
                files
   in a folder specified by TEMP or TMP 
   environment variable, or within /tmp
   under Linux, or in a folder given by
   -tmpdir or from an SFK_CONFIG=tmpdir:...
   setting. type "sfk help opt" for further
   infos.

about nested replacement patterns
   sfk replace myfile.dat /foo/bar/ 
   /bar/goo/ with SFK base, "foo" will be
   replaced by "bar" and then immediately
   "bar" is replaced again by "goo". with
   SFK Plus or XE, a replaced part of text
   is not replaced again in the same
   command, so "foo" stays replaced by
   "bar".

unexpected repeat replace behaviour
   depending on the input data and 
   search/replace expressions, it can
   happen that running the same replace
   multiple times on the same file produces
   further hits that didn't exist in the
   first run. add option -full to read more
   on this.

quoted multi line parameters are supported 
in scripts
   using full trim. type "sfk script" for 
   details.

wildcards and SFK expressions
   SFK Expressions are simple patterns 
   containing literal text, wildcards * and
   ? and character classes in square
   brackets []. basically, the syntax
   provides extended wilcards but no
   further logic and is not related to
   regular expressions.

   search patterns are surrounded by a 
   separator character which can be
   anything not contained in the search
   text, like / or _

within a pattern /fromtext/totext/ the 
fromtext may contain:

  *
     0 to 4000 characters in the same text 
     line or paragraph, i.e. all bytes not
     being CR, LF or NULL. 4000 is just a
     default maximum that can be changed
     by:
  [0.100000 chars]
     0 to 100000 characters in the same 
     text line or paragraph, i.e. the same
     as * but with a larger range.
  ?
     one character. 
  ?????
     same as [5.5 chars] or [5 chars] 
  [bytes]
     0 to 4000 bytes (with CR,LF,NULL) i.e. 
     it collects stream text across lines,
     even in binary data
  **
     the same as [bytes]. 
  [0.100 bytes]
     0 to 100 bytes 
  [.100000 bytes]
     up to 100000 bytes 
  [1.* bytes]
     1 to default maximum bytes 
  [2 chars]
     exactly 2 chars 
  [30 bytes]
     exactly 30 bytes 
  [byte of aeiou]
     one vocal (a OR A OR e OR ...), case 
     insensitive by default. "aeiou" is a
     character list.
  [byte of \\\x2f]
     a backslash \ or forw. slash / 
  [bytes of \r\n \t]
     whitespace incl. line ends 
  [bytes of (\r\n \t)]
     the same, () are optional 
  [bytes not \r\n\0]
     up to 4000 bytes as long as no CR, LF 
     or NULL byte appears
  [chars]
     the same as [bytes not \r\n\0], i.e. 
   collect text in a line
  [char not ( \t)]
     same as [byte not ( \r\n\0\t)], 
   everything not blanks and tabs
  [char not )( \t]
     not brackets, blanks and tabs, same as 
     not (\(\) \t)
  [chars of a-z0-9]
     means a-zA-Z0-9 as search is case 
     insensitive by default
  [chars of \x61-\x7A]
     search a-z but not A-Z, or use option 
     -case for case search
  [eol]
     end of line by characters: CRLF or LF 
     or CR

  [white]
     chars of (\t )     - 0 or more 
                     whitespaces
  [xwhite]
     bytes of (\t \r\n) - same but across 
                          lines
  [1 white]
     byte  of (\t )     - 1 whitespace 
  [digit]
     byte  of (0-9)     - 1 digit 
  [digits]
     bytes of (0-9)     - 0 or more digits 
  [hexdigit]
     byte  of (0-9a-f)  - 1 hexadecimal 
                         digit
  [hexdigits]
     bytes of (0-9a-f) - 0 or more hex 
                        digits

  special keywords that do not count as 
tokens:
  [skip]
     at the start of a pattern: skip such 
     text completely, do not count it as a
     search hit.
  [keep]
     search also the following text but 
     keep it in the input data, without
     consuming it.
  [ortext]
     foo[ortext]bar searches word foo or 
     bar. [ortext] is allowed only between
     literals.

  anchors that have no length of their own:
  [start]
     start of file 
  [end]
     end of file 
  [lstart]
     line start, i.e. start or CRLF or CR 
     or LF
  [lend]
     logical line end, i.e. eol or end of 
     file. to replace line ends use [eol]
     instead.

  how to search or replace special 
characters:
  -  to search or replace text containing 
     the literal characters * ? \ [ ]
     then these must be escaped like \* \?
     \\ \[ \]
  -  ( ) are escaped only within 
     character lists, like \( \)
  -  to search or replace the forward 
     slash '/' type \x2f or use another
     char around from/to text, e.g.
     _fromtext_totext_
  -  parameters with blanks and non 
     trivial characters need double quotes
     "", see also "about Shell Command
     Characters" below.

  expansion priorities: (highest first)
  if two search parts are side by side, and 
the same input character matches both,
then these priorities
apply:

    5:  start, end, lstart, lend
    4:  literal text, eol
    3:  whitelist classes: byte of, bytes of
    2:  blacklist classes: chars not, 
        bytes not
    1:  plain wildcards: ?, *, **, byte, 
                         bytes, chars

  this means in "/[bytes]foo/" the [bytes] 
will stop to collect characters as soon
as "foo" is found, as "foo" is a literal.
on same or higher priority the right side
stops the left side.

  avoid overlapping character groups. for 
example, [chars][white]
  cannot work, as space and tab are part of 
chars. to fix this
  extend chars by relevant exclusions: 
[chars not ( \t)][white]

the totext may contain:

  [part 1]
     use first text part of the fromtext. 
                      e.g. the fromtext 
                      /*foo[.100
                      chars]bar*/ contains
                      parts :  1 2
                      3 4 5
  [part1]
     the same (blank is optional). 
  [parts 1,2,3]
     use parts 1, 2 and 3. 
  [parts 1-10]
     use parts 1 to 10. 
  [strip(part1,\0)]
     use part 1 but remove zero bytes. 
                      only zero bytes "\0" 
                      can be removed.
  [file.name]
     full input filename with path 
  [file.relname]
     input filename without path 
  [file.path]
     input file's path 
  [file.base]
     relname without last .extension 
  [file.ext]
     input filename extension 
  [all]
     use all parts from fromtext. 

  [setvar name]...[endvar]
     set variable "name" with data 
                             between setvar 
                             and endvar.
  [getvar name]
     fill in data from variable "name" 

  although anchors like lstart, lend count 
as a separate part they need NOT be
specified in the totext. this means that /
[lstart]foo[lend]/bar/ just changes the
word "foo".

supported slash patterns
   \t    = TAB
   \r    = CR
   \n    = LF
   \x00  = one byte with code 00 
         hexadecimal
   \0    = short form for \x00
   \q    = a double quote "
   \\    = the backslash character \ 
         itself
   \[    = the bracket open character [
   \]    = the bracket close character ]
   \*    = the literal star character *
   \?    = the literal question mark  ?
   \-    = to use literal "-" in a command
   Within multi line -bylist files:
   \     = slash+blank is changed to a 
         single blank
   Only within "char of" or "byte not" 
   lists: \( = to use literal
   character "(" \) = to use literal
   character ")"

SFK expression options
   -showpart(s)  print /from/ part numbers, 
                 range statistics and
                 expansion priority points
                 per part. done
                 automatically if a
                 required /to/ text is not
                 given with a command.
   -showbest     if a /from/ pattern finds 
                 nothing, use this to see
                 how many parts would match
                 so far, and with up to how
                 many bytes per part.
                 anchors like [lstart] may
                 show a non zero length
                 when matching (CR)LF.
   -showlist     with -bylist, show the 
                       internal joined list
                       if
                 commands are spread across 
                 multiple lines.
   -showall      show all of the above.
   -xmaxlen=n    set default maximum length 
                 for chars or bytes
                 commands, e.g.
                 -xmaxlen=10000 means /
                 foo*bar/ matches with up
                 to 10000 characters
                 between foo and bar. the
                 default max length without
                 this option is 4000
                 characters.

performance notes
 - always use a string literal, or single 
   byte or char, at the start of your
   search expressions, like in /foo*bar/
   starting with 'f'. Do not use a
   wildcard like * at the start like in /
   *foobar/ when searching huge input data,
   as your search will slow down by
   factor 256. Use /[lstart]*foobar/
   instead.
 - the system may cache output file(s), 
   writing to disk in background after sfk
   has finished. subsequent batch commands
   may execute slower.

office file support
   sfk ofind        search in .xml text 
                      file contents of
                    office files like .docx 
                    .xlsx .ods .odt.
   sfk help office  for more infos and 
                    options

see also
   --- open source commands ---
   sfk xfind     search  wildcard text in 
                         plain text files
   sfk ofind     search  in office files  
                         .docx .xlsx .ods
   sfk xfindbin  search  wildcard text in 
                         text/binary
                         files
   sfk xhexfind  search  in text/binary 
                         with hex dump
                         output
   sfk extract   extract wildcard data 
                 from text/binary files
   sfk filter    filter  and edit text 
                 with simple wildcards
   sfk find      search  fixed    text in 
                         text
                         files
   sfk findbin   search  fixed    text in 
                         text/binary
                         files
   sfk hexfind   search  fixed    text in 
                         binary
                         files
   sfk replace   replace fixed    text in 
                 text/binary files
   --- freeware commands ---
   sfk view      GUI tool to search text 
                 as you type
   --- xe commercial commands ---
   sfk replace   replace fixed    text 
                 with high performance
   sfk xreplace  replace wildcard text in 
                 text/binary files
   sfk help xe   about SFK XE and 
                 xreplace with SFK
                 Expressions.

beware of Shell Command Characters.
   to find or replace text patterns 
   containing spaces or special
   characters like <>|!&?* you
   must add quotes "" around parameters
   or the shell environment will destroy
   your command. for example, pattern /
   foo bar/other/ must be written like "/
   foo bar/other/" within a .bat or .cmd
   file the percent % must be escaped
   like %% even within quotes: sfk echo
   -spat "percent %% is a percent \x25"

web reference
   http://stahlworks.com/sfk-xrep

about example numbers with [brackets]
   if you see [1] type "sfk cmd 1" for 
   whole command in one line.

bad examples with corrections
   if input text contains:
      bool bClFoo;
      bool bClBar   ;
   sfk xfind in.txt 
   "/bool[xwhite]bCl*[xwhite];/"
      does NOT match "bool bClFoo;" because 
      * eats the whole input line including
      ";" so no input is left for
      "[xwhite];" and the whole expression
      fails.
   sfk xfind in.txt "/bool[xwhite]bCl[* 
   not ;][xwhite];/"
      does both match "bool bClFoo;" and 
      "bool bClBar ;". this means
      whenever your search fails to work
      write in detail which characters
      (not) to collect where.
   sfk xex in.txt 
   "/[lstart]foo/[lstart]goo/"
      there is no need to write an anchor 
      like [lstart] within totext as it
      contains no data. use instead:
         sfk xex in.txt "/[lstart]foo/goo/"
   sfk xex in.txt 
   "/foo[lend]bar/goo[part2]bar/"
      anchors like [lend] must be at start 
      or end of fromtext and cannot be
      referenced within totext. use
      instead:
         sfk xex in.txt 
         "/foo[eol]bar/goo[part2]bar/"


working examples
   sfk xrep mydir "/foo*bar/"
      an incomplete command (missing 
      "totext" part in pattern). sfk shows
      an info text telling about part
      numbers and runs a search for
      "foo*bar" in all files of mydir.
      nothing is changed so far.
   sfk xrep mydir 
   "/foo*bar/[part1]goo[part3]/"
      same as above, but now the 
      /fromtext/totext/ is complete. again
      sfk runs a search for "foo*bar", but
      now it displays the changed output
      text (totext), with everything
      between "foo" and "bar" being changed
      to "goo". add option -dumpfrom to
      display the original found text
      instead.
   sfk sel mydir .txt +xrep 
   "/foo*bar/[part1]goo[part3]/"
      similar to above, replace in all .txt 
      files of mydir.
   sfk xrep -text "/class* 
   CFoo/[part1][part3]/" -dir mydir -file .
   hpp
      search only .hpp files within mydir, 
      and replace for example "class IMPORT
      CFoo" by "class CFoo".
   sfk xrep -pat "/[byte not \
              n][end]/[part1]\n/"
    -dir mydir -file .cpp .hpp -dumpall
      find all .cpp or .hpp files in mydir 
      whose last line is not ending with a
      linefeed, and add the linefeed. to
      check exactly what is changed dump
      both input and output text. [23]
   sfk xrep -dir mydir -file .hpp -enddir
    -text "/[byte not \n][end]/[part1]\n/" 
    -dumpall
      same as above but with dir parameters 
      first. [25]
   sfk xrep io.txt "/[lstart][20 
   chars]*/[part3]/"
      cut first 20 characters in every line 
      of io.txt.
   sfk xrep io.txt "/[lstart][9 
   bytes]1001*/[part2]9009[part4]/"
      in fixed position text file data like:
         rec. 001:5318 aef3 2751 1001
         rec. 002:1001 aef5 275a 1001
         rec. 003:ef49 aef7 2763 1001
      replace "1001" where it appears in 
      columns 10 to 13, in this example
      only the first "1001" in record 2.
   sfk xrep in.dat "/\xFF\xFE[1 byte]\x80\
   x81/\xFF\xFE\x00\x80\x81/"
      replace byte sequences (not ASCII 
      text strings) in binary data.
      searches byte groups starting with
      values 0xFF 0xFE, then any single
      byte, then 0x80 0x81, and replaces
      the variable byte by always a binary
      0x00 value.
   sfk xreplace in.txt "/foo*bar/other/"
      replace phrases starting with "foo" 
      and ending with "bar" by word "other"
      in single file in.txt
   sfk xreplace -text 
   "/foo*bar/===[part2]===/" -dir mydir
   -file .txt
      replace foo*bar in all .txt files of 
      folder mydir with a new pattern
      containing the text between foo and
      bar surrounded by "===".
   sfk xrep -text "/\x66\x6f\x6f[0.100 
              bytes]\x62\x61\x72/---/"
    -dir mydir -file .dat
      replace binary data starting with 
      bytes 0x66, 0x6f, 0x6f, ending with
      0x62, 0x61, 0x72 and up to 100 bytes
      inbetween by "---" within all .dat
      files of folder mydir. [24]