How to do things AI Noob vs. Pro
List biggest files Free Open Source: Swiss File Knifea command line
Depeche View
command line
free external tools,
cpp sources
articles |
characters and codepages with SFK for Windows: SFK uses 8-bit character codes with a possible range of 255 different characters. see: sfk ascii character codes 32-126, or hexadecimal 0x20-0x7E, are 7-bit ASCII characters. within SFK they are called "Low Codes", or LoCodes. as long as you use only a-z A-Z 0-9 !"#$%&_ etc. you use LoCodes, which will work the same on every computer in the world, and you can ignore code pages. but as soon as you want to use accent characters, umlauts, cyrillic, greek etc. you need HiCodes in the range 0x80-0xFF. these are dependent on the codepages of your Windows system, and you can only use chars of your own language, plus English. your Windows CMD.EXE command line uses two codepages: 1. ANSI codepage for data processing. every text within SFK is encoded in this codepage. Most text editor programs like Notepad will use this codepage by default. 2. Dos/OEM codepage for input and display. what you type on your keyboard is encoded in 850. the CMD.EXE terminal can only display HiCodes in this codepage correctly. HiCode conversions step by step: - when you run sfk, and pass parameters, these are converted from OEM to Ansi and then given to sfk. so sfk gets only Ansi encoded parameters. - within SFK all data processing is done with Ansi, e.g. filter ... +xed ... will pass Ansi text. - when printing text to terminal, SFK converts it from Ansi to OEM for output. otherwise HiCodes would all look wrong, as the terminal needs OEM. - when writing text output to file, like filter ... >out.txt filter ... +tofile out.txt it is written as Ansi, without any conversion. you can then open out.txt with the Notepad or Depeche View, which expect Ansi text, and HiChars will display correctly. Beware of HiCodes within batch files. - if you run SFK interactively like: sfk filter in.txt -+myword and myword contains HiCodes, you type them all as OEM chars, and it works. - if you create a batch file with Windows Notepad, and therein type sfk filter in.txt -+myword and myword contains HiCodes, you will find that filter no longer finds the word. Because Notepad created an Ansi encoded text file, so the "myword" chars are Ansi encoded. what happens? - CMD.EXE still thinks "myword" is OEM, and incorrectly "converts" it to Ansi, which actually breaks all HiCode chars. - sfk.exe then gets myword with completely wrong encoding, and the search fails. how to fix this: - write your .bat files with OEM encoding. this can be done with Notepad++: - create a new file mytest.bat - select: Encoding / Character Set / your area, then select your OEM codepage. - now type sfk commands into the batch file, and save it. - side effect: if you create sfk scripts embedded in such a batch file, like: sfk batch mytest2.bat searches therein will fail again if this is OEM encoded. because by default "sfk script" wants to load Ansi text. to fix this use option -dos like: sfk script -dos ... What is not possible? SFK cannot process any text outside your Ansi codepage. for example, if a computer uses Western Europe codepage 1252, it is possible to search German umlauts and some French accent characters. but it is impossible to search and filter cyrillic text (encoded in 1251), and it will even be impossible to type cyrillic chars in the first place, as the keyboard has no such keys. see also: sfk help nocase about case insensitive search sfk help unicode unicode to Ansi conversion characters and codepages with SFK for Windows: SFK uses 8-bit character codes with a possible range of 255 different characters. see: sfk ascii character codes 32-126, or hexadecimal 0x20-0x7E, are 7-bit ASCII characters. within SFK they are called "Low Codes", or LoCodes. as long as you use only a-z A-Z 0-9 !"#$%&_ etc. you use LoCodes, which will work the same on every computer in the world, and you can ignore code pages. but as soon as you want to use accent characters, umlauts, cyrillic, greek etc. you need HiCodes in the range 0x80-0xFF. these are dependent on the codepages of your Windows system, and you can only use chars of your own language, plus English. your Windows CMD.EXE command line uses two codepages: 1. ANSI codepage for data processing. every text within SFK is encoded in this codepage. Most text editor programs like Notepad will use this codepage by default. 2. Dos/OEM codepage for input and display. what you type on your keyboard is encoded in 850. the CMD.EXE terminal can only display HiCodes in this codepage correctly. HiCode conversions step by step: - when you run sfk, and pass parameters, these are converted from OEM to Ansi and then given to sfk. so sfk gets only Ansi encoded parameters. - within SFK all data processing is done with Ansi, e.g. filter ... +xed ... will pass Ansi text. - when printing text to terminal, SFK converts it from Ansi to OEM for output. otherwise HiCodes would all look wrong, as the terminal needs OEM. - when writing text output to file, like filter ... >out.txt filter ... +tofile out.txt it is written as Ansi, without any conversion. you can then open out.txt with the Notepad or Depeche View, which expect Ansi text, and HiChars will display correctly. Beware of HiCodes within batch files. - if you run SFK interactively like: sfk filter in.txt -+myword and myword contains HiCodes, you type them all as OEM chars, and it works. - if you create a batch file with Windows Notepad, and therein type sfk filter in.txt -+myword and myword contains HiCodes, you will find that filter no longer finds the word. Because Notepad created an Ansi encoded text file, so the "myword" chars are Ansi encoded. what happens? - CMD.EXE still thinks "myword" is OEM, and incorrectly "converts" it to Ansi, which actually breaks all HiCode chars. - sfk.exe then gets myword with completely wrong encoding, and the search fails. how to fix this: - write your .bat files with OEM encoding. this can be done with Notepad++: - create a new file mytest.bat - select: Encoding / Character Set / your area, then select your OEM codepage. - now type sfk commands into the batch file, and save it. - side effect: if you create sfk scripts embedded in such a batch file, like: sfk batch mytest2.bat searches therein will fail again if this is OEM encoded. because by default "sfk script" wants to load Ansi text. to fix this use option -dos like: sfk script -dos ... What is not possible? SFK cannot process any text outside your Ansi codepage. for example, if a computer uses Western Europe codepage 1252, it is possible to search German umlauts and some French accent characters. but it is impossible to search and filter cyrillic text (encoded in 1251), and it will even be impossible to type cyrillic chars in the first place, as the keyboard has no such keys. see also: sfk help nocase about case insensitive search sfk help unicode unicode to Ansi conversion you are viewing this page in mobile portrait mode with a limited layout. turn your device right, use a desktop browser or buy the sfk e-book for improved reading. sfk is a free open-source tool, running instantly without installation efforts. no DLL's, no registry changes - just get sfk.exe from the zip package and use it (binaries for windows, linux and mac are included).
the Daily Landscape image
|