Tuesday, July 22, 2008

Hex edit on linux

There are several useful tools which can be used to view and edit a binary file in hex mode.
(1) xxd (make a hexdump or do the reverse)
xxd infile        //hex dump. By default, in every line 16 bytes are displayed.
xxd -b infile    //bitdump instead of hexdump
xxd -c 10 infile //in every line 10 bytes are displayed instead of default value 16.
xxd -g 4 infile  //every 4 bytes form a group and groups are separated by whitespace.
xxd -l 100 infile     //just output 100 bytes
xxd -u infile    //use upper case hex letters.
xxd -p infile    //hexdump is displayed in plain format, no line numbers.
xxd -i infile     // output in C include file style.
E.g. output looks like:
unsigned char __1[] = {
  0x74, 0x65, 0x73, 0x74, 0x0a
};
unsigned int __1_len = 5;

xxd -r -p infile //convert from hexdump into binary. This requires a plain hex dump format(without line numbers).
E.g. in the infile, content should look like: 746573740a.
Note: additional whitespace and new-breaks are allowed. So 74 65 73 740a is also legal.
xxd -r infile     //similar to last command except line numbers should be specified also.
E.g. in the infile, content should look like: 0000000: 7465 7374 0a.

xxd can easily be used in vim which is described here.

(2) od(dump files in octal and other formats)
Switches:
    -A  how offsets are printed
    -t  specify output format
        (d:decimal; u:unsigned decimal;o:octet; x: hex; a: characters; c:ASCII characters or escape sequences.
        Adding a z suffix  to any type adds a display of printable characters to the end of each line of output)
    -v  Output consecutive lines that are identical.
    -j   skip specified number of bytes first.
    -N  only output specified number of bytes.

Example:
dd bs=512 count=1 if=/dev/hda1 | od -Ax -tx1z -v

(3) hexdump/hd
Currently, I have not used this command. Maybe in the near future I will update this part if I get experience about hexdump.

File system and partition information in Linux/Unix

I would like to summarize some useful commands which can be used to obtain information about partition table and file systems.
Some basic information (model, capacity, driver version...) about hard disk can be obtained by accessing files under these directories:
/proc/ide/hda, /proc/ide/hdc ...(For IDE equitments)
/proc/scsi/  (For SCSI equitments).
Useful posts:
How to add new hard disks, how to check partitions...?
File system related information
Partitions and Volumes

Partition table:
All partitions of a disk are numbered next way: 1-4 - primary and extended, 5-16 (15) - logical.
(1) fdisk (from package util-linux)
Partition table manipulator for Linux.
fdisk -l device       //list partition table
fdisk -s partition    //get size of a partition. If the parameter is a device, get capacity of the device.

(2) cfdisk
Curses based disk partition table manipulator for Linux. More user-friendly.
cfdisk -Ps    //print partition table in sector format
cfdisk -Pr    //print partition table in raw format(chunk of hex numbers)
cfdisk -Pt    //print partition table in table format.

(3) sfdisk
List partitions:
sfdisk -s device/partition    //get size of a device or partition
sfdisk -l device                  //list partition table of a device
sfdisk -g device                 //show kernel's idea of geometry
sfdisk -G device                //show geometry guessed based on partition table
sfdisk -d device                //Dump the partitions of a device in a format useful as input to sfdisk.
sfdisk device -O file          //Just  before  writing  the new partition, output the sectors that are going to be overwritten to file
sfdisk device -I fiel           //restore old partition table which is preserved by using -O tag.
Check partitions:
sfdisk -V device                 //apply consistency check
It can also modify partition table.

(4) parted
An interactive partition manipulation programs. Use print to get partition information.

File system:
use "man fs" to get linux file system type descriptions.
/etc/fstab        file system table
/etc/mtab        table of mounted file systems
(1) df (in coreutils)
Report file system disk space usage.
df -Th    //list fs information including file system type in human readable format
(2) fsck (check and repair a Linux file system)
fsck is simply a front-end for the various file system checkers (fsck.fstype) available under  Linux.
(3) mkfs (used to build a Linux file system on a device, usually a hard disk partition.)
It is a front-end to many file system specific builder(mkfs.fstype).Various backend fs builder:
mkdosfs,   mke2fs,   mkfs.bfs,   mkfs.ext2,   mkfs.ext3,   mkfs.minix, mkfs.msdos, mkfs.vfat, mkfs.xfs, mkfs.xiafs.
(4) badblocks - search a device for bad blocks
(5) mount
(6) unmount

Ext2/Ext3
(1) dumpe2fs (from package e2fsprogs, for ext2/ext3 file systems)
Prints the super block and blocks group information for the filesystem.
dumpe2fs -h /dev/your_device      //get superblock information
dumpe2fs /dev/your_device | grep -i superblock      //get backups of superblock.
(2) debugfs (debug a file system)
Users can open and close a fs. And link/unlink, create...files.
(3) e2fsck (check a Linux ext2/ext3 file system)
This tool is mainly used to repair a file system.

Bash cheat sheet

Part of content in this post is from other web sites, mainly from Bash manual. Because all formal documents and introduction books are long and too detailed , I just list some key points and usage examples so that later I can quickly remember a bash feature by reading this post instead of the whole bash manual.

[Bash switch]
First, get a list of installed shells: chsh -l or cat /etc/shells.
Then use chsh -s to change your default shell. It modifies file /ect/passwd to reflect your change of shell preference.

[Output]
[echo] Output the arguments. If -n is given, trailing newline is suppressed. If -e is given, interpretation of escape characters is turned on.
[Output redirection]
> output redirection
"Code can be added to a program to figure out when its output is being redirected. Then, the program can behave differently in those two cases—and that’s what ls is doing." This means what you see on screen may be different from content of the file to which you redirect the output. For command ls, use ls -C>file.
command 1 > file1 2 > file2
1: stdout; 2: stderr; 1 is the default number.
command >& file
command &> file
command > file 2>&1

These three commands redirect stdout and stderr to the same file.
The tee command writes the output to the filename specified as its parameter and also write that same output to standard out.
>> append

[Input]
< input
<< EOF here-document.
"all lines of the here-document are subjected to parameter expansion, command substitution, and arithmetic expansion.”
<< \EOF or << 'EOF' or <<E\OF
When we escape some or all of the characters of the EOF, bash knows not to do the expansions.
<<-EOF
The hyphen just after the << is enough to tell bash to ignore the leading tab characters so that you can indent here-document. This is for tab characters only and not arbitrary white space.
use read to get input from user.

[Command execution]
get exit status of last executed command using $? (0-255).
(1) command1 & command2 & command3       //execute commands independently
(2) command1 && command2 && command3  //execute commands sequentialy
(3) command1 || command2
(4) command_name&
If a command is terminated by the control operator ‘&’, the shell executes the command asynchronously in a subshell. The shell does not wait for the command to finish, and the return status is 0 (true)
(5) nohup - run a command immune to hangups, with output to a non-tty
(6) command1; command2;
Commands separated by a ‘;’ are executed sequentially; the shell waits for each command to terminate in turn. The return status is the exit status of the last command executed.
(7) $( cmd ) or `cmd`: execute the command and return the output.
(8) $(( arithmetic_op )): do arithmetic operation and return the output.

[Pipe]
Each command in a pipeline is executed in its own subshell.

[Group commands]
{ comamnd1; command2; } > file            //use current shell environment
( command1; command2; ) >file               //use a subshell to execute grouped commands

[Variables]
Array variable: varname=(value1, value2); use ${varname[index]} to access an element.
Type of all variables is string. And sometimes bash operators can treat the content as other types.
Assignment: VARNAME=VALUE    //Note: there is not whitespace around equal sign.Or else bash interpreter
                                              //can not distinguish between variable name and command name. So it will
                                              //VARNAME as a command name.
Get value: $VARNAME    //the dollar sign is a must-have.
Use full syntax(braces) to separate variable name from surrounding text:
E.g. ${VAR}notvarname
export can be used to export a variable to environment so that it can be used by other scripts.
export VAR=value
Exported variables are call by value. Change of variable value in called script does not affect original value in calling script.
Use command set to see all defined variables in the environment
Use command env to see all exported variables in the environment.
${varname:-defaultvalue}: if variable varname is set and not empty, return its value; else return defaultvalue.
${varname:=defaultvalue}: similar to last one except that defaultvalue is assigned to variable varname if the variable is empty or not set. varname can not be positional variables(1, 2, ..., *)
${varname=defaultvalue}: Assignment happens only when variable varname is not set.
${varname:?defaultvalue}: If varname is unset or empty, print defaultvalue and exit.
Parameters passed to a script can be retrieved by accessing special variables: ${1},${2}...
Use variable ${#} to get number of parameters.
Command shift can be used to shift the parameters: The positional parameters from $N+1 ... are renamed to $1 ...  If N is not given, it is assumed to be 1.

[Special variables]
Variable $* can be used to get all parameters. If a certain parameter contains blank, use "$@".
E.g. command "param1" "second param"
value of $* or $@: param1 second param
value of "$@": "param1" "second param"
value of "$*": "param1 second param"

Always use "$@" if possible! And always use "${varname}"(include double quotes) to access value of the variable.
If a parmeter contains blanks, use double quotes to retrieve the value.
ls "${varname}"   instead of   ls ${varname}.
${#}: Expands to the number of positional parameters in decimal.
${-}: (A hyphen) Expands to the current option flags as specified upon invocation, by the set builtin command, or those set by the shell itself (such as the -i option).
${$}: Expands to the process id of the shell. In a () subshell, it expands to the process id of the invoking shell, not the subshell.
${!}: Expands to the process id of the most recently executed background (asynchronous) command.
${0}: Expands to the name of the shell or shell script. This is set at shell initialization.

[Misc]
(1) ${#var}: return length of the string
# character denotes start of a comment.
(2) Long comment: use "do nothing(:)" and here-document.
E.g.
:<<EOF
doc goes here
EOF

[Quotation]
Unquoted text and double quoted text is subject to shell expansion.
(1) In double quoted text, '\' can be used to escape special characters.  Backslashes preceding characters without a special meaning are left unmodified.
(2) Single quoted text is not subject to shell expansion. So you can not use escape sequence in single quoted text. So text 'use \' in single quoted text' is incorrect. It also means you can not include a single quote inside single quoted text, even if using a '\'. You can escape a single quote outside of surrounding single quotes.
(3) Words of the form $'string' are treated specially. The word expands to string, with backslash-escaped characters replaced as specified by the ANSI C standard. The expanded result is single-quoted, as if the dollar sign had not been present.

[Function/Command scope]
If you want to enforce using of an external command instead of built-in command, use enable -n. Or you can use command command, which ignores shell functions.
E.g. enable -n built-in-function
       command command_name

[Control structures and more]
(1) until test-command; do statements; done
(2) while test-command; do statements; done
(3) for var in words; do statements; done
(4) for ((expr1; expr2; expr3 )); do statements; done
(5) if test-commands; then statements;
     elif test_commands2; then statements;
     else statements;
     fi
(6) case word in
     pattern) statements;;
     pattern2|pattern3) statements;;
     *) statements;;
     esac
(7) select var in words; do statements; done
The commands are executed after each selection until a break command is executed, at which point the select command completes.
(8) (( arithmetic_expression))
If the value of the expression is non-zero, return status is 0; otherwise return status is 1. This is equivalent to
let "expression".
(9)[[ ]]

[Shell expansions]
(1) Brace expansion (note: ${...} is not expanded)
a{b,c,d}e  => abd ace ade
(2) Tilde expansion
~: value of $HOME
~username: home directory of user username.
~+: $PWD
~-: $OLDPWD
~N: equivalent to 'dirs +N'
~-N: equivalent to 'dirs -N'
(3) Parameter expansion
(4) Command substitution
$(command) or `command`: executs command and replaces the command substitution with the standard output of the command with any trailing newlines deleted.
(5) Arithmetic expansion
$(( expression )): allows the evalution of an arithmetic expression and substitution of the result.
(6) Process substitution
Actually I don't understand it much.
(7) Word splitting
"The shell scans the results of parameter expansion, command substitution, and arithmetic expansion that did not occur within double quotes for word splitting."
(8) Filename expansion
"After word splitting, unless the -f option has been set (see The Set Builtin), Bash scans each word for the characters ‘*’, ‘?’, and ‘[’. If one of these characters appears, then the word is regarded as a pattern, and replaced with an alphabetically sorted list of file names matching the pattern."

[Useful commands]
(*) customize shell promote

set environment variable "PS1"
(*) How to find commands?
Try following commands:
[type]
    a bash command. Searches files in the PATH, aliases, keywords, functions, built-in commands...
[which]
    displays the full path of the executables that would have been executed when this argument had been entered at the shell prompt(just searches files in $PATH).
[apropos]
    search the whatis database for strings(same as man -k)
[locate, slocate]
    reads  one or more databases prepared by updatedb and writes file names matching at least one of the PATTERNs to standard output. Location of the actual database varies from system to system. 
[whereis]
    locate the binary, source, and manual page files for a command. An entry is displayed only when the whole searched word is matched(not a substring of a long word).
[find]
    search for files in a directory hierarchy.
(*) Info about a command
[man]:
    display manual of the command;
[help]:
    display information of bash built-in commands. For a bash build-in command, if you use man command to display its information, you will get a large bash manual.
[info]:
    display Info doc of an arbitrary command.
(*) Get information about a file
[ls]
    List  information  about  the FILEs (the current directory by default). You can either list directory contents of get information about a file(a regular file, not directory).
E.g.
    ls -a .*  //list all files name of which startes with a .? no!!! Bash will expand pattern .* at first. So
               // ls .. is included which causes content of parent directory is displayed. And so on and so on.
               // You can use echo command to see what the pattern is expanded to. E.g. echo .*
    ls -d .*  //-d switch enforce ls to list directory entries instead of contents.
[stat]
    Display file or file system status.
[find]
    E.g. find /path/ -name file_name -printf '%m %u %t'
[file]
    determine file type. There are three sets of tests, performed in this order: filesystem tests, magic number tests, and language tests.