Wednesday, June 26, 2013

shell PS1 terminal escape sequence in linux

If you've seen string like this "\[\e[1;33m\]" in someone's PS1 (bash prompt) value, you may be curious about what it means. Or you may be using it, but not really know what it means (e.g. just copied and pasted from another bash config file). The format is weird. Isn't it?

PS1 bash prompt

run "man bash", search for PROMPTING, you will see that

  1. \[  begin a sequence of non-printing characters, which could be used to embed a terminal control sequence into the prompt
  2. \]  end a sequence of non-printing characters

You may wonder what "terminal control sequence" is. Generally "terminal control sequence" controls how the data and/or presentation maintained by terminal will be changed (e.g. move cursor, change color, delete line).

ECMA-48

The "terminal control sequence" is defined in ECMA-48 spec [1]. Here I will be focused on only one commonly used terminal control sequence called SGR (the detail is on page 75 of the pdf file, or page 61 of the spec content). SGR controls many graphics aspect of terminal display.

SGR - Select Graphic Rendition. You can think it as just a function which takes some parameters. We need to represent/serialize it so that it can be transferred and interpreted by terminal, etc. Its representation in the spec is "CSI Ps... 06/13". I break it down in the follow table:

Parts

full name

ascii code (hex)

breakdown of example "\e[1;33m"

CSI Control Sequence Introducer For 8-bit mode: 0x9b
For 7-bit mode: 0x1b5b("\e[")
"\e[" is CSI
Ps… parameters each control function can have one or more parameters. "1;33" are two parameters
    SGR can have multiple parameters. Each parameter is an integer represented in decimal notation. 1 means bold; and 33 means yellow foreground color
    Parameters are separated by 0x3b (';')  
06/13 control function SGR 0x6d ('m') "m"

You may wonder why parameter 33 means yellow and parameter 1 means bold. You cannot feed arbitrary parameter values to SGR. The allowed parameter values are (copied from [1]) shown below. The items marked as bold text are the commonly used parameters.

0 default rendition (implementation-defined), cancels the effect of any preceding occurrence of SGR in
the data stream regardless of the  setting of the GRAPHIC RENDITION COMBINATION MODE
(GRCM)
1  bold or increased intensity
2  faint, decreased intensity or second colour
3 italicized
4 singly underlined

5  slowly blinking (less then 150 per minute)
6  rapidly blinking (150 per minute or more)
7 negative image
8 concealed characters
9  crossed-out (characters still legible but marked as to be deleted)
10  primary (default) font
11  first alternative font
12  second alternative font
13  third alternative font
14  fourth alternative font
15  fifth alternative font
16  sixth alternative font
17  seventh alternative font
18  eighth alternative font
19  ninth alternative font
20 Fraktur (Gothic)
21 doubly underlined
22  normal colour or normal intensity (neither bold nor faint)
23  not italicized, not fraktur
24  not underlined (neither singly nor doubly)
25  steady (not blinking)
26  (reserved for proportional spacing as specified in CCITT Recommendation T.61)
27 positive image
28 revealed characters 
29  not crossed out
30 black display
31 red display
32 green display
33 yellow display
34 blue display
35 magenta display
36 cyan display
37 white display

38  (reserved for future standardization;  intended  for setting character foreground colour as specified in
ISO 8613-6 [CCITT Recommendation T.416])
39  default display colour (implementation-defined)
40 black background
41 red background
42 green background
43 yellow background
44 blue background
45 magenta background
46 cyan background
47 white background
48  (reserved for future standardization; intended for setting character background colour as specified in
ISO 8613-6 [CCITT Recommendation T.416])
49  default background colour (implementation-defined)
50  (reserved for cancelling the effect of the rendering aspect established by parameter value 26)
51 framed
52 encircled
53 overlined
54  not framed, not encircled
55 not overlined
56  (reserved for future standardization)
57  (reserved for future standardization)
58  (reserved for future standardization)
59  (reserved for future standardization)
60  ideogram underline or right side line 
61  ideogram double underline or double line on the right side 
62  ideogram overline or left side line
63  ideogram double overline or double line on the left side
64  ideogram stress marking
65  cancels the effect of the rendition aspects established by parameter values 60 to 64

More

PS1 is only one of the places where you can specify terminal control sequence. Many other programs also accept it. For example, in my .inputrc (config for readline), I have config like:

"\e[1~": beginning-of-line               # control sequence ending with '~' is for private use.
"\e[4~": end-of-line                        # In other words, we hope these control sequences are sent to terminal
                                                    # when we try to move cursor to the beginning/end of the line
                                                    # (e.g. press home/end key), and terminal can understand them

"\e[5C": forward-word                     # ECMA-48 section 8.3.20 CUF - cursor right
"\e[5D": backward-word                  # ECMA-48 section 8.3.18 CUB - cursor left
"\e[1;5C": forward-word                  # Other config is not specified in the standard.
"\e[1;5D": backward-word               # They may be extensions or just non standard compliant implementation
"\e\e[C": forward-word
"\e\e[D": backward-word
"\e[1;\e[C": forward-word
"\e[1;\e[D": backward-word

I added comments into the config so you can easily know the references.
If you are not sure what key sequence is sent when you press home/end key on your keyboard, run command "cat -v", and press the key. For me, cat shows "^[[1~" when I press home key. You can replace "^[" with "\e" and get the terminal control sequence.

References:

[1] http://www.ecma-international.org/publications/standards/Ecma-048.htm

No comments: