

However, these characters do not have any special meaning unless they are escaped with a backslash: In BRE, these characters have a special meaning unless they are escaped with a backslash: P, -perl-regexp # perl compatible regular expressionĮxpressions Basic Regular Expressions (BRE) E, -extended-regexp # extended regular expression G, -basic-regexp # basic regular expression (default) This tokenizes into grep, (then|there), x.x.Pattern options -F, -fixed-strings # list of fixed strings grep wants to see grep, then|there, and x.x. In this case (, ) and | are unescaped meta characters and so serve to split the input into these tokens: grep, (, then, |, there, ), and x.x. In the example above grep needs these tokens, grep, string, filename. Some of them also have additional meaning, but first and foremost, they are token delimiters. If not escaped then these 10 special characters serve as token delimiters. The important thing to remember is that bash first looks for escaping characters ( ', ", and \), and then looks for unescaped meta-character delimiters, in that order.

The first token is then run as the command, and takes the next three tokens as input. In the third example the semicolon is escaped, so there are 4 tokens produced by a space delimiter, echo, x, echo, and y.

In the first example there are two tokens produced by a space delimiter: echo and xyz. The remaining unescaped meta-characters then become token separators.

Note, there is an important difference between ', and ", but that's for another day. What is in between quotes in bash are not strings, but rather sections of the input line that have meta-characters escaped so they don't delimit tokens. (It's a little more complicate than this because the quotes also need to be quoted, and because double quotes don't quote everything, but this simplification will do for now.)ĭon't confuse bash quoting with the idea of quoting a string of text, like in other languages. 'xx.', "xx."), or by prefixing an individual character with a back-slash, (i.e. Escaping is done either by quoting a string of one or more characters, (i.e. However because these meta-characters also sometimes must be used within a token, there needs to be a way to take away their special meaning. (Tokenizing occurs before all other expansions, including brace, tilde, parameter, command, arithmetic, process, word splitting, & filename expansion.)Ī token here means a portion of the input line separated (delimited) by one of these special meta-characters: space, - White space.īash uses many other special characters but only these 10 produce the initial tokens. Typed input-lines are first history-expanded.Įach bash line is first tokenized, or in other words chopped into what are called tokens. Note: Shell script lines are used directly. Then using this roadmap I'll parse the examples presented by the questioner to help you better understand why they don't work as intended. In addition to the excellent solutions above, I thought I'd try to give you a cheat sheet on how bash parses and interprets statements. Bash's elegant simplicity seems to get lost in it's huge man page.
