Telephone:
01892 531108
(within the U.K)
+44-1892-531108
(outside the U.K.)
Mail Address:
Aquila Software
30 Fernhurst Crescent
Southborough
Tunbridge Wells
Kent, TN4 0TB
United Kingdom.
Examine32 Text Search manual
Regular
Expressions
Regular expressions are derived from the UNIX utility GREP and
enable powerful text searches to be carried out using the special characters
^, $, ., *, +, ?, [ ], [^] , [-], \, ( ) and |. These
characters have the following meanings:
^
At the beginning of a line a circumflex matches
the start of a line. For instance ^while will find
all lines starting with while.
$
At the end of a line a dollar matches the end of a
line. For instance tomorrow$ will find all lines
ending with tomorrow.
*
An asterisk after a character will match zero or
more occurrences of that character. For instance to*
will match t, to, and
too.
+
A plus sign after a character will match one or
more occurrences of that character. For instance to+
will match to and too.
?
A question mark after a
character will match zero or one occurrence of that character. For
instance to? will match t and to.
.
A period matches any
character. For instance p.n will match pan,
pen, pin and pun.
|
The vertical line character
matches either expression it separates. For example
pan|pen will
match pan and pen.
( )
Characters can be grouped within parentheses. This
allows certain expressions to act on more than one character. For
instance find(ing)?s will match
finds and findings.
[
]
Characters in square brackets
will match any one of the enclosed characters. For instance p[aei]n
will match pan, pen , pin
but not pun.
[^]
A circumflex at the start of
an expression within brackets will match any character except one of
the enclosed characters. For instance p[^aei]n
will match pun but not pan, pen
or pin.
[-]
A hyphen within brackets
indicates a range of characters. For instance p[a-h]n
will match pan and pen but not pin
or pun.
\
A backslash before any of the
above special characters treats that character literally. For instance \.
will be treated as a period rather than as any character.
\w
Matches
any word character. Word characters are the characters a-z, A-Z, 0-9, _
and any
other character recognised by your system such as é and
ä. For instance resum\w
will match resume
and resumé.
\W
Matches
any non word character.
\s
Matches
any white space character including line endings. For instance text\ssearch
will match text
search even if it spans two lines.
\S
Matches
any character that is not a white space character.
\d
Matches
any digit character. For instance \d\d\d
will match 999
and 101.
\D
Matches
any character that is not a digit character.
Within square brackets the special characters $,
., * and +
are treated literally while ^ is only treated as a
special character if it immediately follows a [
Further Examples
colou?r will match color
and colour. p[a-k]+n will match pan,
pen, pin and pain.
th.*y will match thy, they
and theoretically