Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Ganesh_JavaSE7_Programming_1z0-804_study_guide.pdf
Скачиваний:
94
Добавлен:
02.02.2015
Размер:
5.88 Mб
Скачать

Chapter 7 String Processing

Regular Expressions

A regular expression defines a search pattern that can be used to execute operations such as string search and string manipulation. A regular expression is nothing but a sequence of predefined symbols specified in a predefined syntax that helps you search or manipulate strings. A regular expression is specified as a string and applied on another string from left to right.

You may wonder why you need a regex (short for REGular EXpression) when you can directly perform a search using the string function, as you did in last section using indexOf(), for example. Well, the answer is quite simple. You can use the indexOf() method (or any other similar method) when you know the exact string to be searched. However, in cases where you know only the pattern of a string but not a specific string, you need to use regex. Regex is a much more powerful tool than simple search methods for searching and manipulating strings. For example, say you want to search all e-mail addresses in a given string. You cannot achieve this using the indexOf() method since you don’t know the exact e-mail address; however, you can use a regex to specify a pattern that will find all the e-mail addresses in the string.

Understanding regex Symbols

We will now focus on understanding the syntax and semantics of symbols used to specify regular expressions. Table 7-3 shows commonly used symbols to specify regex.

Table 7-3.  Commonly Used Symbols to Specify regex

Symbol Description

^expr

Matches expr at beginning of line.

expr$

Matches expr at end of line.

.

Matches any single character (except the newline character).

[xyz]

Matches either x, y, or z.

[p-z]

Specifies a range. Matches any character from p to z.

[p-z1-9]

Matches either any character from p to z or any digit from 1 to 9 (remember, it won’t match p1).

[^p-z]

‘^’ as first character inside a bracket negates the pattern; it matches any character except

 

characters p to z.

Xy

Matches x followed by y.

x | y

Matches either x or y.

 

 

You can use the symbols given in Table 7-3 and specify regex. For example, you can write "[0–9]" to match all digit characters or "[\t\r\f\n]" to match all whitespaces. Also, you can also use certain predefined metasymbols to ease the regex specification. For instance, you can specify "\d" instead of "[0–9]" to match digits, or "\s" instead of "[\t\r\f\n]" to match all whitespaces. Table 7-4 summarizes a list of commonly used metasymbols.

Table 7-4.  Commonly Used Metasymbols to Specify regex

Symbol Description

\d

Matches digits (equivalent to [0–9]).

\D

Matches non-digits.

\w

Matches word characters.

(continued)

211

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]