Добавил:

fench Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Сумский государственный университет

Предмет:

Программирование

Файл:

Beginning Regular Expressions 2005.pdf

Скачиваний:

101

Добавлен:

17.08.2013

Размер:

25.42 Mб

Скачать

☆

<<< < Предыдущая 120 121 122 123 124 125 126 127 128 129 130 131132 / 169132 133 134 135 136 137 138 139 140 141 142 143 144 > Следующая >>>

Chapter 22

Metacharacters Supported in Visual C# .NET

Visual C#.NET has a very complete and extensive regular expressions implementation, which exceeds in functionality many of the tools you saw in earlier chapters of this book.

Much of the regular expression support in Visual C# .NET can reasonably be termed standard. However, as with many Microsoft technologies, the standard syntax and techniques have been extended or modified in places.

The following table summarizes many of the metacharacters supported in Visual C# .NET.

Metacharacter	Description

\d	Matches a numeric digit.
\D	Matches any character except a numeric digit.
\w	Equivalent to the character class [A-Za-z0-9_].
\W	Equivalent to the character class [^A-Za-z0-9_].
\b	Matches the position at the beginning of a sequence of \w characters
	or at the end of a sequence of \w characters. Colloquially, \b is
	referred to as a word-boundary metacharacter.
\B	Matches a position that is not a \b position.
\t	Matches a tab character.
\n	Matches a newline character.
\040	Matches an ASCII character expressed in Octal notation. The
	metacharacter \040 matches a space character.
\x020	Matches an ASCII character expressed in hexadecimal notation. The
	metacharacter \x020 matches a space character.
\u0020	Matches a Unicode character expressed in hexadecimal notation with
	exactly four numeric digits. The metacharacter \u0020 matches a
	space character.
[...]	Matches any character specified in the character class.
[^...]	Matches any character but the characters specified in the character
	class.
\s	Matches a whitespace character.
\S	Matches any character that is not a whitespace character.
^	Depending on whether the MultiLine option is set, matches the
	position before the first character in a line or the position before the
	first character in a string.
$	Depending on whether the MultiLine option is set, matches the
	position after the last character in a line or the position after the last
	character in a string.

542

		C# and Regular Expressions

	Metacharacter	Description

	$number	Substitutes the character sequence matched by the last occurrence of
		group number number.
	${name}	Substitutes the character sequence matched by the last occurrence of
		the group named name.
	\A	Matches the position before the first character in a string. Its behavior
		is not affected by the setting of the MultiLine option.
	\Z	Matches the position after the last character in a string. Its behavior is
		not affected by the setting of the MultiLine option.
	\G	Specifies that matches must be consecutive, without any intervening
		nonmatching characters.
	?	A quantifier. Matches when there is zero or one occurrence of the pre-
		ceding character or group.
	*	A quantifier. Matches when there are zero or more occurrences of the
		preceding character or group.
	+	A quantifier. Matches when there are one or more occurrences of the
		preceding character or group.
	{n}	A quantifier. Matches when there are exactly n occurrences of the
		preceding character or group.
	{n,m}	A quantifier. Matches when there are at least n occurrences and a
		maximum of m occurrences of the preceding character or group.
	(substring)	Captures the contained substring.
	(?<name>substring)	Captures the contained substring and assigns it a name.
	(?:substring)	A non-capturing group.
	(?=...)	A positive lookahead.
	(?!...)	A negative lookahead.
	(?<=...)	A positive lookbehind.
	(?<!...)	A negative lookbehind.
	\N where N is a number	A back reference to a numbered group.
	\k<name>	A back reference that references a named back reference (same mean-
		ing as the following).
	\k’name’	A back reference that references a named back reference (same mean-
		ing as the preceding).
	!	Alternation.
	(?imnsx-imnsx)	An alternative technique to specify RegexOptions settings inline.

543

Chapter 22

Using Named Groups

One of the features supported in the .NET Framework but not supported in many other regular expression implementations is the notion of named groups.

The syntax is (<nameOfGroup>pattern). Naming a group of characters can make understanding and maintenance of code easier than using numbered groups. For example, examine the following pattern:

${lastName}, ${firstName}

The purpose of this pattern in a replacement string is more easily understood than the purpose of the same replacement operation expressed as numbered, rather than named, groups:

${1}, ${2}

The following example reverses first name and last name using named groups.

Try It Out

Using Named Groups

1.Create a new project in Visual Studio 2003 using the Console Application template, and name the project NamedGroupsDemo.

2.In the code editor, add the following line after any default using statements:

using System.Text.RegularExpressions;

3.Enter the following code between the curly braces of the Main() method:

Console.WriteLine(@”This will find a match for the regular expression ‘^(?<firstName>\w+)\s+(?<lastName>\w+)$’.”); Console.WriteLine(“Enter a test string consisting of a first name then a last name.”);

string inputString;

inputString = Console.ReadLine();

string outputString = Regex.Replace(inputString, @”^(?<firstName>\w+)\s+(?<lastName>\w+)$”, “${lastName}, ${firstName}”); Console.WriteLine(“You entered the string: ‘“ + inputString +

“‘.”);

Console.WriteLine(“The replaced string is ‘“ + outputString + “‘.”);

Console.ReadLine();

4.Save the code, and press F5 to run it.

5.At the command line, enter the test string John Smith, and inspect the displayed result, as shown in Figure 22-15.

Figure 22-15

544

C# and Regular Expressions

How It Works

The content of the Main() method is explained here.

First, the pattern to be matched against is displayed, and the user is invited to enter a first name and last name. The pattern to be matched contains two named groups, represented respectively by

(?<firstName>\w+) and (?<lastName>\w+):

Console.WriteLine(@”This will find a match for the regular expression ‘^(?<firstName>\w+)\s+(?<lastName>\w+)$’.”);

Console.WriteLine(“Enter a test string consisting of a first name then a last name.”);

The inputString variable is declared; then the Console.ReadLine() method is used to capture the string entered by the user. That string value is assigned to the inputString variable:

string inputString;

inputString = Console.ReadLine();

The Regex class’s Replace() method is used statically, with three arguments. The first argument specifies the string in which replacement is to take place — in this case, the string specified by the inputString variable. The pattern to be used to match is specified by the second argument — in this case, the pattern ^(?<firstName>\w+)\s+(?<lastName>\w+)$. The third argument, which is formally a string value, uses the notation ${namedGroup} to represent each named group.

The ${firstName} group, not surprisingly, contains the alphabetic character sequence entered first, and the ${lastName} group contains the alphabetic character sequence entered second:

string outputString = Regex.Replace(inputString,

@”^(?<firstName>\w+)\s+(?<lastName>\w+)$”, “${lastName}, ${firstName}”);

The user is shown the string that was entered and the string produced when the Replace() method was applied:

Console.WriteLine(“You entered the string: ‘“ + inputString + “‘.”);

Console.WriteLine(“The replaced string is ‘“ + outputString + “‘.”);

Console.ReadLine();

Using Back References

Back references are supported in C# .NET. A typical use for back references is finding doubled words and removing them. The following example shows this.

Try It Out

Using Back References

1.Create a new project in Visual Studio 2003 using the Console Application template, and name the project BackReferenceDemo.

2.Add a using System.Text.RegularExpressions; statement.

545

Chapter 22

3.In the code editor, add the following code between the paired braces of the Main() method:

Console.WriteLine(“This example will find a doubled word.”); Console.WriteLine(“Using a backreference and the Replace() method the doubled word will be removed.”);

Console.WriteLine(“Enter a test string containing a doubled word.”);

string inputString;

inputString = Console.ReadLine();

string outputString = Regex.Replace(inputString, @”(\w+)\s+(\1)”, “${1}”);

Console.WriteLine(“You entered the string: ‘“ + inputString + “‘.”);

Console.WriteLine(“The replaced string is ‘“ + outputString + “‘.”);

Console.ReadLine();

4.Save the code, and press F5 to run it.

5.Enter the test string Paris in the the Spring (note the doubled the in the test string); press Return; and inspect the displayed information, as shown in Figure 22-16.

Figure 22-16

6.Press Return to close the application. In Visual Studio, press F5 to run the code again.

7.Enter the test string Hello Hello, press Return, and inspect the displayed information. Again, the doubled word is identified and replaced with a single occurrence of the same word.

How It Works

The Main() method code begins by displaying information to the user about the use of back references and invites the user to enter a string containing a doubled word:

Console.WriteLine(“This example will find a doubled word.”); Console.WriteLine(“Using a backreference and the Replace() method the doubled word will be removed.”);

Console.WriteLine(“Enter a test string containing a doubled word.”);

The inputString variable is declared. And the string that the user entered is assigned to the inputString variable:

string inputString;

inputString = Console.ReadLine();

The Regex class’s Replace() method is used statically and is applied to the inputString variable, and the result is assigned to the outputString variable.

546

<<< < Предыдущая 120 121 122 123 124 125 126 127 128 129 130 131132 / 169132 133 134 135 136 137 138 139 140 141 142 143 144 > Следующая >>>

Соседние файлы в предмете Программирование

#
17.08.20132.9 Mб60Beginning Perl Web Development - From Novice To Professional (2006).pdf
#
17.08.20138.05 Mб124Beginning Programming for Dummies 2004.pdf
#
17.08.201315.78 Mб178Beginning Python (2005).pdf
#
17.08.201313.91 Mб139Beginning Python - From Novice To Professional (2005).pdf
#
17.08.201318.51 Mб239Beginning REALbasic - From Novice To Professional (2006).pdf
#
17.08.201325.42 Mб101Beginning Regular Expressions 2005.pdf
#
17.08.20137.52 Mб29Beginning SharePoint With Excel - From Novice To Professional (2006).pdf
#
17.08.201325.54 Mб75Beginning Ubuntu Linux - From Novice To Professional (2006).pdf
#
17.08.201314.97 Mб226Beginning Visual Basic 2005 (2006).pdf
#
17.08.201321.25 Mб393Beginning Visual Basic 2005 Express Edition - From Novice To Professional (2006).pdf
#
17.08.201338.67 Mб39Blog Design Solutions (2006).pdf