Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Ganesh_JavaSE7_Programming_1z0-804_study_guide.pdf
Скачиваний:
94
Добавлен:
02.02.2015
Размер:
5.88 Mб
Скачать

Chapter 7 String Processing

That’s perfect. Similarly, you can also search all phone numbers, since the length of the phone numbers in the example string length is 10.

Now, let’s try to do something little more difficult: searching e-mail addresses. In an e-mail address, the first part is a word (which can be specified by "\w+"), followed by a "@", followed by another word, and suffixed by ".com" (for the sake of simplicity, let’s ignore other suffixes such as “.edu”). Hence, the regex for searching e-mail address in the example-string is "\w+@\w+\.com". Will this regex string work? Try the code in Listing 7-11.

Listing 7-11.  Regex5.java

import java.util.regex.Matcher; import java.util.regex.Pattern;

// This program demonstrates how we can search email addresses public class Regex5 {

public static void main(String[] args) {

String str = "Danny Doo, Flat no 502, Big Apartment, Wide Road, Near Huge Milestone, Hugo-city 56010, Ph: 9876543210, Email: danny@myworld.com. Maggi Myer, Post bag no 52, Big bank post office, Big bank city 56000, ph: 9876501234, Email: maggi07@myuniverse.com.";

Pattern pattern = Pattern.compile("\\w+@\\w+\\.com"); Matcher matcher = pattern.matcher(str); while(matcher.find()) {

System.out.println(matcher.group());

}

}

}

It prints the following:

danny@myworld.com

maggi07@myuniverse.com

That worked!

Replacing Strings with regex

In the previous section, you tried searching and parsing strings with regex. You can also manipulate (modify) strings with regex. Let’s try replacing strings now.

In the string in your example, currently all phone numbers are represented as a string of consecutive 10 digits. Now you want to change this phone number format to XXX-XXXXXXX format—in other words, you want to insert a dash (−) after the third digit of the phone number. Listing 7-12 shows how to achieve this.

Listing 7-12.  Regex6.java

import java.util.regex.Matcher; import java.util.regex.Pattern;

// This program demonstrates how we can manipulate text public class Regex6 {

public static void main(String[] args) {

String str = "Danny Doo, Flat no 502, Big Apartment, Wide Road, Near Huge Milestone, Hugo-city 56010, Ph: 9876543210, Email: danny@myworld.com. Maggi Myer, Post bag no 52, Big bank post office, Big bank city 56000, ph: 9876501234, Email: maggi07@myuniverse.com.";

216

Chapter 7 String Processing

Pattern pattern = Pattern.compile("(\\D)(\\d{3})(\\d{7})(\\D)"); Matcher matcher = pattern.matcher(str);

String newStr = matcher.replaceAll("$1$2-$3$4"); System.out.println(newStr);

}

}

The output of this program is the following:

Danny Doo, Flat no 502, Big Apartment, Wide Road, Near Huge Milestone, Hugo-city 56010, Ph: 987– 6543210, Email: danny@myworld.com. Maggi Myer, Post bag no 52, Big bank post office, Big bank city 56000, ph: 987–6501234, Email: maggi07@myuniverse.com.

Good, it worked as expected. But how did you achieve this? One very evident observation is that you used replaceAll() of the Matcher class. However, there is one more important concept that you need to understand to grasp the above example.

You can form groups within a regex. These groups can be used to specify quantifiers on a desired subset of the whole regex. These groups can also be used to specify back-reference. Each group can be referred to as $n where n is an integer—so, for example, the first group can be referred to as $1, the second group can be referred to as $2,

and so on.

Here, you formed four groups, and while replacing, you inserted a dash between second and third group. That’s how your replacement works.

Now, let’s do something different. Let’s implement a method to validate an IP address. Can you suggest a regex to match an IP address?

Did you say "\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b", where "\b" is used to match word boundaries and "\d{1,3}" is used to specify that you expect three digit number here. Well, it’s a nice try but it’s wrong!

A valid IP address consists of 4 numbers separated by dots, where each number can be between 0 and 255 (both inclusive). That means any number greater than 255 will result in an invalid IP address. However, in the above regex you can write any three-digit number (even greater than 255) and the regex will match. Hence, it is wrong.

Listing 7-13 shows the correct implementation of the regex for an IP address.

Listing 7-13.  Regex7.java

import java.util.regex.Pattern;

// This program demonstrates how we can validate an IP address public class Regex7 {

void validateIP(String ipStr) {

String regex = "\\b((25[0–5]|2[0–4]\\d|[01]?\\d\\d?)(\\.)){3}(25[0–5]|2[0–4]\\ d|[01]?\\d\\d?)\\b";

System.out.println(ipStr + " is valid? " + Pattern.matches(regex, ipStr));

}

public static void main(String[] args) {

String ipStr1 = "255.245.188.123"; // valid IP address

String ipStr2 = "255.245.188.273"; // invalid IP address - 273 is greater than 255 Regex7 validator = new Regex7();

validator.validateIP(ipStr1);

validator.validateIP(ipStr2);

}

}

217

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]