Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Beginning Regular Expressions 2005.pdf
Скачиваний:
101
Добавлен:
17.08.2013
Размер:
25.42 Mб
Скачать

Regular Expressions Using findstr

At the command line, enter the following command:

findstr /n “://” Protocols.txt

It will display all lines that contain an Internet protocol — in this case, lines 1 and 2.

Multiple File Example

One of the most useful aspects of the findstr utility is that from the command line, you can search across several files at once. This can save time compared to, for example, opening each file in an editor or word processor.

This example looks at how findstr can be used to find occurrences of HTTP URLs across multiple files. There are three short test files. URL1.txt contains the following:

I found interesting information at http://www.w3.org/ on the XQuery specification.

URL2.txt contains the following:

I wanted to find information about Microsoft SQL Server 2005 and the site at

http://www.microsoft.com/sql/ was very useful.

And URL3.txt contains the following:

This document shouldn’t be detected because the protocol, http, is omitted. The site that I

visited was www.w3.org.

The problem definition can be stated as follows:

Match the character sequence http followed by a colon character, followed by two forward-slash characters.

Try It Out

Finding URLs

1.Open a command window, and navigate to the directory that contains the files URL1.txt,

URL2.txt, and URL3.txt.

2.At the command line, type the following command:

findstr /n “http://” URL*.txt

3.Inspect the results, as shown in Figure 13-19. Notice a limitation of findstr in the layout of results in Figure 13-19, where results from one file run on into results from another. This happens when the test text is not tidily line based but, instead, is paragraph based. Because findstr displays text in which a match is contained, rather than specifically the matched text, this imprecision can become a problem. When you see such results running into one another, the need for the /a switch, for which an example was shown earlier, becomes clearer.

321

Chapter 13

Figure 13-19

A Filelist Example

The relatively simple examples in this chapter have used filenames where they can be expressed on the command line using a wildcard, such as in URL*.txt. However, sometimes you will want to search several files for which no such wildcard exists. The /f command-line switch allows this to be done.

The content of the file, Targets.txt, contains a list of files:

URL1.txt

URL2.txt

URL3.txt

The file Data.txt contains a very simple regular expression to find

http://

To put these together, you need to use the /g and /f findstr command-line switches. The argument to the /g switch is a filename for the file that contains the data to be searched for. The argument to the /f switch is a filename for a list of files that are to be searched. In addition, a list of the files with matches can be piped to a results file.

Try It Out

The /g and /f Switches

1.Open a command window, and navigate to the directory that contains Data.txt,

Targets.txt, URL1.txt, URL2.txt, and URL3.txt.

2.At the command prompt, enter the following command:

findstr /g:Data.txt /f:Targets.txt > Results.txt

3.Then, at the command prompt, enter the following command:

Type Results.txt

4.Inspect the results. The results are the same as in the preceding example, but this time they have been piped to an output file where they are listed rather than, as in previous examples, being simply echoed to the screen.

322

Regular Expressions Using findstr

How It Works

The argument /g:Data.txt specifies that the regular expression pattern is contained in the file Data.txt. The argument /f:Targets.txt specifies that the file Targets.txt contains the names of files to be searched. The results are redirected to the file Results.txt, as indicated by > Results.txt in the command line.

Exercises

The following exercises are intended to help reinforce some of the material presented in this chapter:

1.What findstr command would display lines that contain part numbers whose second character is any uppercase character except L, M, or N? Assume a structure of three alphabetic characters, three numeric digits, and that the files to be checked can be expressed as filename*.extension.

2.Give two possible findstr commands that would display lines beginning with either the or The.

323