Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Kenneth A. Kousen - Making Java Groovy - 2014.pdf
Скачиваний:
50
Добавлен:
19.03.2016
Размер:
15.36 Mб
Скачать

 

Groovy Baseball

35

longitude double,

 

 

 

 

primary key(id)

 

 

 

 

);

 

 

 

 

'''

 

 

 

 

Geocoder geo = new Geocoder()

 

 

Insert results

 

 

 

 

stadiums.each { s ->

 

 

into DB

 

 

 

 

 

geo.fillInLatLng s

 

 

 

 

db.execute """

 

 

 

 

insert into stadium(name,city,state,team,latitude,longitude)

 

values(${s.name},${s.city},${s.state},${s.team},${s.latitude},${s.longit ude});

"""

}

assert db.rows('select * from stadium').size() == stadiums.size() Check db.eachRow('select latitude,longitude from stadium') { row -> results

assert row.latitude > 25 && row.latitude < 48 assert row.longitude > -123 && row.longitude < -71

}

This script collects all the latitude and longitude values for each MLB stadium, creates a database table to hold them, and populates the table. It only has to be run once, and the application can then use the table. In the process of reviewing the code I used a Stadium POGO, a list, a couple of collect methods with closures, an example that used the URLEncoder class from Java in a Groovy script, and database manipulation through the groovy.sql.Sql class.

The next step is to collect box score data from a site maintained by Major League Baseball, and generate XML information that can be sent to a view page.

2.3.2Parsing XML

Major League Baseball continuously updates the results of baseball games online. The information is kept in XML format in links descending from http://gd2.mlb.com/ components/game/mlb/.

On the site the games are arranged by date. Drilling down from the base URL requires links of the form "year_${year}/month_${month}/day_${day}/", where the year is four digits, and the month and day are two digits each. The games for that date are listed as individual links. For example, figure 2.8 shows links for each game played on May 5, 2007.8

The link for each individual game has the form

gid_${year}_${month}_${day}_${away}mlb_${home}mlb_${num}

The year, month, and day values are as expected. The values for away and home are three-letter lowercase abbreviations for each team, and the value of num represents the game number that day (1 for the first game, 2 for the second game of a double

8 By an astonishing coincidence, May 5 is my son’s birthday.

www.it-ebooks.info

36

CHAPTER 2 Groovy by example

Figure 2.8 Links to baseball games played on May 5, 2007

header). The links for each game contain a series of files, but the one I’m interested in is called boxscore.xml.

To retrieve the box score information I’ll create a class called GetGameData. This class will have attributes for the base URL and the team abbreviations, as shown. The next listing shows a portion of this class.

Listing 2.5 A portion of GetGameData, showing the attributes and initialization

class GetGameData {

def day

 

Used as strings to

 

 

 

 

 

def month

 

 

 

 

download box scores

 

 

def year

 

 

 

String base = 'http://gd2.mlb.com/components/game/mlb/'

 

Map of team

 

Map stadiumMap = [:]

 

 

 

abbreviations to

 

 

 

 

Stadium instances

Map abbrevs = [

 

 

ana:'Los Angeles (A)', ari:'Arizona',

atl:'Atlanta',

bal:'Baltimore',

bos:'Boston',

cha:'Chicago (A)',

chn:'Chicago (N)',

cin:'Cincinnati',

cle:'Cleveland',

col:'Colorado',

det:'Detroit',

flo:'Florida',

hou:'Houston',

kca:'Kansas City', lan:'Los Angeles (N)',

www.it-ebooks.info

 

Groovy Baseball

37

mil:'Milwaukee',

min:'Minnesota',

nya:'New York (A)',

nyn:'New York (N)',

oak:'Oakland',

phi:'Philadelphia',

pit:'Pittsburgh',

sdn:'San Diego',

sea:'Seattle',

sfn:'San Francisco',

sln:'St. Louis',

tba:'Tampa Bay',

tex:'Texas',

tor:'Toronto',

was:'Washington']

GetGameData() {

fillInStadiumMap()

}

 

 

Read stadium data

 

def fillInStadiumMap() {

 

 

from database

 

 

 

Sql db = Sql.newInstance(

 

 

'jdbc:h2:build/baseball',

 

 

'org.h2.Driver'

 

 

)

 

 

 

db.eachRow("select * from stadium") { row -> Stadium stadium = new Stadium(

name:row.name,

team:row.team,

latitude:row.latitude,

longitude:row.longitude

)

stadiumMap[stadium.team] = stadium

}

db.close()

}

The key-value pairs in the abbrevs map hold the three-letter abbreviations for each team and the city name, respectively.

The next step is to process the actual box scores. Here’s some sample data, taken at random. The random date I’ve chosen is October 28, 2007.9 The next listing shows the box score in XML form, truncated to show the typical elements without showing them all.

Listing 2.6 boxscore.xml: the box score from game 4 of the 2007 World Series

<boxscore game_id="2007/10/28/bosmlb-colmlb-1" game_pk="224026" home_sport_code="mlb" away_team_code="bos" home_team_code="col" away_id="111" home_id="115" away_fname="Boston Red Sox" home_fname="Colorado Rockies"

away_sname="Boston" home_sname="Colorado" date="October 28, 2007" away_wins="5" away_loss="0" home_wins="0" home_loss="5" status_ind="F"> <linescore away_team_runs="4" home_team_runs="3"

away_team_hits="9" home_team_hits="7" away_team_errors="0" home_team_errors="0">

<inning_line_score away="1" home="0" inning="1" /> <inning_line_score away="0" home="0" inning="2" />

...

<inning_line_score away="0" home="0" inning="9" /> </linescore>

9 Just happens to be the day the Red Sox won the World Series in 2007.

www.it-ebooks.info

38 CHAPTER 2 Groovy by example

<pitching team_flag="away" out="27" h="7" r="3" er="3" bb="3" so="7" hr="2" bf="37" era="2.60">

<pitcher id="452657" name="Lester" pos="P" out="17" bf="23"

er="0" r="0" h="3" so="3" hr="0" bb="3" w="2" l="0" era="0.00" note="(W, 2-0)" />

<pitcher id="434668" name="Delcarmen" pos="P" out="2" bf="4" er="1" r="1" h="2" so="1" hr="1" bb="0" w="0" l="0" era="9.00" note="(H, 2)" />

...

<pitcher id="449097" name="Papelbon" pos="P" out="5" bf="5" er="0" r="0" h="0" so="1" hr="0" bb="0" w="0" l="0" era="0.00" note="(S, 4)" />

</pitching>

<batting team_flag="home" ab="34" r="3" h="7" d="2" t="0" hr="2" rbi="3" bb="3" po="27" da="18" so="7" avg=".216" lob="12"> <batter id="430565" name="Matsui" pos="2B" bo="100" ab="4" po="3"

r="0" bb="0" a="5" t="0" sf="0" h="1" e="0" d="1" hbp="0" so="1" hr="0" rbi="0" lob="2" fldg="1.000" avg=".286" /> <batter id="466918" name="Corpas" pos="P" bo="101" ab="0" po="0" r="0" bb="0" a="1" t="0" sf="0" h="0" e="0" d="0" hbp="0"

so="0" hr="0" rbi="0" lob="0" fldg="1.000" avg=".000" />

...

</batting>

<pitching team_flag="home" out="27" h="9" r="4" er="4" bb="1" so="4" hr="2" bf="34" era="6.91">

<pitcher id="346871" name="Cook" pos="P" out="18" bf="23" er="3" r="3" h="6" so="2" hr="1" bb="0" w="0" l="2" era="4.50" note="(L, 0-2)" />

...

</pitching>

<batting team_flag="away" ab="33" r="4" h="9" d="2" t="0" hr="2" rbi="4" bb="1" po="27" da="8" so="4" avg=".322" lob="10"> <batter id="453056" name="Ellsbury" pos="CF-LF" bo="100" ab="4"

po="3" r="1" bb="0" a="0" t="0" sf="0" h="2" e="0" d="1" so="1" hr="0" rbi="0" lob="2" fldg="1.000" avg=".450" />

<batter id="456030" name="Pedroia" pos="2B" bo="200" ab="4" po="1" r="0" bb="0" a="4" t="0" sf="0" h="0" e="0" d="0" hbp="0" so="0" hr="0" rbi="0" lob="2" fldg="1.000" avg=".227" />

...

</batting>

...

</boxscore>

The root element is <boxscore>, which has several attributes. It has a child element called <linescore>, which shows the scoring in each inning. Then there are <pitching> and <batting> elements for the home team and away team, respectively.

This isn’t a terribly complex XML file, but if you had to process it using Java the code would quickly get involved. Using Groovy, as shown previously, all you have to do is walk the tree.

Parsing this data uses the same approach as parsing the geocoded data in the last section. Here I need to assemble the URL based on the month, day, and year and then parse the box score file:

www.it-ebooks.info

Groovy Baseball

39

def url = base + "year_${year}/month_${month}/day_${day}/"

def game = "gid_${year}_${month}_${day}_${away}mlb_${home}mlb_${num}/ boxscore.xml"

def boxscore = new XmlSlurper().parse("$url$game")

After parsing the file I can walk the tree to extract the team names and scores:

def awayName = boxscore.@away_fname

def awayScore = boxscore.linescore[0].@away_team_runs def homeName = boxscore.@home_fname

def homeScore = boxscore.linescore[0].@home_team_runs

The dots represent child elements, as before, and this time the @ symbols imply attributes.

PARSING XML Dots traverse from parent elements to children, and @ signs represent attribute values.

XML, regular expressions, and the Groovy Truth

To do some slightly more interesting processing, consider determining the winning and losing pitchers. The XML contains that information in a note attribute of the pitcher element. I can process that using a regular expression, assuming it exists at all:

def pitchers = boxscore.pitching.pitcher pitchers.each { p ->

if (p.@note && p.@note =~ /W|L|S/) { println " ${p.@name} ${p.@note}"

}

}

First I select all the pitcher elements for both teams. Then I want to examine the pitcher elements to find out who won and lost and if anyone was assigned a save. In the XML this information is kept in a note annotation in the pitcher element, which may or may not exist.

In the if statement, therefore, I check to see if a note attribute is present. Here I’m using the “Groovy Truth,” which means that non-null references evaluate to true. So do non-empty strings or collections, non-zero numbers, and, of course, the boolean literal true. If the note element is present, I then use the so-called “slashy” syntax to check to see if the note matches a regular expression: p.@note =~ /W|L|S/. If there’s a match I print out the values.

GENERATING GAME RESULTS

Before I show the complete method I need one more section. For the Groovy Baseball application I’m not interested in console output. Rather, I want to assemble the game results into a format that can be processed in the view layer by JavaScript. That means I need to return an object that can be converted into XML (or JSON).

Here’s a class called GameResult for that purpose:

www.it-ebooks.info

Use home stadium
Create response object
Extract data
Parsing
the XML data

40

CHAPTER 2 Groovy by example

class GameResult { String home String away String hScore String aScore Stadium stadium

String toString() { "$home $hScore, $away $aScore" }

}

CLOSURE RETURN VALUES The last expression in a closure is returned automatically.

This POGO is a simple wrapper for the home and away teams and the home and away scores, as well as for the stadium. The stadium is needed because it contains the latitude and longitude values I need for the Google Map. The following listing now shows the complete getGame method in the GetGameData class shown in listing 2.5.

Listing 2.7 The getGame method in GetGameData.groovy

def getGame(away, home, num) {

println "${abbrevs[away]} at ${abbrevs[home]} on ${month}/${day}/ ${year}"

def url = base + "year_${year}/month_${month}/day_${day}/"

def game = "gid_${year}_${month}_${day}_${away}mlb_${home}mlb_${num}/ boxscore.xml"

def boxscore = new XmlParser().parse("$url$game") def awayName = boxscore.@away_fname

def awayScore = boxscore.linescore[0].@away_team_runs def homeName = boxscore.@home_fname

def homeScore = boxscore.linescore[0].@home_team_runs

println "$awayName $awayScore, $homeName $homeScore (game $num)" GameResult result = new GameResult(home:homeName,

away:awayName,

hScore:homeScore,

aScore:awayScore, stadium:stadiumMap[home]

)

return result

}

The method uses an XmlSlurper to convert the XML box score into a DOM tree, extracts the needed information, and creates and returns an instance of the GameResult class.

There’s one other method in the GetGameData class, which is the one used to parse the web page listing the games for that day. This is necessary because due to rain-outs and other postponements there’s no way to know ahead of time which games will actually be played on a given day.

Parsing HTML is always a dicey proposition, especially because it may not be wellformed. There are third-partly libraries to do it,10 but the mechanism shown here

10 See, for example, the NekoHTML parser at http://nekohtml.sourceforge.net/.

www.it-ebooks.info

Using the Matcher class from Java
Extracted from the Matcher

Groovy Baseball

41

works. It also demonstrates regular-expression mapping in Groovy. The getGames method from GetGameData is shown in the next listing.

Listing 2.8 The getGames method from GetGameData

def getGames() {

def gameResults = []

println "Games for ${month}/${day}/${year}"

String url = base + "year_$year/month_$month/day_$day/" String gamePage = url.toURL().text

def pattern = /\"gid_${year}_${month}_${day}_(\w*)mlb_(\w*)mlb_(\d)/

Matcher m = gamePage =~ pattern if (m) {

m.count.times { line -> String away = m[line][1] String home = m[line][2] String num = m[line][3] try {

GameResult gr = this.getGame(away,home,num) gameResults << gr

} catch (Exception e) {

println "${abbrevs[away]} at ${abbrevs[home]} not started

yet"

}

}

}

return gameResults

}

The =~ method in Groovy returns an instance of java.util.regex.Matcher. The parentheses in the regular expression are groups, which let me extract the away team abbreviation, the home team abbreviation, and the game number from the URL. I use those to call the getGames method from listing 2.7 and put the results into a collection of GameResult instances.

TESTING

All that’s left is to test the complete GetGameData class. A JUnit test to do so is shown in the next listing.

Listing 2.9 GetGameDataTests.groovy: a JUnit 4 test case

class GetGameDataTest {

GetGameData ggd = new GetGameData(month:10,day:28,year:2007)

@Test

void testFillInStadiumMap() {

assert 0 == ggd.stadiumMap.size() ggd.fillInStadiumMap()

Before and after

def stadiums = ggd.stadiumMap.values()

populating

assert 30 == ggd.stadiumMap.size() stadiums.each { Stadium stadium ->

stadium map

assert stadium.latitude > 25 && stadium.latitude < 48

www.it-ebooks.info

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]