
- •Table of Contents
- •Foreword
- •Chapter 1. A Quick Walk Through
- •Workfile: The Basic EViews Document
- •Viewing an individual series
- •Looking at different samples
- •Generating a new series
- •Looking at a pair of series together
- •Estimating your first regression in EViews
- •Saving your work
- •Forecasting
- •What’s Ahead
- •Chapter 2. EViews—Meet Data
- •The Structure of Data and the Structure of a Workfile
- •Creating a New Workfile
- •Deconstructing the Workfile
- •Time to Type
- •Identity Noncrisis
- •Dated Series
- •The Import Business
- •Adding Data To An Existing Workfile—Or, Being Rectangular Doesn’t Mean Being Inflexible
- •Among the Missing
- •Quick Review
- •Appendix: Having A Good Time With Your Date
- •Chapter 3. Getting the Most from Least Squares
- •A First Regression
- •The Really Important Regression Results
- •The Pretty Important (But Not So Important As the Last Section’s) Regression Results
- •A Multiple Regression Is Simple Too
- •Hypothesis Testing
- •Representing
- •What’s Left After You’ve Gotten the Most Out of Least Squares
- •Quick Review
- •Chapter 4. Data—The Transformational Experience
- •Your Basic Elementary Algebra
- •Simple Sample Says
- •Data Types Plain and Fancy
- •Numbers and Letters
- •Can We Have A Date?
- •What Are Your Values?
- •Relative Exotica
- •Quick Review
- •Chapter 5. Picture This!
- •A Simple Soup-To-Nuts Graphing Example
- •A Graphic Description of the Creative Process
- •Picture One Series
- •Group Graphics
- •Let’s Look At This From Another Angle
- •To Summarize
- •Categorical Graphs
- •Togetherness of the Second Sort
- •Quick Review and Look Ahead
- •Chapter 6. Intimacy With Graphic Objects
- •To Freeze Or Not To Freeze Redux
- •A Touch of Text
- •Shady Areas and No-Worry Lines
- •Templates for Success
- •Point Me The Way
- •Your Data Another Sorta Way
- •Give A Graph A Fair Break
- •Options, Options, Options
- •Quick Review?
- •Chapter 7. Look At Your Data
- •Sorting Things Out
- •Describing Series—Just The Facts Please
- •Describing Series—Picturing the Distribution
- •Tests On Series
- •Describing Groups—Just the Facts—Putting It Together
- •Chapter 8. Forecasting
- •Just Push the Forecast Button
- •Theory of Forecasting
- •Dynamic Versus Static Forecasting
- •Sample Forecast Samples
- •Facing the Unknown
- •Forecast Evaluation
- •Forecasting Beneath the Surface
- •Quick Review—Forecasting
- •Chapter 9. Page After Page After Page
- •Pages Are Easy To Reach
- •Creating New Pages
- •Renaming, Deleting, and Saving Pages
- •Multi-Page Workfiles—The Most Basic Motivation
- •Multiple Frequencies—Multiple Pages
- •Links—The Live Connection
- •Unlinking
- •Have A Match?
- •Matching When The Identifiers Are Really Different
- •Contracted Data
- •Expanded Data
- •Having Contractions
- •Two Hints and A GotchYa
- •Quick Review
- •Chapter 10. Prelude to Panel and Pool
- •Pooled or Paneled Population
- •Nuances
- •So What Are the Benefits of Using Pools and Panels?
- •Quick (P)review
- •Chapter 11. Panel—What’s My Line?
- •What’s So Nifty About Panel Data?
- •Setting Up Panel Data
- •Panel Estimation
- •Pretty Panel Pictures
- •More Panel Estimation Techniques
- •One Dimensional Two-Dimensional Panels
- •Fixed Effects With and Without the Social Contrivance of Panel Structure
- •Quick Review—Panel
- •Chapter 12. Everyone Into the Pool
- •Getting Your Feet Wet
- •Playing in the Pool—Data
- •Getting Out of the Pool
- •More Pool Estimation
- •Getting Data In and Out of the Pool
- •Quick Review—Pools
- •Chapter 13. Serial Correlation—Friend or Foe?
- •Visual Checks
- •Testing for Serial Correlation
- •More General Patterns of Serial Correlation
- •Correcting for Serial Correlation
- •Forecasting
- •ARMA and ARIMA Models
- •Quick Review
- •Chapter 14. A Taste of Advanced Estimation
- •Weighted Least Squares
- •Heteroskedasticity
- •Nonlinear Least Squares
- •Generalized Method of Moments
- •Limited Dependent Variables
- •ARCH, etc.
- •Maximum Likelihood—Rolling Your Own
- •System Estimation
- •Vector Autoregressions—VAR
- •Quick Review?
- •Chapter 15. Super Models
- •Your First Homework—Bam, Taken Up A Notch!
- •Looking At Model Solutions
- •More Model Information
- •Your Second Homework
- •Simulating VARs
- •Rich Super Models
- •Quick Review
- •Chapter 16. Get With the Program
- •I Want To Do It Over and Over Again
- •You Want To Have An Argument
- •Program Variables
- •Loopy
- •Other Program Controls
- •A Rolling Example
- •Quick Review
- •Appendix: Sample Programs
- •Chapter 17. Odds and Ends
- •How Much Data Can EViews Handle?
- •How Long Does It Take To Compute An Estimate?
- •Freeze!
- •A Comment On Tables
- •Saving Tables and Almost Tables
- •Saving Graphs and Almost Graphs
- •Unsubtle Redirection
- •Objects and Commands
- •Workfile Backups
- •Updates—A Small Thing
- •Updates—A Big Thing
- •Ready To Take A Break?
- •Help!
- •Odd Ending
- •Chapter 18. Optional Ending
- •Required Options
- •Option-al Recommendations
- •More Detailed Options
- •Window Behavior
- •Font Options
- •Frequency Conversion
- •Alpha Truncation
- •Spreadsheet Defaults
- •Workfile Storage Defaults
- •Estimation Defaults
- •File Locations
- •Graphics Defaults
- •Quick Review
- •Index
- •Symbols

214—Chapter 7. Look At Your Data
Describing Groups—Just the Facts—Putting It Together
Many of the descriptive features for groups are the same as those for series—except done for each of the series in the group. For example, the descriptive statistics menu for a group offers two
choices: Common Sample, Individual Samples. The choices produce basic descriptive statistics for each series, with two different arrangements for choosing the sample.
If you want the same set of observations to go into computing the statistics for each series in the group, be sure to specify Common Sample. Using Individual Samples, we can see that there were quite a few applicants with valid LSAT scores but not valid GPAs (1707 versus 1638 observations). As a guess, this could reflect applications from undergraduate schools that don’t compute grade point averages.
The Tests of Equality… view for groups checks
whether the mean (or median, or variance) is the same for all the series in the group. (The tests assume that the series are statistically independent.) Here too, you can be sure the same observations are used from each series by picking
Common sample.
Hint: Use Tests For Descriptive Statistics/Simple Hypothesis Tests for each individual series to test for a specific value, say that the mean equals 3.14159. Use Tests For Descriptive Statistics/Equality Tests By Classification… to test that different subpopulations have the same mean for a series. Use Tests of Equality… for groups to see if the mean is equal for different series.

Describing Groups—Just the Facts—Putting It Together—215
Correlations
It’s easy to find the covariances or correlations of all the series in a group by choosing the Covariance Analysis... view. By default, EViews will display the covariance matrix for the common samples in the group, but you may instead compute correlations, use individual samples, or compute various other measures of the associations and related test statistics. For example, to compute using individual samples, you
should uncheck the Balanced sample (listwise deletion) box. This setting instructs EViews to compute the covariance or correlation for the first two series using the observations available for both series, then the correlation between series one and series three using the observations available for those two series (in other words, not worrying whether observations are missing for series two), etc.
Here, we see the correlation for the two series in the group. Surprisingly, LSAT and GPA are not all that highly correlated. A correlation of only 0.307 means that the information in LSAT is not redundant with the information in GPA, which is why law schools look at both test scores and grades.
People use correlations a lot more because correla-
tions are unit-free, while the units of covariances depend on the units of the underlying series.

216—Chapter 7. Look At Your Data
Hint: Should one use a common sample or not? EViews requires a choice in several of the procedures we’re looking at in this chapter.
The most common practice is to find a common sample to use for the entire analysis. That way you know that different answers from different parts of the analysis reflect real differences rather than different inputs. On the other hand, restricting the analysis to a common sample can mean ignoring a lot of data. So there’s no absolute right or wrong answer. It’s a judgment call.
Cross-Tabs
Cross-tabulation is a traditional method of looking at the relationship between categorical variables. In the simplest version, we build a table in which rows represent one variable and columns another, and then we count how many observations fall into each box. For fun, we’ve created categorical variables describing grades (4.0 or better, above average but not 4.0, below average) and test scores (above average, below average) with the following commands:
series gradecat = (gpa>=4)*1 + (gpa>3.365 and gpa<4)*2 + (gpa<=3.365)*3
series testcat = (lsat>158)*1 + (lsat<=158)*2
Since GRADECAT is arbitrarily coded as 1, 2, or 3 and TESTCAT is similarly, arbitrarily coded as 1 or 2, we added a value map to each series to make the tables easier to read. See What Are Your Values? in Chapter 4, “Data—The Transformational Experience.”

Describing Groups—Just the Facts—Putting It Together—217
Choosing the N-Way Tabulation… view and accepting the defaults in the Crosstabulation dialog produces the output shown to the right. Let’s begin with the table appearing at the bottom.
Table Facts
The first column of the table gives counts for high test-scoring applicants: 5 had top grades, 552 had high—but not “top”—grades, and 306 had low grades—for a total of 863 applicants with “high” (i.e., above average) test scores. Reading across the first row, 5 of the students with top grades had high LSATs, 5 were below average—so there was a total of 10 students with top grades.
Hint: The intersection of a row and column is called a cell. For example, there are 350 applicants in the High grade/Low test score cell.
The bottom row reports the totals for each column and the right-most column reports the totals for each row. The Total-Total, bottom right, is the number of observations used in the table.
Table Interpretation
In this applicant pool, there were 5 students with perfect grades and below average LSATs. That’s a true fact—but so what? We might be interested in getting counts, but usually what we’re trying to do is find out if one variable is related to another. For the data in hand, the obvious question is “Do high test scores and high grades go together?” To begin to answer this question, we return to the N-Way Tabulation… menu, this time checking
Table %, Row %, and Column % in the Crosstabulation dialog.

218—Chapter 7. Look At Your Data
Take a look at the Top grade/Low test score cell again. The first number in the cell tells us, as before, that five applicants had this combination of grade and LSAT. But now we have three additional numbers. The first new number, marked , is the Table %, 0.31% (=5/1638), which tells us the fraction of observations falling in this cell out of all the observations in the table. The Row %,
, tells us what fraction of top grades (the row) also have low test scores (5/10.) Analogously, the last element in the cell,
, is the Column % (5/775).
In the same way, table, row, and column percentage are given in the Total column at the right and Total row at the bottom. Looking at the right, we see that perfect grades came from 0.61% of all applicants; that 100% of applicants with perfect grades had perfect grades (telling us that everything in the row is in the row, which isn’t very surprising); and that 0.61% of this column had perfect grades (which we already knew from the table % in this cell.)
Hint: The row and column percentages given in the Total column and row are sometimes called marginals because they give the univariate empirical distribution for grade and test score respectively. So the row and column percentages correspond to the marginal probability distributions of the joint probability distribution described by the table as a whole.
Suppose that doing well on the LSAT and having good grades were independent. We’d expect that the percentage of students having both top grades and low test scores would be
roughly the overall percentage having top grades times the overall percentage having low test scores. For our data we’d expect 0.61% × 47.31% = 0.2886% to fall into this cell. In

Describing Groups—Just the Facts—Putting It Together—219
fact 0.31% do fall into this cell—the difference might easily be due to random variation. We could do the same calculation for all the cells and use this as a basis for a formal test of the hypothesis that grades and test scores are independent. That’s exactly what’s done in the lines marked “Test Statistics” which appear above the table. Formally, if the two series were independent, then the reported test statistics would be approximately x2 with the indicated degrees of freedom. The column marked “Prob” gives p-values. So despite what we found in the Top grade/Low test cell, the typical cell percentage is sufficiently different from the product of the marginal percentages that the hypothesis of independence is strongly rejected.
Statistics hint: It’s not uncommon to find many cells containing very few observations. As a rule of thumb, when cells are expected to have fewer than five observations, the use of x2 test statistics gets a little dicey. In such cases EViews prints a warning message, as it’s done here.
N-Way Tabulation with N>2
With two series in a group, the cells are laid out in a two-dimensional rectangle, with the categories for the first series going down and categories for the second series going across. With three series, EViews displays a three-dimensional hyper-rectangle. With four series, the display is a four-dimensional hyper-rectangle, etc.
Fortunately, EViews is very clever at detecting older equipment. If you are still using a display limited to two dimensions, rather than one of the newer Romulan units, EViews splits the hyper-rectangle into a series of two-dimensional slices.
As an example, open a group with GRADECAT, TESTCAT, and WASH to add the effect of in-state residency into the mix. Then choose N-Way Tabulation… (Reporting of test statistics is turned off simply because the output became very long.)

220—Chapter 7. Look At Your Data
The first table shows counts for grades and test scores for non-resi- dents. For example, there were four out-of-state residents with top grades and low test scores. The second table gives the same information for Washington residents. Together, these tables describe the joint distribution of grades and test scores conditional on residency.
The third table gives the joint distribution of grades and test scores, unconditionally with respect to residency. In other words, it’s the same two-way table we saw before.
You can see that with lots of categories, an N-Way tabulation can be really, really long. If our last series had been 50 states instead of just yes/no for Washington, we’d have gotten 50 conditional tables and one unconditional table. You can imagine how much output there would be with a fourth or fifth variable in the cross-tabulation.
Does the order in which the series
appear in the group matter? The answer is “no and yes.” Whatever order you specify, you get all the possible conditional cell counts. Since you get the same information regardless of the order specified, there’s a sense in which the order is irrelevant. (However, which unconditional tables are shown does depend on the order.) However, the series order does affect readability. It generally makes sense to put first the variables you’re most interested in comparing. These will be the ones that show up together on each table.
Another approach to improved readability is to arrange series for the easiest screen display. The two rules are:
•The second series should have sufficiently few categories such that the categories can go across the top of the table without forcing you to scroll horizontally. (If you’re going to print, think about the width of your paper instead of the width of the screen.)

Describing Groups—Just the Facts—Putting It Together—221
•The first series should have as many categories as possible. It’s easier to look at one long table rather than many short tables. This rule is sometimes limited by the desire to get a complete table to fit vertically on a screen or printed page.
Hint: Sometimes you can get a more useful table by printing in landscape rather than portrait mode.
Hint: Sometimes, when there are too many tables to manage visually, you should stop and think about whether there are also too many tables to help you learn anything.
True Story to End the Chapter
When the author was a college student, he worked as a research assistant for a professor from whom he learned a great deal about many things. One incident was particularly memorable. We had turned in a report, including relevant cross-tabulations, to the government agency which had paid for the research project. Shortly thereafter, a somewhat snippy letter came back pointing out that our analysis had covered a dozen (or so) variables and that the contract required cross-tabulations of all the variables, not just those that the research team thought mattered. And that we’d better comply. The letter was turned over to me.
I found my mentor the next day and pointed out that the sponsoring agency was asking for about one million pages of printout. He laughed. Told me to mail them the first thousand pages without comment, and said we’d never hear back. I did, and we didn’t.

222—Chapter 7. Look At Your Data