
- •Table of Contents
- •Foreword
- •Chapter 1. A Quick Walk Through
- •Workfile: The Basic EViews Document
- •Viewing an individual series
- •Looking at different samples
- •Generating a new series
- •Looking at a pair of series together
- •Estimating your first regression in EViews
- •Saving your work
- •Forecasting
- •What’s Ahead
- •Chapter 2. EViews—Meet Data
- •The Structure of Data and the Structure of a Workfile
- •Creating a New Workfile
- •Deconstructing the Workfile
- •Time to Type
- •Identity Noncrisis
- •Dated Series
- •The Import Business
- •Adding Data To An Existing Workfile—Or, Being Rectangular Doesn’t Mean Being Inflexible
- •Among the Missing
- •Quick Review
- •Appendix: Having A Good Time With Your Date
- •Chapter 3. Getting the Most from Least Squares
- •A First Regression
- •The Really Important Regression Results
- •The Pretty Important (But Not So Important As the Last Section’s) Regression Results
- •A Multiple Regression Is Simple Too
- •Hypothesis Testing
- •Representing
- •What’s Left After You’ve Gotten the Most Out of Least Squares
- •Quick Review
- •Chapter 4. Data—The Transformational Experience
- •Your Basic Elementary Algebra
- •Simple Sample Says
- •Data Types Plain and Fancy
- •Numbers and Letters
- •Can We Have A Date?
- •What Are Your Values?
- •Relative Exotica
- •Quick Review
- •Chapter 5. Picture This!
- •A Simple Soup-To-Nuts Graphing Example
- •A Graphic Description of the Creative Process
- •Picture One Series
- •Group Graphics
- •Let’s Look At This From Another Angle
- •To Summarize
- •Categorical Graphs
- •Togetherness of the Second Sort
- •Quick Review and Look Ahead
- •Chapter 6. Intimacy With Graphic Objects
- •To Freeze Or Not To Freeze Redux
- •A Touch of Text
- •Shady Areas and No-Worry Lines
- •Templates for Success
- •Point Me The Way
- •Your Data Another Sorta Way
- •Give A Graph A Fair Break
- •Options, Options, Options
- •Quick Review?
- •Chapter 7. Look At Your Data
- •Sorting Things Out
- •Describing Series—Just The Facts Please
- •Describing Series—Picturing the Distribution
- •Tests On Series
- •Describing Groups—Just the Facts—Putting It Together
- •Chapter 8. Forecasting
- •Just Push the Forecast Button
- •Theory of Forecasting
- •Dynamic Versus Static Forecasting
- •Sample Forecast Samples
- •Facing the Unknown
- •Forecast Evaluation
- •Forecasting Beneath the Surface
- •Quick Review—Forecasting
- •Chapter 9. Page After Page After Page
- •Pages Are Easy To Reach
- •Creating New Pages
- •Renaming, Deleting, and Saving Pages
- •Multi-Page Workfiles—The Most Basic Motivation
- •Multiple Frequencies—Multiple Pages
- •Links—The Live Connection
- •Unlinking
- •Have A Match?
- •Matching When The Identifiers Are Really Different
- •Contracted Data
- •Expanded Data
- •Having Contractions
- •Two Hints and A GotchYa
- •Quick Review
- •Chapter 10. Prelude to Panel and Pool
- •Pooled or Paneled Population
- •Nuances
- •So What Are the Benefits of Using Pools and Panels?
- •Quick (P)review
- •Chapter 11. Panel—What’s My Line?
- •What’s So Nifty About Panel Data?
- •Setting Up Panel Data
- •Panel Estimation
- •Pretty Panel Pictures
- •More Panel Estimation Techniques
- •One Dimensional Two-Dimensional Panels
- •Fixed Effects With and Without the Social Contrivance of Panel Structure
- •Quick Review—Panel
- •Chapter 12. Everyone Into the Pool
- •Getting Your Feet Wet
- •Playing in the Pool—Data
- •Getting Out of the Pool
- •More Pool Estimation
- •Getting Data In and Out of the Pool
- •Quick Review—Pools
- •Chapter 13. Serial Correlation—Friend or Foe?
- •Visual Checks
- •Testing for Serial Correlation
- •More General Patterns of Serial Correlation
- •Correcting for Serial Correlation
- •Forecasting
- •ARMA and ARIMA Models
- •Quick Review
- •Chapter 14. A Taste of Advanced Estimation
- •Weighted Least Squares
- •Heteroskedasticity
- •Nonlinear Least Squares
- •Generalized Method of Moments
- •Limited Dependent Variables
- •ARCH, etc.
- •Maximum Likelihood—Rolling Your Own
- •System Estimation
- •Vector Autoregressions—VAR
- •Quick Review?
- •Chapter 15. Super Models
- •Your First Homework—Bam, Taken Up A Notch!
- •Looking At Model Solutions
- •More Model Information
- •Your Second Homework
- •Simulating VARs
- •Rich Super Models
- •Quick Review
- •Chapter 16. Get With the Program
- •I Want To Do It Over and Over Again
- •You Want To Have An Argument
- •Program Variables
- •Loopy
- •Other Program Controls
- •A Rolling Example
- •Quick Review
- •Appendix: Sample Programs
- •Chapter 17. Odds and Ends
- •How Much Data Can EViews Handle?
- •How Long Does It Take To Compute An Estimate?
- •Freeze!
- •A Comment On Tables
- •Saving Tables and Almost Tables
- •Saving Graphs and Almost Graphs
- •Unsubtle Redirection
- •Objects and Commands
- •Workfile Backups
- •Updates—A Small Thing
- •Updates—A Big Thing
- •Ready To Take A Break?
- •Help!
- •Odd Ending
- •Chapter 18. Optional Ending
- •Required Options
- •Option-al Recommendations
- •More Detailed Options
- •Window Behavior
- •Font Options
- •Frequency Conversion
- •Alpha Truncation
- •Spreadsheet Defaults
- •Workfile Storage Defaults
- •Estimation Defaults
- •File Locations
- •Graphics Defaults
- •Quick Review
- •Index
- •Symbols

Chapter 7. Look At Your Data
Data description precedes data analysis. Failure to carefully examine your data can lead to what experienced statisticians describe with the phrase “a boo boo.”
True story. I was involved in a project to analyze admissions data from the University of Washington law school. (An extract of the data, “UWLaw98.wf1”, can be found on the EViews website.) Some of my early results were really, really strange. After hours of frustration I did the sensible thing and went and asked my wife’s advice. She told me:
Look at your data!
So I quickly pulled up a histogram of the applicants’ grade point averages (GPA). Notice the one little data point all by its lonesome way off to the right? According to the summary table, the highest recorded GPA was 39. Since GPAs in American colleges are generally on a 4.0 scale, it’s a pretty good bet that a decimal point was omitted somewhere.
In this chapter we’ll walk through a number of techniques
for looking at your data. Since the border between describing data and beginning an analysis can be fuzzy, some of the topics covered here are useful in data analysis as well. Our discussion is split into univariate (describing one variable at a time) and multivariate (describing several variables jointly). Maybe it’s easier to think of descriptive views of series and descriptive views of groups.
Hint: Two important data descriptive techniques are covered elsewhere. Graphing techniques are explored in Chapter 5, “Picture This!” And while one of the very best techniques for looking at your data is to open a spreadsheet view and then look at it, this doesn’t require any instructions—so past this reminder-sentence we won’t give any…except for one little trick in the next section.

196—Chapter 7. Look At Your Data
Reminder Hint: Don’t forget that you can hover your cursor over points in a graph to display observation labels and values.
Sorting Things Out
As you know, you can open a spreadsheet view of a series or a group of series to get a visual display. For example, the spreadsheet view of GPA is shown to the right.
Observations appear in order.
Push the button to bring up the Sort Order dialog, which gives you the option of sorting by either observation number or the value of GPA. You can sort in either Ascending (low-to-high) or Descending (high-to-low) order.
By sorting according to GPA, we can instantly see where the problem value is located.

Describing Series—Just The Facts Please—197
More generally, the Sort Order dialog for groups lets you sort using up to three series to order the observations.
Hint: Sorting changes the order in which the data is visually displayed. The actual order in the workfile remains unchanged, so analysis is not affected. To restore the appearance to its original order, sort using Observation Order and Ascending.
Describing Series—Just The Facts Please
Open a series and click the button. The dropdown menu shows the tools available for looking at the series. We begin with the basic descriptive statistics.
Stats Panel from Histogram and Stats
Histograms and basic statistics are generated through the
Descriptive Statistics & Tests/Histogram and Stats menu item. The data used for computing descriptive statistics is, as always, restricted to the current sample. Let’s first eliminate reported grades that are almost certainly data errors.
smpl if gpa>1 and gpa<5

198—Chapter 7. Look At Your Data
As you can see, Histogram and Stats produces a histogram on the left and a panel of descriptive statistics on the right. Let’s start with the latter, coming back to the picture part later.
The top of the statistics panel gives the sample in effect when the report was made and the number of observations. If you compare this report with the one at the
beginning of the chapter, you might note that the smpl if command cut out two observations. Comparing the maximum and minimum between the two reports, we can deduce that one GPA of 39 and one GPA of .26 was eliminated.
Was it a good idea to eliminate these two observations? This question can’t be answered by statistical analysis—you need to apply subject area knowledge. In this case, we might have chosen instead to “correct” the data by changing 39 to 3.9 and .26 to 2.6. (Although, one is left with the nagging question of whether there might really have been an applicant with a 0.26 GPA.) When we eliminated two grade observations by changing the sample, we also cut out data for other series for these two individuals. Their state of residence or LSAT scores might still be of interest, for example. There’s no right or wrong about this “side effect.” You just want to be aware that it’s happening.
Hint: If you want to eliminate data errors for one series without affecting which observations are used for other series in an analysis, change the erroneous values to NA instead of cutting them out of the sample.
The remainder of the statistics panel reports characteristics of the data sample, mean, median, etc.

Describing Series—Just The Facts Please—199
Export Hint: If you double-click on the statistics panel, the Text Labels dialog opens. This is the place to manipulate the text display. (See Chapter 5, “Picture This!”) You can also Edit/Copy the text in the statistics panel and then paste the text into your word processor.
The statistic at the bottom of the panel, the Jarque-Bera, tests the hypothesis that the sample is drawn from a normal distribution. The statistic marked “Probability” is the p-value associated with the Jarque-Bera. In this example, with a p-value of 0.000, the report is that it is extremely unlikely that the data follows a normal distribution.
Hint: There are relatively few places in econometrics where normality of the data is important. In particular, there is no requirement that the variables in a regression be normally distributed. I don’t know where this myth comes from.
One-Way
To look at the complete distribution of a series use One-Way Tabulation…, which lets you Tabulate Series. Initially, it’s best to uncheck both Group into bins if checkboxes. Eliminating binning ensures that we see a complete list of every value appearing in the series from low to high, as well as a count and cumulative count of the number of observations taken by each value.

200—Chapter 7. Look At Your Data
Tabulation of GPA provides lots of information. It also illustrates a common problem—too many categories.
This is why the Tabulate Series dialog defaults provides binning control.
Binning Control
The Group into bins if field is a threepart control over grouping individual values into bins. Checking # of values tells EViews to create bins if there are more than the specified number of values and checking Avg. count means to create bins if the average count in a category is less than specified. Max # of bins, not surprisingly, sets the maximum number of bins. Sometimes you need to play around with these options to get the tabulation that best fits your needs.
As an example, here’s a GPA tabula-
tion that shows broad categories. It’s now easy to see that 15 percent of applicants had below a 3.0 average and 10 applicants, 0.61 percent of the applicant pool, did report GPAs above 4.0.

Describing Series—Just The Facts Please—201
Stats Table
The menu Descriptive Statistics/Stats Table creates a table with pretty much the same information as is found in the statistics panel of Histogram and Statistics. This table format has the advantage that it’s easier to copy-and-paste into your word processor or spreadsheet program.
Stats By Classification
A common first step on the road from data description to data analysis is asking whether the basic series statistics differ for sub-groups of the population. Clicking Descriptive Statistics/Stats by Classification… brings up the Statistics By Classification dialog. You’ll see a field called Series/Group for classify smack in the upper center of the dialog. Enter one or more series (or groups) here, hit , and you get summary
statistics computed for all the distinct combinations of values of the classifying series.
Here’s a simple example. In our workfile, the variable WASH equals one for Washington State residents and zero for everyone else. Using WASH as the classifying variable gives the results shown to the right. About 60 percent of applications (1028 out of 1639) were from out of the state, and the out of state applicants averaged a slightly higher GPA.

202—Chapter 7. Look At Your Data
If we wanted to see the effect of state and having a relatively high LSAT score (Law School Admission Test), we could fill out the
Series/Group for classify field with both WASH and LSAT>160.
Now we get a table showing
mean, standard deviation, and the number of observations for all four combinations of Washington resident/not resident and high/low LSAT. The list of statistics reported appears in the upper left-hand corner of the statistics table so that you’ll have a key handy for reading the results.
The left-hand side of the Statistics By Classification dialog has a series of checkboxes for selecting the statistics you’d like to see. The Output Layout field, on the right-hand side, provides some control over the
appearance of the table and whether you want “margin” statistics— the “All” row and the “All” column.
Looking at statistics by classification makes sense when the classifying variable has a small set of distinct values. When the classifying variable takes on a large number of values, it’s sometimes better to clump together values into a small number of groups or “bins.” The Group into bins if field in the lower center of the dialog lets you instruct EViews to group different values of the classifying variable into a single bin. (See Binning Control, later in this chapter.)