Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Eviews5 / EViews5 / Docs / EViews 5 Users Guide.pdf
Скачиваний:
152
Добавлен:
23.03.2015
Размер:
11.51 Mб
Скачать

Chapter 6. Working with Data

In the following discussion, we describe EViews’ powerful language for using numeric expressions and generating and manipulating the data in series and groups. We first describe the fundamental rules for working with mathematical expressions in EViews, and then describe how to use these expressions in working with series and group data.

More advanced tools for working with numeric data, and objects for working with different kinds of data are described in Chapter 7, “Working with Data (Advanced)”.

Numeric Expressions

One of the most powerful features of EViews is the ability to use and to process mathematical expressions. EViews contains an extensive library of built-in operators and functions that allow you to perform complicated mathematical operations on your data with just a few keystrokes. In addition to supporting standard mathematical and statistical operations, EViews provides a number of specialized functions for automatically handling the leads, lags and differences that are commonly found in time series data.

An EViews expression is a combination of numbers, series names, functions, and mathematical and relational operators. In practical terms, you will use expressions to describe all mathematical operations involving EViews objects.

As in other programs, you can use these expressions to calculate a new series from existing series, to describe a sample of observations, or to describe an equation for estimation or forecasting. However, EViews goes far beyond this simple use of expressions by allowing you to use expressions virtually anywhere you would use a series. We will have more on this important feature shortly, but first, we describe the basics of using expressions.

Operators

EViews expressions may include operators for the usual arithmetic operations. The operators for addition (+), subtraction (-), multiplication (*), division (/) and raising to a power (^) are used in standard fashion so that:

5 + 6 * 7.0 / 3

7 + 3e-2 / 10.2345 + 6 * 10^2 + 3e3

3^2 - 9

are all valid expressions. Notice that explicit numerical values may be written in integer, decimal, or scientific notation.

130—Chapter 6. Working with Data

In the examples above, the first expression takes 5 and adds to it the product of 6 and 7.0 divided by 3 (5+14=19); the last expression takes 3 raised to the power 2 and subtracts 9 (9 – 9 = 0). These expressions use the order of evaluation outlined below.

The “-” and “+” operators are also used as the unary minus (negation) and unary plus operators. It follows that:

2-2

-2+2

2+++++++++++++-2

2---2

all yield a value of 0.

EViews follows the usual order in evaluating expressions from left to right, with operator precedence order as follows (from highest precedence to lowest):

unary minus (-), unary plus (+)

exponentiation (^)

multiplication (*), division (/)

addition (+), subtraction (-)

comparison (<, >, <=, >=, =)

and, or

The last two sets of operators are used in logical expressions.

To enforce a particular order of evaluation, you can use parentheses. As in standard mathematical analysis, terms which are enclosed in parentheses are treated as a subexpression and evaluated first, from the innermost to the outermost set of parentheses. We strongly recommend the use of parentheses when there is any possibility of ambiguity in your expression.

To take some simple examples,

-1^2, evaluates to (–1)^2=1 since the unary minus is evaluated prior to the power operator.

-1 + -2 * 3 + 4, evaluates to –1 + –6 + 4 = –3. The unary minus is evaluated first, followed by the multiplication, and finally the addition.

(-1 + -2) * (3 + 4), evaluates to –3 * 7 = –21. The unary minuses are evaluated first, followed by the two additions, and then the multiplication.

3*((2+3)*(7+4) + 3), evaluates to 3 * (5*11 + 3) = 3 * 58 =174.

Numeric Expressions—131

A full listing of operators is presented in Appendix D, “Operator and Function Reference”, on page 573 of the Command and Programming Reference.

Series Expressions

Much of the power of EViews comes from the fact that expressions involving series operate on every observation, or element, of the series in the current sample. For example, the series expression:

2*y + 3

tells EViews to multiply every sample value of Y by 2 and then to add 3. We can also perform operations that work with multiple series. For example:

x/y + z

indicates that we wish to take every observation for X and divide it by the corresponding observation on Y, and add the corresponding observation for Z.

Series Functions

EViews contains an extensive library of built-in functions that operate on all of the elements of a series in the current sample. Some of the functions are “element functions” which return a value for each element of the series, while others are “summary functions” which return scalars, vectors or matrices, which may then be used in constructing new series or working in the matrix language (see Chapter 3, “Matrix Language”, on page 23 of the Command and Programming Reference for a discussion of scalar, vector and matrix operations).

Most function names in EViews are preceded by the @-sign. For example, @mean returns the average value of a series taken over the current sample, and @abs takes the absolute value of each observation in the current sample.

All element functions return NAs when any input value is missing or invalid, or if the result is undefined. Functions which return summary information generally exclude observations for which data in the current sample are missing. For example, the @mean function will compute the mean for those observations in the sample that are non-missing.

There is an extensive set of functions that you may use with series:

A list of mathematical functions is presented in Appendix D, “Operator and Function Reference”, on page 573 of the Command and Programming Reference.

Workfile functions that provide information about observations identifiers or allow you to construct time trends are described in Appendix E, “Workfile Functions”, on page 589 of the Command and Programming Reference.

132—Chapter 6. Working with Data

Functions for working with strings and dates are documented in “String Function Summary” on page 129 of the Command and Programming Reference and “Date Function Summary” on page 152 of the Command and Programming Reference.

The remainder of this chapter will provide additional examples of expressions involving functions.

Series Elements

At times, you may wish to access a particular observation for a series. EViews provides you with a special function, @elem, which allows you to use a specific value of a series.

@elem takes two arguments: the first argument is the name of the series, and the second is the date or observation identifier.

For example, suppose that you want to use the 1980Q3 value of the quarterly series Y, or observation 323 of the undated series X. Then the functions:

@elem(y, 1980Q3)

@elem(x, 323)

will return the values of the respective series in the respective periods.

Numeric Relational Operators

Relational comparisons may be used as part of a mathematical operation, as part of a sample statement, or as part of an if-condition in programs.

A numeric relational comparison is an expression which contains the “=” (equal), “>=” (greater than or equal), “<=” (less than or equal), “<>” (not equal), “>” (greater than), or “<” (less than) comparison operators. These expressions generally evaluate to TRUE or FALSE, returning a 1 or a 0, depending on the result of the comparison.

Comparisons involving strings are discussed in “String Relational Operators” beginning on page 121 of the Command and Programming Reference.

Note that EViews also allows relational comparisons to take the value “missing” or NA, but for the moment, we will gloss over this point until our discussion of missing values (see “Missing Values” on page 134).

We have already seen examples of expressions using relational operators in our discussion of samples and sample objects. For example, we saw the sample condition:

incm > 5000

which allowed us to select observations meeting the specified condition. This is an example of a relational expression—it is TRUE for each observation on INCM that exceeds 5000; otherwise, it is FALSE.

Numeric Expressions—133

As described above in the discussion of samples, you may use the “and” and “or” conjunction operators to build more complicated expressions involving relational comparisons:

(incm>5000 and educ>=13) or (incm>10000)

It is worth emphasizing the fact that EViews uses the number 1 to represent TRUE and 0 to represent FALSE. This internal representation means that you can create complicated expressions involving logical subexpressions. For example, you can use relational operators to recode your data:

0*(inc<100) + (inc>=100 and inc<200) + 2*(inc>=200)

which yields 0 if INC<100, 1 if INC is greater than or equal to 100 and less than 200, and 2 for INC greater than or equal to 200.

The equality comparison operator “=” requires a bit more discussion, since the equal sign is used both in assigning values and in comparing values. We consider this issue in greater depth when we discuss creating and modifying series (see “Series” on page 137). For now, note that if used in an expression:

incm = 2000

evaluates to TRUE if INCOME is exactly 2000, and FALSE, otherwise.

Leads, Lags, and Differences

It is easy to work with lags or leads of your series. Simply use the series name, followed by the lag or lead enclosed in parentheses. Lags are specified as negative numbers and leads as positive numbers so that,

income(-4)

is the fourth lag of the income series, while:

sales(2)

is the second lead of sales.

While EViews expects lead and lag arguments to be integers, there is nothing to stop you from putting non-integer values in the parentheses. EViews will automatically convert the number to an integer; you should be warned, however, that the conversion behavior is not guaranteed to be systematic. If you must use non-integer values, you are strongly encouraged to use the @round, @floor, or @ceil functions to control the lag or lead behavior.

In many places in EViews, you can specify a range of lead or lag terms. For example, when estimating equations, you can include expressions of the form:

134—Chapter 6. Working with Data

income(-1 to -4)

to represent all of the INCOME lags from 1 to 4. Similarly, the expressions:

sales sales(-1) sales(-2) sales(-3) sales(-4)

sales(0 to -4)

sales(to -4)

are equivalent methods of specifying the level of SALES and all lags from 1 to 4.

EViews also has several built-in functions for working with difference data in either levels or in logs. The “D” and “DLOG” functions will automatically evaluate the differences for you. For example, instead of taking differences explicitly,

income - income(-1)

log(income) - log(income(-1))

you may use the equivalent expressions,

d(income)

dlog(income)

You can take higher order differences by specifying the difference order. For example, the expressions:

d(income,4)

dlog(income,4)

represent the fourth-order differences of INCOME and log(INCOME).

If you wish to take seasonal differences, you should specify both the ordinary, and a seasonal difference term:

d(income,1,4)

dlog(income,1,4)

These commands produce first order differences with a seasonal difference at lag 4. If you want only the seasonal difference, specify the ordinary difference term to be 0:

d(income,0,4)

dlog(income,0,4)

Mathematical details are provided in Appendix D, “Operator and Function Reference”, on page 573 of the Command and Programming Reference.

Missing Values

Occasionally, you will encounter data that are not available for some periods or observations, or you may attempt to perform mathematical operations where the results are unde-

Numeric Expressions—135

fined (e.g., division by zero, log of a negative number). EViews uses the code NA (not available) to represent these missing values.

For the most part, you need not worry about NAs. EViews will generate NAs for you when appropriate, and will automatically exclude observations with NAs from statistical calculations. For example, if you are estimating an equation, EViews will use the set of observations in the sample that have no missing values for the dependent and all of the independent variables.

There are, however, a few cases where you will need to work with NAs, so you should be aware of some of the underlying issues in the handling of NAs.

First, when you perform operations using multiple series, there may be alternative approaches for handling NAs. EViews will usually provide you with the option of casewise exclusion (common sample) or listwise exclusion (individual sample). With casewise exclusion, only those observations for which all of the series have non-missing data are used. This rule is always used, for example, in equation estimation. For listwise exclusion, EViews will use the maximum number of observations possible for each series, excluding observations separately for each series in the list of series. For example, when computing descriptive statistics for a group of series, you have the option to use a different sample for each series.

If you must work directly with NAs, just keep in mind that EViews NAs observe all of the rules of IEEE NaNs. This means that performing mathematical operations on NAs will generate missing values. Thus, each of the following expressions will generate missing values:

@log(-abs(x))

1/(x-x)

(-abs(x))^(1/3)

3*x + NA

exp(x*NA)

For the most part, comparisons involving NA values propagate NA values. For example, the commands:

series y = 3 series x = NA

series equal = (y = x) series greater = (y > x)

will create series EQUAL and GREATER that contain NA values, since the comparison between observations in a series involving an NA yields an NA.

Note that this behavior differs from EViews 4.1 and earlier in which NAs were treated as ordinary values for purposes of equality (“=”) and inequality (“<>”) testing. In these

136—Chapter 6. Working with Data

versions of EViews, the comparison operators “=” and “<>” always returned a 0 or a 1. The change in behavior was deemed necessary to support the use of string missing values. In all versions of EViews, comparisons involving ordering (“>”, “<“, “<=”, “>=”) propagate NAs.

It is still possible to perform comparisons using the previous methods. One approach is to use the special functions @EQNA and @NEQNA for performing equality and strict inequality comparisons without propagating NAs. For example, you may use the commands:

series equal1 = @eqna(x, y)

series nequal = @neqna(x, y)

so that NAs in either X or Y are treated as ordinary values for purposes of comparison. Using these two functions, EQUAL1 will be filled with the value 0, and NEQUAL will be filled with the value 1. Note that the @EQNA and @NEWNA functions do not compare their arguments to NA, but rather facilitate the comparison of values so that the results are guaranteed to be 0 or 1. See also “Version 4 Compatibility Mode” on page 97 of the Command and Programming Reference for settings that enable the previous behavior for element comparisons in programs.

To test whether individual observations in a series are NAs, you may use the @ISNA function. For example,

series isnaval = @isna(x)

will fill the series ISNAVAL with the value 1, since each observation in X is an NA.

There is one special case where direct comparison involving NAs does not propagate NAs. If you test equality or strict inequality against the literal NA value:

series equal2 = (x = NA)

series nequal2 = (y <> NA)

EViews will perform a special test against the NA value without propagating NA values. Note that these commands are equivalent to the comparisons involving the special functions:

series equal3 = @eqna(x, NA)

series nequal3 = @neqna(y, NA)

If used in a mathematical operation, a relational expression resulting in an NA is treated as an ordinary missing value. For example, for observations where the series X contains NAs, the mathematical expression

5*(x>3)

will yield NAs. However, if the relational expression is used as part of a sample or IF-state- ment, NA values are treated as FALSE.

Соседние файлы в папке Docs