Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Eviews5 / EViews5 / Docs / EViews 5 Users Guide.pdf
Скачиваний:
152
Добавлен:
23.03.2015
Размер:
11.51 Mб
Скачать
Notched boxplot
Shaded boxplot

Boxplots—409

Boxplots

What is a boxplot?

A boxplot, also known as a box and whisker diagram, summarizes the distribution of a set of data by displaying the centering and spread of the data using a few primary elements.

The box portion of a boxplot represents the first and third quartiles (middle 50 percent of the data). These two quartiles are collectively termed the hinges, and the difference between them represents the interquartile range, or IQR. The median is depicted using a line through the center of the box, while the mean is drawn using a symbol.

The inner fences are defined as the first quartile minus 1.5*IQR and the third quartile plus 1.5*IQR. The inner fences are not drawn, but graphic elements known as whiskers and staples show the values that are outside the first and third quartiles, but within the inner fences. The staple is a line drawn at the last data point within (or equal to) each of the inner fences. Whiskers are lines drawn from each hinge to the corresponding staple.

Far outlier Outer fence

Near outliers

Inner fence Staple

Whisker

Third quartile Mean Median

First quartile

Data points outside the inner fence are known as outliers. To further characterize outliers, we define the outer fences as the first quartile minus 3.0*IQR and the third quartile plus 3.0*IQR. Data between the inner and outer fences are termed near outliers, and those outside the outer fence are referred to as far outliers. A data point lying on an outer fence is considered a near outlier.

A shaded region or notch may be added to the boxplot to display approximate confidence intervals

for the median (under certain restrictive statistical assump-

tions). The bounds of the shaded or notched area are defined by the median +/- 1.57*IQR/ N ,

where N is the number of observations. Notching is useful in indicating whether two samples were drawn from populations with the same median; roughly speaking, if the notches of two boxes do not overlap, then the medians may be said to differ with 95% confidence. It is worth noting that in some cases, most likely involving small numbers of observations, the notches may be bigger than the boxes.

410—Chapter 13. Statistical Graphs from Series and Groups

Boxplots are often drawn so that the widths of the boxes are uniform. Alternatively, the box widths can be varied as a measure of the sample size for each box, with widths drawn proportional to N , or proportional to the square root of N .

Creating a boxplot

The boxplot view can be created from a series for various subgroups of your sample, or from a group.

Boxplots by Classification

From a series, select View/Descriptive Statistics/Boxplots by Classification… to display the Boxplots by Classification dialog.

In the Series/Group for classify field, enter series or group names that define your subgroups. You may type more than one series or group name; separate each name by a space.

If the classification field is left blank, statistics will be calculated for the entire sample of observations, otherwise, descriptive statistics will be calculated for each unique value of the classification series (unless automatic binning is performed).

You may specify the NA handling, and the grouping options as described in “Stats by Classification” beginning on page 312. The Show boxplot for total option allows you to include a box of the summary statistics for (ungrouped) series the boxplot view.

The set of options provided in Display in boxplot allows you to customize the initialize appearance of the boxplot. By default, the Fixed Width boxplots will show Medians, Means, Near outliers and Far outliers, as well as Shaded confidence intervals for the medians. You need not make final decisions here, since all box display elements may subsequently be shown or hidden by modifying the resulting view.

Lastly, the Options button may be used to open a dialog allowing you to customize the calculation of your quartiles.

Boxplots—411

Here we elect to display the boxplots for the series F categorized by the series ID. All of the settings are at the default values, except that we have selected the Show boxplot for total option.

Since there are 10 distinct values for ID, we display a separate boxplot for each value, showing the quartiles, means, outliers and median confidence intervals for F. Also displayed is a boxplot for labeled “Total”, corresponding to a group made up of all of the observations for F in the current sample.

As noted above, you may elect to modify the characteristics of your boxplot display. Simply click anywhere in the main graph to bring up the Graph Options dialog.

The left-hand side of the dialog repeats the display settings from the main boxplot create dialog. Here, you may choose to show or hide elements of the boxplot, and may modify the appearance of confidence intervals, or box widths.

In the right-hand portion of the dialog, you may customize individual elements of your graph. Simply select an element to customize by using the Element listbox, or by clicking on the depiction of a boxplot element in the Pre-

view window, and then choose, as appropriate, the Color, Line pattern, Line/Symbol width, and Symbol type. Note that each boxplot element is represent by either a line or a symbol, and that the dialog will show the appropriate choice for the selected element.

The preview window will change to display the current settings for your graph. To keep the current settings, click on Apply. To revert to the original graph settings, click on Undo Edits.

412—Chapter 13. Statistical Graphs from Series and Groups

It is worth pointing out that the Graph Options dialog for boxplots does not include a Type or a Legend tab. Boxplots, like a number of other statistical graphs, do not allow you to change the graph type. In addition, boxplots use text objects in place of legends for labeling the graph.

To specify the axis labels for a boxplot, click on the Axes & Scaling tab, and select the Bottom - Dates or Observations axis in the Edit Axis combo box. You may use the listbox on the right side of the dialog to edit, display, or hide individual box labels. You may also change the box label font or display the labels at an angle.

Boxplots for a Group of Series

To display a boxplot for each series in a group, open a group window and select View/ Descriptive Statistics/Boxplots…. EViews will open the Group Boxplots dialog. You may use the dialog to customize the initialize appearance of the boxplot.

By default, the Display options are set to show all of the boxplot components: Medians, Means, Near outliers and Far outliers, and Shaded confidence intervals for the medians. You may elect to hide any of the basic elements by unchecking the corresponding checkbox, and you may use the radio buttons to change the display of the median confidence intervals: None or Notched.

The Boxwidth settings are, by default, set to show

Fixed Width boxes, but you may elect to draw

Boxplots—413

boxes that have widths that are proportional to the number of observations in each series (Proportional), or proportional to the square root of the number of observations (Sqrt proportional).

You may elect to select the Balance sample checkbox, so that EViews will eliminate from the calculation sample those observations with a missing value for any of the series in the group. If this option is not selected, EViews will use the individual sample for each series.

Lastly, the Options button may be used to open a dialog allowing you to customize the calculation of your quartiles.

414—Chapter 13. Statistical Graphs from Series and Groups

Соседние файлы в папке Docs