- •Задание.
- •План работы.
- •О реализации.
- •Общие данные.
- •Анализ глубины просмотра по каждой категории и в целом.
- •Гипотеза нормальности. Модифицированный критерий χ2.
- •Гипотеза экспоненциальности. Критерий Колмогорова-Смирнова.
- •Гипотеза симметричности. Быстрый критерий Кенуя.
- •Результат выполнения программы на Python:
- •Результат анализа данных в пакете Statgraphics.
- •Сводная таблица по разделам:
- •Выводы:
- •Регрессионные модели.
- •Сводный результат выполнения программы на Python.
- •Категория 14.
- •Линейная модель.
- •Мультипликативная модель.
- •Обратная по X модель.
- •Выводы:
- •Анализ связи между категориями. Корреляционный анализ.
- •Пример.
- •Приложение. Исходный код.
Линейная модель.
Linear model: Y = a + b*X
Coefficients
|
Least Squares |
Standard |
T |
|
Parameter |
Estimate |
Error |
Statistic |
P-Value |
Intercept |
1517.72 |
638.711 |
2.37622 |
0.0195 |
Slope |
-2.07708 |
2.12035 |
-0.979591 |
0.3298 |
Analysis of Variance
Source |
Sum of Squares |
Df |
Mean Square |
F-Ratio |
P-Value |
Model |
3.05868E7 |
1 |
3.05868E7 |
0.96 |
0.3298 |
Residual |
2.96433E9 |
93 |
3.18746E7 |
|
|
Total (Corr.) |
2.99492E9 |
94 |
|
|
|
Correlation Coefficient = -0.101059
R-squared = 1.02129 percent
R-squared (adjusted for d.f.) = -0.0429978 percent
Standard Error of Est. = 5645.76
Mean absolute error = 2191.33
Durbin-Watson statistic = 0.232882 (P=0.0000)
Lag 1 residual autocorrelation = 0.532628
Col_2 = 1517.72 - 2.07708*Col_1
Since the P-value in the ANOVA table is greater or equal to 0.05, there is not a statistically significant relationship between Col_2 and Col_1 at the 95.0% or higher confidence level.
The R-Squared statistic indicates that the model as fitted explains 1.02129% of the variability in Col_2. The correlation coefficient equals -0.101059, indicating a relatively weak relationship between the variables. The standard error of the estimate shows the standard deviation of the residuals to be 5645.76. This value can be used to construct prediction limits for new observations by selecting the Forecasts option from the text menu.
The mean absolute error (MAE) of 2191.33 is the average value of the residuals. The Durbin-Watson (DW) statistic tests the residuals to determine if there is any significant correlation based on the order in which they occur in your data file. Since the P-value is less than 0.05, there is an indication of possible serial correlation at the 95.0% confidence level. Plot the residuals versus row order to see if there is any pattern that can be seen.
Мультипликативная модель.
Multiplicative model: Y = a*X^b
Coefficients
|
Least Squares |
Standard |
T |
|
Parameter |
Estimate |
Error |
Statistic |
P-Value |
Intercept |
9.81502 |
0.465532 |
21.0835 |
0.0000 |
Slope |
-1.90126 |
0.113276 |
-16.7844 |
0.0000 |
NOTE: intercept = ln(a)
Analysis of Variance
Source |
Sum of Squares |
Df |
Mean Square |
F-Ratio |
P-Value |
Model |
601.196 |
1 |
601.196 |
281.72 |
0.0000 |
Residual |
198.466 |
93 |
2.13404 |
|
|
Total (Corr.) |
799.662 |
94 |
|
|
|
Correlation Coefficient = -0.867071
R-squared = 75.1812 percent
R-squared (adjusted for d.f.) = 74.9144 percent
Standard Error of Est. = 1.46084
Mean absolute error = 1.21891
Durbin-Watson statistic = 0.125032 (P=0.0000)
Lag 1 residual autocorrelation = 0.885756
Col_2 = exp(9.81502 - 1.90126*ln(Col_1))
Unusual Residuals
|
|
|
Predicted |
|
Studentized |
Row |
X |
Y |
Y |
Residual |
Residual |
93 |
1357.0 |
1.0 |
0.0202652 |
0.979735 |
2.89 |
94 |
1429.0 |
1.0 |
0.0183681 |
0.981632 |
2.97 |
95 |
1796.0 |
1.0 |
0.0118937 |
0.988106 |
3.35 |
Since the P-value in the ANOVA table is less than 0.05, there is a statistically significant relationship between Col_2 and Col_1 at the 95.0% confidence level.
The R-Squared statistic indicates that the model as fitted explains 75.1812% of the variability in Col_2. The correlation coefficient equals -0.867071, indicating a moderately strong relationship between the variables. The standard error of the estimate shows the standard deviation of the residuals to be 1.46084. This value can be used to construct prediction limits for new observations by selecting the Forecasts option from the text menu.
The mean absolute error (MAE) of 1.21891 is the average value of the residuals. The Durbin-Watson (DW) statistic tests the residuals to determine if there is any significant correlation based on the order in which they occur in your data file. Since the P-value is less than 0.05, there is an indication of possible serial correlation at the 95.0% confidence level. Plot the residuals versus row order to see if there is any pattern that can be seen.
