Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Semestr2 / 1 - Oracle / Oracle selected docs / SQL reference.pdf
Скачиваний:
24
Добавлен:
12.05.2015
Размер:
11.92 Mб
Скачать

REGR_ (Linear Regression) Functions

REFTOHEX(BUILDING)

--------------------------------------------------------------------------

0000220208859B5E9255C31760E034080020825436859B5E9255C21760E034080020825436

REGR_ (Linear Regression) Functions

The linear regression functions are:

REGR_SLOPE

REGR_INTERCEPT

REGR_COUNT

REGR_R2

REGR_AVGX

REGR_AVGY

REGR_SXX

REGR_SYY

REGR_SXY

Syntax linear_regr::=

REGR_SLOPE

REGR_INTERCEPT

REGR_COUNT

REGR_R2

OVER ( analytic_clause )

REGR_AVGX

(

expr1

,

expr2

)

REGR_AVGY

REGR_SXX

REGR_SYY

REGR_SXY

Functions 6-129

REGR_ (Linear Regression) Functions

See Also: "Analytic Functions" on page 6-10 for information on syntax, semantics, and restrictions

Purpose

The linear regression functions fit an ordinary-least-squares regression line to a set of number pairs. You can use them as both aggregate and analytic functions.

See Also:

"Aggregate Functions" on page 6-8

"About SQL Expressions" on page 4-2 for information on valid forms of expr

Oracle applies the function to the set of (expr1, expr2) pairs after eliminating all pairs for which either expr1 or expr2 is null. Oracle computes all the regression functions simultaneously during a single pass through the data.

expr1 is interpreted as a value of the dependent variable (a "y value"), and expr2 is interpreted as a value of the independent variable (an "x value"). Both expressions must be numbers.

REGR_SLOPE returns the slope of the line. The return value is a number and can be null. After the elimination of null (expr1, expr2) pairs, it makes the following computation:

COVAR_POP(expr1, expr2) / VAR_POP(expr2)

REGR_INTERCEPT returns the y-intercept of the regression line. The return value is a number and can be null. After the elimination of null (expr1, expr2) pairs, it makes the following computation:

AVG(expr1) - REGR_SLOPE(expr1, expr2) * AVG(expr2)

REGR_COUNT returns an integer that is the number of non-null number pairs used to fit the regression line.

REGR_R2 returns the coefficient of determination (also called "R-squared" or "goodness of fit") for the regression. The return value is a number and can be null. VAR_POP(expr1) and VAR_POP(expr2) are evaluated after the elimination of null pairs. The return values are:

6-130 Oracle9i SQL Reference

REGR_ (Linear Regression) Functions

NULL if VAR_POP(expr2) = 0

1 if VAR_POP(expr1) = 0 and VAR_POP(expr2) != 0

POWER(CORR(expr1,expr),2) if VAR_POP(expr1) > 0 and VAR_POP(expr2 != 0

All of the remaining regression functions return a number and can be null:

REGR_AVGX evaluates the average of the independent variable (expr2) of the regression line. It makes the following computation after the elimination of null (expr1, expr2) pairs:

AVG(expr2)

REGR_AVGY evaluates the average of the dependent variable (expr1) of the regression line. It makes the following computation after the elimination of null (expr1, expr2) pairs:

AVG(expr1)

REGR_SXY, REGR_SXX, REGR_SYY are auxiliary functions that are used to compute various diagnostic statistics.

REGR_SXX makes the following computation after the elimination of null (expr1, expr2) pairs:

REGR_COUNT(expr1, expr2) * VAR_POP(expr2)

REGR_SYY makes the following computation after the elimination of null (expr1, expr2) pairs:

REGR_COUNT(expr1, expr2) * VAR_POP(expr1)

REGR_SXY makes the following computation after the elimination of null (expr1, expr2) pairs:

REGR_COUNT(expr1, expr2) * COVAR_POP(expr1, expr2)

The following examples are based on the sample tables sh.sales and sh.products.

General Linear Regression Example

The following example provides a comparison of the various linear regression functions:

Functions 6-131

REGR_ (Linear Regression) Functions

SELECT

 

 

 

 

 

 

s.channel_id,

 

 

 

 

 

REGR_SLOPE(s.quantity_sold, p.prod_list_price)

SLOPE ,

 

REGR_INTERCEPT(s.quantity_sold, p.prod_list_price) INTCPT ,

REGR_R2(s.quantity_sold, p.prod_list_price) RSQR ,

 

REGR_COUNT(s.quantity_sold, p.prod_list_price)

COUNT ,

 

REGR_AVGX(s.quantity_sold, p.prod_list_price)

AVGLISTP ,

 

REGR_AVGY(s.quantity_sold, p.prod_list_price)

AVGQSOLD

 

FROM

sales s, products p

 

 

 

WHERE s.prod_id=p.prod_id AND

 

 

 

p.prod_category=’Men’

AND

 

 

 

s.time_id=to_DATE(’10-OCT-2000’)

 

 

 

GROUP BY s.channel_id

 

 

 

 

;

 

 

 

 

 

 

C

SLOPE

INTCPT

RSQR

COUNT

AVGLISTP

AVGQSOLD

- ----------

----------

---------- ---------- ---------- ----------

C -.03529838

16.4548382

.217277422

17

87.8764706

13.3529412

I -.0108044

13.3082392

.028398018

43

116.77907

12.0465116

P -.01729665

11.3634927

.026191191

33

80.5818182

9.96969697

S -.01277499

13.488506

.000473089

71

52.571831

12.8169014

T -.01026734

5.01019929

.064283727

21

75.2

4.23809524

REGR_SLOPE and REGR_INTERCEPT Examples

The following example determines the slope and intercept of the regression line for the amount of sales and sale profits for each fiscal year:

SELECT t.fiscal_year,

REGR_SLOPE(s.amount_sold, s.quantity_sold) "Slope", REGR_INTERCEPT(s.amount_sold, s.quantity_sold) "Intercept"

FROM sales s, times

t

WHERE s.time_id = t.time_id

GROUP BY

t.fiscal_year;

FISCAL_YEAR

Slope

Intercept

-----------

----------

----------

1998

49.3934247

71.6015479

1999

49.3443482

70.1502601

2000

49.2262135

75.0287476

The following example determines the cumulative slope and cumulative intercept of the regression line for the amount of and quantity of sales for two products (270 and 260) for weekend transactions (day_number_in_week = 6 or 7) during the last three weeks (fiscal_week_number of 50, 51, or 52) of 1998:

6-132 Oracle9i SQL Reference

REGR_ (Linear Regression) Functions

SELECT t.fiscal_month_number "Month", t.day_number_in_month "Day", REGR_SLOPE(s.amount_sold, s.quantity_sold)

OVER (ORDER BY t.fiscal_month_desc, t.day_number_in_month) AS CUM_SLOPE, REGR_INTERCEPT(s.amount_sold, s.quantity_sold)

OVER (ORDER BY t.fiscal_month_desc, t.day_number_in_month) AS CUM_ICPT FROM sales s, times t

WHERE s.time_id = t.time_id AND s.prod_id IN (270, 260) AND t.fiscal_year=1998

AND t.fiscal_week_number IN (50, 51, 52) AND t.day_number_in_week IN (6,7)

ORDER BY t.fiscal_month_desc, t.day_number_in_month;

Month

Day

CUM_SLOPE

CUM_ICPT

----------

---------- ---------- ----------

12

12

-68

1872

12

12

-68

1872

12

13

-20.244898 1254.36735

12

13

-20.244898 1254.36735

12

19

-18.826087

1287

12

20

62.4561404

125.28655

12

20

62.4561404

125.28655

12

20

62.4561404

125.28655

12

20

62.4561404

125.28655

12

26

67.2658228

58.9712313

12

26

67.2658228

58.9712313

12

27

37.5245541

284.958221

12

27

37.5245541

284.958221

12

27

37.5245541

284.958221

REGR_COUNT Examples

The following example returns the number of customers in the customers table (out of a total of 319) who have account managers.

SELECT REGR_COUNT(customer_id, account_mgr_id) FROM customers;

REGR_COUNT(CUSTOMER_ID,ACCOUNT_MGR_ID)

--------------------------------------

231

The following example computes the cumulative number of transactions for each day in April of 1998:

SELECT UNIQUE t.day_number_in_month,

REGR_COUNT(s.amount_sold, s.quantity_sold)

Functions 6-133

REGR_ (Linear Regression) Functions

OVER (PARTITION BY t.fiscal_month_number

ORDER BY t.day_number_in_month) "Regr_Count"

FROM sales s, times t

WHERE s.time_id = t.time_id

AND t.fiscal_year =

1998 AND t.fiscal_month_number = 4;

DAY_NUMBER_IN_MONTH

Regr_Count

-------------------

----------

 

1

825

 

2

1650

 

3

2475

 

4

3300

.

 

 

.

 

 

.

26

21450

 

 

30

22200

REGR_R2 Examples

The following example computes the coefficient of determination of the regression line for amount of sales greater than 5000 and quantity sold:

SELECT REGR_R2(amount_sold, quantity_sold) FROM sales WHERE amount_sold > 5000;

REGR_R2(AMOUNT_SOLD,QUANTITY_SOLD)

----------------------------------

.024087453

The following example computes the cumulative coefficient of determination of the regression line for monthly sales amounts and quantities for each month during 1998:

SELECT t.fiscal_month_number, REGR_R2(SUM(s.amount_sold), SUM(s.quantity_sold)) OVER (ORDER BY t.fiscal_month_number) "Regr_R2"

FROM sales s, times t WHERE s.time_id = t.time_id AND t.fiscal_year = 1998

GROUP BY t.fiscal_month_number ORDER BY t.fiscal_month_number;

FISCAL_MONTH_NUMBER Regr_R2

------------------- ----------

1

2 1

6-134 Oracle9i SQL Reference

REGR_ (Linear Regression) Functions

3.927372984

4.807019972

5.932745567

6.94682861

7.965342011

8.955768075

9.959542618

10.938618575

11.880931415

12.882769189

REGR_AVGY and REGR_AVGX Examples

The following example calculates the regression average for the amount and quantity of sales for each year:

SELECT t.fiscal_year,

REGR_AVGY(s.amount_sold, s.quantity_sold) "Regr_AvgY", REGR_AVGX(s.amount_sold, s.quantity_sold) "Regr_AvgX"

FROM sales s, times t WHERE s.time_id = t.time_id GROUP BY t.fiscal_year;

FISCAL_YEAR Regr_AvgY Regr_AvgX

----------- ---------- ----------

1998 716.602044 13.0584283

1999 714.910831 13.0665536

2000 717.331304 13.0479781

The following example calculates the cumulative averages for the amount and quantity of sales profits for product 260 during the last two weeks of December 1998:

SELECT t.day_number_in_month, REGR_AVGY(s.amount_sold, s.quantity_sold)

OVER (ORDER BY t.fiscal_month_desc, t.day_number_in_month) "Regr_AvgY",

REGR_AVGX(s.amount_sold, s.quantity_sold)

OVER (ORDER BY t.fiscal_month_desc, t.day_number_in_month) "Regr_AvgX"

FROM sales s, times t WHERE s.time_id = t.time_id

AND s.prod_id = 260

AND t.fiscal_month_desc = ’1998-12’ AND t.fiscal_week_number IN (51, 52)

ORDER BY t.day_number_in_month;

Functions 6-135

REGR_ (Linear Regression) Functions

DAY_NUMBER_IN_MONTH Regr_AvgY

Regr_AvgX

-------------------

---------- ----------

14

882

24.5

14

882

24.5

15

801

22.25

15

801

22.25

16

777.6

21.6

18

642.857143

17.8571429

18

642.857143

17.8571429

20

589.5

16.375

21

544

15.1111111

22

592.363636

16.4545455

22

592.363636

16.4545455

24

553.846154

15.3846154

24

553.846154

15.3846154

26

522

14.5

27

578.4

16.0666667

REGR_SXY, REGR_SXX, and REGR_SYY Examples

The following example computes the REGR_SXY, REGR_SXX, and REGR_SYY values for the regression analysis of amount and quantity of sales for each year in the sample sh.sales table:

SELECT t.fiscal_year,

REGR_SXY(s.amount_sold, s.quantity_sold) "Regr_sxy", REGR_SYY(s.amount_sold, s.quantity_sold) "Regr_syy", REGR_SXX(s.amount_sold, s.quantity_sold) "Regr_sxx"

FROM sales s, times t WHERE s.time_id = t.time_id GROUP BY t.fiscal_year;

FISCAL_YEAR Regr_sxy Regr_syy Regr_sxx

----------- ---------- ---------- ----------

1998 1620591607 2.3328E+11 32809865.2

1999 1955866724 2.7695E+11 39637097.2

2000 2127877398 3.0630E+11 43226509.7

The following example computes the cumulative REGR_SXY, REGR_SXX, and REGR_SYY statistics for amount and quantity of weekend sales for products 270 and 260 for each year-month value in 1998:

SELECT t.day_number_in_month, REGR_SXY(s.amount_sold, s.quantity_sold)

OVER (ORDER BY t.fiscal_year, t.fiscal_month_desc) "Regr_sxy",

6-136 Oracle9i SQL Reference

Соседние файлы в папке Oracle selected docs