WINKS Header Numbers
   

 

Home

Tutorial Menu

More About WINKS

What's New?

 

WINKS COVER

  Order WINKS

7

 

These WINKS statistics tutorials explain the use and interpretation of standard statistical analysis techniques for Medical, Pharmaceutical, Clinical Trials, Marketing or Scientific Research. The examples include how-to instructions for WINKS SDA Version 6.0 Software. Download evaluation copy of WINKS.


Simple Linear Regression

This is one in a series of tutorials using examples from WINKS SDA.

Definition: Used to develop an equation (a linear regression line) for predicting a value of the dependent variables given a value of the independent variable. A regression line is the line described by the equation and the regression equation is the formula for the line. The regression equation is given by:

Y = a + bX

where X is the independent variable, Y is the dependent variable, a is the intercept and b is the slope of the line.

Assumptions: For a fixed value of X (the independent variable), the population of Y (the dependent variable) is normally distributed with equal variances across Xs.

Related statistics: The correlation coefficient, r, measures the strength of the association between X and Y.

Test: A test that of the slope of the regression line is 0 is used to determine if the regression line shows an statistically significant linear relationship between X and Y. The hypotheses for this test are:

H0: slope = 0
Ha: slope <> 0

A low p-value for this test (less than 0.05) means that there is evidence to believe that the slope of the line is not 0, or that there is a statistically significant linear relationship between the two variables.

Note: This test is equivalent to the test rho = 0 in the correlation procedure.

Location in WINKS: Simple linear regression is located in the Regression and Correlation procedures menu.

Graphs: Graphs produced with the simple linear regression procedure are:

1. Scatterplot with fitted regression line.

2. Residuals by the independent variable.

3. Residuals by run order.

Examination of the graphs is useful to visually verify that the relationship is linear and that there is no pattern to the residuals. If there is a pattern to the residuals, remedial methods may need to be taken for the analysis. Reference: Neter, Wasserman and Kutner.

Example: Use the Simple Linear Regression procedure to calculate a prediction equation for HP given the WEIGHT of a car using the CAR database. The partial results are:

 Dependent variable is HP, 1 independent variables, 38 cases.

Variable Coefficient St. Error t-value p(2-tail)
Intercept 3.4983434 7.3194069 .4779545 0.6326
WEIGHT 34.314394 2.4839847 13.814253 0.000

R-Square = 0.8413      Adjusted R-Square = 0.8369

Analysis of Variance Table
Source Sum of Sqs d.f. Mean Sq F p-value
Regression 21768.755 1 21768.775 190.8336 0.000
Error 4106.593 36 114.07203
Total 25875.368 37

A low p-value suggests that the dependent variable HP may be linearly related to independent variable(s).

Pearson's r (Correlation Coefficient)= 0.9172    R-Square=.8413

The linear regression equation is:

  HP = 3.498343 + 34.3144 * WEIGHT

Test of hypothesis to determine significance of relationship:
  H(null): Slope = 0 or H(null): r = 0 (two-tailed test)

  t = 13.81 with 36 degrees of freedom p = 0.000

The following plots show the linear relationship and the randomness of the residuals.


 

Warning: Using the regression equation to predict values of the dependent variable outside the range of the independent variable is not recommended since you have no evidence that the same linear relationship exists outside the observed range.

Related Topics: Correlation, multiple linear regression, polynomial regression.

Simple Linear Regression - Exercise

Data: A random sample of 14 elementary school students is selected, and each student is measured on a creativity score (x) using a well-defined testing instrument and on a task score (y) using a new instrument. The task score is the mean time taken to perform several hand-eye coordination tasks. The data are:

STUDENT	CREATIVITY(X)	            TASKS(Y)
AE		28			4.5
FR		35			3.9
HT		37			3.9
IO		50			6.1
DP		69			4.3
YR		84			8.8
QD		40			2.1
SW		65			5.5
DF		29			5.7
ER		42			3.0
RR		51			7.1
TG		45			7.3
EF		31			3.3
TJ		40			5.2

Answer these questions:

1. Calculate the regression equation for this data, and enter it here:

  Y =

2. Is the relationship between tasks score and creativity score statistically significant?

3. What statistic do you base your answer on?

4. Look at a scattergram of Creativity by tasks. Does the relationship look linear?

5. Look at a plot of residuals. Do the residuals look random?

6. Do you think the task score does a good job of estimating the student's creativity score. Why?


Do this analysis NOW using WINKS

Analyze Data With WINKS

Yes! You've Found it!

WINKS SDA Data Analytics Software
Affordable Software for Predictive Analytics
in Health, Science, Business, and Government

Free Shipping for a limited time...

Best value for a statistics software program, starting at $75 for download version. (Less for student editions.)

WINKS
Statistical Software 

Reliable. Relevant. Affordable.
www.texasoft.com



 We guarantee WINKS will meet your statistical needs -- or your money back! (return within 30 days for full refund.)

Download WINKS Evaluation Version Now

 

 

Statistical Software
Reliable. Relevant. Affordable.
Data analysis software that's easy to use.
www.texasoft.com


| Tutorial Index | TexaSoft Homepage | Send comments |
\A9 Copyright TexaSoft, 2004-2013