TexaSoft Home

BASIC WINKS

PROFESSIONAL

"I consider WINKS a treasure."
- Bill Lafitte, Pepperdine University - GSEP

Order WINKS NOW

 Order WINKS Now 7

These WINKS SDA statistics tutorials explain the use and interpretation of standard statistical analysis techniques for Medical, Pharmaceutical, Clinical Trials, Marketing or Scientific Research. The examples include how-to instructions for WINKS SDA Version 7.0 Software. Download evaluation copy of WINKS

Crosstabulation Analysis (Chi-square)

Crosstabulations can be used to perform a chi-square test for independence or a chi-square test for homogeneity. A two-way table is constructed that displays the number of counts for each category. It must be possible to assume that the data observations are independent and that each data value can be counted in one and only one category. It is also assumed that the number of observations is fixed. SDA allows you to enter data for a two-way table from the keyboard or from a data set.

You can enter data for this analysis using

Examples of each are provided here:

Example 1: Entering Data from a Data Set (Analyze/Crosstabs, Frequencies, Chi-Square/ Crosstabulations/ Chi-Square)

If you choose to enter the information from a data set, you will be prompted to indicate what tables are to be calculated. Select one or more fields for the “Data field” (top right hand list box) and select one or more fields for the “By Var” field (bottom right hand side list box).

For example open the data file SALARY.SDA (salaries of professors at a college), produce a table of RANK by SEX.

Step 1: Select Analyze/Crosstabulations, Frequencies, Chi-Square/Crosstabulations, Chi-Square.

Step 2: For the variables to use, select Rank and Sex as shown here:

For all tables, you are prompted to specify what output options you want included in the output tables:

 Frequencies Total Percent Row Percent Column Percent Expected Values Chi-contribution Residual Standardized Residual Adjusted Residual

For this example, select the “Expected Values” option. Click OK and the following output is produced:

RANKS(rows) by SEX (columns)

FREQUENCY|
EXPECTED |    1|    2|   TOTAL
------------------------
1|    7|   20|     27
| 10.3| 16.7|
------------------------
2|   15|   33|     48
| 18.4| 29.6|
------------------------
3|   27|   42|     69
| 26.4| 42.6|
------------------------
4|   18|   13|     31
| 11.9| 19.1|
------------------------
TOTAL        67   108     175
38.3  61.7   100.0

Statistic                       DF      Value         p-value
-----------------------------------------------------------------
Chi-Square                       3      7.905          0.049
Phi Coefficient                          .213
Cramer's V                               .213
Contingency Coefficient                  .208

The calculated Chi-Square value is 7.905 with 3 degrees of freedom. The p-value of 0.049 indicates marginal significance. Assuming the SEX code is 1=Female and 2=Male you can see that in the highest rank (4) there were fewer females than expected (11.9 instead of 18) and more males (19.1 instead of 13). This might indicate a gender bias in how professors are promoted rank.

Question: What to the differences in expected and observed in rank=1 indicate?

This is a test of independence. For this analysis the contingency table looks at two categorical variables from a single sample of one population and tests whether the two variables are related in some way, (e.g., are sex and rank related?) The hypotheses being tested are:

Ho: The variables are independent of each other. (There is no association between them).

Ha: The variables are not independent of each other.

If there is no association them (p is greater than 0.05) it means there is no evidence of bias. A low p-value indicates rejection of the null hypothesis and in this case implies bias.

WINKS SDA reports both the chi-square statistic and the p-value. If the expected value in one or more cells is less than 5, the chi-square test may not be valid. A warning to this effect appears on the screen if appropriate. In the case of a 2 by 2 table, Fisher's Exact Test and the chi-square with Yates' correction are also performed and results displayed. Note: Tables as large as 15 columns by 100 rows may be created by reading data from a data set. If there are more categories than this, SDA combines remaining categories in a group called REST. To prevent this, you might combine some groups.

Example 2: Entering Data from the keyboard (Analyze/Crosstabs, Frequencies, Chi-Square/ Crosstabulations/ Chi-Square – From Keyboard)

Data for this example are observations of the number of beetles and bugs on the upper and lower sides of leaves (Zar,1974, page 292).

2 by 2 Contingency Table Data

 Beetles Bugs Upper Leaf 12 7 Lower Leaf 2 8

To perform this analysis, follow these steps:

Step 1: Select Analyze/Crosstabulations, Frequencies, Chi-Square/ Crosstabulations, Chi-Square - From Keyboard.

Step 2: You are first prompted to select output options. For this example, just select Frequencies. You are then prompted to indicate the size of the table. When asked for the number of rows and columns, type 2, 2 and press Enter. An empty table appears. Enter counts for each category into the appropriate cell, and choose Calculate. Preliminary results appear on the status bar a the bottom of the screen. You can perform calculations on several tables, and all results will appear in the viewer when you select Exit.

2-Way Contingency Table

FREQUENCY|     |     |   TOTAL

------------------------

|   12|    7|     19

------------------------

|    2|    8|     10

------------------------

TOTAL        14    15      29

48.3  51.7   100.0

WARNING - Some Expected values less than 5. Chi-Square may not be valid.

Statistic                     DF    Value         p-value

-------------------------------------------------------------

Chi-Square                     1    4.887          0.028

Yates' Chi-Square              1    3.312          0.069

Fisher's Exact Test (one-tail)                     0.033

(two-tail)                     0.050

Phi Coefficient                      .411

Cramer's V                           .411

Contingency Coefficient              .380

Relative Risk                       3.158

Odds Ratio                          6.857  95% C.I.=(1.124,41.829)

Sensitivity                          .857

Specificity                          .533

Sensitivity, Specificity and RR calculations are based on a

table where the cells are in the following pattern:

TP   FP

FN   TN

Step 3: The calculated chi-square statistic is reported as 4.89 with a p-value of 0.028. The chi-square with Yates correction is 3.31 with a p-value of 0.069 and the Fisher Exact Test (two-tailed) has a p-value of 0.050. Because one of the cells produces an expected value less than 5, SDA gives a warning that the chi-square analysis for this data may not be valid. Given this warning, it is best to rely on the Fisher's Exact Test for making a decision.

A low p-value indicates rejection of the null hypothesis. At a 0.05 significance level, the Fisher's Exact Test p-value of 0.050 indicates (borderline) that there is enough evidence to reject the null hypothesis of independence of the two variables and to conclude that leaf side and type of insect are not independent. In this case it appears that beetles prefer the upper sides of leaves and bugs are about split in their preference. In the case of the Yates results, this decision is marginal.

Example 3: Entering Data from Count Data Set (Analyze/Crosstabs, Frequencies, Chi-Square/ Crosstabulations/ Chi-Square – from count data)

The following data are from a classic study from 1909 reported by Karl Pearson that observed the association between drinking and criminal behavior.

Step 1: Open CROSSTAB_COUNTS.SAV and Select Analyze, Crosstabs, Frequencies, Chi-Square, Crosstab/Chi-Square (From count data.)

Step 2: Select CRIME as the row variable, DRINKER as the column and COUNT as count. Click Ok.

Step 3: From the Options menu select Frequency and Standardized Residual. Click Ok. The following (partial) output is displayed (similar to Example 1.)

CRIME(C)(rows) by DRINKER(N) (columns)
FREQUENCY|           YES|   NO|   TOTAL
------------------------
ARSON|      50|   43|     93
------------------------
RAPE|       88|   62|    150
------------------------
VIOLENCE|  155|  110|    265
------------------------
STEALING|  379|  300|    679
------------------------
COINING|    18|   14|     32
------------------------
FRAUD|      63|  144|    207
------------------------
TOTAL       753   673    1426
52.8  47.2   100.0
Typical hypotheses tested include:
Test of independence: Ho: There is no association between the two variables.
or Test of homogeneity:  Ho: Distribution of each category is same across population.
Statistic                       DF      Value         p-value
-----------------------------------------------------------------
Chi-Square                       5     49.731         <0.001
Likelihood Ratio Chi-Square      5     50.517         <0.001
Phi Coefficient                          .187
Cramer's V                               .187
Contingency Coefficient                  .184
Since p<=0.05, the null hypothesis (of independence or homogeneity)
is rejected and multiple comparisons are performed.

continues...

Click the Graph option at the top of the sceeen to display the graph grouped by Drinker within Crime.

End of tutorial

• Against All Odds VIDEOS - Now in VHS or DVD formats Teaching Videos from Annenberg/PBS Click here for info

• BeSmartNotes Reference sheets for SAS, SAS ODS, SPSS and WINKS  - Click here for info.

• Statistical Analysis Quick Reference Guidebook: With SPSS Examples is a practical "cut to the chase" handbook that quickly explains the when, where, and how of statistical data analysis as it is used for real-world decision-making in a wide variety of disciplines. It contains examples using SPSS Statistics software. In this one-stop reference, the authors  provide succinct guidelines for performing an analysis, avoiding pitfalls, interpreting results, and reporting outcomes. Paperback. Sage Publishers ISBN: 1412925606 Order book from Amazon

ue, F=False, P=Positive, N=Negative