Navigational Menu
MAIN MENU
OVERVIEW OF STATISTICAL THINKING
MICROCASE
MICROCASE
BASIC STATISTICS OPTIONS
AVAILABLE DATA SETS
STATISTICAL SOURCES ON LINE
QUANTITATIVE METHODS
|
MICROCASE
BASIC STATISTICS
SCATTERPLOT
SCATTERPLOT--Graphing
the relationship between two variables (both of which should be
intervally measured or at least arrayed along a continuum).
1.
When you select the SCATTERPLOT option, a window appears in which
you indicate your independent variable (X-axis) and your dependent
variable (Y-axis).
A.
You can also indicate if you want a subset of the cases included
in the scatterplot.
B.
If you have a large sample or a small number of attributes on
either variable, the scatterplot may be difficult to interpret because it
will put cases with the same score on top of each other, but looking at
a two-dimension screen or reading a two-dimensional printout, you will
not be able to determine that a number of cases are at the same point.
If this occurs, you might get a clearer scatterplot by
restricting the analysis to a subset of the original data set.
2.
When you have indicated your independent and dependent variables
and any subset variables, click on the OK button in the upper right
corner of the screen.
A.
The outcome is a graph of the relationship between the two
variables.
B.
Remember scatterplots are asymmetrical; if you reverse the
independent and dependent variables, you will get different graphs.
C.
The correlation coefficient (Pearson r), the number of
cases in the analysis, and the number of cases with missing data are
printed below the scatterplot. The probability value refers to the
statistical significance of the correlation coefficient.
D.
Click on CASE on the left side of the screen to find the point
that represents a particular state.
E.
Click on REGRESSION LINE on the left side of the screen.
Doing this will add the regression line to the scatterplot and
the regression equation (the slope is unstandardized) to
the information below the scatterplot.
You can also look at the residuals (deviations of actual
points from the regression line) by clicking on RESIDUAL on the left
side of the screen. To
remove the regression line, click a second time on REG.LINE.
F.
Outliers (unusual or extreme scores) exert a powerful
distorting effect on the slope of a regression line and the correlation
coefficient. Note that
there is an option to identify and then eliminate the outliers.
Click on OUTLIER on the left side of the screen.
A data point will be highlighted, its values on independent and
dependent variables identified, the new correlation coefficient if it
were to be eliminated calculated. You
then choose whether to leave it in (click a second time on OUTLIER) or
take it out (click on REMOVE). When
using Ecological Data you can also click on individual points in the
scatterplot and evaluate their impact on the correlation coefficient;
once you have highlighted a point, you can take it out by clicking on
REMOVE. Be very careful
with this option; be sure to have a rationale for removing a data
point--more than just a bigger correlation coefficient.
Once you=ve removed a data point, you can not add it back in.
You have to redo the analysis (click on the counter clockwise
arrow in the tool bar; then click on OK in the upper right corner of the
window).
3.
The icons in the top tool bar allow you to print or cut and paste
a screen, review and cut and paste the FILE NOTES, review and cut and
paste the variable definitions, return to the basic SCATTERPLOT screen,
or return to the MicroCase MENU screens.
for questions or comments contact me at mduncombe@coloradocollege.edu
last updated on August 19, 2003
|