Navigational Menu

MAIN MENU

OVERVIEW OF STATISTICAL THINKING
Levels of Measurement
Samples
Descriptive Statistics
Statistical Inference

MICROCASE
Getting Started

File Management
Data Management

MICROCASE
Basic Statistics Options
UnivariateStatistics

CrossTabulations

ttest/ANOVA

Mapping

Scatterplot

Correlation

Regression

AVAILABLE DATA SETS

STATISTICAL SOURCES ON LINE

 

MICROCASE

BASIC STATISTICS

UNIVARIATE STATISTICS

Statistics to describe one variable

1.  When you select the UNIVARIATE STATISTICS option, a window will appear for entering the primary variable you are interested in--your dependent variable.  Type in the variable # or type in the variable name (be careful about spaces and periods).

2.  If you are not sure which variable you want, there is a search option or you can scroll through the entire variable list.  If you highlight a variable in the variable list, the definition of the variable (often the question asked of survey respondents) and the possible attributes will appear in a box in the lower right of the window.

3.  Sometimes you will want to look at a variable for only part of the data set, for example, only for males or only for females, or only for people with college educations, or only for people living with a romantic partner.  To do this type a variable # or name in the SUBSET box.  For example, if you want to examine a variable only for men, you would type SEX in the subset box; the computer would then prompt you whether you want to look only at men or only at women, and you would select men.  You can select more than one subset variables, for example use sex and marital status to select married women.

4.  Be careful to eliminate Adon=t knows@ and Ano answers@ by using the SUBSET function (or by defining them as missing data on the FILE SETTINGS screen as you open the data file) before you write down the statistics or copy the pie charts and bar graphs.

5.  When you=ve indicated the variable you=re interested in and any subset variables, click on OK in the top right corner of the window.

6.  Three kinds of statistics are available from the UNIVARIATE STATISTICS option:

Pie Charts: Pictorial depictions of the percentage distribution of the variable.  If 100% of the sample gave one response or shared the same attribute, the whole pie would be one color; the size of the different colored pie wedges represents the proportion of the sample who share that attribute.

Bar Graphs: When there are more than 10 attributes in a variable, MicroCase automatically gives you a bar graph rather than a pie chart.  Here the heights of the bars represent the number (frequency) of the sample units which share each attribute.
--Note that by using the left and right arrows you can move from one bar to another.
--Note that there is a menu on the left margin where you can move among the UNIVARIATE STATISTICS options.

--
Note that there is also an option for a Cummulative Bar Graph which tells you the percentage of the sample with a particular attribute or a lower attribute; since Cummulative Bar Graphs are always interpreted with an Aor lower@ statement, they are appropriate only for variables, the attributes of which can be arrayed along a continuum, e.g., education or family size, but not religion or race.

Statistics: By clicking on the Statistics option on the left margin of the screen, you can get a frequency and percentage distribution, as well as the summary statistics.  There are three averages which you can see from this screen:

MODE: most frequently occurring score (score with the highest frequency or percentage).  This average is useful with categorical data (e.g., 1=men, 2=women) where there is no logical order to the categories; that is, although men are scored 1 and women are scored 2, it is nonsense to think men are less than women.

MEDIAN: score which divides the distribution in half; half of the scores are above this score and half are below this score.  This average is useful when the responses are arrayed along a continuum from high to low or low to high (e.g., 1=strongly agree, 2=agree, 3=can=t choose, 4=disagree, 5=strongly disagree).
--Note, when there is more than one score in the median interval, MicroCase interpolates the exact position of the median between the upper and lower limits of the median interval.
--Note, sometimes GSS lists categories that you think could be ordered if they were rearranged (e.g., 1=agree, 2=disagree, 3=can
=t choose); you can use the COLLAPSE option on the DATA MANAGEMENT menu to recode these responses (e.g., 1=agree, 2=can=t choose, 3=disagree).

MEAN: arithmetic average of the scores (add up all the scores and divide by the number of scores).  This average is useful when the response categories are ordered numerical categories, e.g., age in years, education in years, income in $s, etc.  

VARIANCE: squared average variation around the mean.  The variance is derived by subtracting each individual score from the mean of the distribution of scores (deviation scores), squaring each deviation score, summing the squared deviation scores (sum of squares), and dividing the sum by the number of scores in the distribution.  
a.   Statisticians square the deviation scores because the mean is defined as that point in a distribution, the sum of the deviations from which equals zero.
                              Deviation     (Deviation
Person   Score   Score        Score)2            
Alex
            8            -1                 1
Jessie         9             0                  0
Les             10          +1                 1
Sam            9              0                 0                      
Sums          36            0                 2

The mean for this distribution of scores is 9 ([8+9+10+9]/4).  Alex's deviation score is -1, Jessie's deviation score is 0, Les' deviation score is +1, and Sam's deviation score is 0.  The sum of the deviation scores is zero.  This will be true for every distribution, assuming the mean and the deviation scores are calculated correctly.  The intuitive mean deviation would be the sum of the deviation scores divided by the number of deviation scores, but this number will always equal zero.  Therefore, statisticians square each of the deviation scores so that the sum of the deviation scores will not be zero unless every score is the same as the mean score.  The average of the sum of the squared deviations (2/4=.5 in the example above) is called the variance

STANDARD DEVIATION: the square root of the variance.  The standard deviation converts the variance back to raw score units (instead of squared raw score units) and is interpreted as the average variation around the mean.  In the example above the square root of 0.5 is 0.71.  A researcher would describe the above distribution as having a mean of 9 and a standard deviation of .7.  In other words, the central tendency of the distribution is 9 and on, average scores deviation from that central tendency by 7/10ths of a point.
More precise interpretations of the standard deviation can be derived from the normal curve.  Statisticians have determined that if a variable is normally distributed (i.e., when graphed, the shape of the distribution is the same as that of the normal curve) that 68% of the scores in a distribution will fall between plus and minus one standard deviation from the mean, 95% of the scores in a distribution will fall between plus and minus two standard deviations from the mean; and 99.7% of the scores in a distribution will fall between plus and minus three standard deviation units from the mean.  

CONFIDENCE INTERVALS OF THE MEAN: the range of scores that a researcher is confident--at different degrees of certainty--contain the actual population mean.
95% confidence interval +/- mean: 22-30
      The researcher is 95% confident that the interval 22-30
      captures the actual population mean. 
99% confidence interval +/- mean
: 24-28
      The researcher is 99% confident that the interval 24-26
      captures the actual population mean.

STANDARD SCORES (or z-scores): deviations above or below the mean in units of standard deviation.  For example Alex's z score in the above distribution would be -1.43 which is interpreted as Alex scored 1.43 standard deviation units below the mean. 

7.  You can print the pie chart, the bar graph, or the statistics by clicking on the printer icon in the top tool bar.

8.  To copy a screen into a word processing document, click on the double-sheet-of-paper icon in the top tool bar; this copies the screen to the clipboard.  Minimize the MicroCase screen by clicking on the dash line in the very top right toolbar.  Open (or maximize) your word processor (either Word or Word Perfect) and paste the screen image into your word processing document.  The paste icon is a clipboard with a sheet of paper on top.  To return to MicroCase when you are finish with the word processor, click on the MicroCase icon in the taskbar along the bottom of your screen.

9.  Note that the MicroCase tool bar at the top also allows you to view the FILE NOTES screen for the data set (by clicking on the notebook icon) and the variable definitions (by clicking on the V icon).  If you open either of these documentation screens, be sure to click on OK to close them.

10.  To look at the UNIVARIATE STATISTICS for another variable, click on the counter clockwise arrow in the top tool bar.

11.  To move to another statistical option (e.g., CROSS TABULATION or COLLAPSE) or to exit from MicroCase, click on the MENU button in the top tool bar.  

 

for questions or comments contact me at mduncombe@coloradocollege.edu
last updated on November 25, 2002