Calculation of the Base

When calculating the base, IBM® SPSS® Data Collection Survey Reporter includes every case for which the case data stored in the variable is not Null. A value of Null is a special marker that indicates that the value is not known and generally indicates that the question on which the variable is based was not asked. A value of Null is different from an empty or zero value.

When a respondent is asked a categorical or open-ended question but for some reason does not answer, the case data generally stores an empty categorical value ({}) or an empty string ("") respectively (although some questions have one or more special categories to indicate that the respondent did not provide an answer). Consequently, for categorical and text data, it is possible to distinguish between a question that was asked but not answered and one that was not asked at all. However, in numeric data it is not possible to distinguish questions that were asked but not answered from those that were not asked at all, because the IBM® SPSS® Data Collection Data Model currently stores a Null value for both.

In a simple survey where a case corresponds to a respondent, the base generally includes every respondent who was asked the question on which the variable is based, regardless of whether he or she actually answered it or not.

When you use a subset of the categories in a variable on the side or top of the table, the base is the same as when you select all of the categories. To illustrate this, we will use the signs variable in the Museum XML survey data file to create an unfiltered one-dimensional table:

Notice that the base is 298, which is the sum of the counts in the three categories. This is a single response variable and all of the respondents who were asked the question answered it. Note that if any of the respondents had not answered the question, they would be included in the base too. Now let's include only the first two categories:

Notice that the base is still 298, but it no longer represents the sum of the counts in the categories on the table. This is because the base represents the number of respondents who were asked the question and is not based on the counts in the categories that have been selected for inclusion on the table. If you want the base to reflect only the respondents who selected the categories that are shown on the table, you would need to use a filter. For example, here is the table after adding a filter to exclude respondents who did not choose either of the two categories that are shown:

Notice that the base is now 271, which is the sum of the counts in the two categories that are shown on the table.

Now suppose we want to add a mean element based on the visits numeric variable to the axis in the unfiltered table:

The visits variable is a numeric variable, which means that it stores a Null value for respondents who were not asked or did not answer the question on which it is based. In the Museum sample, the visits variable stores a Null value for some of the respondents who are included in this table. When Survey Reporter calculates the base used by the mean value calculation, it includes only respondents who are included in the table base and for whom the numeric variable does not store a Null value.

The following table lists some hypothetical responses and shows whether the case is included in the base for the axis and the base for the mean.

CaseValue in Signs VariableValue in Visits VariableIncluded in Axis Base in Unfiltered TableIncluded in Base for Mean Element
1{Yes}4YesYes
2{No}NullYesNo
3{Dont_Know}5YesYes
4Null2NoNo
5{}NullYesNo

Notes:

• When working with the hierarchical view of the data, empty levels are considered to be Null and are not counted in the base. See the seventh example table in Examples Showing Results Generated at Different Levels for more information.