Basic population estimates with British Social Attitudes Survey data using SPSS

Author

UK Data Service

Published

June 2024

This exercise is part of the ‘Introduction to the British Social Attitudes Survey (BSA)’ online module. In the exercise, we examine data from the 2020 British Social Attitudes survey to find out:

Answers to the questions asked throughout the exercise can be found at the end of the page.

Getting started

Data can be downloaded from the UK Data Service website following registration. Download the compressed folder, unzip and save it somewhere accessible on your computer.

The examples below assume that the dataset has been saved in a new folder named UKDS on your Desktop (Windows computers). The path would typically be C:\Users\YOUR_USER_NAME\Desktop\UKDS. Feel free to change it to the location that best suits your needs.

You need to set the folder as your working directory in SPSS. To do this, you need to add the correct file path to the folder on your computer to the code below.

* Setting up the working directory
* Change the command below  to match yours: 

cd "C:\Users\YOUR_USER_NAME\Desktop\UKDS".
show DIRECTORY.

A screenshot of SPPS Output window showing the current default directory

Output of the show DIRECTORY command

If you have your working directory saved to the folder location, the following code should open the BSA dataset.

GET FILE='BSA\UKDA-9005-spss\spss\spss25\bsa2020_archive.sav'.

A screenshot of SPPS Data Editor  window showing the BSA dataset in variable view

BSA dataset in SPSS Variables View

1. Explore the dataset

Start by getting an overall feel for the dataset. Either inspect variables and cases in the data editor or use the code below to produce a summary of all the variables in the dataset.

CODEBOOK all. 

A screenshot of SPPS Output  window showing the output of the CODEBOOK command for the SERIAL variable

SPSS codebook output for the first variables

Questions

1. What is the overall sample size? 2. How many variables are in the dataset?

Now, focus on the three variables we will use.

CODEBOOK TAXSPEND EUVOTWHO PenExp2.  

A screenshot of SPPS Output  window showing the output of the TAXSPEND command for the SERIAL variable

SPSS codebook output for TAXSPEND

Questions 3

What do the variables measure and how?

2. Missing values

Review the frequency tables, examining the not applicable and don’t know categories.

Question 4

Why for EUVOTWHO are there so many not applicable? Note, you can use the documentation to check if needed. What does this mean when it comes to interpreting the percentages?

When analysing survey data, it is sometimes convenient to recode item nonresponses such as ´Don’t know´ and ‘Prefer not to say’ as system missing so that they do not appear in the results. An example of the syntax required to achieve this with EUVOTWHO and TAXSPEND is provided in the appendix.

Unlike some other surveys, ‘Don’t knows’ and ‘Does not apply’ were not removed when weights were computed in the BSA. As a result, analyses using weights (ie when planning to use the data to make inference about the British population) need to retain these observations, otherwise estimated results might be incorrect.

3. Compare unweighted and weighted frequencies

Let’s examine the weighted responses.

WEIGHT Off.
*This line is probably not unnecessary as we have not applied a weight yet; it has been included just to make sure we are looking at unweighed results. 
FREQUENCIES VARIABLES=TAXSPEND EUVOTWHO
  /BARCHART PERCENT
  /ORDER=ANALYSIS.
*Here, we use the FREQUENCIES command for the categorical variables and the EXAMINE command for the continous variables. 
EXAMINE VARIABLES=PenExp2
  /PLOT HISTOGRAM
  /STATISTICS DESCRIPTIVES
  /MISSING LISTWISE
  /NOTOTAL.

"A screenshot of SPPS Output  window showing the unweighted output of the FREQUENCIES command for the TAXSPEND EUVOTWHO variables

SPSS output for Frequency distribution of TAXSPEND and EUVOTWHO

What is the (unweighted) percent who say they voted remain in the EU referendum? The answer is about 58 percent of those who voted in the referendum say they voted to remain. This figure seems a bit high (though people do not always report accurately).

Let’s add the weight.

*The weight is added by the command below. It will remain on for all subsequent analyses. 
WEIGHT BY BSA20_wt_new.
FREQUENCIES VARIABLES=TAXSPEND EUVOTWHO
   /ORDER=ANALYSIS.
EXAMINE VARIABLES=PenExp2
  /PLOT HISTOGRAM
  /STATISTICS DESCRIPTIVES
  /CINTERVAL 95
  /MISSING LISTWISE
  /NOTOTAL.

*To stop weighting the data you can use the following command. 
WEIGHT off. 

A screenshot of SPPS Output  window showing the weighted output of the FREQUENCIES command for the TAXSPEND EUVOTWHO variables

SPSS output for Frequency distribution of TAXSPEND and EUVOTWHO

Now, what proportion say they voted remain in the EU referendum? It is about 54 percent, lower than the unweighted proportion and closer to the actual referendum results.

4. Confidence intervals

Add confidence intervals to the bar charts and mean to indicate uncertainty due to sampling error.

WEIGHT BY BSA20_wt_new.
GRAPH
  /BAR(SIMPLE)=PCT BY TAXSPEND
  /INTERVAL CI(95.0).

GRAPH
  /BAR(SIMPLE)=PCT BY EUVOTWHO
  /INTERVAL CI(95.0).

EXAMINE VARIABLES=PenExp2
  /PLOT NONE
  /STATISTICS DESCRIPTIVES
  /CINTERVAL 95
  /MISSING LISTWISE
  /NOTOTAL.
  

A screenshot of SPPS Output  window showing a bar plot  of the frequency distribution of  the TAXSPEND and EUVOTWHO variables

SPSS output for GRAPH BAR of TAXSPEND and EUVOTWHO

Question 5

What proportion think government should increase taxes and spend more on health, education and social benefits?

Question 6

How much do people think they will get at state pension age?

Additional question

Select two variables that interest you and examine their distribution.

Answers

  1. There are 3964 cases in the dataset.

  2. The total number of variables is 213.

  3. TAXSPEND records responses to the questions of whether government should reduce/increase/maintain levels of taxation and spending? There are three possible responses to the question. EUVOTWHO records responses to the question ‘Did you vote to ’remain a member of the EU’ or to ‘leave the EU’?’ The responses are Remain or Leave . PenExp2 contains responses to the question ’How much do you think someone who reaches State Pension age today would receive in pounds per week?’Responses are numeric.

  4. There are two reasons for the many ‘not applicable’.

  • Routing: the question is only asked to those who said yes to a previous question (EURefV2).
  • Versions 5 and 6 - The BSA uses a split sample and the question is only asked in Versions 5 and 6.
  1. About 50% say the government should increase taxes and spend more.

  2. The amount people think they will get at state pension age varies between £0 and £7000, with an average in the region between £170 and £184

Appendix: recoding nonresponses as system missing

The code below provides and example of how to recode missing values including ‘don’t knows’ and ‘prefer not to say’ into system missing.

The SPSS syntax below includes the command, the variables and the relevant missing values in (). Note, you can set missing values more than 1 at a time if they have the same missing value pattern.

COMPUTE EUVOTWHO_m=EUVOTWHO.
COMPUTE TAXSPEND_m=TAXSPEND.
COMPUTE PenExp2_m=PenExp2.

MISSING VALUES PenExp2_m (-1, 9998, 9999). 
MISSING VALUES TAXSPEND_m (-1, 8, 9). 
MISSING VALUES EUVOTWHO_m (-1, 3 THRU 9).