Supporting Harm Reduction Programs

PiTS Toolkit: Analysis Phase

Objectives

During this phase, syringe services programs who have recently implemented a Point in Time Survey (PiTS) will: 

  • Identify data inconsistencies, including missing and duplicate data 
  • Learn how to create a data table using pivot tables and charts
  • Outline approaches to data use 

Contents

To help you implement your Point in Time Survey, this page contains tutorials to walk you through data cleaning and analysis using Excel. It also includes screen recordings demonstrating skills needed to clean and analyze survey data using Excel. And finally, there are resources for you to develop a data analysis plan and create a data table using Excel skills.

Step 1: Complete a Data Analysis Plan

A data analysis plan is an important component of your survey project. It will tie your survey objectives to your analysis, confirming that you have actually achieved what you set out to do. You may consider drafting your data analysis plan during the planning phase and revisiting it throughout your project.

Step 2: Debrief the Survey Process with the Survey Team

After your survey implementation period is over, it is important to meet with your survey team to get a sense of how things went in order to inform your analysis and plan for future iterations of the survey. You may be surprised by the different perspectives team members have and the lessons learned you will walk away from this meeting with!

Step 3: Clean Your Data

In this section of the toolkit, we have created data cleaning tutorials utilizing a randomly generated fake dataset based on our sample point in time survey. You can follow along in Excel by downloading the raw (referred to as “messy”) dataset, and then practicing the skills covered in the tutorial as you watch. An answer key for each tutorial is also included so that you can check your work.

The purpose of cleaning your data is to identify errors, unusual values, outliers (very high or low numbers) and missing data using Excel tools including spell check, filters, conditional formatting, and the find/replace function. We present more information about what data cleaning is in the below video.

Training presentation about data cleaning

Sample Dataset

To practice the skills you learn from the tutorials in this toolkit, we have generated a sample dataset with 100 (fake) records for you to practice monitoring, cleaning, and analyzing. This dataset is based on the example point in time survey. Practicing these skills in advance will help you when you are working with your own dataset. Below is the raw, or “messy”, dataset that we will use for the data monitoring tutorials. A cleaned dataset and answer key is provided at the end of this series of tutorials.

Missing Data

This tutorial will demonstrate how to use filters and pivot tables to identify missing data within your dataset.

Brief video demonstrating how to identify missing data.

Below is an answer key that contains all of the missing values in the raw (messy) dataset. Take a look and make sure you found all the errors!

De-duplicating Data

This tutorial will demonstrate how to use conditional formatting to identify duplicate values within your dataset and show how to address duplicates.

Brief tutorial about identifying duplicate data in your dataset.

Below is an answer key that contains all the duplicate values in the raw (messy) dataset. Take a look and make sure you found the same duplicates.

Cleaned Dataset

Now that you have learned to clean your data by following these tutorials, you can check the rest of the dataset for additional errors. Make sure to use spell check as well as check for non-matching answers to find the remaining problems.

Done? Below is our cleaned dataset so that you can cross reference yours with ours to make sure you arrived at the correct answers.

Step 4: Analyze Your Data

Managing “Check All That Apply” Data

Before you get started, make sure all your data is in a form that allows you to do the analysis you want to do. This step is not data cleaning but it is also not analysis either – it is a stepping stone to do your analysis. Categorical survey questions that allow the respondent to “check all that apply” (versus check one) are unique because the answers are not mutually exclusive. In this tutorial we will share a method you can use to ensure each respondent appears as an individual with unique characteristics in your analysis.

Brief tutorial showing how to manage check all that apply data

Descriptive Statistics and Stratification

Now you’re ready to analyze your data! Often referred to as descriptive statistics, in this step you will be summarizing the data that you collected. You can look at the frequency of every response, as well as the median, mean, and other statistics for numeric responses. We will also discuss stratifying your data to compare subgroups within your data (i.e. the breakdown of housing status by gender within your respondents). Watch this tutorial to learn how to calculate descriptive statistics and stratify your data using pivot tables.

Brief tutorial on stratifying data

Special Topic: Numerators and Denominators

When analyzing your data, you need to be mindful of your denominators. Denominators let you know how many people have a certain characteristic of interest. Numerators let you know how many people are potentially at risk of having that certain characteristic of interest. A proportion is the denominator divided by the numerator. This presentation is important because it will help you to identify when to use the total number of surveys you collected, and when to use a smaller subset of your respondents to calculate a statistic. We have also included a summary handout of key terms used in the presentation.

Presentation on numerators, denominators, and proportions

Special Topic: Analyzing Questions Related to Drugs Used

Many SSPs will choose to ask respondents about what drugs they used in conjunction with frequency of use and route of administration. Not only can these questions be tricky to program, they can also be tricky to analyze since they are often combined. This resource will help you set up your analysis for this type of question.