Data Quality
To ensure the highest quality of data, the MSS undergoes a series of checks designed to identify invalid responses. These checks may lead to the removal of an entire case from the dataset or the exclusion of specific answers while retaining the rest of the responses. Surveys are discarded if they show significant inconsistencies, if there's a pattern of likely exaggeration, if completed outside of school hours (excluding online students), or if the survey appears to be a test of the online system. Additionally, surveys are removed if only the background section is answered. Overall, approximately 2-4% of cases are eliminated from the data file following these quality checks.
Detailed Criteria for the Removal of Cases from the Minnesota Student Survey (MSS)
- Remove cases in which the student did not answer any questions beyond the background section.
- Remove cases in which age is sharply incompatible with reported grade. Also, remove anyone who said they were 21 or older, unless they are in a special population group (e.g., JCF).
- Remove cases that show pattern of exaggeration around drug or tobacco use. Exaggerated tobacco use is the reported use of all five kinds of tobacco (cigarettes, cigars, smokeless tobacco, e-cigarettes, and hookah) on all thirty of the past thirty days. Exaggerated drug use is the reported use of eight or more kinds of drugs on the maximum number of occasions (20 or more) in the last 12 months.
- Remove cases for patterns of excessive marking or straight-line marking (so named because of students actually drawing a straight line through responses on the paper version of the survey). These occur when questions are organized into lists or boxes, and students mark the extreme answer for all questions in the list. Eleven criteria were used in 2022. Cases were removed if the pattern of excessive marking was found on two or more of the eleven items. Excessive marking criteria:
- All race/ethnicity groups were checked.
- All gender identities were checked.
- All racial/ethnic subcategories were marked among students who marked "Asian/Asian American," "Black, African or African American," "Hispanic or Latino/Latina" and/or "Middle Eastern or North African."
- Students marked all 15 reasons for missing days or part days of school (for regular public schools students) or scheduled classes or assigned activities (for online students).
- Students reported being bullied or harassed "every day" in past 30 days for all eight reasons listed.
- Students reported going to all six locations listed after school for all five days of typical week.
- Students reported participating in all eight kinds of organized activities on five or more days in typical week.
- Students reported drinking all seven kinds of beverages listed four or more times per day during the last seven days.
- Students reported engaging in all five kinds of gambling "every day"
- All 14 sources of e-cigarettes were marked among students who reported using e-cigarettes.
- All 13 sources of alcohol were marked among students who reported using alcohol in the last 30 days.
- Remove cases showing pattern of logical inconsistency. There are several situations in which survey answers can be clearly inconsistent. Ten options were used in 2022. Since it is quite possible for an inconsistency to be accidental, especially with the no-yes questions, cases were removed only if there were inconsistencies in three or more of the ten items.
- Answering "none of these" and one or more dental problems
- Answering "no" and "yes" to mental health treatment
- Answering "no" and "yes" to alcohol/drug treatment
- Answering "none of these" and one or more adults student can talk to
- Answering "no" and "yes" to suicidal thoughts
- Answering "no" and "yes" to suicide attempts
- Answering "no" and "yes" to having ever been in foster care
- Answering "no" and "yes" to being homeless
- Answering "no" and "yes" to having had a parent in jail or prison
- Answering "no method was used" and one or more methods of birth control
- As a result of skip patterns not all students were asked these questions.
Additional cases were removed for the following reasons:
- The survey was determined to be a test survey (i.e., a single or small number of surveys taken on a day before a day that a large number of surveys was taken). These were determined on a case-by-case basis.
- The survey was the only survey in a grade level or school building. Since some such cases are legitimate (especially in charter schools) and would still contribute toward state and county results, these were determined on a case-by-case basis.
- Responses to individual questions are removed if they are very unlikely or logically inconsistent and there is no pattern of mischievous responding. These responses are likely errors and are deleted but the rest of the responses are retained. For example, a student who reported they had dental problems and reported they did not dental problems. The responses to the dental problems questions would be deleted and the student’s responses to other questions retained.