Skip to content Skip to main navigation Report an accessibility issue

Understanding Identifiable Data

What is identifiable data?

Within the context of human subjects research, the definition of identifiable data can sometimes be complex. According to the federal regulations governing human subjects research, data is considered identifiable if the identity of the participant is known or may readily be ascertained.

A participant’s identity may be readily ascertained if a code key exists that links participant ID numbers to their identity.

In some cases, a participant’s identity can also be readily ascertained if indirect identifiers are combined with each other or with other known characteristics. For example, if survey data is collected from a small class of students that only includes one male student, and students are asked to report their sex on the survey, then the survey of the one male student in the class could be easily identified.

What are direct identifiers?

Direct identifiers are variables that point explicitly to particular individuals or units. Examples include:

  • Names
  • Addresses
  • Telephone numbers, including area codes
  • Social Security numbers
  • Other linkable numbers such as driver’s license numbers, certification numbers, etc.

What are indirect identifiers?

Indirect identifiers are variables that may be used together or in conjunction with other information to identify individual participants. Examples include:

  • Detailed geographic information (e.g., state, county, province, or census tract of residence)
  • Organizations to which the participant belongs
  • Educational institutions from which the participant graduated and year of graduation
  • Detailed occupational titles
  • Place where participant grew up
  • Exact dates of events (e.g., birth, death, marriage, divorce)
  • Detailed income
  • Offices or posts held by participant

What about audio and video recordings?

Audio and video recordings are always considered to be identifiable in the context of human subjects research. Options for de-identifying audio and video recordings include altering an individual’s voice or blurring their face. Another option is to transcribe the recording, remove any potential identifiers from the transcript, keep the de-identified transcripts, and destroy the original recording.

Do I need to de-identify my data?

The HRPP understands that there are some instances in which keeping identifiable data is necessary to answer the research questions under study. In these instances, be sure to clearly inform participants in the consent form that you will be collecting and keeping identifiable data. Depending on the sensitivity of the information being collected, additional privacy and confidentiality protections may need to be put in place to protect participants.

What about HIPAA-protected information?

f you are working with HIPAA-protected information, note that there are different guidelines for what is considered identifiable data and how this data should be de-identified. See the links below or contact us for more information.