Designing a Quality In-depth Interview Study: How Many Interviews Are Enough?

Here is a topic that is worthy of more discussion in the research community: What is the optimal number of in-depth interviews to complete in an IDI study?  how-many-interviewsThe appropriate number of interviews to conduct for a face-to-face IDI study needs to be considered at two key moments of time in the research process – the initial research design phase and the phase of field execution.  At the initial design stage, the number of IDIs is dictated by four considerations: 1) the breadth, depth, and nature of the research topic or issue; 2) the hetero- or homogeneity of the population of interest; 3) the level of analysis and interpretation required to meet research objectives; and 4) practical parameters such as the availability and access to interviewees, travel and other logistics associated with conducting face-to-face interviews, as well as the budget or financial resources.   These four factors present the researcher with the difficult task of balancing the specific realities of the research components while estimating the optimal number of interviews to conduct.  Although the number of required interviews tends to move in direct step with the level of diversity and complexity in the research design, there is little guidance in sample size for the researcher at the planning stage.

The other key moment in time when the researcher considers the adequacy of the sample size is during the field phase when interviews are actually being conducted.  This has been the most widely discussed point in time by many researchers because it is then, when in the field, that the optimal number of interviews is determined.  Specifically, researchers utilizing grounded theory rely on the notion of “saturation” or the point in time when responses no longer reveal ‘fresh insights’.  On this basis, the researcher deems that a sufficient number of interviews have been conducted when no new themes or stark variations in interviewees’ responses are coming to light.  There are few guidelines for determining number of interviews by way of saturation, and some have questioned its value given the lack of transparency.

A more quality approach to the question of how many face-to-face IDIs to conduct considers the design phase as well as results in the field but goes further.   It is not good enough to simply evaluate interview completions in the field based on the point of saturation.  While it is important to determine the degree to which interviews are or are not reaping new meaningful information (see the fourth question, below), there are many other quality concerns that need to be resolved.  To assess the number of face-to-face IDIs at the field stage, the researcher needs to more broadly review the quality of the interview completions based on the answers to these eight questions:

  • Did every IDI cover every question or issue important to the research?
  • Did all interviewees provide clear, unambiguous answers to key questions or issues?
  • Does the data answer the research objective?
  • To what extent are new ideas, themes, or information emerging from these interviews?
  • Can the researcher identify the sources of variations and contradictions in the data?
  • Does the data confirm or deny what is already known about the subject matter?
  • Does the data tell a story?  Does it make sense and does it describe the phenomenon or other subject of the study?
  • Are new, unexplored segments or avenues for further research emerging from the data?

From there, the researcher can determine whether additional interviews are justified.


  1. This is terrific, thanks. We are constantly faced with this design question from our (internal) clients. Saturation and level (fit for purpose) of analysis tend to drive our decisions, based on my limited experience. But your first 2 questions (item nonresponse, essentially) clarify something else for me – we end up losing part of our “sample” in many interview programs because informants turn out to be “ineligible” – they didn’t actually meet the selection criteria and that resulted in levels of missing or invalid data that turn them into the survey-equivalent disposition of “unusable”.

  2. Hello Margaret,
    The following is a bit too simplistic but I hope it will serve. In the first stages of grounded theory, interviews are led by the participants and regarded as a conversation between equals. We use very general ‘grand tour’ questions to start the conversation and follow where our interlocutor leads (the inductive phase).
    The interview is analysed as soon as possible afterwards and ideally before the next interview. When analysing, we notice incidents (a word, phrase, sentence, paragraph) and label/code/conceptualise the incidents with the aim of developing concepts. As we continue to analyse, we compare incident to incident, incident to concept and concept to concept searching the best fit, we seek a name/label/code which conceptualises the set of incidents we notice. We develop a ‘stable’ of concepts and each concept comprises many incidents. This is the constant comparison process.
    At a certain point, (once the core category has been identified), we theoretically sample, that is we conduct more interviews with more focused questions, designed to address gaps in the developing theory (the deductive phase). We cease analysis when we achieve theoretical completeness. That is, our theory explains the main concern of participants and how they resolve or process that concern; our theory fits, works, is relevant and is modifiable.
    Each concept will be theoretically saturated, and we know when theoretical saturation has been achieved due to a mechanism known as the ‘interchangeability of indicators’ as discovered by Paul Lazersfeld (See Roots of Grounded Theory for mathematical basis of GT
    The interchangeability of indicators means that for any one concept, the next incident that is recognised could be exchanged for any one of the concept’s current indicators without altering the fit.
    Several points arise.
    The output of the grounded theory process is a grounded theory explaining the main concern of participants and how they resolve that main concern. That is all. It’s no good for anything else*.
    Data collection and data analysis are conducted rhythmically; an interview is coded as soon as possible after the interview has ended and ideally before the next interview is conducted. Analysis drives theoretical sampling and so the number of interviews needed (and interview is but one source of data) is only known at the end of the study. Most PhDs seem to use between 30 and 50.
    The method is rigorous and is a whole package which governs both data collection and data analysis. Adopting one grounded theory technique into a different research design discredits that technique. ‘Theoretical saturation’ is a powerful idea as part of the grounded theory method.
    So you are right when you say: ‘It is not good enough to simply evaluate interview completions in the field based on the point of saturation’ … where I understand you to mean that interviews are conducted sequentially until an impression of saturation is reached. Theoretical saturation is achieved through a rigorous process of anlaysis.
    Best wishes
    Helen Scott

    [It’s relevance as a market research tool might be that GT looks for patterns of behaviour and is useful for identifying dependent and independent variables. It produces a theory, an explanation but not truths. It’s use would probably be more strategic?]

    1. Hello Helen,
      Just a short reply to thank you very much for your thoughtful and insightful comment. I am continually looking at and am interested in the marketing research version of qualitative research compared to how it is studied and practiced in other social and health science disciplines. Grounded theory is definitely one important point of differentiation.

