10 most common clinical trials data problems

One of the common tasks I do in my day-to-day job is to review analyses and summaries of clinical trial data. I’ve summarised the most common problems I see with the hope that it might help you avoid these in the future:

  1. White blood cell differential units being mixed up – frequently these lab tests are measured either in absolute terms (number of cells per volume) or as a percentage of the total number of white blood cells. Often the absolute and percentage results are both reported. A decision should be taken early on in the trial as to how the data should be reported. I would recommend reporting in both absolute and percentage terms. Often this will require converting from one to the other. It’s also very useful to do a check by adding up all the white blood cell components and ensuring that the total is in the same ball park as the reported WBC count.
  2. Reasons for withdrawing from the study – often clinical trial investigators have different approaches to reporting the main reason for a patient withdrawing from a study. When looked at overall, these patterns can cause difficulty in interpreting the results. These reasons should be reviewed with the clinical team throughout the trial. It’s also useful to check the number of patients who have withdrawn due to side effects against the adverse event data reported.
  3. Scale of graphs – to help interpretation of results, often clinicians will want to compare graphs, and to facilitate this, the graphs need to be on the same scale. Additionally, if graphs are showing pre-treatment versus post-treatment results, graphs should use the same size of scale on both axes, so the 45 degree line shows the “no-difference” point.
  4. Footnotes – add footnotes to make sure that the results in a table are clear, and can be understood with reference to other documents as much as is possible. Ideally abbreviations used in a table or graph should be footnoted.
  5. Number of subjects – usually tables present the number of subjects in the population under study (“big N”), as well as the number of subjects with available data (“little n”). The “big N” number needs to reflect the number of subjects under study, and this is frequently incorrect for subgroup analyses. I have seen cases where “little n” is bigger than “big N” which is clearly rubbish (not from people in my company though!).
  6. Denominators for percentage calculations – related to the above point with “big N”, the denominators for summaries should be carefully planned and documented. If the denominator is not the total number of subjects in the analysis population, this needs to be justified.
  7. Fasting glucose – someone laboratories and people handling clinical data always seem to struggle with fasting and non-fasting glucose. A patient either has fasted or they haven’t, and hopefully the state of the patient will be accurately recorded. The normal ranges for fasting glucose is different from the non-fasting state, which is why this is important.
  8. Decimal places – often the number of decimal places is not appropriate. There should be enough decimal places to allow adequate interpretation of the results, and ideally not any more!
  9. Units of measurement – always state what the unit of measurement is on every output and analysis.
  10. Outliers – a check should always be carried out when working with any continuous data (e.g. lab measurements, vital signs parameters) to ensure that any outlying values are not erroneous.