Recently, I had the opportunity to sit down with Dr. John Birkmeyer, Director of the , and lead author of a landmark study of surgical skill and outcomes recently published in the ‘s Center for Healthcare Outcomes and PolicyNew England Journal of Medicine. Among other things, we discussed the motivation of the study, the main findings, and the implications for future research. Here is what he had to say:
Bradley Reames: What motivated you to conduct this study?
John Birkmeyer: We have known for many years that the outcomes of surgery vary widely across both hospitals and surgeons. We have made significant headway in understanding how components of a hospital’s structure or volume affect outcomes, and we have even studied how different aspects of process of care and perioperative practice affects outcomes. But there are still many unanswered questions about how the quality of the procedure itself drives outcomes after surgery.
BR: What was the most difficult part of completing a study like this?
JB: I think that the most challenging part of doing a study like this is getting surgeons to participate. Obviously, submitting videotapes of yourself operating and submitting to peer rating is threatening at multiple levels, and would be a serious challenge for many trying to replicate this type of study. I think what ultimately enabled our success was the social capital, accrued over many years, collaborating with Michigan bariatric surgeons on improvement activities.
BR: How would you summarize the main findings of this work?
JB: I think that the study has three major implications. The first is that despite the inherent subjectivity of the process, it is in fact feasible, and practical, to measure the skill of operating surgeons. The second main finding is that, even among surgeons that are fully trained and practicing in a single specialty, there was remarkable variation in empirical ratings of their skill. And third, and perhaps most important: those empirical measures of skill were remarkably correlated to surgical outcomes, at least as reflected by risks of postoperative complications.
BR: What surprised you most about the results?
JB: I don’t think that any surgeon would be surprised by the general finding that the skill of the operating surgeon matters. What surprised us, however, was how readily a surgeon’s skill could be measured, and how powerfully the measures of surgeon skill were associated with outcomes.
BR: Were you surprised that duration of practice or fellowship training did not seem to influence skill?
JB: Well I have heard many people express surprise that fellowship training was not associated with surgeon skill, but that does not surprise me at all. I believe that subspecialty training is a really important strategy for young surgeons to accelerate their learning curve and their familiarity with a particular set of procedures, or a subspecialty. So I personally believe that fellowship training might differentiate the outcomes of a surgeon that is right out of fellowship. However, by the time a surgeon has been practicing for many years, I would expect that the fellowship training effect would have washed out over time, and that is what I think we are seeing in this study.
BR: Given that most current continuing medical education programs stress knowledge instead of technical proficiency, do you think this shows that surgeons become “set in their ways” so-to-speak, after practicing for an extended period of time?
JB: I think my clinical observation, as well as previous studies by others, suggest that surgeons get on to the flat part of the curve of proficiency (with regard to technical skill and outcomes) by their mid-40s. The mean age of surgeons participating in this study was 45-50 years old, and thus I am not surprised that years of training and fellowship training were not a big driver of outcomes.
BR: What do you think are the most important implications of your findings for the surgical community?
JB: I think that it is easy to imagine implications for many different stakeholders involved in the process of producing or evaluating surgeons. You can imagine that these findings have implications for the process by which we decide who gets to be a surgeon in the first place. Right now, medical students get to be surgeons because that is what they want to do, rather than because that is where their talent directs them. Perhaps we need to rethink that, and to more directly or empirically assess the skill of future surgeons at an early stage in their careers.
For surgical residents, it means that rather than pass surgical residents from year-to-year based on time served, it is possible that we need to more empirically assess their skill along the way, and perhaps be more explicit about steering surgeons to the right types of subspecialties according to objectively measured skill.
After clinical training, you can imagine that empirical assessments of skill will have direct application to the American Board of Surgery and other agencies that are charged with board certification and recertification. Certainly cognitive skills, as reflected by in-service exams or by board tests, matter for some disciplines, but it is hard to believe that they matter more than the operative proficiency of surgeons.
You can also imagine the business case for better assessments of surgeon skill from other perspectives. Certainly hospitals, for example, have a strong incentive to recruit and retain the best surgeons; why not seek better information about surgeons while they are in the hiring process?
And finally, I think our findings have important implications for public health, and for improving care in general. Most of our collective activities-to-date have been around assessing and enhancing evidence-based processes of peri-operative care. These findings suggest that maybe we should be spending as much attention on making surgeons better at what they do while they are in the operating room.
BR: This study opens multiple new avenues of scientific inquiry. How do you think these results should influence the surgical health services research agenda going forward?
JB: I think there are at least two obvious important questions that derive from our findings. The first is: to what extent can our findings from one specialty, in this case bariatric surgery, be extrapolated to other disciplines? Many would assume that they would be directly applicable to comparable subspecialties that imply a significant amount of technical complexity in the procedures they encompass, but only with empirical study will we have a true understanding of how much we should prioritize surgeon skill over other quality measures.
And a second really important question is: to what extent is this problem fixable? Whether it is musicians, or athletes, or surgeons, we all appreciate that there will inevitably be some variation in skill or other types of performance. We take it as a given that deliberative practice and coaching can improve skill at any level. But it remains to be seen whether there are practical or effective interventions that could similarly improve the skill of practicing surgeons.
BR: How might this research be translated to open operations and other specialties?
JB: I think that our study provides a pretty clear design blueprint of how to measure surgical skill and link these measures of skill to post-operative outcomes, and I believe a similar approach could be applied in virtually any surgical discipline. I think the narrower question is: what are the barriers to taking methods that were applied to videoscopic surgery, and applying them to open surgery? I think there are plenty of technology options, whether it is Google Glass or other headlight-mounted cameras, which could provide at least comparable recordings of open surgery, as we had for videoscopic surgery.
BR: What is the next step for your group in Michigan?
JB: I think that our broader group in Michigan is moving forward on two fronts. Across many of the specialty collaboratives in Michigan, there is an interest in replicating our findings for other types of procedures and assessing the extent to which skill is associated not just with complications, but also with longer-term outcomes. Second, there is really innovative work going forward by Justin Dimick and Nancy Birkmeyer trying to assess the feasibility of population-wide coaching interventions. Their work is aimed at not only narrowing variation in skills, but also in making surgeons of every skill-level better.
Certainly the conversation regarding measurement of technical skill and its implications for surgical education, accreditation and certification, clinical quality improvement, and health policy is just getting started. Going forward, it will be interesting to see where this study has the greatest impact.
In the meantime, we want to hear your thoughts. What do you think about the measurement of surgical skill and its potential application to the areas above? Post your comments below.
About Dr. John Birkmeyer:
John Birkmeyer, MD is the George D. Zuidema Professor of Surgery and Director of the Center for Healthcare Outcomes & Policy at the University of Michigan. He is a graduate of Harvard Medical School. His research career has focused on performance measurement, understanding variation in hospital outcomes and cost-efficiency, and strategies for improvement. Formerly a series editor of the Dartmouth Atlas of Healthcare, Dr. Birkmeyer has leading roles in several regional collaborative improvement programs involving over 50 hospitals in Michigan, with support from Blue Cross Blue Shield Michigan. He serves on the blue ribbon expert panel on hospital safety ratings for the Leapfrog Group and as Chief Scientific Officer for ArborMetrix, Inc. Dr. Birkmeyer was elected to the Institute of Medicine of the National Academy of Sciences in 2006.