Evaluating Connected Health Interventions In The Era Of Personalized Medicine

My colleague, Anne Thorndike, MD, an expert in population studies and workplace wellness made contributions to this piece. She can be reached directly at athorndike@partners.org

The concept of connected health recently received a blow when an article entitled “Telemonitoring in Patients with Heart Failure” appeared on December 2010 in the New England Journal of Medicine.  The headline offered by this study was that “telemonitoring does not work”.  More accurately, the intervention group did not have a significantly different hospital readmission rate than the control group.  We live in a world where, accentuated by RSS feeds, hyperlinks and 160 character Tweets, headlines are all we can consume most days.  So whether the conclusion that telemonitoring does not work is justified or not, the headline did some damage.

This is as an opportunity to re-examine how we are evaluating health interventions in the era of personalized medicine.  The gold standard for determining an intervention’s effectiveness is the randomized controlled trial. The design of this type of study is very predictable:  decide what effect is to be tested and how large the effect is expected to be.  From that, researchers can calculate the number of subjects to enroll in order to insure that, if the effect exists, one will see it.

This exercise is conceived from the point of view of population science.  The idea is to choose a large enough sample so that the differences between subjects randomized to the intervention group and the control group will wash out and that the only real difference between them will be the intervention.  When the intervention is a treatment, the intervention group is often examined as a whole, whether or not all subjects actually completed or received that treatment.  This is called an ‘intention-to-treat’ analysis.  Population scientists call this a measure of the effectiveness of the intervention.  If you give the intervention to a population of folks, do they use it and does it work?  The logic is that when an intervention is made available to a population in the real world, not everyone will use it. The conventional wisdom is that if we don’t count folks who are non-adherent or who drop out, we will potentially bias the results.  For example, it is possible that non-adherent subjects are sicker and that excluding them will falsely elevate the perceived effect of the intervention.

In the case of the NEJM article referenced above, the intervention was a program where at-home heart failure patients weighed themselves on a bathroom scale and then telephoned the weight into an interactive voice response line.  At the beginning of the study, ~ 10% of the intervention subjects did not use the intervention and by the end of the 6 month period, 55% were not using it.   It is reasonable to conclude that this type of intervention does not lead to sufficient engagement to make a difference in hospital readmissions.  Does if follow that ‘telemonitoring does not work’?  What we’d really like to know is the efficacy of the intervention.  Population scientists define efficacy as a more pure test of whether the intervention works when people use it.

At the Center for Connected Health, we have a heart failure telemonitoring program.  It differs in at least two ways from what is described in the NEJM article. One is that we use automatically uploaded vital sign data.  We think that makes a difference. The main difference, though, is that we have nearly 100% adherence.  This is so because each morning, if a patient has not uploaded his/her vital signs, a nurse calls to remind the patient to do so.  The objective data from the monitoring devices sets up an opportunity to hold the patient accountable for a health outcome, an important principle of connected health.  Our program has consistently been correlated with a reduction of ~ 50% in both CHF related and all-cause hospital admissions.

So we’d say that our program is efficacious.  We could possibly do more study on its effectiveness at a population level.

The Healthrageous platform promises dynamic personalization, or connected personalized health.  These terms imply that each intervention will be tailored to meet an individual’s needs.  As we see this vision unfold, it seems antiquated to use a tool like a randomized controlled trial to test the intervention’s utility.  If there are 1,000 people in a group, the Healthrageous platform will customize 1,000 individual interventions.

The same anxiety is occurring as the hyper-individualized designer therapeutics of personalized medicine begin to hit the market.

Think of it… we are all individuals. Medicine is on the cusp of offering each one of us unique interventions. Of what value is it for me to know I was part of a clinical trial where 60% of recipients benefited from an intervention?  What I really want to know is if I will benefit (100% or 0).

There are a couple of ideas floating around that give me some hope. One is really an old idea made new in the context of modern technology.  In some settings, you can act as your own control. We can measure your baseline, measure results in the context of the intervention, withdraw the intervention and see if you go back to baseline, add the intervention back again…in this way we can get a sense of what the effect of the intervention is on you and you alone.  This “n of 1” design works best if one can substitute a placebo for the intervention at various intervals in a way that is blind to the study subject.  For an entertaining example of this phenomenon, see the quantified self website and how people are doing ‘clinical trials of one’ using biometric monitoring devices.  A second strategy would be to perform a delayed start study.  During an early phase, subjects who are non-adherent could be screened out.  While this will still draw some criticism, it should be more pure test of the intervention than intention to treat design

Healthrageous is leading the way in creating interventions that are unique to each of us.  This is the phenotypic individualism that compliments the genotypic mapping, which has been popularized as personalized medicine.  We need a new strategy for evaluating the success of these highly personalized interventions.  Two possible solutions are ‘n of 1’ trials, especially where the intervention can be blinded, and a design where non-adherent participants are removed during a pretrial phase.

Are there other strategies we might employ?  Let me know your thoughts.