In 2018 I was part of an ambitious project to design a mobile service to empower patients with diabetes mellitus type 2 to better manage their conditions, and health practitioners to deliver a better care. After creating it with user-centered and co-design, here’s how we evaluated it with real patients.
The project took place between January and December 2018. I was part of a team made of developers, researchers and junior designers, alongside Engineering (R&D IT Systems for Health) and Imec. I was responsible for the user research, co-design activities and the formative and summative evaluation, producing the major deliverables on design and evaluation activities by December 2018.
My challenge was to coordinate the evaluation across three different sites, two in Italy and one in the Netherlands. The trials were all located outside Trento, where I am based, so the task required efficient communication and project management to coordinate remote teamwork. I also had to adapt the experimental design along the way, to cope with unforseen contingencies and issues with recruitment.
In the spring of 2018, while we were still immersed in the design of the service, we started to plan the experimental design and the recruitment. At first, I had planned to run a randomized control trial to compare the DMCoach app to standard treatment. I ran different calculations based on the literature to compute the sample size using G*Power. Unfortunately for us, our main contact for the recruitment of patients got a promotion in the early summer and moved abroad, and this remixed things a bit. We had to quickly adjust the plan to run an effective evaluation with fewer patients. We considered different options (e.g., cross-over experimental design, a within-subjects design where each patient receives both conditions, with the advantage of requiring less participants to obtain the same statistical power) and finally decided to submit a “quality improvement project” with a simpler within-subjects repeated measures design, which did not required ethical approval and allowed us to cope with this unforseen delay.
After the design phase, we proceeded with quick iterations of development and testing, using both expert evaluation and user testing with 15 users. This allowed us to refine the service and solve all the major bugs, UX and usability problems by September.
Then, we started the evaluation incrementally in the three sites. As soon as participants were available, they joined the study in three different groups. This involved 32 patients with DMT2 in Italy and 5 additional participants in Eindhoven to evaluate a smart band for detecting physiological parameters.
To cope with the hectic schedule, we tapped into our personal networks at work and managed to find 15 people in Trento and Rome who used the mobile app daily for one month in August 2018. Users listed their remarks as “Crash”, “Bug”, “No app response”, “Response delay”, and “Desiderata” and gave them a priority among “Low”, “Medium”, “High”, “Urgent”, and “Future development”. We collected all this user feedback into a Google spreadsheet, releasing new versions as the problems were fixed, and repeating the cycle again. We also asked my colleagues, UX experts, to run a heuristic evaluation to spot major usability problems.
We identified 108 problems, which we addressed and solved based of their priority.
We recruited 32 patients (mean age=62; standard deviation=10; 8 females) in two Italian cities: they joined the study as soon as they reached a minimum number, so the evaluation lasted 8 weeks for the first group and 4 weeks for the second group. We also ran a one week test in Eindhoven to assess the acceptance and the quality of data gathered by a novel smartband prototype developed by Imec, which collects physiological data.
Our main goal was to evaluate the technological acceptance and the UX of the mobile app. We assessed persistence of use (how consistent were the patients in using the app?) and attrition rate (how many dropouts?). For persistence, we aimed at 3 minimum logins on average per users per week, and by the end of the study each patient accessed on average 52 times per week (more than 7 times per day!), well above the threshold. Only 2 participants did not finish the study – 6%, compared to a 20% threshold we found in previous studies.
To evaluate patients’ satisfaction of the app functionalities and the UX, we asked participants to fill in a set questionnaires. We used both validated ones and questionnaires I designed ad hoc and based on academic literature, to explore these dimensions:
- individual aspects of technology acceptance;
- self-reported compliance with healthy nutrition and pshysical activity;
- satisfaction with current self-management practices;
- quality of life;
- motivation to undertake a healthy lifestyle.
We designed the evaluation as a 5-phase process:
- Rectruitment: health practitioners used a brief survey to identify eligible volunteers.
- Baseline assessment: participants were screened to exclude any obstacles to their participation and to define, together with their doctor, a set of lifestyle and data tracking goals. Based on goal-setting theory, we explicitly designed the service to allow patients and doctors to set lifestyle goals together. Patients were assisted in the installation of the app and guided through its functionalities. After the technical training, patients filled in the baseline questionnaires;
- DMCoach use: participants used the mobile app for 8 or 4 weeks. During this period, two health practitioners monitored the participants through the webapp.
- Final assessment: at the end of the study, patients underwent the final assessment and filled in the questionnaires.
- Smartband data collection: 5 volunteers in Eindhoven evaluated a smartband prototype developed by Imec, which collects data on skin conductance, skin temperature and heart rate. The goal of this test was to tune the integration of physiological data with DMCoach.
We analysed data from the app logs, questionnaires, and physical assessment. Here, I give a brief and generic summary of the main findings.
Patients used consistently the app throughout the trial to track nutrition and physical activity.
Regarding patients’ interaction with the app, we reached very successful results. A large portion of the participants used the app every day during the experimentation, which was encouraging. We computed insightful statistics from the app logs : we observed how most of patients’ data tracking regarded lunch, and then breakfast and dinner. But we also noted that, while aerobic workout was the second most frequently tracked information in the 8-week group, the 4-week group did not track physical activity that much.
We wanted to understand why.
So, we further analyzed the data and found that the 4-week group had less goals on physical activity. This can depend on patients’ characteristics and on the health practitioner’s approach – as we had found out in the research phase, diabetes requires a highly individualized treatment and not all the patients can be asked to do physical activity (i.e., due to comorbidities or mobility issues). However, this pointed out to possible differences in usage patterns by the clinicians: for instance, even if the study lasted longer in the 8-week group, it was the health practitioner of the 4-week group who sent more direct personalized messages to the patients.
UX was high in both groups, with higher scores at the end of the study.
The user experience of DMCoach was rated high in both groups. For particular dimensions (ease of use for both groups and attractiveness for the 4-week group), participants reported a significantly higher score at the end of the evaluation. This means that after an initial learning and a prolonged use, the app was easier to use and more attractive for patients. These data also suggested that DMCoach was widely accepted even after the whole duration of the study, without any apparent effect of boredom.
By the end of the study, patients reported healthier nutrition and were more satified with how they managed diabetes.
This was a breakthrough in our study. Although here we measured attitudes and not behaviors, we understood we were on the right path. Also, by the end of the study patients in the 4-week group reported to be less attracted by unhealthy behaviors.
13 patients out of 22 lost weight by the end of the study.
Although we did not find statistically significant differences in participants’ weight between the beginning and the end of the study, more than a half of the 22 patients for which we were able to collect weight at two data points lost weight. This is promising, considering the relatively short period of the evaluation (4 and 8 weeks) and that significant weight reduction may require several weeks of treatment and a larger sample size.
The smartband prototype collected good quality data and the battery lasted 24h, although work still had to be done to improve comfort.
The data collected by the smartband was overall of good quality, enough to monitor physiological changes and activity during the day. Also, it had to be charged only once a day. However, participants were still concerned by the size of the wristband and the skin marks from the sensors. This suggested that work had to be done to develop a smaller, lighter prototype and to increase ergonomic comfort. Participants also suggested to display insights derived from the data collected in an interactive way.
Overall, the evaluation of DMCoach was promising. In 2019, the project was improved and obtained further funding from EIT Digital to fine-tune the service and develop additional functionalities.
DMCoach was funded by EIT Digital. In collaboration with: