Skip to main content
Digital Biomarkers logoLink to Digital Biomarkers
. 2018 Nov 7;2(3):126–138. doi: 10.1159/000493883

A Pilot Study to Assess the Feasibility of Collecting and Transmitting Clinical Trial Data with Mobile Technologies

Colleen Russell a, Nadir Ammour b,*, Toby Wells a, Nicolas Bonnet c, Matthias Kruse a, Agnes Tardat c, Christel Erales c, Thomas Shook a, Stephane Kirkesseli b, Lionel Hovsepian b, Sy Pretorius a
PMCID: PMC7015356  PMID: 32095763

Abstract

Background

The use of mobile technologies for data capture and transmission has the potential to streamline clinical trials, but researchers lack methods for collecting, processing, and interpreting data from these tools.

Objectives

To assess the performance of a technical platform for collecting and transmitting data from six mobile technologies in the clinic and at home, to apply methods for comparing them to clinical standard devices, and to measure their usability, including how willing subjects were to use them on a regular basis.

Methods

In part 1 of the study, conducted over 3 weeks in the clinic, we tested two device pairs (mobile vs. clinical standard blood pressure monitor and mobile vs. clinical standard spirometer) on 25 healthy volunteers. In part 2 of the study, conducted over 3 days both in the clinic and at home, we tested the same two device pairs as in part 1, plus four additional pairs (mobile vs. clinical standard pulse oximeter, glucose meter, weight scale, and activity monitor), on 22 healthy volunteers.

Results

Data collection reliability was 98.1% in part 1 of the study and 95.8% in part 2 (the percentages exclude the wearable activity monitor, which collects data continuously). In part 1, 20 of 1,049 overall expected measurements were missing (1.9%), and in part 2, 45 of 1,083 were missing (4.2%). The most common reason for missing data was a single malfunctioning spirometer (13 of 20 total missed readings) in part 1, and that the subject did not take the measurement (22 of 45 total missed readings) in part 2. Also in part 2, a higher proportion of at-home measurements than in-clinic readings were missing (12.6 vs. 2.7%). The data from this experimental study were unable to establish repeatability or agreement for every mobile technology; only the pulse oximeter demonstrated repeatability, and only the weight scale demonstrated agreement with the clinical standard device. Most mobile technologies received high “willingness to use” ratings from the patients on the questionnaires.

Conclusions

This study demonstrated that the wireless data transmission and processing platform was dependable. It also identified three critical areas of study for advancing the use of mobile technologies in clinical research: (1) if a mobile technology captures more than one type of endpoint (such as blood pressure and pulse), repeatability and agreement may need to be established for each endpoint to be included in a clinical trial; (2) researchers need to develop criteria for excluding invalid device readings (to be identified by algorithms in real time) for the population studied using ranges based on accumulated subject data and established norms; and (3) careful examination of a mobile technology's performance (reliability, repeatability, and agreement with accepted reference devices) during pilot testing is essential, even for medical devices approved by regulators.

Keywords: Mobile technologies, Feasibility study, Data transmission, Agreement, Correlation, Usability, Data collection

Introduction

Intensifying interest in the potential of new, portable technology tools such as mobile phones and wearables to transform the conduct of clinical trials is a natural outgrowth of the broader societal trend toward digitization [1]. However, despite widespread hype and hope, few studies have tackled the substantial technical, scientific, and methodological challenges of collecting, processing, and interpreting data from mobile technologies (MTs), including wearables, in formal clinical research [2, 3].

Two powerful forces in the biopharmaceutical industry are converging to accelerate activity in this field: (1) a renewed interest in mitigating the burden on patients and investigators who participate in clinical trials of experimental agents (i.e., a focus on how to make protocols more “patient centric”) and (2) the enticing possibility that frequent (or continuous) data collection from outside the clinic will provide more study endpoints and ones that reflect somewhat less artificial study environments [4].

If mobile data gathering technologies were proven to be at least as meaningful as the current “gold standards” used in clinical trials, widespread use of these tools might, among other things, encourage greater patient participation in clinical research (due to fewer site visits), reduce the costs of conducting studies, and enable sponsors to gather more data to generate evidence about treatment performance in real-world settings [5, 6].

In this study, Sanofi and PAREXEL jointly sponsored the evaluation and technical implementation of MTs that might be suitable for extended use as additional patient evaluation tools in both pre- and post-approval clinical studies. In the context of this work, the term “mobile technology” applies to a medical device or activity tracker that has a connectivity feature allowing recorded data to be transmitted wirelessly to a central server. The study objectives were (1) to assess the reliability of a technical platform for the collection and transmission of data from multiple MTs in the clinic and at home, (2) to explore analytical methods for comparing them to clinical standard devices (CSDs), and (3) to measure their usability, including how willing subjects were to use them on a regular basis.

Subjects and Methods

Because this was a pilot feasibility study to vet the operational aspects of data collection and transmission and to implement statistical methodologies for assessing the repeatability and agreement of MTs versus CSDs, we determined that a test on closely supervised healthy volunteers should precede testing in a patient population. Part 1 of the study was a 3-week, integrated add-on to the multiple ascending dose (MAD) portion of a Sanofi phase I first-in-human study on healthy volunteers. The add-on tested two MTs, a blood pressure monitor [7] and a spirometer [8], on 25 subjects. Measurements were taken in an early phase unit (EPU) clinical setting at pre-defined times (according to the protocol) first by the CSD, then by the MT (Fig. 1; Table 1).

Fig. 1.

Fig. 1

Mobile technology and clinical standard device pairs tested in part 1 and part 2 of the study.

Table 1.

Mobile technologies and clinical standard devices used in the pilot feasibility study – including the regulatory approval status

Type Blood pressure monitor Spirometer Pulse oximeter Glucose monitor Weight scale Activity monitor
Mobile technologies
Manufacturer A&D Vitalograph Nonin Entra A&D Striiv
Model UA-767PBT-Ci1 asma-1 bt1 Onyx II 9560 MyGlucoHealth UA-351PBT-Ci Fusion
FDA 510(k)2 Yes Yes Yes Yes Exempt No5
CEmark3 Yes Yes Yes Yes Yes Yes

Clinical standard devices
Manufacturer General Electric Carefusion General Abbott4 Seca Philips
Healthcare Electric Respironics
Healthcare
Model Dinamap Masterscope1 Dinamap Freestyle 930 Actiwatch
Procare4001 Procare 400 Precision
FDA 510(k)2 Yes Yes Yes Yes Exempt Yes
CEmark3 Yes Yes Yes Yes Yes Yes
1

These mobile technologies and clinical standard devices were used both in part 1 (add-on part) and part 2 (extension part), while the other four types were used in part 2 only.

2

A 510(k) is a premarket submission made to the US FDA to demonstrate that the device to be marketed is at least as safe and effective– that is, substantially equivalent– to a legally marketed device (21 CFR 807.92(a)(3)) that is not subject to premarket approval.

3

A CE mark is a legal requirement to place a device on the market in the European Union and indicates conformity with health, safety, and environmental protection standards for products sold within the European Economic Area.

4

The clinical standard is use of a glucose monitor– in this case, we used the Abbott Freestyle Precision glucose monitor– and laboratory results.

5

The Striiv Fusion is a consumer grade activity monitor, not medical grade, and thus is not subject to FDA review or clearance.

Part 2 was a 3-day extension study which tested six MTs (the same two as in part 1 plus four additional MTs: a pulse oximeter [9], a glucose meter [10], a weight scale [11], and a wrist-worn activity monitor [12]) on 22 subjects (Fig. 1; Table 1). A crossover design aspect was implemented to randomly assign subjects in part 2 so that half of them would have the MT data captured first followed by the CSD data, and the other half would have the CSD data captured first, followed by the MT data. Participation in part 2 was optional; a total of 18 subjects from part 1 elected to participate, while another 4 were recruited from the parent Sanofi study to participate in part 2 only. Thirteen subjects were randomly assigned to be housed in the EPU for 3 days during the extension study, while 9 subjects were assigned to leave the EPU site on the afternoon of day 1 (after the last data measurement had been taken) and did the remaining measurements at home using MTs only. In the extension study, four distinct physical tests were conducted to simulate some typical clinical tests and to vary the conditions (such as rest, low activity, high activity) in which data were collected from the MTs (Table 2). All MTs carried the CE mark (a legal requirement to place a device on the market in the European Union), and all MTs except for the Striiv activity monitor and A&D weight scale had received a 510(k) Substantial Equivalence determination from the FDA (Table 1).

Table 2.

Summary of measurements, mobile technologies (MTs), clinical standard devices (CSDs), and physical tests conducted during part 2 of the study (extension)

Measurement MT CSD Physical test
Blood pressure A&D Model UA-767PBT-Ci1 GE Healthcare Dinamap Procare 4001 Modified orthostatic challenge

Spirometry Vitalograph asma-1 bt1 Carefusion Masterscope1 8-min exercise on bicycle ergometer

Oximetry Nonin Onyx II 9560 GE Healthcare Dinamap Procare 400 Voluntary hyperventilation provocation

Glucose Entra MyGlucoHealth Abbott Freestyle Precision (plus lab result) Meal test

Weight A&D Model UA-351PBT-Ci Seca 930 Not applicable

Activity Striiv Fusion Philips Respironics Actiwatch 8-min exercise on bicycle ergometer

GE, General Electric.

1

These MTs and CSDs were used both in part 1 (add-on part) and part 2 (extension part), while the other four were used in part 2 only.

Data from MTs – both in the clinic and at home – were transmitted via a small plug-and-play radio communication unit, the Qualcomm Life 2net Hub (2net Hub) [13], which collected data over Bluetooth and delivered them to a Qualcomm Life 2net Cloud Platform [14] via a cellular network for storage. Platform performance was evaluated with regard to the following metrics: data collection reliability (expected vs. actual number of clinical measurements), missing data (the incidence of and reasons for missed readings), data transmission times (the time between a measurement being taken and the data arriving at the 2net Hub), and data processing times (the time between measurement data being received at the 2net Hub and the data appearing in the PAREXEL results database). The time from a measurement being taken to its appearance in the database is referred to as the end-to-end data flow time.

No reminders were sent to the subjects electronically to take readings, but while the subjects were in the EPU they were reminded to take readings by the EPU staff. For the at-home readings, no reminders of any kind were given. This was done intentionally so that the study could assess the impact of the absence of reminders.

This was not a formal MT validation study. However, an important secondary objective was to gain experience in comparing measurements taken by MTs and those taken by CSDs. For five of the MT-CSD pairs, agreement was assessed using two statistical methods: (1) the Bland-Altman approach [15] and (2) a modification of this approach by Francq and Govaerts [16]. Clinical acceptance ranges (CARs) were defined prior to data analysis (Table 3) [17, 18, 19, 20]. The activity monitors were assessed for correlation only (vs. agreement), because the MT activity watch and the CSD activity watch measured different endpoints [21].

Table 3.

Clinical acceptance ranges

Clinical endpoint Device comparison (MT vs. CSD) Range (lower, upper)
Blood pressure, mm Hg A&D Model UA-767PBT-Ci vs. GE Healthcare Dinamap Procare 400 Level 1: (−5, 5) Level 2: (−10, 10) Level 3: (−15, 15)

Pulse, bpm A&D Model UA-767PBT-Ci vs. GE Healthcare Dinamap Procare 400 Level 1: (−5, 5) Level 2: (−10, 10)

FEV1, L Vitalograph asma-1 bt vs. Carefusion Masterscope Level 1: (−5, 5) Level 2: (−10, 10)

Glucose, mmol/L (1)EntraMyGlucoHealth vs. Abbott Freestyle Precision (2)EntraMyGlucoHealth vs. laboratory result (3) Abbott Freestyle Precision vs. laboratory result (−0.67, 0.67)

Blood oxygen, % Nonin Onyx II 9560 vs. GE Healthcare Dinamap Procare 400 (−3, 3)

Weight, kg A&D Model UA-351PBT-Ci vs. Seca 930 (−1, 1)

MT, mobile technology; CSD, clinical standard device; GE, General Electric.

The MT population included all data transmitted, whereas the per protocol population excluded invalid device readings. In this study, invalid device readings were defined as data points that were (1) erroneous in some way due to the use of the device itself (e.g., a family member steps on the weight scale at home and it transmits data from someone who is not a study subject) or (2) physically impossible (such as an increase in FEV1 lung capacity of 2 L/s) and therefore not valid due to device malfunction. A small number of spare devices were available for use in case of malfunction. If a subject noticed an issue with a device, they could raise a concern with the EPU staff. They in turn would alert technical staff and a substitution was made. If a subject did not notice an issue, no intervention was made. All MTs had “store-and-forward” capability, whereby a reading which could not immediately be sent would be stored on the device until the next available opportunity.

Because the research was exploratory, no statistical hypothesis was specified, no power calculations were conducted to determine the sample size, and no p values are reported. Various scenarios were explored for some devices and study parts (e.g., analyses based on data from part 1 days 1–14 pre-dose, or analyses based on data from part 2 tests). The devices were assessed for repeatability (the amount of variation within one subject on the same device under the same conditions) and agreement (the amount of variability and accuracy between the device pair relative to the CAR). Due to the exploratory nature of the study, definitive conclusions about device interchangeability were not made. Nevertheless, the data support general observations on device performance and recommendations for future study designs.

The subjects and investigators using the MTs evaluated the usability (“ease of use”) with a Subject Device Usability Questionnaire (DUQ) [22] and Investigator After-Scenario Questionnaire (ASQ) [23], respectively.

Results

Healthy volunteers were enrolled at a PAREXEL EPU between February 11 and June 13, 2017. In part 1 of the study, data collection reliability across the two devices was high: 98.1% of all expected readings were taken at the correct time and processed successfully. Compliance was slightly higher for the blood pressure monitor at 99.2% versus 96.5% for the spirometer. In part 2 of the study, data collection reliability across all devices (excluding the activity monitor) was also high at 95.8%. The devices with the highest compliance were the blood pressure monitor (98.9%) and spirometer (98.1%), followed by the weight scale (93.8%), pulse oximeter (89.9%), and glucose meter (88.9%). The activity monitor collected data continuously, so compliance was calculated differently. Of a total of 66 expected data files, 55 (86%) were received and read successfully.

In part 1 of the study, 20 of 1,049 expected readings were missing. The majority of these (13) were due to a single spirometer malfunctioning. Five others were due to subjects not taking the reading, or not taking it at the correct time. Equipment malfunction caused the remaining 2 missing readings. None were due to platform processing issues. In part 2 of the study, of 1,083 expected measurements, 45 were missing (4.2%). By far the most common reason for a missing measurement was that it was not taken (22; 49%), which occurred for all device types. User error with the glucose meter accounted for the second largest category (15; 33% of all missing measurements), and error messages with the pulse oximeter accounted for the third largest (4; 9%). For the activity monitor there were two causes of missing data: the battery on a single device was drained and not recharged (5%), and some devices were turned off before final data transmission on day 3 (9%). All causes of missed measurements were at the device and data collection level; no data were lost in the data processing stages. At-home measurements accounted for a higher proportion of missed measurements than in-clinic readings in part 2 of the study (12.6 vs. 2.7%).

In both parts of the study, the transmission times (the time between a measurement being taken and the data arriving at the 2net Hub) for each device could be split into two categories: fast (when data flow encountered no interruptions) and slow (when there were connectivity issues), the boundary between the two being device dependent. The store-and-forward capability of the MTs studied proved very important, because the volume of messages with “slow” transmission times due to connectivity issues was higher than expected. There was no significant evidence that the transmission times of the measurements taken at home were different from those taken in the clinic. The processing times (the time between measurement data being received by the 2net Hub and the data appearing in the PAREXEL results database) were similar across the device types, whether at home or in the clinic.

An analysis of the end-to-end data flow times showed that in part 2 of the study, the measurement data arrived in the results database a median of 34 s after being taken. Overall, 90% of the measurements appeared in the results database within 11 min 9 s of being taken.

General observations on repeatability, agreement, and user acceptance for all six MTs are summarized in Table 4. This study was not able to establish repeatability or agreement for every MT (see online suppl. Material; see www.karger.com/doi/10.1159/000493883 for all online suppl. material). One MT (the pulse oximeter) demonstrated repeatability within the CAR and one device (the weight scale) demonstrated agreement (vs. the CSD) within the CAR. Below is a discussion of each MT.

Table 4.

Mobile technologies listed by repeatability, agreement, and user acceptance

Device type and model Repeatability1 Agreement1 User acceptance2
Weight scale
  A&D model UC-351PBT-Ci Not within CAR Within CAR 21/22 (95%)
Pulse oximeter
  Nonin Onyx II 9560 Within CAR Not within CAR 16/16 (100%)
Spirometer Vitalograph asma-1 bt Blood pressure monitor Not within CAR Not within CAR 19/22 (86%)
  A&D model UA-767PBT-Ci Not within CAR Not within CAR 18/22 (82%)
Activity monitor
  Striiv Fusion n/a n/a 12/22 (55%)
Glucose meter
  Entra MyGlucoHealth Not within CAR Not within CAR 16/22 (73%)

CAR, clinical acceptance range.

1

Repeatability and agreement were assessed in relation to the CARs shown in Table 3.

2

Willingness to use the device more than once per day for more than 1 month (percentage of subjects answering“agree” or“strongly agree”). See online supplementary material for more information regarding user acceptance data.

A&D Wireless Blood Pressure Monitor UA-767PBT-Ci

In the part 1 scenario based on one pre-dose measurement from each device on days 1–14, the systolic blood pressure (SBP) repeatability coefficient for the MT was 22.4 mm Hg, that is, on average, the absolute difference between two readings for the same subject using the MT under similar conditions is expected to be less than 22.4 mm Hg for 95% of the subjects. The SBP results with the CSD were similar (repeatability coefficient = 22.6 mm Hg, excluding an outlier), and both exceeded the highest level of CAR of 15 mm Hg, a reflection of the amount of variability in one SBP measurement.

The difference between the SBP devices in pre-dose assessments from days 1–14 collected at the same time point can be observed with respect to the line of identity (x = y) in Figure 2. Figure 3 shows that both the limits of agreement (LoA) and tolerance intervals are similar to each other but outside the level 3 CAR (Table 3). The LoA/tolerance interval describes the interval within which the difference between an MT and a CSD (based on one measurement) can be expected to lie 95% of the time. The intervals are calculated from the individual standard deviation (SD) estimates, which are a combination of the between- and within-subject variances. In this scenario, the MT appeared to be accurate with respect to the CSD (mean difference, −0.3 mm Hg), but the variability for both devices led to a failure to establish agreement based on one measurement per time point. In a post hoc analysis considering the hypothetical situation based on the average of three measurements at each time point, the predictive intervals from correlated bivariate least-squares regression fell within level 3 CAR when SBP readings were around 110 mm Hg (Fig. 4). The diastolic blood pressure findings from part 2 morning readings for in-clinic subjects compared to level 1 CAR were consistent with SBP, although a small bias was observed (2.5 mm Hg). These data confirm that one blood pressure reading is highly variable, may not be representative of the true underlying blood pressure, and may make it difficult to establish agreement between devices. In addition, since blood pressure and pulse have different magnitudes and relative variability, the behavior of each biometric from the MT relative to the CSD should be fully understood if planned for use in a clinical trial.

Fig. 2.

Fig. 2

Plot of systolic blood pressure comparing mobile technology (MT) to clinical standard device (CSD) pre-dose on days 1–14 with the x = y line of identity during part 1 of the study. Each symbol represents an individual's measurement pairs over all time points.

Fig. 3.

Fig. 3

Plot of systolic blood pressure (SBP) difference versus average SBP with limits of agreement (LoA) and tolerance intervals (TI) pre-dose on days 1–14 during part 1 of the study. “×” = paired differences at planned time points; “○” = by-subject average difference. MT, mobile technology; CSD, clinical standard device.

Fig. 4.

Fig. 4

Systolic blood pressure (SBP) correlated bivariate least squares regression for the average of three observations post hoc during part 1 of the study. “○” = by-subject average difference. MT, mobile technology; CSD, clinical standard device; PI, prediction interval based on 1 future observation; GI, prediction interval based on the average of 3 future observations; CB, confidence band for the regression line.

Vitalograph asma-1 bt Spirometer

The mean difference in both parts of the study was a negative bias, with more MT results being lower than the respective CSD results. FEV1 is a subject-dependent, effort-driven clinical parameter. Since the CSD in part 1 was used first at each measurement time and multiple FEV1 samples were collected in the MAD study (up to 8 trials), it was hypothesized that subject fatigue could have contributed to the lower MT readings. In part 2, the variability and bias were instead attributed to the clinical setting and conduct of sample collection. For the CSD, site staff accompanied the subjects and encouraged them to blow hard into the device in order to capture the maximum of three trials at each reading. For the MT, the subjects were provided some instruction and then left alone to conduct the three trials to simulate a home environment. These factors likely contributed to the bias and lack of repeatability.

Nonin Onyx II 9560 Pulse Oximeter

The Nonin Onyx II 9560 demonstrated repeatability in the scenario where three morning readings were taken by the subjects who took the device home. For the hyperventilation test, the LoA were outside of but close to the CAR. The narrow range of results from the healthy population and the upper limit of 100% may have been factors; an alternative data transformation, or testing the device in the disease population, could clarify this issue.

Entra MyGlucoHealth Glucose Meter

The Entra MyGlucoHealth measurements were consistently higher than both the standard glucose meter and the laboratory results (Fig. 5). Although these data were collected during a meal test (collected before a meal in a fasted state and after the meal at 30, 60, 90, and 120 min), the high degree of bias and variability was unexpected.

Fig. 5.

Fig. 5

Plot of glucose comparing mobile technology (MT) to laboratory results (CSL) from the meal test with the x = y line of identity during part 2 of the study (extension).

A&D Weight Scale UA-351PBT-Ci

Due to the nature of day-to-day weight fluctuations, repeatability over three morning measurements was high at 1.5 kg for both devices; i.e., two measurements taken from the same subject on the same device are expected to be within 1.5 kg for 95% of the subjects on average. To assess agreement, day-to-day variability in weight was accounted for by using the difference between daily morning measurements from both devices. The LoA were within the CAR of ±1.0 kg, and the difference between the two devices was negligible (−0.09 kg).

Striiv Fusion Activity Watch

This version of the Striiv Fusion is primarily a step counter; it captures actual steps above a level of intensity (in contrast, the CSD, Actiwatch, can measure small motor activity). Motion such as cycling cannot be accurately counted as a step and will result in false-positive data. Thus, the study was not able to measure the correlation between total steps per hour from the Striiv and total activity per hour from the Actiwatch.

Usability Questionnaires

Most MTs registered high “willingness to use” ratings (defined as an answer of “agree” or “strongly agree” from subjects when asked whether they would be willing to use the device more than once per day for more than 1 month) in the DUQ (Table 4; see online suppl. material). The ASQ scores indicate that the investigators (5 total in part 1 and 2 in part 2) expressed moderate-to-high satisfaction with the blood pressure monitor and spirometer and did not express dissatisfaction with any of the devices. The DUQ ratings collected on day 3 indicate that the majority of subjects expressed moderate-to-high “overall” satisfaction with all of the devices.

Discussion/Conclusion

MTs should not be used in clinical research unless the wireless data flow and processing platform are dependable. This study demonstrated that the technical platform produced reliable and timely results, with little data loss (the minimal losses were primarily due to measurements not taken, user error, and malfunctioning devices). The device instructions for subjects and site training contributed to a high level of data collection reliability.

That 90% of the data were delivered to the main PAREXEL results database within 12 min demonstrates the potential of marrying MTs with wireless transmission to a clinical trial database. Nevertheless, the study highlighted potential problems with at-home data collection. Although most readings were actionable within a minute, connectivity issues between the MTs and transmission devices caused significant delays to this. There was also significant variability in transmission time between devices. Both these factors must be considered when developing real-time algorithms. No active data monitoring occurred during the study, which was a factor contributing to a large proportion of the missed readings. Feedback to the subject when expected readings had been missed may have prevented future missed readings due to forgetfulness or triggered an investigation that would identify a fault more quickly. Equally, monitoring error messages from a device or determining when it has a low battery level could lead to interventions that prevent missed readings. Thus, this study has highlighted the importance of a comprehensive device and data monitoring system which feeds back to the subject.

There are limitations to the generalizability of our data. We studied healthy volunteers and results may not be the same with actual patients. Our sample sizes may not have been adequate to account for variables such as age, ethnicity, socioeconomic status, or educational level. Finally, since the study participants were compensated, they may have been less likely to express dissatisfaction with a device in the questionnaire.

The process of comparing the data from the MTs with those from the CSDs identified at least three critical areas of study for improving methods that will enable MTs to streamline clinical trials.

First, if an MT captures more than one type of endpoint (such as blood pressure and pulse), agreement between the MT-CSD device pair may need to be established for each endpoint to be included in a clinical trial. Specifically for the use of blood pressure monitors, we recommend more than one blood pressure reading per time point, because blood pressure is highly variable and one reading may not be representative of the true underlying blood pressure.

Second, researchers need to develop criteria for excluding invalid device readings from the intent-to-treat population using ranges based on accumulated subject data and established norms. To handle data that may not be valid, algorithms to identify suspect readings will need to be set up prospectively, preferably via the data transmission platform. It will be essential to have mobile phone apps that identify potentially erroneous and missing data and send prompts to subjects or patients in real time. Alternatively, a companion app could offer to the patient the ability to mark erroneous data by flagging them.

Finally, a key finding of this study is that documented performance of MTs (i.e., reliability, repeatability, and agreement with accepted reference devices) during pilot testing turned out to be essential, even for 510(k)-approved MTs and those with a CE mark, to set expectations for use in clinical trials [6]. Study designs could then account for all known factors. MTs which fail to demonstrate reliability, repeatability, and agreement with reference devices for appropriately selected endpoints may need to be recalibrated or replaced with alternate MTs.

Statement of Ethics

The subjects (or their parents or guardians) enrolled in this study have given their written informed consent. Prior to the initiation of the study, the final clinical study protocol, subject information sheet, and informed consent form were approved by the Ärztekammer Berlin Ethics Committee.

Disclosure Statement

C.R., T.W., M.K., T.S., and S.P. are employed by PAREXEL International. N.A., N.B., A.T., C.E., S.K., and L.H. are employed by Sanofi-Aventis Recherche & Développement. The authors have no other conflicts of interest to declare.

Funding Sources

Both Sanofi-Aventis Recherche & Développement and PAREXEL funded the study and covered costs related to protocol development, study execution, and data collection and analysis. More than 50% of the direct and indirect costs were absorbed by PAREXEL.

Author Contributions

C.R. contributed to the secondary endpoint data analysis and reviewed the manuscript. T.W. contributed to the primary endpoint data analysis and reviewed the manuscript. T.S. contributed to the study design and reviewed the manuscript. S.P. reviewed the manuscript. M.K. was involved in the setup and design of the study, analyzed data, contributed to the original clinical study report, and reviewed the manuscript. N.A. contributed to the launch of this project and development of the study protocol and reviewed the final report. N.B. contributed to the development of the study protocol and statistical analysis plan and reviewed the study report. C.E. helped coordinate the conduct of the study and the collaboration with PAREXEL and reviewed the study report. A.T. acted as clinical trial manager of the phase I study that provided the subjects and clinical standard data for comparison to the wearable device data. S.K. contributed to the development of the study protocol and reviewed the final report. L.H. contributed to the review of the study report.

Supplementary Material

Supplementary data

Acknowledgements

Jean-Marc LeBideau provided the technical coordination and integration of platform technology with test devices. Ramona Borst served as Clinical Research Coordinator for Sanofi MAD study conduct and device measurements. Kay Reichenbach served as Clinical Research Coordinator for pilot study conduct, device measurements, and contribution to the development of the platform. Laurent Venerucci and Jens Janiszewski helped with device selection and testing, preparation of device kits, and allocation to subjects. Cliford Mbonbowo integrated data across platforms and created data sets. Alain Afios served as the Information Technology Solution Architect, helping with the review and design of the data flow, data collection, and data transmission.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary data


Articles from Digital Biomarkers are provided here courtesy of Karger Publishers

RESOURCES