Skip to main content
PLOS One logoLink to PLOS One
. 2021 Jan 6;16(1):e0244302. doi: 10.1371/journal.pone.0244302

Insights into mobile health application market via a content analysis of marketplace data with machine learning

Gokhan Aydin 1,*, Gokhan Silahtaroglu 2
Editor: Farrukh Aslam Khan3
PMCID: PMC7787530  PMID: 33406100

Abstract

Background

Despite the benefits offered by an abundance of health applications promoted on app marketplaces (e.g., Google Play Store), the wide adoption of mobile health and e-health apps is yet to come.

Objective

This study aims to investigate the current landscape of smartphone apps that focus on improving and sustaining health and wellbeing. Understanding the categories that popular apps focus on and the relevant features provided to users, which lead to higher user scores and downloads will offer insights to enable higher adoption in the general populace. This study on 1,000 mobile health applications aims to shed light on the reasons why particular apps are liked and adopted while many are not.

Methods

User-generated data (i.e. review scores) and company-generated data (i.e. app descriptions) were collected from app marketplaces and manually coded and categorized by two researchers. For analysis, Artificial Neural Networks, Random Forest and Naïve Bayes Artificial Intelligence algorithms were used.

Results

The analysis led to features that attracted more download behavior and higher user scores. The findings suggest that apps that mention a privacy policy or provide videos in description lead to higher user scores, whereas free apps with in-app purchase possibilities, social networking and sharing features and feedback mechanisms lead to higher number of downloads. Moreover, differences in user scores and the total number of downloads are detected in distinct subcategories of mobile health apps.

Conclusion

This study contributes to the current knowledge of m-health application use by reviewing mobile health applications using content analysis and machine learning algorithms. The content analysis adds significant value by providing classification, keywords and factors that influence download behavior and user scores in a m-health context.

Introduction

Maintaining public health costs and sustaining or improving public health is becoming harder throughout the world with the steady increase in median age and related diseases (e.g., diabetes, high blood pressure, etc.). This phenomenon is expected to lead to congestion in healthcare systems and higher healthcare costs. Consequently, preventive measures that can lead to a healthier life even at a later age emerge as a likely solution [1,2]. Given the recent Covid-19 pandemic, the fragility of the healthcare systems worldwide has become more evident. Within this context, the effective use of mobile technologies and devices as interventions in sustaining health and well-being come forward as a promising venue for policymakers and relevant stakeholders, which is drawing growing interest from researchers as well [36]. Moreover, mobile devices and apps are used more frequently due to the remote working and lockdowns that resulted from the Covid-19 pandemic, also drawing the interest of the public towards health & fitness apps [7]. However, to offer any value to the general public, mobile apps must be downloaded, installed, and actively used by the users. As of 2019, more than seven billion people (95% of the global population) live in an area covered by a mobile cellular network (GSM). Furthermore, mobile broadband subscriptions that are required for the effective use of smart mobile devices, such as smartphones and tablets, have grown by more than 20% annually over the past five years and reached 4.1 billion globally the end of 2019 [8]. Thus, the infrastructure for the mobile health (m-health) initiatives is in place in most locations. Nevertheless, among the hundreds of thousands of mobile applications on display in mobile app marketplaces, attracting attention to them is not straightforward and the success of individual apps is related to the download behavior of users. Certain apps are perceived to be more successful and effective than others, yet most are free to download. Considering similar functionalities, cost alone may not be indicative of performance or effectiveness entirely. Moreover, bearing in mind the different categories of health and wellbeing that mobile apps may focus on, different features may be influential on user choice and sentiment [9,10].

Against this backdrop, this study aims to be instrumental for policymakers, health institutions and mobile app developers by contributing to the discussion on e-health and m-health usage behavior and provide practical implications on ways to increase m-health app use. Cultural factors and ever-changing technology and applications lead to a need for continued scientific studies to identify the best practices and pave the way to wider adoption of mobile applications for sustaining and improving health. Within this context, by analyzing both user-generated (i.e. user review scores) and company generated data (app definition, description and technical data), this study aims to:

  • Describe the m-healthcare characteristics of apps available on Google Play Store.

  • Determine the categories that m-health apps focus on and highlight the areas neglected areas.

  • Determine the app features and categories that users perceive more positively (i.e. higher user scores).

  • Identify the best practices and features (e.g., keywords) through a content analysis of application descriptions.

  • Test for potential relationships between data protection practices highlighted in app descriptions and download behavior/scores.

  • Identifying the barriers and pain points users highlight in each mobile app subcategory.

  • Provide actionable insights such as features and keywords that can be used in promoting m-health apps.

Relevant studies have mostly been carried out in western countries particularly the US and a research gap is evident in emerging countries. Adaptability of these studies’ findings to emerging economies is questionable due to factors such as legislation and culture. Thus, findings from this study can be used to inform future policy development, mobile health application, planning, design and the development of m-health apps specifically in emerging economies.

This article is organized in three main sections. The first part of this paper introduces different ways mobile apps can be used for improving and sustaining health and common relevant classification methods. In this section we also provide information on mobile app features, privacy concerns and the effect of price on mobile app use in this section. The second part examines the research methodology, which is followed by the data analysis section, where the data analysis algorithms utilized along with the results are provided. Finally, we discuss the findings in the discussion section and finalize the paper by offering possible future research directions in the conclusion section.

Background and related work

Mobile health and health applications

The rapid adoption of smart devices in the last decade has greatly contributed to the promise of using mobile technologies for health improvement. The m-health (i.e. mHealth) term also emerged, which was defined by the World Health Organization (WHO) as: “medical and public health practice supported by mobile devices, such as mobile phones, patient monitoring devices, personal digital assistants (PDAs), and other wireless devices” [11]. Mobile devices provide good platforms for developers to design third-party apps called mobile apps, software programs that are specifically designed to run on mobile devices to improve the functionality of mobile devices. Mobile applications installed on the mobile devices can utilize hardware and sensors (i.e. accelerometers, gyroscopes, magnetometers, sensors to measure heart rate, geo sensors GPS and cameras) to obtain the desired outputs. Consequently, mobile apps provide new methods for the continuous monitoring of biological, behavioral or environmental data, health indicators, and trends related to health behavior. Mobile apps can help change attitudes and behaviors by distributing, collecting, processing and interpreting health-related information and by enabling interventions [12,13]. Therefore, various objectives may be met through mobile applications targeting a wide range of user groups. It is possible to develop applications targeting healthcare professionals, healthcare recipients and the general public. Mobile devices and applications are being used for the rapid delivery of clinical information to healthcare workers to equip rural healthcare personnel with up-to-date information in both developed and underdeveloped countries. Yet, since the applications targeting health personnel are not within the scope of this study, only the applications targeting general public and healthcare recipients are discussed from this point on. Such mobile health interventions have achieved success in varying degrees in adherence to medication and treatment outcomes [14]. The effectiveness of various cell phone application and cell phone text message interventions have been tested in clinical trials such as diabetes control [15], hypertension control [16], and adherence to medication [17]. In cases where treatment is complex, such as with cancer patients, mobile applications are also used to improve health literacy in order to improve compliance [18]. Similarly, systematic reviews on the use of various smartphone apps and exercise platforms to improve diet, physical activity and sedentary behavior, and the effectiveness of related interventions have shown positive yet modest effects [12,19,20]. Mobile apps were also found to be influential in improving emotional and mental health. In a relevant study, it was observed that self-monitoring of personal mood with a mobile application reduces symptoms related to depression and anxiety and improves emotional well-being [21]. Consequently, mobile applications may be used as preventive medical tools in a multitude of ways [2,10,14].

Several studies have been conducted to analyze the features of mobile health software on app marketplaces, yet most of them have focused on a particular application area. Correspondingly, m-health app use literature is diversified and numerous subcategories exist. Depending on the use case mobile apps may be used as information sources, journals and personal digital assistants for quitting smoking, healthy eating, reducing calorie intake, increasing physical activity levels, communicating with the health system, improving adherence to treatments (e.g., on-time drug intake), medical monitoring and more [9,10,20,2225]. Given the variety of apps available to the general public, only a limited number of studies analyzed all available categories [e.g., 4], creating a research gap. Such ambitious studies called for solid frameworks to categorize the available apps and utilized several popular health education, planning and promotion models. One such framework is the PRECEDE-PROCEED model (PPM) of health planning and health education, another is Health Education Curriculum Analysis Tool (HECAT) health education content classification [4,26]. Additionally, disease management models such as WHO Global Burden of Disease [27] have been utilized in relevant studies. The PPM is a widely applied ecological approach to the planning and promotion of health interventions [28]. The PPM has been applied to the m-health applications context by adopting the tripartite structure of predisposing, enabling, and reinforcing factors [4,26]. Within this framework, predisposing factors such as mobile applications try to influence attitudes by providing information and increasing awareness of conditions and or health outcomes, as well as changing beliefs and attitudes to establish confidence among users so that they can change their behavior to avoid adverse outcomes. Enabling factors/ mobile applications on the other hand, aim to change behavior and formed habits by providing an opportunity to learn a new skill and to follow up on the progress of a subject (e.g., applications that allow daily/monthly recordings and follow-ups of running/cycling times and duration while doing sports). Lastly, reinforcing factors/applications aim to encourage certain behaviors that will help in improving and sustaining health through different reward and feedback systems provided to users [4,29]. For instance, apps with auto-sharing to social media sites such as Facebook, or apps that provide ways to communicate and get feedback from an online coach are considered in this category.

Another framework used in the literature for categorizing m-health apps, which is also used in the present study, is HECAT [4] by the Centers for Disease Control and Prevention [30]. This framework focuses on health education curricula provided to students. The categories considered within the HECAT health education content classification are as follows: Alcohol and Other Drugs, Healthy Eating, Mental and Emotional Health, Personal Health and Wellness, Physical Activity, Safety, Sexual Health, Tobacco, Violence Prevention, Comprehensive Health Education.

Factors affecting m-health application use

Mobile application features

It is obvious that there are certain macro- level barriers to the use of m-health applications such as low technology literacy, income, limited access to mobile devices, and lack of infrastructure. Yet, certain other barriers are perceived barriers and may be influenced by mobile app developers and sponsors. The findings obtained in a large-scale study in the US have indicated that a significant part of the population does not use healthcare applications due to hidden or visible costs, high data entry burden, complex systems, and data security concerns [9]. The need to understand and address user concerns is critical to ensure wide use of these applications. Perceptions of functionality, performance, trustworthiness, ease of use (e.g., interface, time required to learn etc.), and certain privacy concerns that are influential in usage [31] may be overcome by application developers through the careful design of mobile apps and communication in mobile marketplaces [32]. For instance, it has been shown that informational content, organizational attributes, technology-related features, and user control factors influence the trustworthiness of m-health applications [33]. Moreover, the mobile health app features are a research area that attracted the interest of researchers due to its significance in usage behavior [34]. Considering that these factors are commonly assessed by users while browsing app stores without experiencing the app itself, the proper use of available tools such as app descriptions is vital to success.

Pricing

Pricing and costs associated with using the app have been considered in the literature as a significant factor that hinders the wider adoption of mobile devices and apps. Several studies indicated that the costs associated with mobile app use are a barrier to adoption [9,10,35,36]. Moreover, increased app prices are shown to decrease app sales [37], yet most apps are offered free of charge by developers to users in several contexts [e.g., 38]. The revenue is generated from advertising displayed to users, in-app purchases for premium features, higher convenience, etc. The latter model is also called freemium model, which has been tested and tried in software and mobile application contexts. Thus, offering basic functionality free of charge, while offering in-app purchase options for higher functionality is a viable model that is also used in a health apps context [39,40].

Privacy, information disclosure and relevant legislations

Information privacy is a delicate topic in m-health apps context due to the personal and sensitive nature of the information gathered from the users [41]. Personal information is used to establish functionality and provide value to users but also to determine the content of the advertisements to be displayed to users in the case of free applications. Advertisement content such as for medical products may or may not be certified and their effectiveness or safety is questionable given the lack of effective control mechanisms. Keeping the personal information that is collected through apps secure and being transparent are vital issues for mobile application developers and sponsors [10,42]. User privacy concerns should be addressed and compliance with legal regulations and ethical concerns should be met [9,41,43]. Although attempts have been made to establish standards for mobile applications being released in the field of health, unfortunately, there is no globally accepted framework. The American Food and Drug Administration (FDA) emphasized the necessity of maintaining certain technical standards and data protection in the US and tried to set certain standards in the mobile application ecosystem. The FDA aims to establish standards within this framework in private institutions to develop applications that offer “reliable content, protect information privacy and security, and work as promised” [44]. Regardless of the efforts, control mechanisms are not sufficient in practice and health apps that are estimated to be in the hundreds of thousands, carry many security concerns as keeping the information secure is not straightforward [45,46]. Studies on mobile apps suggest that apps targeting patients could perform better, particularly regarding privacy and security issues [e.g., 47]. Despite its significance, privacy concerns have not been addressed properly in most m-health apps as studies suggest. For instance, in a review study on health applications in mobile application markets, only 30% of the 600 mobile applications evaluated were found to have a privacy policy. Moreover, approximately two-thirds of the available privacy policies were generic [48]. These findings indicate that there are deficiencies in providing the necessary transparency regarding the information collected and its security. This may be among the reasons why users prefer not to download certain apps.

Research methodology

To attain the research objectives, a cross-sectional study of m-health apps was performed to characterize and classify apps. Publicly available data on free and paid apps provided by app developers and users on the Google Play app store were collected, coded, classified and analyzed as detailed throughout this section. The Istanbul Medipol University Ethical Committee of Non-invasive Clinical Trials specifically approved the present study (document no: 10840098–604.01.01-E.8356).

Selection, data collection and screening

Mobile apps are selected by browsing the health and wellness category of Google Play app store and carrying out searches using ‘health’ and ‘wellness’ as keywords. The data on apps was collected in the second half of January 2020 via an application programming interface (API) in Python programming environment. Data on 520 free and 520 paid mobile applications classified under the health and wellness category were retrieved through the API. The app selection and exclusion process can be seen in Fig 1.

Fig 1. App selection process.

Fig 1

Data coding and classification and validation

In the data coding stage, 30 duplicates and 36 apps belonging to other categories or having different purposes (e.g., games, gym membership apps etc.) were left out of further analysis. App description data were cleaned with text mining tools. Firstly, all descriptions were translated into a single language i.e. Turkish as not all were in the same language. After that, texts have been tokenized and stemmed (i.e. converted to its base form). Following this process goes, going, went or gone were converted to their base form, ‘go’. Then, stop words (i.e. the most common words in a language) and punctuation filtering were applied to the data. In this stage words such as ‘a’, ‘an’, ‘the’, ‘but’, ‘only’ was filtered. Additionally, punctuation marks such as ‘?’, ‘;’ and bullet points were also removed from the dataset. After this process, all text data have been vectorized, in other words converted to binary dummy variables. Since this creates a large dataset that is hard to administer, frequency elimination was applied. Classical ways of calculating the weighting of words are term-frequency and Document Frequency (DF). DF is the number of documents in which the term appears. One may think of the definite article ‘the’, which occurs more than any other lexical words in an English text. However, it does not mean that ‘the’ is more informative than other words are. For this purpose, the Inverse Document Frequency (IDF) method was used. This method applies a technique considering all reviews and the frequency of each word in each document inversely. In this way, if a word occurs too many times in all of the documents, its frequency value drops. The description data which was vectorized were merged with other variables. In the Eq (1), fr_t,d is the frequency of each term in the documents, while ‘d’ represents the number of documents in which the term occurs. In our case d is the number of reviews.

TFt,d=frt,dt=1nfrt,d2 (1)

Inverse document frequency (IDF) is a measure which aims to deliver how much information the term gives. IDF measure does not count on whether the term occurs frequently or rarely. It takes one or more documents in the corpus. N is the total number of documents in the corpus and nt is the number of documents in which the term appears. If the word does not occur in the document nt = 0, this leads to a division-by-zero case. In order to avoid this problem, it is smoothed with 1 + nt.

TFIDF=TFt,d.Nnt+1) (2)

Regarding the manual data classification, two distinct frameworks were used to categorize the mobile applications. First, the PPM [28] was utilized to categorize mobile applications with regard to how they promote health (i.e. their major aims and how they provide value to users). Within this framework, mobile applications were classified as predisposing, reinforcing, or enabling. This model has been widely used in health education planning, promotion, diagnosis and evaluation activities [49]. Similar to the methodology followed by West et al. [4] in a related study, the applications have been grouped with regard to the PPM framework in this study as well. Consequently, the coders coded the applications as predisposing, enabling or reinforcing. Within this framework m-health apps were coded as predisposing if they provided health information and statistics related to influencing attitudes, knowledge, awareness, beliefs and values or if they aimed to increase the confidence, motivation or self- efficacy of users (e.g., an application that provides smoking and cancer related statistics; ways to avoid adverse health outcomes). If the apps were used for tracking or recording status or progress (e.g., weight or calorie counting, geo-locating running/biking activities), or they facilitated behavior by teaching a skill (e.g., an app showing images or videos on proper posture) they were coded as enabling. Enabling apps are commonly used at the same time as the desired behavior. The apps were coded as reinforcing if they interfaced with a social networking site (e.g., apps with an automatic upload to Facebook), provided encouragement from trainers/coaches (e.g., an app that featured easy communication with a trainer), and included an evaluation based upon the user’s self-monitoring (e.g., an app that provided automated notifications). In the event that an app was both considered enabling or reinforcing, the reinforcing category was coded. If an app can both be considered predisposing or enabling the enabling category was coded. Examples of applications that were coded as predisposing are Health Articles, Info & Motivation–Lets Healthify, Ketogenic Diet and EO Guide. Applications that were coded as reinforcing are Drink water reminder, Soccer Training and Quit Smoking Tracker GOLD; coded as enabling are Period Tracker Mia, YAZIO Calorie Counter, Nutrition Diary & Diet Plan, and Calorie Counter–MyFitnessPal.

The second classification method used is HECAT health education content classification by the Centers for Disease Control and Prevention [30]. The categories of coding within the HECAT health education content classification are: AOD: Alcohol and Other Drugs, HE: Healthy Eating, MEH: Mental and Emotional Health, PHW: Personal Health and Wellness, PA: Physical Activity, S: Safety, SH: Sexual Health, T: Tobacco, V: Violence Prevention, CHE: Comprehensive Health Education.

Two research assistants working at a management information systems department with a focus on healthcare, who provided informed consent to participate in this study, were trained by the authors to collect and code application data into the aforementioned categories based on the two frameworks discussed. Two training sessions were held over the course of data coding by two authors and two research assistants, one before coding and another after the coding of 60 apps in a pilot study. The manual coding detailed in this section helped in obtaining further variables (i.e. the PPM and HECAT categories, app sponsor type, privacy policy availability) in addition to the information retrieved from Google Play Store. Table 1 presents all of the metrics and qualitative parameters that were extracted in this manner and fed into the machine learning models. These predictor variables were used in the analysis to predict the total number of downloads. Supervised learning algorithms need a class variable for training using the available data. In the current study, ‘the total number of downloads’ was chosen as the class variable parallel to the main aim of this study, which is to discover the ways to improve the use of m-health apps.

Table 1. Code sheet: Metrics and parameters.

Variable Values Description
Video Yes–No Whether there is a video provided by the developer/sponsor in the app description or not.
Description Free text Text provided by app developer to describe the app on marketplace.
Editor’s Choice Yes–No Editor’s Choice badge given to app or not.
Free Yes–No Free or paid app.
Price 0 or a double value in US dollars. Price to be paid to run the app; recoded into 5 categories (see Table 2).
Days since last update Integer between 11–2371 Days since the app was last updated as of the data collection date. Recoded into 5 categories (see Table 2).
In App Purchase Yes–No In-app purchases provided or not
Size double value in Bytes Size of the downloaded app package
Required Android Version Text Name of the version such as 4.4 or up, 5.0 or up.
Content Rating Everyone or Teen Everyone or Teen with warning such as ‘use of alcohol, gambling, Language’
Interactive Elements None, Digital Purchases, Users Interact, Shares Location, Shares Info. Info on interactivity provided in the app such as purchases, sharing and user interactions.
Score Double value between 1–5 Score provided by users to the app. (min.5 reviews required for score).
Number of Reviews Integer between 0–2,125,979 Total number of reviews users have provided on an app.
Installs 10+, 50+, 100+, 500+, 1K+, 5K+, 10K+, 50K+, 100K+, 500K+, 1M+, 5M+, 10M+ Number of times app is installed (categorized by Google app market)
App Sponsor/Origin Government, large corporation, SME-Developer, individual The main sponsor of the app
Privacy Policy Yes–No Whether privacy policy is mentioned in description or not.
HECAT App Category AOD: Alcohol & Other Drugs, HE: Healthy Eating, MEH: Mental & Emotional Health, PHW: Personal Health & Wellness, PA: Physical Activity, S: Safety, SH: Sexual Health, T: Tobacco, V: Violence Prevention, CHE: Comprehensive Health Education A new category `Maternal and Infant Health`was amended to the existing HECAT classification given that there are several apps aiming at new mothers and parents. AOD and T categories are joined as TAOD to carry out certain analysis due to the low number of apps in these categories.
PPM App Category Predisposing, Enabling, Reinforcing Details on PPM classification are provided in Section 2.1.

Inter-coder reliability was computed on 120 apps (approximately 12% of the total dataset) after the apps were coded. The degree of agreement between the two coders was calculated and the overall 87% agreement figure led to the conclusion that there was no significant inter-coder issue.

Data analysis and results

Following the re-coding process, the frequencies of each variable and the results of the classifications are provided in Table 2.

Table 2. Frequencies of apps and app classification(s).

Variable # of apps % of total Variable # of apps % of total
Sponsor/Origin Video in Description
Corporation 175 18.00% Yes 193 19.80%
Government 10 1.00% No 781 80.20%
Individual 132 13.60% Days since last update
SME-Developer 657 67.50% 1-60days 186 20.50%
PPM Categories 61-120days 151 16.60%
Enabling 669 68.70% 121-240days 171 18.80%
Predisposing 159 16.30% 240-365days 128 14.10%
Reinforcing 146 15.00% 365+days 273 30.00%
HECAT Categories Privacy Policy  
Maternal&Infant Health 32 3.00% Yes 81 8.30%
CHE 43 4.10% No 893 91.70%
HE 149 14.10% Price (USD)    
MEH 150 14.20% Free 500 50.00%
PA 368 34.80% 0.99–1.50 107 11.00%
PHW 247 23.40% 1.51–3.00 175 18.00%
SH 44 4.20% 3.01–5.00 103 10.80%
TAOD 23 2.20% 5.01+ 89 9.20%
Version Required Installs
1.6 and up 5 0.50% 10+ 54 5.60%
2.1 and up 12 1.20% 100+ 80 8.20%
2.2 and up 46 4.70% 500+ 63 6.50%
2.3 and up 29 3.00% 1,000+ 174 17.90%
2.3.3 and up 11 1.10% 5,000+ 75 7.70%
3.0 and up 10 1.00% 10,000+ 120 12.30%
4.0 and up 94 9.70% 50,000+ 61 6.30%
4.0.3 and up 122 12.50% 100,000+ 133 13.70%
4.1 and up 249 25.60% 500,000+ 41 4.20%
4.2 and up 60 6.20% 1,000000+ 103 10.60%
4.3 and up 24 2.50% 5,000,000+ 19 2.00%
4.4 and up 122 12.50% 10,000,000+ 51 5.20%
5.0 and up 94 9.60% Editors’ Choice
6.0 and up 20 2.00% No 942 96.70%
Varies with device 76 7.80% Yes 32 3.30%
Average Rating Content Rating
1.5–3.0 50 5.70% Everyone 928 95.30%
3.1–4.0 214 24.60% Teen 34 3.50%
4.1–4.5 332 38.10% Mature (17+) 12 1.20%
4.6–5.0 275 31.60%  
Total 974 100% Total 974 100%

As a next step, the means of different categories were compared via ANOVA analysis on SPSS software package to reveal the factors that are influential in review scores. User scores variable was set as the dependent variable whereas the app sponsor, PPM categories, HECAT categories, content ratings, price groups, being an editors’ choice, having a privacy policy, having a video in the description, in-app purchases, the number of days since last update and interactive elements were set as independent variables. 75 apps out of 974 did not have any published review scores, thus were left out of this analysis. The results of the analysis are detailed in Table 3. According to the results, the PPM category, content rating, price group, the days since last update and the interactive elements were not significantly different in terms of user scores and F values are not provided for these variables in an effort to save space.

Table 3. F-test compare means results.

Variable Mean N Std. Dev. Min. Max. F-value Sig.
Video in Description
No 4.168 702 0.6125 1.5 5.0 5.124 .024**
Yes 4.275 197 0.4618 2.2 5.0
Total 4.192 899 0.5842 1.5 5.0
HECAT
Maternal & Infant Health 4.438 26 0.4900 3.0 4.9 5.367 .000***
CHE 4.100 34 0.5836 2.7 4.9
HE 4.035 123 0.5989 2.2 5.0
MEH 4.283 125 0.5059 2.3 5.0
PA 4.275 335 0.5474 1.5 5.0
PHW 4.053 199 0.6405 2.0 5.0
SH 4.220 35 0.5825 2.2 5.0
TAOD 4.336 22 0.6314 2.4 4.9
Total 4.192 899 0.5842 1.5 5.0
Sponsor/Origin
Corporation 4.119 159 0.6397 1.5 5.0 3.930 .008***
Government 3.680 10 0.5574 2.7 4.4
Individual 4.246 121 0.5271 2.6 5.0
SME Developer 4.208 609 0.5760 1.8 5.0
Total 4.192 899 0.5842 1.5 5.0
Privacy Policy
No 4.181 819 0.5833 1.5 5.0 3.103 .078*
Yes 4.301 80 0.5862 2.2 4.9
Total 4.192 899 0.5842 1.5 5.0
Editors’ Choice
False 4.179 867 0.5884 1.5 5.0 11.774 .001***
True 4.538 32 0.2938 3.8 4.8
Total 4.192 899 0.5842 1.5 5.0
In-App Purchase
False 4.133 651 0.6132 1.5 5.0 21.567 .000***
True 4.344 248 0.4681 2.2 4.9
Total 4.192 899 0.5842 1.5 5.0
Install Categories
1,000,000+ 4.418 173 0.4480 2.0 4.9 12.882 .000***
1,000+ 4.109 172 0.5961 2.3 5.0
10–1000 4.194 125 0.6303 2.2 5.0
5,000–50,000 4.013 195 0.6759 1.5 5.0
50000–1,000,000+ 4.232 234 0.4931 2.5 4.9

* significant at 0.10< level

** significant at 0.05< level

*** significant at 0.01< level.

As a further step of the analysis, three machine learning algorithms were used to predict the number of downloads. The overall workflow of the machine learning analysis is depicted in Fig 2.

Fig 2. Overall workflow.

Fig 2

A decision tree model was used to extract hidden patterns behind the number of downloads. Decision trees create IF-ELSE-RULES which can be interpreted by humans and can also be implemented in third -party applications. The most frequently used decision tree algorithms are Gini and Entropy (gain ratio) based algorithms. However, it is well known that they are weak learners. They are easily affected by dataset variations and outliers. In order to circumvent this overlearning or overfitting problem, ensemble and random forest decision tree (RFDT) models have been developed. They yield successful results on many data bases. In this study, the decision tree model we have employed is the random forest model. As the name suggests, the random forest model creates more than one tree. Trees are created with randomly selected variables from the dataset by using different algorithms [50]. The total number of the trees in the forest is determined by the data scientist. The root of the tree is essential for a decision tree model. It is considered to be the most important parameter to explain the target class variable. Thus, in a forest it is critical to know how many times each variable was selected as the algorithm to be included in the root of the trees. Traditionally, random forests may use GINI or Entropy based decision tree algorithms. In this study, GINI based random forest algorithm, which employs probability concept as indicated in Eq (3), was used. In the equation, p is the probability of a variable being in a field of the data set.

G(split)=i=1nn1n(1j=1np2) (3)

The second machine learning model we used is Artificial Neural Networks (ANN). ANN are composed of layers, which are interconnected via weights as visualized in Fig 3. The simplest ANN has one input layer (variables used for learning), a hidden layer and an output layer (class variables). Hidden layers contain neurons that hold activation functions. Each neuron makes a decision about how strongly it should activate. Each neuron is also fed with weights coming from previous layers. A backpropagation method is applied to adjust weights with minimum prediction error. ANN generates weight matrixes as learning outputs [51]. Neural networks need to be fined-tuned with certain hyperparameters.

Fig 3. Artificial neural networks.

Fig 3

Thirdly, Naive Bayes (NB) machine learning models, which are one of the most popular probabilistic classifiers in data science, were used. NB’s are based on the Bayes Theorem of conditional probabilities. They tolerate noise and outliers well; their training time is relatively short, and they need few hyperparameters for fine-tuning. In the Eq (4) P(x) is probability of x in the document (data set), P(c) is the probability of the labeled class in the data set and finally P(x/c) is the probability of variable x in a given class.

P(c/x)=P(x/c)P(c)P(x) (4)

The KNIME analytics platform was used to generate all machine learning models and to apply algorithms. For all the algorithms applied, 30% of the data was used for validation, and 70% was used for training. Stratified sampling method was used for partitioning dataset as learning and validation parts. To overcome minority class problem, Synthetic Minority Over-sampling Technique (SMOTE) was employed. Hyperparameter selection for the algorithms utilized is as follows:

  • The quality measure was chosen as Gini index for random forest decision tree model. The number of levels was set at 10 and the minimum node size was 3. The n-estimator was chosen as 100. A 5-fold sampling (without replacement) was done along with stratified sampling.

  • In ANN, sigmoid activation function was preferred, and z-score normalization was applied to the dataset to speed up the training. Stochastic depth and early stopping were to prevent any possible overfitting. The best performing model for ANN was achieved with two layers and 10 nodes in each layer after several trials.

We calculated the accuracy, precision, sensitivity, specifity, Cohen’s Kappa, and F- values using true positive, true negative, false positive, and false negative cases to evaluate the learning quality and performance of the algorithms as provided in Table 4 and Fig 4.

Table 4. Model prediction performance.

Algorithm Precision Sensitivity Specifity F-Measure Accuracy Cohen’s Kappa
ANN 0.719 0.700 0.908 0.704 0.738 0.627
NB 0.670 0.663 0.888 0.662 0.660 0.550
RFDT 0.724 0.722 0.909 0.720 0.724 0.632

Fig 4. HECAT application categories.

Fig 4

The influences of different factors on download behavior were assessed using machine learning algorithms. When the results of these analyses are assessed, the best learner emerges as the ANN, which can be seen in Table 4. The output of the ANN analysis is not directly discussable, yet the sensitivity of each input variable can be calculated and interpreted. Thus, we calculated the sensitivity of input variables using the weights between inputs and the first layer of the ANN. The weights were then normalized by dividing the weight of each variable by the grand total of the weights. The sensitivity analysis of the Top-100 parameters (consolidated and filtered) led to the following influential keywords:

  • Keywords: part, young, band, weight, period, breath, pedometer, guts, treatment, calculate, education, smoking, use, outside, activity, timer, running, water, educate, Bluetooth, development, sutra, chest, glycemic, step, cycle, run, dance, points, plans, use, photo, music, minute, notification, progress

Moreover, a second run of the ANN by excluding the description text data has led to the findings provided in Table 5. Please note that random forest and naive Bayes algorithms have not been used further since they yielded relatively weaker accuracy and precision.

Table 5. ANN sensitivity (excluding description text data).

Category Sensitivity
Required Android Version 21.75%
Interactive elements 16.42%
HECAT category 14.42%
Content Rating 12.95%
Free 7.63%
Sponsor 4.95%
User Score 4.60%
In-app Purchases 4.17%
Privacy 3.97%
PPM 2.73%
Days since last update 2.21%
Editor’s choice 2.19%
Price 1.29%
Video 0.72%

The top five factors that are related to the total number of downloads is the ‘Required Android Version’, ‘Interactive elements’, ‘HECAT category’, ‘Content Rating’ and whether the app is ‘free’ or not. Lastly, the second-best performing algorithm RFDT’s tree attribute selection results highlight the role of the following parameters on download behavior:

  • Variables: Sponsor, HECAT, Editor ‘s Choice, Android version, Free.

  • Keywords: Light, Routine, Follow, Information, Morning, Food, Exercise, Step recorder, Ovulation period, Develop, Relax, Friends, Heart.

These findings are discussed in the following section in detail.

Discussion

The most common m-health app category among the sample was Physical Activity (34.8%) as indicated in Table 2 and as visualized in Fig 5. This was followed by Personal Health and Wellness (23.4%), Mental Health (14.2%), and Healthy Eating (14.1%). Apps that target quitting tobacco, alcohol use and/or drug abuse attracted the least attention (2.2%) from developers in the sample. It is evident that certain categories are represented and promoted at a higher frequency that other categories, hinting at that there may be areas where competition/supply is less intensive.

Fig 5. Algorithm prediction performance results.

Fig 5

When the user scores provided in Table 3 were assessed, it became evident that there were significant differences in user scores between HECAT app categories. Moreover, RFDT analysis also indicated that the HECAT category played an important role in download behavior (i.e. app popularity). The lowest average score was given to Healthy Eating apps by users among all categories (4.04). Despite the presence of a relatively high number of apps focusing on this category, users were not entirely content with Healthy Eating apps they have downloaded. This indicates that there is room for improvement in this subcategory, which may benefit app sponsors may when launching new apps or improving existing ones. Another common app category (23.4% of the total sample) that received relatively low scores (4.05) was the Personal Health and Wellness category that encompasses a larger variety of apps than other categories. The low scores indicated that despite the high number of apps and variety, user satisfaction is yet to be established. On the other hand, mobile applications on ‘Maternal and Infant Health’ received the highest average score (4.44) among the total sample. This specific category targeting new moms and parents received the most positive scores, indicating that competition may be challenging in this category where the users are already particularly satisfied with the available apps. Surprisingly, apps for quitting smoking, alcohol and other drugs have received low interest from developers (~2% of the total) despite the high user scores (4.34) and emergence of ‘smoking’ as a significant keyword related to the number of downloads. Given the high percentage of smokers (28% of the total population) in Europe [52] and the risks associated with alcohol use [53], more apps may be promoted to users. Moreover, there were no apps promoted by governmental institutions in the TAOD category among the sample. More resources may be allocated by governmental institutions on this category to disseminate apps to more users.

As discussed in the background section, pricing and costs associated with using the app are also influential in download behavior and have been considered as three distinct variables in this study: whether the app is free to use, the actual price to be paid, and whether in-app payment are offered to users or not. The first two factors were shown not to be influential on user scores, yet the ‘free’ variable proved to be influential on the total number of downloads. It is obvious that if an application is free more users will download it. The cost(s) associated with m-health app use have appeared as a barrier to adoption in several academic studies as well [9,10,36]. Thus, our finding signifying the free apps attracting more downloads is in accordance with the literature. Moreover, in-app purchases emerged as a notable factor related to user review scores. The apps offering in- app purchases received better user scores, which is in agreement with the results of a study by Biviji et al. [54]. It is evident that the freemium business model, also termed as in-app purchase strategy [39], in which basic functionality is offered freely while in-app purchase options are used for accessing extended features, works well in promoting health apps. This has been evidenced in the literature as well [40].

Another significant factor is the ‘Editor’s Choice’, which was found to be influential in both download behavior and user scores. If the app had an editor’s choice badge, both the user scores and the total number of downloads were observed to be higher. It should be noted that higher user scores and downloads may also lead to an ‘Editor’s choice’ badge so there is no clear causality in this relationship. A similar study on app marketplaces indicated that a comparable factor, the ‘best-seller rank’ affects consumers choice, which supports our finding [55]. Nonetheless, this is not a feature that can directly be controlled by m-health app sponsors or developers, and therefore, it does not lead to significant applicable insights.

A further noteworthy factor, which is associated with user scores is the presence/absence of a privacy policy regarding the use of the app. The apps that provide information on privacy policies in the description section have received higher user scores than apps without such information. However, only less than 9% of the mobile apps analyzed in the study had any mention of privacy policy within their descriptions. A similar finding was obtained in a review study on mobile health apps by Sunyaev et al. [48], where only 30% of the 600 apps analyzed were found to have a privacy policy. It is evident that app developers have not improved their stance on privacy policies in the last six years since that study. A need for further action by app developers, sponsors and policymakers is required to increase transparency regarding the privacy of personal health information collected by apps. These initiatives by app developers regarding privacy policy development and use may lead to higher user satisfaction as evidenced by higher user scores in the present study.

Another significant element in the ANOVA analysis that had a positive influence on user scores is ‘Videos’. This variable indicates whether there is a video provided by the app developer in the app description or not. Mobile apps with videos, which commonly show how the app works, received higher user scores than apps without videos. Interestingly, only 20% of the apps analyzed had videos provided in the description section. This leads to a practical implication that can benefit app sponsors/developers without spending considerable resources. Relevant descriptive app videos can be prepared through the use of common video maker and editor software that utilize screen captures and images, which can then be provided on marketplace app description pages.

Taking the keywords related to the app features into consideration, the ones related to basic functionalities provided by mobile app features to keep track of progress such as step counting, recording and using timers emerged to be influential in download behavior. Moreover, ‘notifications’ also emerged as a significant keyword related to the number of downloads, which highlights the role of feedback mechanisms in download behavior. Thus as another practical implication keywords related to notifications and features to keep track of progress, in addition to providing relevant keywords (see the following two paragraphs summarizing the ANN and RFDT results) in app descriptions will be effective in reaching higher number of downloads according to the findings.

Among analyzed keywords, ‘friends’ came to light as one of the most influential keywords in the RFDT analysis, indicating that social interactions and ability to share in apps can lead to higher number of downloads. As signified in relevant literature, apps with social networking features tend to perform better in facilitating behavior change [12]. Yet, similar to analogous classification studies, the findings of this study also reveal that only a limited percentage of apps offer sharing and social networking capabilities [56,57]. Similarly, another variable that was found to be an important element is the interactive elements that incorporate ‘Users Interact’, ‘Shares Info’, ‘Digital Purchase’, ‘Shares Location’, ‘Unrestricted Internet’ values. The major interactive element with the highest sensitivity, has emerged as ‘Users Interact’. This keyword denoting social interaction highlights both the role of social sharing and the interaction between users. Not surprisingly, in online consumer behavior literature, interactivity in several forms has been found to be a critical element of success in marketing communication, web design, and social media marketing [58,59]. Similar findings of studies in mobile contexts also support the significance of interactivity provided by relevant features in intentions and usage [6062]. Consequently, providing user interaction has emerged as a viable strategy to influence the total number of downloads and therefore app installs.

Conversely, the last update time of the app has not been found to be a significant factor related to either the number of downloads or user scores. This finding contradicts Krishnan and Selvam’s [40] study on diabetes smartphone apps. This may partly be attributed to the differences in the context and the sample. Yet, the ‘Android version’ variable emerged as a significant factor in the RFDT and the ANN sensitivity analysis. It may be inferred that when the app provides required functionality, works in a wide range of devices (evidenced by the Android version variable) and is free of significant bugs, frequent updates are not vital in the success of the app.

Lastly, the findings suggest that the sponsor/developer of the app is an essential element that is related to the total number of downloads as well as the average user scores. Despite the higher popularity of state-sponsored apps, the user scores are significantly lower in this category. This suggests that more effort and resources should be devoted to the development and improvement of state-sponsored apps that can potentially reach a considerably higher number of people than other sponsor types. In this way, a higher user satisfaction may be established that will most likely turn into a higher usage frequency (i.e. behavior loyalty) and better health outcomes for the general population.

In line with the discussion carried out, the following five practical implications and strategies applicable to a wide range of m-health apps, can be put forward as the major outcomes of the present study:

  • Mobile apps should be offered free of cost with in-app purchase options when possible.

  • Interactions between users (e.g., social networking and sharing options) should be available among features when relevant.

  • Videos should be provided in app description pages.

  • Information on privacy policy should be presented in the description section.

  • Notifications and similar feedback mechanisms should be specified among app features.

Conclusion

This study reviewed mobile health apps using content analysis, ANOVA and machine learning algorithms and contributes to the current knowledge on m-health application use in several ways. First, content analysis through manual coding of data yielded added value by providing classifications, keywords and factors that influence mobile health app download behavior and user scores. Second, there is no directly comparable study of this scale carried out regarding mobile health applications specifically in emerging countries such as Turkey. Thus, this study provides guidance to researchers, professionals and policymakers in similar nations as well. Given that the app data is publicly available on marketplaces, similar studies may be carried out to test for adaptability of findings to other contexts.

Despite the effort put into the study, it has a number of limitations. First, this study relied on data provided on app store pages (e.g., descriptions) for categorization and analysis. This created potential discrepancies between the m-health app description and actual features, hence functionality provided to users (i.e. under reporting or over reporting) may have influenced user scores. This highlights a future research direction that can overcome such discrepancies. The researchers may analyze the mobile applications by personally installing and using them then scoring each app in various dimensions (e.g., features, ease of use, interactivity provided etc.) [63]. Furthermore, the number of app downloads may not be directly indicative of adoption and regular usage behavior as an initial download represents a trial. This disparity points to a further research path that can be focused on. Researchers may enroll users and follow their usage behavior throughout a set time frame via custom mobile apps or personal diaries. By using this tactic, repeat use and adoption behavior may be observed. Furthermore, deeper insights that can complement cross-sectional studies such as the present one may also be obtained. Second, the sample utilized in this study covered only a fraction of all available apps as there are estimated to be more than hundreds of thousands of them. Manual coding of such data is unfortunately not feasible and different approaches are needed to collect and recode such amounts of data. In addition, there is no exhaustive list of apps available to users/researchers in app marketplaces rendering random sampling impractical and forcing researchers to use non-random sampling methods. This leads to a viable research direction, such as carrying out research in collaboration with mobile app marketplace sponsors (e.g., Apple). More generalized findings may be obtained in this way as the researchers can access an exhaustive list of apps, which allows for better sampling.

Acknowledgments

The authors thank Zehra Nur Canbolat and Omer Berkay Aytac for their assistance with data extraction.

Data Availability

The dataset is hosted on OSF. Researchers can access the dataset via the following link and DOI: https://osf.io/fm5vc/.

Funding Statement

GA, Istanbul Medipol University Scientific Research Projects Fund. Project Grant No: 2019/08 URL: www.medipol.edu.tr. The sponsor haven't played an active role in the study design, data collection, analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Agnihothri S, Cui L, Delasay M, Rajan B. The value of mHealth for managing chronic conditions. Health Care Manag Sci. 2020;23: 185–202. 10.1007/s10729-018-9458-2 [DOI] [PubMed] [Google Scholar]
  • 2.Bettiga D, Lamberti L, Lettieri E. Individuals’ adoption of smart technologies for preventive health care: a structural equation modeling approach. Health Care Manag Sci. 2020; 203–214. 10.1007/s10729-019-09468-2 [DOI] [PubMed] [Google Scholar]
  • 3.Riley WT, Rivera DE, Atienza AA, Nilsen W, Allison SM, Mermelstein R. Health behavior models in the age of mobile interventions: are our theories up to the task? Transl Behav Med. 2011;1: 53–71. 10.1007/s13142-011-0021-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.West JH, Hall PC, Hanson CL, Barnes MD, Giraud-Carrier C, Barrett J. There’s an app for that: Content analysis of paid health and fitness apps. J Med Internet Res. 2012;14: 1–16. 10.2196/jmir.1977 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Stoyanov SR, Hides L, Kavanagh DJ, Zelenko O, Tjondronegoro D, Mani M. Mobile App Rating Scale: A New Tool for Assessing the Quality of Health Mobile Apps. JMIR mHealth uHealth. 2015;3: e27 10.2196/mhealth.3422 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Davalbhakta S, Advani S, Kumar S, Agarwal V, Bhoyar S, Fedirko E, et al. A Systematic Review of Smartphone Applications Available for Corona Virus Disease 2019 (COVID19) and the Assessment of their Quality Using the Mobile Application Rating Scale (MARS). J Med Syst. 2020;44 10.1007/s10916-020-01633-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Venkatraman A. Weekly Time Spent in Apps Grows 20% Year Over Year as People Hunker Down at Home. In: App Annie [Internet]. 2020 [cited 23 Apr 2020]. Available: https://www.appannie.com/en/insights/market-data/weekly-time-spent-in-apps-grows-20-year-over-year-as-people-hunker-down-at-home/.
  • 8.International Telecommunication Union. ICT facts and Figures: 2019. Geneva, Switzerland; 2019. Available: https://www.itu.int/en/ITU-D/Statistics/Pages/stat/default.aspx.
  • 9.Krebs P, Duncan DT. Health App Use Among US Mobile Phone Owners: A National Survey. JMIR mHealth uHealth. 2015;3: e101 10.2196/mhealth.4924 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Fitzgerald M, McClelland T. What makes a mobile app successful in supporting health behaviour change? Health Educ J. 2017;76: 373–381. 10.1177/0017896916681179 [DOI] [Google Scholar]
  • 11.World Health Organization. mHealth: New Horizons for Health through Mobile Technologies. Glob Obs eHealth Ser. 2011;3. [Google Scholar]
  • 12.Helbostad J, Vereijken B, Becker C, Todd C, Taraldsen K, Pijnappels M, et al. Mobile Health Applications to Promote Active and Healthy Ageing. Sensors. 2017;17: 622 10.3390/s17030622 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Kumar S, Nilsen WJ, Abernethy A, Atienza A, Patrick K, Pavel M, et al. Mobile Health Technology Evaluation: the mHealth evidence workshop. Am J Prev Med. 2013;45: 228–236. 10.1016/j.amepre.2013.03.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Kitsiou S, Paré G, Jaana M, Gerber B. Effectiveness of mHealth interventions for patients with diabetes: An overview of systematic reviews. PLoS One. 2017;12: 1–17. 10.1371/journal.pone.0173160 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Quinn C, Shardell M, Terrin M. Cluster-randomized trial of a mobile phone personalized behavioral intervention for blood glucose control. Diabetes Care. 2011;34: 1934–42. 10.2337/dc11-0366 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Carrasco MP, Salvador CH, Sagredo PG, Márquez-Montes J, González de Mingo MA, Fragua JA, et al. Impact of patient-general practitioner short-messages-based interaction on the control of hypertension in a follow-up service for low-to-medium risk hypertensive patients: A randomized controlled trial. IEEE Trans Inf Technol Biomed. 2008;12: 780–791. 10.1109/TITB.2008.926429 [DOI] [PubMed] [Google Scholar]
  • 17.Lester RT, Ritvo P, Mills EJ, Kariri A, Karanja S, Chung MH, et al. Effects of a mobile phone short message service on antiretroviral treatment adherence in Kenya (WelTel Kenya1): a randomised trial. Lancet. 2010;376: 1838–1845. 10.1016/S0140-6736(10)61997-6 [DOI] [PubMed] [Google Scholar]
  • 18.Kim H, Goldsmith JV, Sengupta S, Mahmood A, Powell MP, Bhatt J, et al. Mobile Health Application and e-Health Literacy: Opportunities and Concerns for Cancer Patients and Caregivers. J Cancer Educ. 2017. [cited 23 Nov 2017]. 10.1007/s13187-017-1293-5 [DOI] [PubMed] [Google Scholar]
  • 19.Coughlin SS, Whitehead M, Sheats JQ, Mastromonico J, Smith S, Mastrominico J, et al. A Review of Smartphone Applications for Promoting Physical Activity. Jacobs J community Med. 2016;2: 1–14. 10.1038/nature13736.Tyrosine [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Turner-McGrievy GM, Beets MW, Moore JB, Kaczynski AT, Barr-Anderson DJ, Tate DF. Comparison of traditional versus mobile app self-monitoring of physical activity and dietary intake among overweight adults participating in an mHealth weight loss program. J Am Med Informatics Assoc. 2013;20: 513–518. 10.1136/amiajnl-2012-001510 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Bakker D, Rickard N. Engagement in mobile phone app for self-monitoring of emotional wellbeing predicts changes in mental health: MoodPrism. J Affect Disord. 2018;227: 432–442. 10.1016/j.jad.2017.11.016 [DOI] [PubMed] [Google Scholar]
  • 22.Burke LE, Conroy MB, Sereika SM, Elci OU, Styn MA, Acharya SD, et al. The effect of electronic self-monitoring on weight loss and dietary intake: A randomized behavioral weight loss trial. Obesity. 2011;19: 338–344. 10.1038/oby.2010.208 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Mollee JS, Middelweerd A, Kurvers RL, Klein MCA. What technological features are used in smartphone apps that promote physical activity? A review and content analysis. Pers Ubiquitous Comput. 2017;21: 633–643. 10.1007/s00779-017-1023-3 [DOI] [Google Scholar]
  • 24.Zahry NR, Cheng Y, Peng W. Content Analysis of Diet-Related Mobile Apps: A Self-Regulation Perspective. Health Commun. 2016;31: 1301–1310. 10.1080/10410236.2015.1072123 [DOI] [PubMed] [Google Scholar]
  • 25.Charbonneau DH, Hightower S, Katz A, Zhang K, Abrams J, Senft N, et al. Smartphone apps for cancer: A content analysis of the digital health marketplace. Digit Heal. 2020;6: 1–7. 10.1177/2055207620905413 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Payne HE, Wilkinson J, West JH, Bernhardt JM. A content analysis of precede-proceed constructs in stress management mobile apps. mHealth. 2016;2: 5 10.3978/j.issn.2306-9740.2016.02.02 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Martínez-Pérez B, De La Torre-Díez I, López-Coronado M. Mobile health applications for the most prevalent conditions by the world health organization: Review and analysis. J Med Internet Res. 2013;15 10.2196/jmir.2600 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Crosby R, Noar SM. What is a planning model? An introduction to PRECEDE-PROCEED. J Public Health Dent. 2011;71 10.1111/j.1752-7325.2011.00235.x [DOI] [PubMed] [Google Scholar]
  • 29.Green LW, Kreuter MW. Health promotion planning: An educational and environmental approach 4th ed. New York, NY, USA: McGrawhill; 2005. [Google Scholar]
  • 30.Centers for Disease Control and Prevention. Health Education Curriculum Analysis Tool (HECAT). 2017. [cited 5 Feb 2018]. Available: https://www.cdc.gov/healthyyouth/hecat/. [Google Scholar]
  • 31.Anderson K, Burford O, Emmerton L. Mobile Health Apps to Facilitate Self-Care: A Qualitative Study of User Experiences. van Ooijen PMA, editor. PLoS One. 2016;11: e0156164 10.1371/journal.pone.0156164 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Lim D, Norman R, Robinson S. Consumer preference to utilise a mobile health app: A stated preference experiment. Torpey K, editor. PLoS One. 2020;15: e0229546 10.1371/journal.pone.0229546 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.van Haasteren A, Gille F, Fadda M, Vayena E. Development of the mHealth App Trustworthiness checklist. Digit Heal. 2019;5: 1–21. 10.1177/2055207619886463 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Carter DD, Robinson K, Forbes J, Hayes S. Experiences of mobile health in promoting physical activity: A qualitative systematic review and meta-ethnography. PLoS One. 2018;13 10.1371/journal.pone.0208759 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Middelweerd A, Mollee JS, van der Wal CN, Brug J, te Velde SJ. Apps to promote physical activity among adults: A review and content analysis. Int J Behav Nutr Phys Act. 2014;11: 1–9. 10.1186/1479-5868-11-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Alqahtani F, Orji R. Insights from user reviews to improve mental health apps. Health Informatics J. 2020. 10.1177/1460458219896492 [DOI] [PubMed] [Google Scholar]
  • 37.Liang T-P, Li X, Yang C-T, Wang M. What in Consumer Reviews Affects the Sales of Mobile Apps: A Multifacet Sentiment Analysis Approach. Int J Electron Commer. 2015;20: 236–260. 10.1080/10864415.2016.1087823 [DOI] [Google Scholar]
  • 38.Zhang Y, Liu C, Luo S, Xie Y, Liu F, Li X, et al. Factors influencing patients’ intention to use diabetes management apps based on an extended unified theory of acceptance and use of technology model: Web-based survey. J Med Internet Res. 2019;21: 1–17. 10.2196/15023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Appel G, Libai B, Muller E, Shachar R. On the monetization of mobile apps. Int J Res Mark. 2020. 10.1016/j.ijresmar.2019.07.007 [DOI] [Google Scholar]
  • 40.Krishnan G, Selvam G. Factors influencing the download of mobile health apps: Content review-led regression analysis. Heal Policy Technol. 2019;8: 356–364. 10.1016/j.hlpt.2019.09.001 [DOI] [Google Scholar]
  • 41.Rowland SP, Fitzgerald JE, Holme T, Powell J, McGregor A. What is the clinical value of mHealth for patients? npj Digit Med. 2020;3 10.1038/s41746-019-0206-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Marley J, Farooq S. Mobile telephone apps in mental health practice: uses, opportunities and challenges. BJPsych Bull. 2015;39: 288–290. 10.1192/pb.bp.114.050005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Shareef MA, Kumar V, Kumar U. Predicting mobile health adoption behaviour: A demand side perspective. J Cust Behav. 2014;13: 187–205. 10.1362/147539214X14103453768697 [DOI] [Google Scholar]
  • 44.Kamel Boulos MN, Brewer AC, Karimkhani C, Buller DB, Dellavalle RP. Mobile medical and health apps: state of the art, concerns, regulatory control and certification. Online J Public Health Inform. 2014;5: 1–23. 10.5210/ojphi.v5i3.4814 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Kotz D, Avancha S, Baxi A. A privacy framework for mobile health and home-care systems Proceedings of the first ACM workshop on Security and privacy in medical and home-care systems—SPIMACS ‘09. New York, New York, USA: ACM Press; 2009. p. 1 10.1145/1655084.1655086 [DOI] [Google Scholar]
  • 46.Martínez-Pérez B, de la Torre-Díez I, López-Coronado M. Privacy and Security in Mobile Health Apps: A Review and Recommendations. J Med Syst. 2015;39: 181 10.1007/s10916-014-0181-3 [DOI] [PubMed] [Google Scholar]
  • 47.Levine DM, Co Z, Newmark LP, Groisser AR, Holmgren AJ, Haas JS, et al. Design and testing of a mobile health application rating tool. npj Digit Med. 2020;3 10.1038/s41746-020-0268-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Sunyaev A, Dehling T, Taylor PL, Mandl KD. Availability and quality of mobile health app privacy policies. J Am Med Informatics Assoc. 2014; 28–33. 10.1136/amiajnl-2013-002605 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Timmreck TC. Planning, Program Development, and Evaluation: A Handbook for Health Promotion, Aging, and Health Services. Sudbury, Massachusetts: Jones & Bartlett Publishers, Inc.; 1995. [Google Scholar]
  • 50.Breiman L. Random forests. Mach Learn. 2001;45: 5–32. 10.1023/A:1010933404324 [DOI] [Google Scholar]
  • 51.Samarasinghe S. Neural Networks for Applied Sciences and Engineering: From Fundamentals to Complex Pattern Recognition. 1st ed. NY, USA: Auerbach Publications; 2006. 10.1201/9781420013061 [DOI] [Google Scholar]
  • 52.World Health Organization. Tobacco use data and statistics. In: Disease Prevention [Internet]. 2020 [cited 2 Feb 2020]. Available: http://www.euro.who.int/en/health-topics/disease-prevention/tobacco/data-and-statistics.
  • 53.World Health Organization. Fact sheet on alcohol consumption, alcohol-attributable harm and alcohol policy responses in European Union Member States, Norway and Switzerland. 2018. Available: http://www.euro.who.int/en/health-topics/disease-prevention/alcohol-use/data-and-statistics/fact-sheet-on-alcohol-consumption,-alcohol-attributable-harm-and-alcohol-policy-responses-in-european-union-member-states,-norway-and-switzerland-2018.
  • 54.Biviji R, Vest JR, Dixon BE, Cullen T, Harle CA. Factors Related to User Ratings and User Downloads of Mobile Apps for Maternal and Infant Health: Cross-Sectional Study. JMIR mHealth uHealth. 2020;8: e15663 10.2196/15663 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Carare O. The impact of bestseller rank on demand: Evidence from the app market. Int Econ Rev (Philadelphia). 2012;53: 717–742. 10.1111/j.1468-2354.2012.00698.x [DOI] [Google Scholar]
  • 56.Capras R-D, Bolboaca SD. An Evaluation of Free Medical Applications for Android Smartphones. Appl Med Informatics. 2016;38: 117–132. Available: http://ami.info.umfcluj.ro/index.php/AMI/article/view/608%0Ahttp://ezproxy.leedsbeckett.ac.uk/login?url = http://search.ebscohost.com/login.aspx?direct=true&db=a9h&AN=120576150&site=eds-live&scope=site%0Ahttp://ami.info.umfcluj.ro/index.php/AMI/article/view. [Google Scholar]
  • 57.Ernsting C, Dombrowski SU, Oedekoven M, O’Sullivan JL, Kanzler E, Kuhlmey A, et al. Using smartphones and health apps to change and manage health behaviors: A population-based survey. J Med Internet Res. 2017;19: 1–12. 10.2196/jmir.6838 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Antoniadis I, Paltsoglou S, Patoulidis V. Post popularity and reactions in retail brand pages on Facebook. Int J Retail Distrib Manag. 2019;47: 957–973. 10.1108/IJRDM-09-2018-0195 [DOI] [Google Scholar]
  • 59.Cyr D, Head M, Ivanov A. Perceived interactivity leading to e-loyalty: Development of a model for cognitive-affective user responses. Int J Hum Comput Stud. 2009;67: 850–869. 10.1016/j.ijhcs.2009.07.004 [DOI] [Google Scholar]
  • 60.Kang JYM, Mun JM, Johnson KKP. In-store mobile usage: Downloading and usage intention toward mobile location-based retail apps. Comput Human Behav. 2015;46: 210–217. 10.1016/j.chb.2015.01.012 [DOI] [Google Scholar]
  • 61.Coursaris CK, Sung J. Antecedents and consequents of a mobile website’s interactivity. New Media Soc. 2012;14: 1128–1146. 10.1177/1461444812439552 [DOI] [Google Scholar]
  • 62.Lee T. The Impact of Perceptions of Interactivity on Customer Trust and Transaction Intentions in Mobile Commerce. J Electron Commer Res. 2005;6: 165. [Google Scholar]
  • 63.Breland JY, Yeh VM, Yu J. Adherence to evidence-based guidelines among diabetes self-management apps. Transl Behav Med. 2013;3: 277–286. 10.1007/s13142-013-0205-4 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Farrukh Aslam Khan

8 Sep 2020

PONE-D-20-12673

Insights into mobile health application market via a content analysis of marketplace data with machine learning

PLOS ONE

Dear Dr. Aydin,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Oct 23 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Farrukh Aslam Khan

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2.  Please can you clarify in your methods section whether the two coders provided informed consent to participate in this study."

3. Thank you for including your ethics statement: 

"Istanbul Medipol University Ethical Committee of Non-invasive Clinical Trials. No: 10840098-604.01.01-E.8356".   

Please amend your current ethics statement to confirm that your named institutional review board or ethics committee specifically approved this study.

Once you have amended this statement in the Methods section of the manuscript, please add the same text to the “Ethics Statement” field of the submission form (via “Edit Submission”).

For additional information about PLOS ONE ethical requirements for human subjects research, please refer to http://journals.plos.org/plosone/s/submission-guidelines#loc-human-subjects-research.

4. We note that you have stated that you will provide repository information for your data at acceptance. Should your manuscript be accepted for publication, we will hold it until you provide the relevant accession numbers or DOIs necessary to access your data. If you wish to make changes to your Data Availability statement, please describe these changes in your cover letter and we will update your Data Availability statement to reflect the information you provide.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: No

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

Reviewer #2: No

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: No

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: This manuscript presents an investigation on the current landscape of smartphone apps targeting

to improve and sustain health and well-being. This study contributes to the current knowledge on m-health application use by reviewing the mobile health applications using content analysis and machine learning algorithms. Although the approach is good, however it require major improvements in order to be considered for acceptance.

1. Overall presentation and structure of the paper is not good, Please consider the following comments:

a. Proper alignment (on right margin) is required.

b. The manuscript has many uneven-sized paragraphs, some are very small and other are too long. Please be consistent in that and make apparently same size of paragraphs, while not disturbing the flow of concepts.

c. On page 4. Line no. 14, 16, 18, 20, 22, 24, and page 5, line 1:- Please start sentence with the capital latter for each starting word.

d. Paper has many punctuation, typo, and grammar mistakes, please proofread it carefully.

e. Conclusion must be revised. Please write a concise conclusion. Also add future directions. The shortcoming and other relevant details can be added into a separate section of discussion with more details.

2. Page 9, Line no. 17 and Page 11, line 3 to 11 :- Please provide references to justify your statements.

3. Figures are not clear, Please improve image quality and its description in the text must be detailed. Figure 2 is not acceptable, please redraw it.

4. Please give more details to improve proposed methodology for reader’s understanding. You may add a proposed Algorithm/ flow chart for the ease of the readers.

5. In table 2, column 3, why there is an extra sub-column. In table 3, column 3. Please use 0 (zero) before decimal point, for example, instead of writing “.6125” please write as “0.6125” throughout for readability.

6. Please add a table for different machine learning algorithm output/result comparison in data analysis and result section.

7. Overall, all information (apps data, algorithms, results, etc.) are in textual format, Please rewrite using technical details.

8. Please make graphs to augment the result analysis.

9. References are too old, please add some latest references.

Reviewer #2: The manuscript presents the analysis of mobile health application market and use machine learning models to predict the download behavior of these apps. Following are points which needs to be considered:

1. The machine learning models used in the study are supervised learning algorithms. How the authors labeled the data, for the models?

2. What kind of metrics/ qualitative parameters have been extracted from the raw data to feed into the machine learning model for prediction?

3. How the authors choose the best factors that influence or effect the download behavior? What was the criteria for choosing the factors?

4. The results and analysis section need some more details.

5. English language needs proper revisions

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Jan 6;16(1):e0244302. doi: 10.1371/journal.pone.0244302.r002

Author response to Decision Letter 0


6 Oct 2020

Reviewer #1: This manuscript presents an investigation on the current landscape of smartphone apps targeting to improve and sustain health and well-being. This study contributes to the current knowledge on m-health application use by reviewing the mobile health applications using content analysis and machine learning algorithms. Although the approach is good, however it require major improvements in order to be considered for acceptance.

1. Overall presentation and structure of the paper is not good, Please consider the following comments:

a. Proper alignment (on right margin) is required.

Author(s): Thank you for highlighting this issue, the text is properly aligned.

b. The manuscript has many uneven-sized paragraphs, some are very small and other are too long. Please be consistent in that and make apparently same size of paragraphs, while not disturbing the flow of concepts.

Author(s): Several improvements were made throughout the text to improve the readability, formatting and paragraph sizes.

c. On page 4. Line no. 14, 16, 18, 20, 22, 24, and page 5, line 1:- Please start sentence with the capital latter for each starting word.

Author(s): Thank you, the problems are corrected.

d. Paper has many punctuation, typo, and grammar mistakes, please proofread it carefully.

Author(s): The paper was proofread by a native speaker to rectify typos and grammar mistakes.

e. Conclusion must be revised. Please write a concise conclusion. Also add future directions. The shortcoming and other relevant details can be added into a separate section of discussion with more details.

Author(s): We have revised the Conclusion and removed the irrelevant section to incorporate related future research directions.

2. Page 9, Line no. 17 and Page 11, line 3 to 11 :- Please provide references to justify your statements.

Author(s): Reference(s) has been amended to the related section(s).

3. Figures are not clear, Please improve image quality and its description in the text must be detailed. Figure 2 is not acceptable, please redraw it.

Author(s): The figure has been completely changed and redrawn.

4. Please give more details to improve proposed methodology for reader’s understanding. You may add a proposed Algorithm/ flow chart for the ease of the readers.

Author(s) An overall model has been added as Figure 2 to be clearer about the model used in the analysis. Moreover, more details on the analysis method and the algorithms are amended into the paper.

5. In table 2, column 3, why there is an extra sub-column. In table 3, column 3. Please use 0 (zero) before decimal point, for example, instead of writing “.6125” please write as “0.6125” throughout for readability.

Author(s): The extra column has been removed and 0 was amended before decimal point(s).

6. Please add a table for different machine learning algorithm output/result comparison in data analysis and result section.

Authors: The results and performances of each algorithm are presented on Table 4. Yet, a figure has been amended into the relevant section for better visual communication. After the first run of analysis, an extra ANN analysis was performed, and the results are displayed on Table 5. This part has been made clearer in the manuscript. Thank you for the warning.

7. Overall, all information (apps data, algorithms, results, etc.) are in textual format, Please rewrite using technical details.

Author(s): Formulas and figures were amended into the paper to make it easier to understand the logic and mathematical foundations of each algorithm applied.

8. Please make graphs to augment the result analysis.

Author(s): A pie chart has been added in the discussion section to make final results clearer. Moreover, a chart depicting the performance of each machine learning algorithm was amended to the text

9. References are too old, please add some latest references.

Author(s): Six new studies published in the last two years are amended into the text.

● Agnihothri, S., Cui, L., Delasay, M., & Rajan, B. (2020). The value of mHealth for managing chronic conditions. Health Care Management Science, 23, 185–202. https://doi.org/10.1007/s10729-018-9458-2

● Bettiga, D., Lamberti, L., & Lettieri, E. (2020). Individuals’ adoption of smart technologies for preventive health care: a structural equation modeling approach. Health Care Management Science, 23, 203–214. https://doi.org/10.1007/s10729-019-09468-2

● Davalbhakta, S., Advani, S., Kumar, S., Agarwal, V., Bhoyar, S., Fedirko, E., Misra, D. P., Goel, A., Gupta, L., & Agarwal, V. (2020). A Systematic Review of Smartphone Applications Available for Corona Virus Disease 2019 (COVID19) and the Assessment of their Quality Using the Mobile Application Rating Scale (MARS). Journal of Medical Systems, 44(9). https://doi.org/10.1007/s10916-020-01633-3

● Levine, D. M., Co, Z., Newmark, L. P., Groisser, A. R., Holmgren, A. J., Haas, J. S., & Bates, D. W. (2020). Design and testing of a mobile health application rating tool. Npj Digital Medicine, 3(1). https://doi.org/10.1038/s41746-020-0268-9

● Rowland, S. P., Fitzgerald, J. E., Holme, T., Powell, J., & McGregor, A. (2020). What is the clinical value of mHealth for patients? Npj Digital Medicine, 3(1). https://doi.org/10.1038/s41746-019-0206-x

● Zhang, Y., Liu, C., Luo, S., Xie, Y., Liu, F., Li, X., & Zhou, Z. (2019). Factors influencing patients’ intention to use diabetes management apps based on an extended unified theory of acceptance and use of technology model: Web-based survey. Journal of Medical Internet Research, 21(8), 1–17. https://doi.org/10.2196/15023

Reviewer #2: The manuscript presents the analysis of mobile health application market and use machine learning models to predict the download behavior of these apps. Following are points which needs to be considered:

1. The machine learning models used in the study are supervised learning algorithms. How the authors labeled the data, for the models?

Author(s): For classification the total number of downloads has been used as the class variable. This part has been made clearer in the manuscript. Thank you for highlighting this issue.

2. What kind of metrics/ qualitative parameters have been extracted from the raw data to feed into the machine learning model for prediction?

Author(s): A detailed table, Table 1 Code Sheet, presents the metrics/ parameters that have been extracted and fed into the machine learning models. This has been made clearer in the manuscript, thank you for pointing this out.

3. How the authors choose the best factors that influence or effect the download behavior? What was the criteria for choosing the factors?

Author(s): All the relevant data available on marketplaces regarding each app was put into consideration. This study focuses on the features and keywords that can be extracted from the textual data available in mobile app descriptions as well. Consequently, studies on mobile app acceptance were used as guidelines yet the machine learning algorithms provided distinct keywords related to the download behavior and user scores that were not known to the researchers beforehand. –

4. The results and analysis section need some more details.

Author(s): A new table and a chart has been amended into the results section. Moreover, several formulas and a chart has been added to the analysis section to make the algorithms and the analysis methodology easier to understand. The results section is also expanded using new graphs and several amendments to the text.

5. English language needs proper revisions

Author(s): The paper was proofread by a native speaker to rectify typos and grammar mistakes.

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 1

Farrukh Aslam Khan

11 Nov 2020

PONE-D-20-12673R1

Insights into mobile health application market via a content analysis of marketplace data with machine learning

PLOS ONE

Dear Dr. Aydin,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Dec 26 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Farrukh Aslam Khan

Academic Editor

PLOS ONE

Additional Editor Comments (if provided):

Please carefully check the paper for English language mistakes including punctuation, typos, etc. I would recommend getting the paper checked by a native speaker.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: (No Response)

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: I am Happy that the comments are addressed carefully, however, still there are few minor punctuation mistakes that me be addressed:

1. Please use a comma "," after e.g. as e.g., throughout.

Reviewer #2: Most of the revision prompted by the comments has been incorporated and no further explanation is required.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Noshina Tariq

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Jan 6;16(1):e0244302. doi: 10.1371/journal.pone.0244302.r004

Author response to Decision Letter 1


13 Nov 2020

Reviewer #1: I am Happy that the comments are addressed carefully, however, still there are few minor punctuation mistakes that me be addressed:

1. Please use a comma "," after e.g. as e.g., throughout.

Authors: The authors thank the reviewer the comments and suggestions that definitely helped in improving the manuscript. This minor issue has been settled and a comma was added after `e.g.` throughout the manuscript.

Reviewer #2: Most of the revision prompted by the comments has been incorporated and no further explanation is required.

Authors: The authors thank the reviewer for the valuable input that helped us in improving the manuscript.

Attachment

Submitted filename: Response to Reviewers R2.docx

Decision Letter 2

Farrukh Aslam Khan

8 Dec 2020

Insights into mobile health application market via a content analysis of marketplace data with machine learning

PONE-D-20-12673R2

Dear Dr. Aydin,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Farrukh Aslam Khan

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

The authors have addressed all the reviewers’ comments. The paper is in good shape now.

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: (No Response)

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Noshina Tariq

Acceptance letter

Farrukh Aslam Khan

15 Dec 2020

PONE-D-20-12673R2

Insights into mobile health application market via a content analysis of marketplace data with machine learning

Dear Dr. Aydin:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Farrukh Aslam Khan

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Attachment

    Submitted filename: Response to Reviewers.docx

    Attachment

    Submitted filename: Response to Reviewers R2.docx

    Data Availability Statement

    The dataset is hosted on OSF. Researchers can access the dataset via the following link and DOI: https://osf.io/fm5vc/.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES