Table 2.
Overview of all functions contained in the WhatsR package and their respective features
| Type | Name | Features |
|---|---|---|
| Processing functions | parse_chat() | Takes an exported WhatsApp chat log as input and converts it into a data frame with one row per message and 15 feature columns (see Table 1). It has additional parameters for anonymizing the chat log and for removing messages from non-consenting participants |
| parse_ios() | Subfunction of parse_chat(). Separates the raw text file into different messages and distinguishes user-generated messages from WhatsApp System messages as per the iOS chat log structure | |
| parse_android() | Subfunction of parse_chat(). Separates the raw text file into different messages and distinguishes user-generated messages from WhatsApp System messages as per the Android chat log structure | |
| Visualization functions | plot_messages()a | Function for visualizing the number of messages per sender as a bar plot, cumulative sum, heatmap, or pie chart |
| plot_tokens()a | Function for visualizing the distributions of tokens sent per sender and message as a bar plot, box plot, violin plot, or heatmap | |
| plot_tokens_over_time()a | Function for visualizing the number of tokens per sender across time. Includes visualizations per year, per month, per day, per hour of day, per day of week, and all time | |
| plot_smilies()a | Function for visualizing the number of smileys sent per sender as a bar plot, cumulative sum, heatmap, or split bar plot | |
| plot_emoji()a | Function for visualizing the number of emojis sent per sender as a bar plot, cumulative sum, heatmap, or split bar plot | |
| plot_links()a | Function for visualizing the number of links or domains sent per sender as a bar plot, cumulative sum, heatmap, or split bar plot | |
| plot_media()a | Function for visualizing the number of media files or file types sent per sender as a bar plot, cumulative sum, heatmap, or split bar plot | |
| plot_wordcloud()a | Function for visualizing word clouds from tokenized versions of messages. Essentially a wrapper for the ggwordcloud R packageb | |
| plot_network()a | Function for visualizing networks of user interactions in WhatsApp chat logs. Essentially a wrapper to the visNetwork R package.c Constructs an edge between two users for each consecutive message. Edges can be built based on sent tokens, emojis, smileys, locations, URLs, media files, or amount of sent messages | |
| plot_lexical_dispersion()a | Function for visualizing occurrence of specific tokens in the sent chat messages. Requires raw message texts to be present in the data frame | |
| plot_replytimes()a | Function for visualizing the distribution of time delay for responding to a previous message, or being responded to, for each participant in the chat | |
| plot_locations()a | Function for visualizing sent locations from within the chats on a map. Essentially a wrapper for the ggmap R package.d Requires non-anonymized chat logs as input. Temporarily not available in CRAN version due to pending changes in a dependency package. | |
| Summary functions | summarize_chat()a | Function for summarizing basic statistics about a WhatsApp chat log. Contains number of messages, tokens, participants, system messages, emoji, smileys, links, media files, and locations. Also computes datetime of first and last message and total duration of the chat |
| summarize_tokens_per_person()a | Function for summarizing basic statistics about tokens sent per person. Contains timestamp of first and last message and distribution of sent tokens for each chat participant | |
| Helper functions | download_emoji()a | Helper function for scraping a dictionary of emojis from the Unicode websitee and building a corresponding data frame. Can be used to update the built-in emoji dictionary manually if new emojis are added to WhatsApp |
| tailor_chat()a | Helper function to restrict a parsed WhatsApp chat log to specific timeframes or senders, or to exclude WhatsApp system messages | |
| Testing function | create_chatlog()a | Function for creating files with the same structure as exported, unparsed WhatsApp chat logs using artificial names, telephone numbers, and lorem ipsum message text.f Contains parameters to control the operating system, language settings, time settings, first and last timestamp of the message, and number of users, emojis, unique emojis, links, locations, smileys, unique smileys, media, and self-deleting photos in the chat. These files can be used for testing the correct setup of the ChatDashboard framework (see section 3) |
aAll visualization functions have parameters for restricting plots to specified timeframes and senders, and for excluding system messages from plots. They return either a customizable ggplot2 object or the preprocessed data frame.
bhttps://cran.r-project.org/web/packages/ggwordcloud/vignettes/ggwordcloud.html
chttps://www.rdocumentation.org/packages/visNetwork/versions/2.1.2
dhttps://cran.r-project.org/web/packages/ggmap/readme/README.html