Skip to main content
STAR Protocols logoLink to STAR Protocols
. 2024 Feb 22;5(1):102902. doi: 10.1016/j.xpro.2024.102902

Protocol for programing a degree-to-careers dashboard in R using Posit and Shiny

Mark A Perkins 1,2,3,4,, Jonathan W Carrier 1,2,∗∗
PMCID: PMC10901136  PMID: 38393952

Summary

Here, we present a protocol for programing a degree-to-careers dashboard in R using Posit and Shiny. We describe steps for installing software, obtaining datasets, and munging and joining data. We then detail procedures for programing and publishing the dashboard in Shiny web application that includes a filtered data table. The resulting dashboard links academic programs with careers and income data and may be useful to inform decision-making by higher education leaders and policymakers.

For complete details on the use and execution of this protocol, please refer to Perkins and Carrier (2023).1

Subject areas: Clinical Protocol, Computer sciences

Graphical abstract

graphic file with name fx1.jpg

Highlights

  • Steps for querying data from a local SQL server and from publicly available data

  • Munging and formatting the data into one usable data table using Posit packages

  • Programming and publishing a Shiny dashboard including filtered data download


Publisher’s note: Undertaking any experimental protocol requires adherence to local institutional guidelines for laboratory safety and ethics.


Here, we present a protocol for programing a degree-to-careers dashboard in R using Posit and Shiny. We describe steps for installing software, obtaining datasets, and munging and joining data. We then detail procedures for programing and publishing the dashboard in Shiny web application that includes a filtered data table. The resulting dashboard links academic programs with careers and income data and may be useful to inform decision-making by higher education leaders and policymakers.

Before you begin

The programming code for this project is published on the affiliated RPubs page at this link: https://rpubs.com/anonymous_dashboarder/1072766. The Shiny dashboard application is published at this link: https://anonymousdashboards.shinyapps.io/Application/. Be sure to have a working Windows or Macintosh computer with permissions to install new software and access to program datasets from an institution. Working with a particular institution or institutions ensures obtainment of these data. It is also important to gain an understanding of concepts such as CIP codes, SOC codes and other terminology used in this protocol. Table 1 provides important terms and their detailed descriptions of the data downloaded and used with this protocol.

Table 1.

Key terms, items, and definitions relevant to the protocol

Term Item(s) Definition
CIP Code
  • Data from one or more higher education institutions

  • CIP/SOC Crosswalk data from National Center of Educational Statistics (NCES)

“The Classification of Instructional Programs (CIP) provides a taxonomic scheme that supports the accurate tracking and reporting of fields of study and program completions activity.”2 It thus assigns a unique identifier to each program that can be used universally across institutions.
SOC Code
  • CIP/SOC Crosswalk data from National Center of Educational Statistics (NCES)

  • Labor Market Projections data

“The 2018 Standard Occupational Classification (SOC) system is a federal statistical standard used by federal agencies to classify workers into occupational categories for the purpose of collecting, calculating, or disseminating data. All workers are classified into one of 867 detailed occupations according to their occupational definition.”3
Program
  • Data from one or more higher education institutions

A major or minor program of study uniquely labeled for the institutions but associated with a CIP code. May include Certificates, Associates, Bachelors, or graduate programs.4
Cost of Living Index
  • Missouri Economic and Research Center data

A consideration of the expenses of a specific geographical area to a general geographical area. In this case, compare states to the United States where the United States has an index of 100. Therefore indexes >100 are more expensive than the U.S. in general, and those <100 are less expensive.5
Income Data and Job projections data
  • Bureau of labor statistics data

  • Projections Central data

Median salary per hour and annually as well as the ten-year employment numbers and percent growth.6,7

Finally, there are other dashboard tools including Tableau and PowerBI. This protocol uses R and affiliated packages. R is an open-source tool that anyone can use for no cost. R is also an all-in-one package that allows for data munging as well as dashboard programming. Python is also open-source and provides dashboard potential as well as data manipulation. Other studies could replicate this protocol in Python.

Inline graphicCRITICAL: The data scientist should learn fundamental skills in R, Posit, and Shiny before engaging in this protocol. There are a number of good tutorials to help with this. Books by Paradis,8 Wickham and Grolemund,9 and the Shiny online guide10 provide fundamental programming instruction and theory for R. Finally, even the most experienced R user must endure trial and error. Thus, this protocol is best for R users with medium to advanced skills.

Institutional permissions

The programs’ data in the demonstrated product are anonymized. Other data are publicly available through government and agency websites. Be sure to obtain permission from participants before publishing their data.

Installation

Inline graphicTiming: ∼1 h

This protocol requires the installation of the general R program and RStudio.

Use the consul to install R packages for future libraries to load.

  • 1.
    Install R base on a computer.

> if(!require("devtools")) install.packages("devtools")

> install.packages("rstudio", "shiny", "shinydashboard", "shinyWidgets")

> install. packages(c("tidyverse", "dplyr", "knitr", "data.table", "DT", "formattable", "shinydashboard", "openxlsx", “zip”))

  • 2.
    Visit the Posit website to download RStudio.
    • a.
      Scroll to a blue button (Figure 1) that says “Download RStudio Desktop”.
    • b.
      The free version of this software is adequate for this protocol.
  • 3.

    Open RStudio and on the top toolbar, click on “File”, then “New” and then select “RMarkdown” (Figure 2).

Note: To change the look of the screen under “Tools” then “Global Options” and “Appearance”. The demonstrated screen style for this protocol is set as “Vibrant Ink” under “Editor Theme”.

Note: The R programming code for this protocol is found at this link: https://rpubs.com/anonymous_dashboarder/careers.

  • 4.

    Load the installed libraries in RMarkdown as shown. See troubleshooting 1 for installation problems.

> library(readxl)

> library(openxlsx)

> library(tidyverse)

> library(dplyr)

> library(knitr)

> library(shiny)

> library(shinyWidgets)

> library(DT)

> library(formattable)

> library(data.table)

> library(shinydashboard)

> library(zip)

Figure 1.

Figure 1

Click this to download RStudio

Figure 2.

Figure 2

Click on “File” then “R Markdown”

Key resources table

REAGENT or RESOURCE SOURCE IDENTIFIER
Deposited data

Program information Institutions’ SQL Servers N/A
Occupational Wage Data By State Projections Central23 https://projectionscentral.org/
CIP/SOC Crosswalk National Center of Educational Statistics4 https://nces.ed.gov/ipeds/cipcode/post3.aspx?y=56
Cost of Living Data by State Missouri Research and Information Center5 https://meric.mo.gov/data/cost-living-data-series
Job Projections by State U.S. Bureau of Labor Statistics24 https://www.bls.gov/oes/tables.htm

Software and algorithms

Personal computer Technology equipment The user must have a Windows or Macintosh PC with the ability to download R. Most standard computers are able to do this.
R and Posit Software Open-source software The user will need to download R base, Posit Studio and Shiny.
data.table package in R This package aggregates data into tables25 https://cran.r-project.org/web/packages/data.table/index.html
Dplyr package in R This is a data grammar package that helps manipulate and munge data. This is also a part of the tidyverse but sometimes load it or call it individually16 https://dplyr.tidyverse.org/
DT package in R This package works with data.table to provide user-friendly HTML data tables19 https://rstudio.github.io/DT/
Formattable package in R This package allows the user to format data frames and vectors for presentation20 https://www.rdocumentation.org/packages/formattable/versions/0.2.1
Knitr package in R This package allows users to generate iterative output on Rmarkdown. It also allows users to publish code and output17 https://www.r-project.org/nosvn/pandoc/knitr.html
Openxlsx package in R This package allows users to read, write, and edit .xlsx21 https://cran.r-project.org/web/packages/openxlsx/index.html
Readxl package in R This package allows the user to read Excel files and write them as R data frames. This is a part of the tidyverse26 https://readxl.tidyverse.org/
Shiny package in R This package allows users to build interactive web applications14 https://cran.r-project.org/web/packages/shiny/index.html
Shinydashboard package in R This is an extension to Shiny allowing users to generate dashboards27 https://rstudio.github.io/shinydashboard/
shinyWidgets package in R This is an extension to Shiny adding extra widgets to web applications28 https://github.com/dreamRs/shinyWidgets
Tidyverse package in R This package is a conglomeration of several packages that help with data munging, visualization, and general data science29 https://www.tidyverse.org/
zip package in R This package allows the user to compress files30 https://cran.r-project.org/web/packages/zip/index.html

Note: The key resource table provides links to all the data and resources required for this protocol. Some links may expire over time, in which case a web search of key terms will lead to them. The first two resources mention the requirement of a personal computer with a Windows, IOS, or Linux operating system where the user will download R and Posit software. The next list of resources includes the data sources used for this protocol as described in the above table. The final list of resources includes a list of R packages to install to complete the task.

Step-by-step method details

This part of the protocol first describes the process of obtaining all data sets using SQL and R programming language, then the process of munging the data in RMarkdown, followed by the process of programming the dashboard, including all user interface and server elements in Shiny. The result is an interactive dashboard linking institution degrees to careers, job projections, and income information. It is important to keep all data files in the same folder system as the RMarkdown file. This is also where files written in R will be saved. This will make it easy to link data directly to the code.

Obtaining academic program data

Inline graphicTiming: ∼1 h

This section provides an overview of how to query the data from the academic institutions. This requires the use of structural query language (SQL).

Note: This protocol gathered two separate queries, one from a university, and one from seven community colleges.

  • 1.

    Obtain data from the institutions’ SQL databases where they retain student records.

  • 2.

    Include institution name, the student id (which is later aggregated and excluded), the CIP code, the name of the program, the academic period (or term), and the academic year.

  • 3.

    Query from a SQL table called “DEGREE_TABLE” and limit to the 2016-17 academic year through the 2021-22 academic year.

  • 4.

    Save this query as a .csv file for RMarkdown.

> SELECT DISTINCT

> ID,

> Instituion,

> TermCode,

> Year,

> Degree_Type,

> PROGRAM_DESC,

> CIP

> FROM DEGREE_TABLE

> WHERE Year IN (‘2017’, ‘2018’, ‘2019’, ‘2020’, ‘2021’, ‘2022’)

> AND INSTITUTION IN (‘College 1’, ‘College 2’, ‘College 3’, ‘College 4’, ‘College 5’, ‘College 6’, ‘College 7’);

  • 5.

    Import the .csv file into R using the read.csv command and naming it “U_Count” using the “<-“.

  • 6.

    Use the group_by command to count the number of degrees by institution, term, degree type, program, and CIP.

  • 7.

    Change the “Freq” column to “NumberofGraduates”.

Note: Changing column names allows them to match the other colleges’ data to import later.

Note: Two queries resulted in an extraction of 207 de-duplicated CIP codes from eight institutions of higher education.

Note: One university file is named “Ucount.csv”.

Note: The college file is named “CCData.2.CSV”.

Note: All files are included on the dashboard.

> U_Count<- read.csv(“U_Count.csv”)%>%

group_by(Institution, TermCode, DegreeType, PROGRAM_DESC, CIP) %>%

summarize(Freq=n())%>%

rename("NumberOfGraduates" = Freq)

  • 8.
    The next step is to download all the data and vertically merge the databases.
    • a.
      Download the college data, naming it “CC” using the read.csv command.
    • b.
      Transform the CIP code into a character variable instead of a numeric variable.
    • c.
      Import the university data and transform the CIP variable into a character.
    • d.
      Vertically merge the university dataset to the college dataset using the bind_rows command.
    • e.
      Name the dataset “All_University”.

> CC<- read.csv("CCData2.csv")

> CC$CIP<- as.character(CC$CIP)

> U_Count<- read.csv("Ucount.csv")

> U_Count$CIP<- as.character(U_Count$CIP)

> All_University<- bind_rows(CC, U_Count)

Import wage and career data from public websites

Inline graphicTiming: ∼1 h

This step illustrates how to obtain publicly accessible data, import them into R, munge them, and write them as data tables.

  • 9.

    Obtain the cost of living data from the Missouri Research and Information Center (see the above table for a link and more information) by copying and pasting it into an Excel file.

  • 10.

    Load the cost of living data and name it “col” using the code, col<- read_excel(‘COL_Index.xlsx’) and save it in the same file as the other data and RMarkdown file.

  • 11.

    Select only two columns from the original dataset: Area Name and Index.

  • 12.

    Transform the index variable into a decimal by dividing by 100.

> col<- read_excel('COL_Index.xlsx')%>%

select(Area_Name, Index)

> col$Index<- col$Index/100

  • 13.

    Next, load the wage data by state.

  • 14.

    Obtain occupational projections from Projections Central (see the above table for a link and more information) where they can be downloaded as an Excel file and save in the same folder as the RMarkdown file.

  • 15.

    Name the database “wage” using the read_excel command.

  • 16.

    Rename the “OCC_CODE” variable to “SOC” to join this column with other datasets.

  • 17.

    Concatenate AREA_TITLE AND SOC.

  • 18.

    Convert all the wage metrics into numeric values to use in mathematical formulas. Prior to this, they loaded as character values.

> wage<- read_excel('StateWage.xlsx')%>%

rename("SOC" = "OCC_CODE")%>%

mutate(StateSoc = paste(AREA_TITLE,SOC))

> wage$H_MEAN<- as.numeric(wage$H_MEAN)

> wage$A_MEAN<- as.numeric(wage$A_MEAN)

> wage$A_MEAN<- as.numeric(wage$A_MEDIAN)

> wage$H_MEDIAN<- as.numeric(wage$H_MEDIAN)

  • 19.

    Download the CIP to SOC crosswalk data from the National Center for Educational Statistics as a .txt file. See troubleshooting 2 for potential problems with importing files.

Note: Download the CIP SOC crosswalk data from the National Center of Educational Statistics as a .txt file as Excel often transforms CIP codes into dates. Use the read.delim command.

  • 20.

    Use the read.delim command to read the file.

  • 21.

    Select all fields from the original file selected except the CIP title as this protocol uses the degree title from the institution data.

  • 22.

    Transform the CIP codes from numeric values to characters for merging.

  • 23.

    Rename variables to CIP and SOC to be compatible with the other datasets.

> cipsoc<- read.delim("CIP_SOC.txt")%>%

select(-c("CIP2020Title"))

> cipsoc$CIP2020Code<- as.character(cipsoc$CIP2020Code)

> cipsoc<- cipsoc%>%

rename("CIP" = "CIP2020Code")%>%

rename("SOC" = "SOC2018Code")%>%

rename("Career_Title" = "SOC2018Title")

  • 24.

    Finally, load the job projections data from the Bureau of Labor Statistics (see the above table for a link and more information).

  • 25.

    Save the job projections data in the same folder as the RMarkdown file as “ltprojections.csv”.

  • 26.

    Select all columns of this dataset except “stfips” by excluding that variable with the select command.

  • 27.

    Rename the variable “code” to “SOC” for compatibility in joining with the other data.

> projections<- read.csv("ltprojections.csv")%>%

select(-"stfips")%>%

rename("SOC" = "code")

Munge and link data into usable dataset

Inline graphicTiming: ∼4 h

  • 28.

    Now work with the All_University file to generate some new variables and prepare it for a Shiny application dashboard (see Obtaining Academic Program Data 3.3). See troubleshooting 3, troubleshooting 4 and troubleshooting 5 for potential errors in programming syntax.

  • 29.

    Subset (or limit) the data only to include academic terms from the fall of 2016 to the summer of 2022.

  • 30.

    Next, using the mutate command, code a new variable called “Year” that includes each respective academic term.

> All_University<- All_University%>%

subset(TermCode %in% c("16/FA", "17/SP", "17/SU", "17/FA", "18/SP", "18/SU", "18/FA", "19/SP", "19/SU", "19/FA", "20/SP", "20/SU", "20/FA", "21/SP", "21/SU", "21/FA", "22/SP", "22/SU")) %>%

> mutate(Year = case_when(

        TermCode == "16/FA" ∼ "2016-17",

        TermCode == "17/SP" ∼ "2016-17",

        TermCode == "17/SU" ∼ "2016-17",

        TermCode == "17/FA" ∼ "2017-18",

        TermCode == "18/SP" ∼ "2017-18",

        TermCode == "18/SU" ∼ "2017-18",

        TermCode == "18/FA" ∼ "2018-19",

        TermCode == "19/SP" ∼ "2018-19",

        TermCode == "19/SU" ∼ "2018-19",

        TermCode == "19/FA" ∼ "2019-20",

        TermCode == "20/SP" ∼ "2019-20",

        TermCode == "20/SU" ∼ "2019-20",

        TermCode == "20/FA" ∼ "2020-21",

        TermCode == "21/SP" ∼ "2020-21",

        TermCode == "21/SU" ∼ "2020-21",

        TermCode == "21/FA" ∼ "2021-22",

        TermCode == "22/SP" ∼ "2021-22",

        TermCode == "22/SU" ∼ "2021-22"))

Note: As developed, the dataset is in long form by term, meaning that there is one column for term with different terms on each row. The code shows how to select for all the columns needed on the dataset, and then make it so that every term is its own column.

  • 31.

    Select for the columns of interest.

  • 32.

    Use spread to convert it from long to wide with each term as its own column.

  • 33.

    Call the data table “ByTerm”.

  • 34.

    Re-write it to order the terms chronologically.

> ## Simplify file to these variables

> All_University<- select(All_University, Institution, TermCode, Year, DegreeType, PROGRAM_DESC, CIP, NumberOfGraduates)

> ## Spread/pivot the data so each term is a different row

> ByTerm<- select(All_University, Institution, TermCode,DegreeType, PROGRAM_DESC, CIP, NumberOfGraduates)

> ByTerm<- spread(ByTerm, key=TermCode, value = NumberOfGraduates)

> ## Re-order the data set

> ByTerm<- ByTerm %>%

select(Institution, DegreeType, PROGRAM_DESC, CIP, "16/FA", "17/SP", "17/SU", "17/FA", "18/SP", "18/SU", "18/FA", "19/SP", "19/SU", "19/FA", "20/SP", "20/SU", "20/FA", "21/SP", "21/SU", "21/FA", "22/SP", "22/SU")

  • 35.

    With the “ByTerm” dataset, replace missing values with 0 using the is.na command.

  • 36.

    Then calculate the total number of graduates by academic year using mutate and the rowsums commands.

> ### Calculate custom Sum Variables

> ByTerm<- ByTerm %>%

replace(is.na(.), 0)

> ByTerm<- ByTerm %>%

mutate("Total_16_17" = rowSums(across(5:7))) %>%

mutate("Total_17_18" = rowSums(across(8:10))) %>%

mutate("Total_18_19" = rowSums(across(11:13))) %>%

mutate("Total_19_20" = rowSums(across(14:16))) %>%

mutate("Total_20_21" = rowSums(across(17:19))) %>%

mutate("Total_21_22"= rowSums(across(20:22))) %>%

mutate("Grand_Tot" = rowSums(across(5:22)))

  • 37.
    Join the “ByTerm” data set with the SIP/SOC crosswalk, job projections and wage with institution data. The code illustrates how to do this using the join command.
    • a.
      Left join the crosswalk to the job projections data.
    • b.
      Inner join that table to the programs data.
    • c.
      Left join that table with the wage data.
  • 38.

    Rename variables to make them more readable to the end user.

  • 39.

    Order the dataset using the select command.

> dashdata<- left_join(cipsoc, projections, by = "SOC")

> dashdata<- inner_join(ByTerm, dashdata, by = "CIP")%>%

rename("Program" = "PROGRAM_DESC")%>%

rename("Area_Name" = "areaname")%>%

mutate(StateSoc = paste(Area_Name,SOC))

> FullCareer<- left_join(dashdata, wage, by = "StateSoc")%>%

rename("Area Name" = "Area_Name")%>%

rename("Occupation Name" = "Career_Title")%>%

rename("Percent Change" = "percentchange")%>%

rename("Average Annual Openings" = "avgannualopenings")%>%

rename("Projection" = "proj")%>%

rename("Base" = "base")%>%

rename("Change" = "change")%>%

select(Institution, Program, CIP, Total_16_17, Total_17_18, Total_18_19, Total_19_20, Total_20_21, Total_21_22, Grand_Tot, "Area Name", "Occupation Name", Base, Projection, Change, "Percent Change", "Average Annual Openings", H_MEAN, A_MEAN, H_MEDIAN, A_MEDIAN)

  • 40.

    Use tidyverse to rename several of the fields to be more readable for the end user.

  • 41.

    Left join that dataset to the cost of living data.

  • 42.

    Calculate adjusted income by dividing the wage by the index.

> FullCareer<- FullCareer %>%

    rename("Area_Name" = "Area Name")%>%

    rename("Occupation.Name" = "Occupation Name")%>%

    rename("2018_Base" = Base)%>%

    rename("2028_Base" = Projection)%>%

    rename("Percent.Change" = "Percent Change")%>%

    rename("Average.Annual.Openings" = "Average Annual

      Openings")

    rename("Hourly_Average_Wage" = H_MEAN)%>%

    rename("Annual_Average_Wage" = A_MEAN)%>%

    rename("Hourly_Median_Wage" = H_MEDIAN)%>%

    rename("Annual_Median_Wage" = A_MEDIAN)

> FullCareer<-left_join(FullCareer, col, by = "Area_Name")%>%

    mutate ("Hourly_Adjusted_Mean" = Average_Wage/Index)%>%

    mutate ("Annual_Adjusted_Mean" = Average_Wage/Index)%>%

    mutate ("Hourly_Adjusted_Median" =

      Hourly_Median_Wage/Index)%>%

    mutate ("Annual_Adjusted_Median" =

      Annual_Median_Wage/Index)

  • 43.

    Create a database for the Shiny application and save it as a .csv in the same folder as the RMarkdown file and also in the same folder as the Shiny application file.

  • 44.

    Name the file “dashdata.csv”.

Note: Use “dashdata.csv” to program the Shiny application dashboard.

> dashdata<- FullCareer

> write.csv(dashdata, "../Data/Application/dashdata.csv")

> write.csv(dashdata, "dashdata.csv")

Program the dashboard

Inline graphicTiming: ∼5 h

This section provides details on how to program the shiny application. This requires several steps of developing the user interface including input controls and the table, and developing the server, which processes the application.

Note: Please refer the RMarkdown page for all the code: https://rpubs.com/anonymous_dashboarder/careers

Note: It is possible to program a Shiny application dashboard under a chunk in RMarkdown. However, Use a Shiny application file to publish on the web.

  • 45.

    Click on “File”, “New File”, then “Shiny Web App…” (Figure 3).

  • 46.

    Assign the application a name.

  • 47.

    The application will open with some pre-programmed code (Figure 4). Delete this and start afresh.

  • 48.

    Save the Application file in the same folder as “dashdata.csv” and this should be a separate folder than the RMarkdown file.

  • 49.

    Keep the Shiny file name as “App”.

Note: Shiny application files reference file names in the folders in which they are saved.

  • 50.

    Create folder called “www” in the application folder to keep the background image.

  • 51.

    Figure 4 shows what shiny looks like when first opened.

  • 52.

    The first task is to load all libraries. If any libraries are excluded, the application may run on the user’s machine, but it will not publish. If this happens, there will be an error code.

  • 53.

    There is no need to install any packages if they were already installed when using R another time.

> library(tidyverse)

> library(dplyr)

> library(knitr)

> library(shiny)

> library(shinyWidgets)

> library(DT)

> library(formattable)

> library(data.table)

> library(shinydashboard)

> library(zip)

  • 54.

    Create the database using the “dashdata.csv” file and call it “Table1”.

  • 55.

    Program a few last changes and calculations to the database including removing the United States column, making the variables numeric, and calculating a percent change variable.

Note: These programming tasks in Shiny ensure consistent data for the application, as Excel and .csv files sometimes modify variables.

  • 56.

    Download the data dictionary to program a table where folks can refer to it.

  • 57.

    Download a zip file with the main datasets for downloading from the dashboard application.

###Download Data

> zipfile<- zip("zipfile.zip", file = "StateWage.zip")

> Table1<- read.csv("dashdata.csv")

> Table1<- Table1[!Table1$Area_Name == "United_States",]

> Table1<- Table1 %>%

select(Institution, Program, CIP, Area_Name, Occupation.Name, Total_16_17, Total_17_18, Total_18_19, Total_19_20, Total_20_21, Total_21_22, Grand_Tot, X2018_Base, X2028_Base, Change, Percent.Change, Average.Annual.Openings,

Hourly_Adjusted_Median, Annual_Adjusted_Median, Index)%>%

mutate(across(where(is.numeric), ∼ round(., digits = 2)))%>%

rename("2018_Base" = X2018_Base)%>%

rename("2028_Base" = X2028_Base)

> Table1$Percent.Change<- Table1$Percent.Change/100

> datadictionary<- read.csv("Data.csv")

Note: A Shiny web application consists of two main parts. The user interface (UI) is where the visual and functional features are programmed and typically begins with a “ui<-“ followed by the programming that lays out the space of the dashboard. The second part of the dashboard consists of the server, which typically starts with a “server<-“. This is where to program the operations behind the buttons and tabs and where to develop tables or graphs. The following code takes the first 11 lines of the beginning of the UI programming.

  • 58.

    Start the user interface with “UI < -“.

  • 59.

    Move to the next line to declare the main color of the dashboard skin.

  • 60.

    Establish a title.

  • 61.
    Name the tabs in the sidebar,
    • a.
      “Introduction”,
    • b.
      “Data Table,”
    • c.
      “Data Dictionary”,
    • d.
      “Sources”.

Note: Each of these has a special name assigned to them to reference with future programming. For example, “Introduction” is given the tab name of “intro”, which is the name used to program that tab item. Figure 5 shows the upper right corner of the application’s menu.

Inline graphicCRITICAL: It is very important to note the commas, brackets, and parentheses. R will mark them to help keep track, but it can take some time to get used to doing this in a way that reflects the goal of the application.

> ui <-

dashboardPage(skin = "green",

     dashboardHeader(title = "Degrees and Careers"),

     dashboardSidebar(

      sidebarMenu(

        menuItem("Introduction", tabName = "intro", icon = icon("user")),

        menuItem("Data Table", tabName = 'datatable', icon = icon("user")),

        menuItem("Data Dictionary", tabName = "dictionary", icon = icon("user")),

        menuItem("Sources", tabName = "references", icon = icon("user")))),

Note: After programming the sidebar, next program each of the tabs that are present in the sidebar menu. Name the first tab is the introduction “intro” for programming purposes. The following code shows the first ten lines of this code.

  • 62.

    Program the body of the dashboard with this dashboardbody command telling Shiny to include the tab items in the body with the tabitems command.

  • 63.

    Generate a fluid page, which adjusts the content to the size of the user’s screen.

  • 64.

    Set a background image and include text using HTML.

Note: The h1, h2, etc. commands give different level headings for font size. Write words in quotes.

dashboardBody(

      tabItems(

       tabItem(tabName = "intro",

         fluidPage(setBackgroundImage(src = "tree.png", shinydashboard = TRUE)),

         h1("About this Dashboard", align = "left", style = "color: white"),

         h2("The purpose of this dashboard is to allow the user to connect any higher education degree affiliated with a Classification of Instructional Program (CIP) code to a career associated with that CIP code to investigate ten-year job growth projections by any U.S. state, and to examine income information about that affiliated career within each state. Reported incomes are divided by the cost of living index for the state of the occupation so that adjusted income can be compared across states given degree and affiliated career.",

         style = "color: white"),

Note: The next tab item consists of the data table. This is where the user will go to filter the data, see a table of results, and download the data as filtered. This table consists of three input controls called “pickerInput”. This allows the user to check one or more selections from the list.

  • 65.
    Program picker input controls.
    • a.
      The first input is to select an institution (one or more of the colleges and universities).
      • i.
        Program directions to the user, such as “Select one or more institutions”.
      • ii.
        Program the text to be in white font.
      • iii.
        Alphabetize from the Table1 dataset with sort(unique(Table1$Institution)).
      • iv.
        Have University pre-selected.
    • b.
      The second input is to select one or more degree.
      • i.
        Program directions to the user, such as “Select one or more degrees”.
      • ii.
        Program the text to be in white font.
    • c.
      The third is to select one or more states to compare projections and wage data.
      • i.
        Program directions to the user such as “Select one or more areas for careers”.
      • ii.
        Program the text to be in white font.
    • d.
      Each input control needs a name that will be referred to when programming the server. For example, the input control for selected institutions is named “institutionInput”.
    • e.
      Program each action box so users can select all or none.
  • 66.

    Program a download button. This will be used to download the filtered data as a .csv file.

Note: If desirable, one could program the input controls to select specific degrees or states, or to select all of them.

tabItem(tabName = "datatable",

         h1("Degrees and Job Income Projections", align = "center", style = "color: white"),

         h2("First select an institution, then a degree program, then one or more states to compare career outcomes. Data will not appear until you make a selection. You can also download the filtered data as a .csv.",

          style = "color: white"),

         h3(tags$b("You can scroll horizontally to see income data by dragging the bar at the bottom of the table.",

                  style = "color: white")),

         h3(tags$b("Rotate your phone or tablet if you cannot scroll through all the columns.", style = "color: white")),

         fluidRow(

          pickerInput(

           inputId ="institutionInput",

           label = div("Select one or more institutions:", style = "color: white;"),

           choices = sort(unique(Table1$Institution)),

           options = list('actions-box' = TRUE),

           selected = "University",

           multiple = TRUE)),

         fluidRow(

          pickerInput(

           inputId ="credentialInput",

           label = div("Select one or more degrees:", style = "color: white;"),

           choices = sort(unique(Table1$Program)),

           options = list('actions-box' = TRUE),

           multiple = TRUE)),

         fluidRow(

          pickerInput(

           inputId = "areaInput",

           label = div("Select one or more areas for careers:", style = "color: white;"),

           choices = sort(unique(Table1$Area_Name)),

           options = list('actions-box' = TRUE),

           multiple = TRUE)),

         downloadButton("downloadData", "Download the Filtered Data"),

         fluidRow(

           box(width = 10, DT::dataTableOutput("projections")))),

  • 67.

    Program data dictionary with a command to download the table.

  • 68.

    Program the references using HTML text.

tabItem(tabName = "dictionary",

         h1("Data Dictionary", align = "center", style = "color: white"),

         fluidRow(

          box(width = 10, DT::dataTableOutput("dictionarytable")))),

       tabItem(tabName = "references",

         h1("Data Sources", align = "center", style = "color: white"),

         h3("About the PMP (2021). Projections Central. Retrieved November 14, 2021 from https://projectionscentral.org/", = "color: white"),

         h3("About the U.S. Bureau of Labor Statistics (2021). U.S. Bureau of Labor Statistics. Retrieved November 15, 2021 from https://www.bls.gov/bls/infohome.htm", style = "color: white"),

         h3("Missouri Economic and Research Information Center (2022). Cost of living data series. Retrieved January 3, 2022 from https://meric.mo.gov/data/cost-living-data-series", style = "color: white"),

         h3("National Center for Education Statistics (2021). CIP User Site. CIP SOC Crosswalk. Retrieved November 15, 2021 from https://nces.ed.gov/ipeds/cipcode/post3.aspx?y=56", style = "color: white"),

         h3("Occupational Employment and Wage Statistics Home Page (2021). Retrieved November 5, 2021, from https://www.bls.gov/oes/", style = "color: white"),

         h3("U.S. Bureau of Labor Statistics. (2021). May 2020 National Occupational Employment and Wage Estimates. Occupational Employment and Wage Statistics. OEWS Program Links. Retrieved November 5, 2021, from https://www.bls.gov/oes/current/oes_nat.htm#00-0000",

          style = "color: white"))))

Note: The next phase of coding is the server. This is where the UI elements are populated with data and utility.

  • 69.

    Name a series of objects and programming functions by declaring that the server begins with “Server<-“.

  • 70.
    Program the input controls to link back to the user buttons and to behave so that they interconnect.
    • a.
      Name the first input control “institutionDegree” and then link it to the UI code that named the button “institutionInput”.
    • b.
      Do the same with the two other input controls as shown.

Note: The input controls filter their respective variables given the previously filtered field, so institutionInput filters for institution, credentialInput filters for credentials given the institution selected, and areaInput filters for the state given institution and degree. The result is a table with data relevant to the desired filters.

Inline graphicCRITICAL: Be sure to study the provided programming logic. The correct inputs should be in the correct sequence for the filters to interact correctly. It may take some practice and time to get this right if done in different contexts with different data and labels.

server <- function(input, output, session){

 institutionDegree<- reactive({

  filter(Table1, Institution %in% input$institutionInput)

})

 observeEvent(institutionDegree(), {

  choices <- sort(unique(institutionDegree()$Program))

  updatePickerInput(session = session, inputId = "credentialInput", choices = choices, selected = Table1$Institution)

})

 programDegree <- reactive({

  req(input$credentialInput)

  filter(institutionDegree(), Program %in% input$credentialInput)

})

 observeEvent(programDegree(), {

  choices <- sort(unique(programDegree()$Area_Name))

  updatePickerInput(session = session, inputId = "areaInput", choices = choices, selected = Table1$Program)

})

  • 71.
    Program the filterable data table.
    • a.
      Begin with the “output$projections<-“.
    • b.
      Follow with the DT package telling R to render the data table.

Note: The code shows that this commands R to filter the data with the input controls as previously programmed by naming the last one “input$areaInput” after the filter command. The rest of the programming gives R specific directions on the table’s look and feel.

output$projections <-

  output$projections <-

   DT::renderDataTable(datatable(extensions = c("FixedHeader", "FixedColumns"),

      options= list(autoWidth=TRUE,

        scrollX = TRUE, searching = FALSE, fixedColumns = list(leftColumns = 5), fixedHeader=TRUE),{

        req(input$areaInput)

        programDegree()%>%

        filter(Area_Name %in% input$areaInput)

})

  • 72.

    Format the percent change variable as a percentage and the wage values as currency with a dollar sign.

Note: This starts with a pipe (%>%) telling R to consider the previous code for this command.

%>%formatPercentage("Percent.Change", 0)

%>%formatCurrency(c("Hourly_Adjusted_Median", "Annual_Adjusted_Median"), currency = "$"))

  • 73.

    Program it to allow the user to download the table data as a filter.

  • 74.

    Use a similar command to filter input$areaInput in the write.csv command line.

output$downloadData <- downloadHandler(

 filename = function() {

  paste("Careers", ".csv", sep="")

},

 content = function(file) {

  write.csv(programDegree()%>%

      filter(Area_Name %in% input$areaInput), file)

 }

)

  • 75.

    Finish the dashboard by telling R to include a table of the data dictionary on the appropriate tab of the dashboard.

  • 76.

    Close off the final bracket and tell Shiny to run the program as a shiny application with shinyApp(ui = ui, server = server).

output$dictionarytable <- DT::renderDataTable(

 datadictionary,

 options = list(scrollX = TRUE))

}

shinyApp(ui=ui, server=server)

shinyApp(ui=ui, server=server)

  • 77.

    Test the dashboard by hitting the “Run” button in the upper right hand corner.

  • 78.

    If it does not run, investigate the provided error code.

Note: Using sites such as GitHub, Stack Overflow, or the R Shiny community will help with troubleshooting. A lot of using R is learning how to self-diagnose problems, which can require hours of seemingly fruitless research and frustration. If one is resilient, eventually, they will figure it out. Do not hesitate to post a question in one of the community’s sites.

Figure 3.

Figure 3

Click on “File”, “New File” then “Shiny Web App…” respectively

Figure 4.

Figure 4

Shiny when first opened

Figure 5.

Figure 5

The upper-right of the application’s menu

Publish the dashboard

Inline graphicTiming: ∼1 h

  • 79.

    Register a free account with Shiny by visiting this site https://www.shinyapps.io/ and clicking on “Sign Up.”

  • 80.

    Once registered, return to the Web Application and click on the triangle in the upper-right of the screen and then “Manage Accounts” as shown in Figure 6.

  • 81.

    Click on “Connect” and then select “ShinyApps.io”. See Figure 7.

  • 82.

    Return to the Shiny Apps account, and on the left toolbar select “Tokens”. Show the secret as shown on Figure 8.

  • 83.

    Copy the secret, return to Shiny and paste it in the box. Click connect account as shown in Figure 9.

  • 84.

    Figure 10 shows the publication box in Shiny.

  • 85.

    Open the Shiny Application file. Run the application again.

  • 86.

    If it runs, click on publish in the upper right corner.

  • 87.

    Highlight the account and click “Publish”.

  • 88.

    It will publish the files included in the left side window. These are the files placed in the folder with the application. The programmer can select and unselect these, but excluding things may result in error in publishing.

  • 89.

    If the application does not publish, check the terminal to see the error code. With some troubleshooting, it will work.

  • 90.

    The dashboard is now ready to share with stakeholders.

Figure 6.

Figure 6

Click on “Manage Accounts”

Figure 7.

Figure 7

Connecting the application to ShinyApps.io

Figure 8.

Figure 8

Getting a token and secret

Figure 9.

Figure 9

Adding the secret to the Shiny application

Figure 10.

Figure 10

Publishing the Shiny application

Update dashboard

Inline graphicTiming: ∼2 h

This section provides directions and recommendations for updating the data since most applications and dashboards are useful only if current.

Note: We recommend updating this dashboard once every year. Each agency has a different time schedule of when they update their data.

  • 91.

    The Bureau of Labor Statistics updates its data tables every May. Tables include data from the previous year.

  • 92.

    The Missouri Economic Research and Information Center updates its data every quarter.

  • 93.

    CIP to SOC crosswalk updated their data in 2020.

  • 94.

    Updates to CIP codes occur every ten years.

  • 95.

    Updates to SOC occur next in 2028.

  • 96.

    Projections Central completed their database in 2020.

Note: Periodically check all sites for updates.

Expected outcomes

The expected outcome of this activity is a usable dashboard to provide users with information about the link between degrees, job projections, and outcomes. This dashboard may be useful to inform decision making about academic programs by higher education leaders and policymakers.

To ensure that the dashboard reflects valid data, randomly select rows of data from the dashboard and compare the values to the original source files for each metric. Make sure CIP Codes match SOC codes, degree labels match CIP codes, careers match SOC codes, and the numbers match the careers. Repeat steps on more random rows. Finally consult with institutional researchers or other people familiar with the data at the participating institutions to pilot test the dashboard and solicit their feedback on the validity of the results.

Quantification and statistical analysis

While the data sources provide the metrics for this Shiny application, this protocol requires calculation of adjusted salary. Calculate this by dividing the income metric by the cost of living index as indicated on step 15 of Munge and Link Data into Usable Dataset.

Limitations

There are a few limitations to the data used in the protocol as well as the methodology. The first limitation is the data obtained using SQL from institutions will only be as accurate as it is recorded. Most registrars and institutional research offices have ways to ensure data governance. It is recommended to speak with an expert on the institution’s data, and preferably to ask an expert in SQL to extract those data. Second, the government data used from this study may not accurately predict actual long-term economic outcomes. For example, the projections data provide ten-year predictions on job openings, based on models that examine data from the past. Anomalies in the present may render current models inaccurate. Wage and other related data are aggregate values and do not represent beginning salaries. Finally, projects like this are dependent upon programmer experience and knowledge. Those new to programming should familiarize themselves with the simpler tasks on R provided in the beginning of this protocol before taking on a project like this. There are limitations to R. The data published are only as accurate as their source and the programmer’s ability to capture them. This application should not be used with sensitive data under FERPA, HIPAA, or other related laws as the free version of R is available to the public.

Troubleshooting

Note: Coding errors will result in error messages in R in both Shiny and RMarkdown. The most common errors in Shiny relate to brackets, commas, or parentheses. Read the error message carefully and trace the code back to the source of the error. In addition, use a search engine or search in communities such as R Shiny community, GitHub or Stack Overflow. The following a list of common errors found in R and Shiny.31

Problem 1

Receive an error that a package or function that is not installed”. For example, “Error in select (x, y, z) could not find function “select”. This may occur here: Installation.

Potential solution

  • Search for the function on Google to locate the correct package.

  • Install the package.

Problem 2

R will not import the file and the “Error in file . . . cannot open the connection” message appears. This may occur here: Import Wage and Career Data from Public Websites.

Potential solution

  • Make sure to type the file name correctly.

  • If that does not work, make sure the file to import is in the same folder as the RMarkdown or Shiny file, or that a correct file path is given.

Problem 3

The programmer misspells variable names or other commands. For example, “Error in mutate(car = min+max). This error may occur here: Munge and Link Data into Usable Dataset.

Potential solution

  • Check spelling.

  • In the above case, the correct spelling is “mutate(car = min+max)”.

  • Notice the spelling of “mutate”.

Problem 4

The programmer misses arguments in a function. For example, r reports, “Error in table(income)”. This may occur here: Munge and Link Data into Usable Dataset.

Potential solution

  • Make sure to provide all the information.

  • For example, “table(data$income)”.

  • The “$” between “data” and “income” tells R to generate a table from the table called “data” on the variable of “income”.

Problem 5

The programmer tries a function on an incorrect data type. For example, r reports, “Warning: argument is not numeric”. This may occur here: Munge and Link Data into Usable Dataset.

Potential solution

  • Check to make sure the data type matches the command.

  • If it does not, transform the variable using as.numeric(), as.character, or other similar functions.

Resource availability

Lead contact

Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Mark A. Perkins, Ph.D., mperki17@uwyo.edu.

Technical contact

Technical questions on executing this protocol should be directed to and will be answered by the technical contact, Mark. A. Perkins, Ph.D., mperki17@uwyo.edu.

Materials availability

All materials and software are open-source and this protocol provides directions.

Data and code availability

All the data used in this demonstration are available as a downloadable link in the demonstration dashboard.

Acknowledgments

We acknowledge the R community, which has been instrumental in our own evolution and development as researchers, programmers, and data scientists.

Author contributions

M.A.P. - programmer and writer. J.W.C. - editor and conceptual consultant.

Declaration of interests

The authors declare no competing interests.

Contributor Information

Mark A. Perkins, Email: mperki17@uwyo.edu.

Jonathan W. Carrier, Email: jcarrie5@uwyo.edu.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

All the data used in this demonstration are available as a downloadable link in the demonstration dashboard.


Articles from STAR Protocols are provided here courtesy of Elsevier

RESOURCES