Abstract
This paper describes the methods that have been developed and implemented to process research participant data generated by a high fidelity driving simulator that has been integrated with eye tracking equipment. The driving simulator is used for experimental studies to understand driving behavior. Solutions are implemented to programmatically process the output of the simulator and transform the raw data from these research experiments to an analysis ready format. The algorithm is tested across the data for numerous participants with varying scenarios within the experiments and is further curated to meet the requirements and standards of the research studies that require the use of driving simulator to generate data.
Keywords: Data Reduction, Driving Simulator, Eye Tracking, Translational Research
I. INTRODUCTION
The advent of big data science and the platforms for its use into meaningful analysis has increased the sophistication of the techniques used for acquisition of all kinds of structured and unstructured data through various sensors, Internet of Things devices and simulators [1]. This has resulted in a data driven approach towards the implementation in many applications, research studies, and experiments. This influx in the use of large and complex data sets has also prompted the requirement of handling, translating and transforming the data in a way that could not be achieved by traditional data processing methods. Data reduction is the process of transforming the raw data from such devices that has been acquired through tests and experimentations into a correctly ordered and summarized form that can be used for further analysis. The data reduction solution in this project is targeted for processing the data of a fully immersive, state-of-the-art driving simulator designed by Realtime Technologies Inc. This simulator is used for translational research and transportation related injury prevention studies that aim to understand the driving behaviors of at-risk drivers, with the ultimate goal of reducing the morbidity and mortality resulting from motor vehicle collisions. Driving simulator studies offer high experimental controls in a virtual environment where the participant operates the car model simulator through various driving scenarios and environments, and multitudinous parameters are recorded as defined by the study. The output of driving data is stored on the system in a default raw format used by the software interface and cannot be directly used for analysis due to its formatting, size and complexity. This extensive and intricate output creates a challenge for researchers to directly perform statistical analysis on the data [2, 3]. The work described herein aimed to implement a software algorithm that transformed the raw output of the simulator and reduced it according to the specifications stated by the research studies, and ultimately present it in a simpler, ordered and summarized tabular form ready for analysis by researchers. This solution can be applied to other studies using a driving simulator for data acquisition.
II. DRIVING SIMULATOR
A. Simulation Test
The driving simulation test is conducted with a human participant. The purpose of the simulation test is to place a human participant in a virtual driving environment to replicate real-world driving scenarios and performance. The participant drives the car through the virtual environment through various environments encountering various driving situations and hazards. The simulator records multiple parameters relating to the participants driving performance and their eye movements. The data described herein are recorded continuously throughout the duration of the test and saved on the local system as drive data and eye data respectively. With multiple driving tests for multiple participants each, several simulator output files are saved with respect to each participant.
The algorithm developed in this paper is a data reduction program that summarizes and compiles all the simulator data from its raw format to a single file containing data of all participants and can be used by researchers for further analysis.
B. Simulator Model
The driving simulator model is highly reliable and immersive device designed by Realtime Technologies, Inc. and optimized for driving studies. It is outfitted with a full size car body housing fully operational steering wheel, brakes, accelerator, gearbox, indicators and dashboard. Its placement and positioning is designed to emulate a real car height and movement. The simulator provides feedback in response to the tactile sensors placed across the model for accelerating and braking, engaging the participant in a practical immersive driving experience. Figure 1 shows the arrangement of the simulator where the hardware system that is interfaced with the software is built into the cockpit of the vehicle. The high quality graphic visuals system consists of three 80 inch projected screens proving a 180 degree field of view to the driver. The visual system has a 60Hz refresh rate and a measured latency of less than 50ms from step input on host to the visual output. A large screen behind the vehicle is also fitted to simulate the roadway environment behind the car that can be seen through the rear view mirror. LCD displays are placed into the side mirrors to account for varying calculated field of views required for the two side mirrors. The sound system houses a 5.1 surround sound system to simulate the realistic vehicle and ambient traffic sounds. There is a separate room in the simulator area where the computer systems to operate and monitor the simulation are installed. These systems are used to interface and calibrate the simulator before every test. It is a setup of multiple computer systems that executes different data acquisitions software on each of them that are linked with the hardware of the simulator. The researcher controls and executes the simulation from this setup during the test. The data recorded from the experiment are stored in the local storage. Additional speakers are present inside the car body for warnings, guide and communication between the researcher and the participant during the experiment [4].
Fig. 1.

Simulator equipment setup
C. Simulation Software
Internet Scene Assembler: This software is used to create a virtual world scenario for the driving simulation. It is a 3D authoring tool that allows the user to create a 3D interactive and dynamic sequence. In-built or custom models of objects (e.g., buildings, trees) can be used to assemble a scene, and animated objects (e.g., pedestrians) can also be used for interactive segments of the simulation where the objects react to the events in the scene and the participant driver’s actions. The animation support of the software can quickly and easily animate object size, position, orientation, texture, color, intensity and transparency. The design of the entire virtual world is programmed, that can be simulated through the SimCreator software. The object interactivity feature of this scene assembler is leveraged to add certain driving hazards at specific times with respect to the participant’s driving actions through the sequence. Sensor points are also placed across the scene to trigger the software interface to record certain parameters at certain points during the driving scenario [5].
SimCreator: This software is capable of modelling and generating a simulation. Its interface allows the placements and connections between different components. These components can be its own singular function or can consist of multiple components. They can be mathematical functions and or a code component that executes a program written in C/C++/JavaScript etc. The simulation model is developed by connecting these components together in a meaningful way according to the simulation requirements. A model itself can be added as a group component to a larger design or it can be added as an alias component, which is linked to the original model. The SimCreator software takes in the code components from the Internet Scene Assembler software that contains the program for the driving simulation and executes it. It is interfaced with the hardware car model of the simulator and synchronizes the hardware operations and inputs with the virtual scenario. When the simulation is in progress, it reads in the parameters of the drive data as defined in its interface according to its study. The simulator can be operated and monitored from the user interface of the system from where it is being executed. When the simulation is complete, it results in multiple output files with the driving data. The files of concern in this project have extensions of “.plt” file that contains the recorded parameters of the driving data and “.hdr” file that contains the header names for the values in the .plt file [6].
Smart Eye Pro: The Smart Eye Pro system consists of multiple cameras placed in the car body that are connected to the host system where they are interfaced with the software. The camera placement can be seen in figure 2 which captures the eye movement and the head movement of the participant driver from all angles. This software is linked with the simulator through the Mapps Record Manager software for smart eye tracking. It outputs a timestamp in the stream it sends to the Record Manager. This software’s main purpose is to record the eye tracking data of the participant as the simulator goes through various segments. It captures the gaze measurement with accuracy of 0.5 degrees and the eye movement at 60Hz sampling rate. It also performs head tracking based on detection of basic facial features in each image frame. Upon completion of a simulation drive this software generates many file containing the eye tracking information. The files of concern for this project is the log file with the extension of “.log”, which contains the parameter values outlined by the software along with its header names [7].
Mapps Record Manager for Smart Eye Pro: The Record Manager by EyesDx is a 2D eye tracking analysis software that translates and manages the output data from the Smart Eye Pro’s data stream of physiological and behavioral information. The screen and camera capturing tools can record data from multiple computer screens and cameras across the network. The strategically placed Smart Eye Pro cameras in the simulator car body track the eye movements and the trajectory of the participant’s line of sight with respect to the scene that is being projected on the screen from the virtual scenario in real-time and records the eye parameters throughout the duration of the simulation. The software integrates together with the SimCreator software and seamlessly performs the time stamping and data synchronization with the drive data generated by the simulator and ensures a 1:1 mapping of data and time. It produces and saves useful output containing the eye tracking information that could be used for research [8].
Fig. 2.

Smart Eye Pro cameras configuration
III. IMPLEMENTATION RESOURCES
A. Hardware Systems
The simulator produces a large amount of data that are saved on the local system that is running and monitoring the simulator. This data are routinely transferred to the remote servers of a high performance computing (HPC) system that provides a large storage space and high computing capabilities that uses multiple CPU cores and NVIDIA P100 GPU’s. It provides an easy-to-use interface with customizable options to select the resources required for development and testing of the algorithm.
B. Remote Conda and Jupyter Notebook Environment
The Conda software manages the open sources packages and development environments of a system and can be used on different operating systems [9]. The HPC system has the provisions to host a remote Conda session that allows the user to create and maintain a customized environment with dependency packages defined and installed as required by the project. This environment is saved on the server and need not be defined again for later use. Multiple such environments can be created, saved, modified and deleted as needed for different projects. All these actions can be performed from the secure shell access of the system [10].
Jupyter Notebook in an open source application that creates document called “notebook”. This document contains a live code with a graphical rich interface that allows for addition of paragraphs, visualizations and equations. The code can be executed step-by-step, and the output of each operation can be observed, which makes it easy for debugging and testing. The notebook can also be converted to a script that can be executed from a shell command line [11]. The HPC cluster can support a live Jupyter notebook session for development by starting an interactive job on the system. The web browser based user friendly front-end interface of the HPC system allows for easy selection of resources and initialization of a remote session.
C. Python Data Science Environment
Python is a general purpose programming language and is easy to use for analytical and quantitative problems. It provides a wide range of open source libraries that can be used for data intensive projects. Some libraries, such as “Pandas”, are specially designed to deal with data processing tasks on a large scale. This programming language is independent of a platform and can be extended and adapted to other platforms. It is a powerful tool especially when used in conjunction with Jupyter notebooks making it easy and efficient for testing [12].
IV. METHODS
A. IRB Training
The research involves participation and data collection of human subjects. The Institution Review Board for Human Use (IRB) training is completed to maintain the guiding ethical principles laid out by the board. It is periodically reviewed for the protocols convened by the IRB.
B. Data Securtiy and Integrity
The data generated by the simulator for every test run can be categorized as 1) drive data (“.plt” and “.hdr” files) from the SimCreator software; and 2) eye data (“.log” file) from the Smart Eye Pro software. These files are stored in separate folders with their participant’s subject ID as the folder names. Any research study would require numerous participants and their tests to be performed in the simulator resulting into a large amount of data.
It is of highest importance that the security and integrity of the data is maintained to meet the quality and ethical standards of the research. A data auditing process is executed to confirm that the files are stored in the correct location. It corrects and removes any errors that may have occurred during the test or handling of the data (e.g., human error in mislabeling an ID). Missing files are a common error that occurs due to files placed in a location other than typical, standard locations. The auditing process resolves this error and maintains the quality of the data.
Drive Data Audit: The files of these data are stored in the drive data directory under the research project’s main folder. The two drive files (.hdr and .plt) are expected to be preset for every test of its respective subject ID. The auditing program detects the subject IDs for which any of the drive files is missing. These files are tracked throughout the system and placed at the correct location. Every drive file has the participant’s unique ID in its filename by default making them easily identifiable.
Eye Data Audit: For the eye data, under every participant’s ID folder the output files share identical names (Subject_1.log) by default despite being from different scenario runs. This makes the files difficult to identify and manage. The auditing process assigns the subject’ unique ID to the filename of its respective test output files. This makes the eye data files uniquely identifiable and easier to manage and process. The auditing program then detects the subject IDs with missing eye data files. These missing files are tracked down in the system and are placed in their correct location in the eye data directory.
C. Data Acquisition
For developing and testing the data reduction algorithm only one subject ID’s data files are required. The requirements of the current research studies that use this algorithm are used as standards for developing the data reduction program. Some of these requirements could be different for different studies. Provisions are made into the algorithm to make slight modifications as required but the core process of reducing this data remains the same.
The first step is to import the information from the data files into the programming environment. A list of subject IDs or study participants is acquired from the data files stored in the directories of drive data and eye data. This list is filtered according to the requirements and process of the research study and an input ID list is formed. The data of these participant IDs are to be imported into the program for further processing. The drive data files and the eye data files are stored in different formats and requires different functions to import the data
Drive Data: The “.hdr” file contains the header names for the column values of the driving data present in the “.plt” file. The information from these files is imported using the Pandas library method of “pandas.Dataframe” in Python that acquires the data in a tabular format. A list of 81 column names is retrieved from the information in the header file. The data from the drive file (“.plt”) is acquired into a dataframe object and the column names from the header file and the data types of these columns are applied to this table. The subset of the data where the time measure is zero is filtered out while importing. A dataframe table with 81 columns is acquired. This table is passed as input to another function that applies further conditions to the data. Parameters that are not required according to the outline of the study are removed. New names are assigned to the preserved columns with a prefix of ‘D’ to identify the columns as driving data. UNIX timestamps generated by default are converted to UTC timestamps. Additional information on frequency and counter is derived from the time measurement and UTC date-timestamp values respectively. This new information is added to the dataframe table as columns. Some of the distance measurements are converted from meters to miles and some are converted from meters to feet.
Eye Data: The “.log” file is acquired using the import techniques from the Pandas library. These data contains the column headers for its values and is directly imported into a tabular dataframe object with 62 columns. This dataframe is passed into further processing. The UTC ticks are converted to date-time stamps. ‘E’ is added as prefix to the column names to identify as eye data columns. Further information is derived on frequency, counter and object insertions to create new columns and are added to the dataframe table.
D. Data Merging
The drive data and eye data retrieved from the acquisition functions are stored in different variables. To get a complete dataset of the simulator scenario run, these tables are merged to form a single dataframe table. The Pandas dataframe object provides techniques that can seamlessly merge two different tables into one. An outer merge is performed using the “UTC” and “groupcounter” columns that are present in both datasets. This merge aligns data of identical timestamps from different tables in the same row. The merging operation results into a table with drive and eye data synched at the same instances of time maintaining the accuracy of the data while reducing two datasets into one.
E. Statistics for Regions of Interest
A set of 76 pre-defined statistical parameters are stored as constants. These parameters are specific to the eye data and drive data which aligns with the columns of the dataset. Appropriate variable names are generated according to these parameters with added prefixes of “E” and “D” to identify the columns as eye and drive statistics respectively. If any variable name is missing from the columns of the dataset then that parameter is filled with a missing values constant. Mean and standard deviation values are calculated for each parameter resulting into a list of parameters of 152 values.
A map section is a specific region of interest in the driving simulation map (e.g., freeway). When the participant driver enters this region within the simulation scenario, the sensors of the simulator are triggered to record the current map section. These are defined during the design of the simulation map. Depending on the outline of the study there could be many such map sections throughout the simulation. The map sections described for this test is given in table I which shows each section value and its corresponding meaning. The “MapSection” column of the data could have a NULL value which means the simulation at this instance of time is not present in any of the driving environment regions of interest. It could also have a zero value which is the default initialization value when the simulation is being commenced and the participant has not encountered any map sections yet. The NULL and zero values are filtered out as those are not of any interest to the study.
TABLE I.
MAP-SECTION VALUES
| Section Values | Map Regions |
|---|---|
| 90 | Urban |
| 50 | Overpass |
| 80 | Freeway |
| 81 | OnRamp |
| 82 | OffRamp |
| 60 | CarFollow |
| 70 | Residential |
These filtered data are used to calculate the statistical parameters. The data are further grouped into subsets and values are derived for each of the subset as well. The dataset is grouped into four categories for this operation.
1). Overall:
In this category, we are performing statistical calculation on the entire dataset (i.e., across all map sections) and deriving the values without further grouping of the dataset. A list of 152 paramters is generated from this method.
2). Map Sections:
A pre-defined list of roadway and environmental regions of interest is stored as a constant. The dataset is grouped into smaller subsets acccording to the map sections available in the data. The order of the map sections encountered during a scenario run is randomized, and there could some regions that do not occur in a particular simulation test. This would show a map section value from the constant list that is not available in the data. A list of parameters is still created for the missing map section and is filled with a missing value constant. The stastistical parameters are derived for the subsets of the map sections that are available and stored into its own list of 152 paramters. This number of paramters resulting in this category is equal to 152 times the number of map sections.
3). Combined Map Sections and Map Halfs:
The map halfs are the fixed roadway elements that occur in every map section. These map halfs are specific roadway elements (e.g., curve, intersection) nested within the map sections. The occurrence of each of these elments is recorded in the “MapHalf” column of the dataset. Table II gives the map half values and their roadway elements.
TABLE II.
MAP-HALF VALUES
| Map Half Values | Map Half Elements |
|---|---|
| 1 | Straight |
| 2 | Curve |
| 3 | Intersection |
This is a further subdivision of the subsets of the map sections. To achieve this a combined grouping of map sections and map halfs is performed. A pre-defined list of map halfs is stored seperatly as a constant and is used to calculate the statistical values for these subsets. A list filled with a missing value constant is generated for every map half element missing for every region of map sections. Then the statistical parameters are calculated for each subset of combined map half and map section available in the data. This results into a list 152 paramters for each road element within each map section.
All these lists are compiled together to store in two variables. Namely as “map_stats” that contains the statistical parameters of all the available subsets and “map_missing” that contains the lists of all the missing subsets.
F. Statistics for Regions of Hazards
The same set of 76 pre-defined statistical parameters that are used for regions of interest i.e. the map sections is used to calculate the hazard specific statistics. Appropriate variable names are generated according to the parameters with added prefix of “E” and “D” to identify the columns as eye data and drive data statistics respectively.
The hazard view values are the hazardous roadway elements that the participant will come across during the simulation as they are driving through the virtual environment (e.g., pedestrian abruptly nearing roadway). These hazards are strategically placed throughout the simulation to record and understand the behavior of the participant driver in an event of a hazard. These hazard elements as described in table III are designed based on the real life scenarios and occur within the map sections of interest across the map. The hazard view values are recorded when the hazard appears in the frame of the screen and is in view of the participant. The “HazardView” column of the dataset contains the values of the various hazard elements. Some of these values could be a “NoHazardinView” value which is the default initialization value that is recorded when the simulation commences. “HazardContinues” value is recorded when the simulation is at the instance between two consecutive hazards. The instances at these two values are filtered out as they are not of interest to the study.
TABLE III.
HAZARD-VIEW VALUES
| Hazard Values | Hazard Elements |
|---|---|
| 0 | NoHazardinView |
| 1 | UrbanCar |
| 2 | UrbanPedestrian |
| 3 | FreewayDeer |
| 4 | FreewayStrandedPedestrian |
| 5 | ResidentialCar |
| 6 | ResidentialPedestrian |
| 9 | HazardContinues |
The occurrence of the hazards of interest depends on the map sections. Every region of the map sections consist of two hazard events. The sequence order of these two hazards is fixed but only one of the two will randomly activate. Across the entire simulation test there will be a total of six hazard events in view of which three will activate. The statistical parameters are calculated for all six hazard events in view.
A pre-defined list of hazard view elements and its values is stored as a constant. The dataset is grouped into smaller subsets according to the hazard view values available in the data. There could be a few cases when a value for hazard view is missing from the dataset. This would occur in case the test did not go through the entire simulation map and therefore missed the hazard event. In this case a statistical parameter list would still be generated for this missing hazard view value and filled with a missing constant value. Then the statistical parameters are calculated for the subsets of the hazard view values that are available in the dataset. Two additional parameters of location (map section) and duration for which the hazard is in view is also calculated for each subset. This results into a list of 154 parameters for each of the six hazard in view elements. These lists are compiled together and stored in two variables. Namely as “hazard_stats” that contains the statistical parameters of all the available subsets and “hazard_missing” that contains the lists of the missing subsets.
G. Final Output
The final output can be obtained by combining all the lists of statistics and missing values for map sections and hazards. Then input this combined list into a Pandas dataframe method and obtain a table containing all the information as required. An additional column is added with the value of subject ID to identify these data. The final output for this subject ID test is a table with one row and several thousand columns of statistical values. This algorithm is repeated for all remaining participants’ test data and added to the table, it will produce the final output table containing data of all the participants.
V. TESTING AND RESULTS
The application of this algorithm is dependent on the research study for which it is being used. It is applied and tested on different research study cases where the process for data reduction is the same but has different outlines for its implementation (e.g., different roadway scenarios, different number of scenario runs).
A. Single Drive Test
In this case a single drive test of the simulation is required from the participant for a single experiment. There is a total of three files to process to produce the data of the test. After testing the algorithm on this study case, the final output table is in the dimensions of [1 rows × 5297 columns]. Applying this implementation to rest of the participant’s data the final output file is in dimension of [(No. of participants) rows × 5279 columns]
B. Multiple Drive Test
In this study case three drive tests of the simulation are required from the participant for a single experiment. There are three files for each drive test and thus a total of nine files to process. The algorithm has the provisions to accommodate such requirements by making slight modifications to some parameters of the program. Instead of one the data of three tests are processed at the same time and combined together to form a single output table. The algorithm generates specific variable names to distinguish between the data of different test drives.
The final output table in this case study is in the dimension of [1 rows × 15891 columns]. Additional processing is implemented to derive the average values of statistical parameters across three drives and added to this table. This results with final output table with in dimensions of [1 rows × 21176 columns] for a single subject ID. Then the remaining data of study participants are processed and reduced with the final output table in the dimensions of [(No. of participants) rows × 21176 columns].
Each input data file consists of several data recordings which is summarized to a single data entry for each participant. Figure 3 show the average number of data recording in the drive and eye data input files in comparison with the reduced output data for each participant.
Fig. 3.

Average number of records for each participant ID
In its raw state the input data consists of multiple files with respect to each participant. However, a compiled dataset is required for further analysis of this data. Figure 4 gives the results of the reduction in the number of data files for the entire data used in this test.
Fig. 4.

Average number of records for each participant ID
The overall data size of the drive and eye data input files are in several gigabytes. The data reduction algorithm reduces and summarizes the input data into a single output file which is significantly reduced in size. Figure 5 shows the performance achieved by the algorithm in reducing the size of the data.
Fig. 5.

Average number of records for each participant ID
VI. CONCLUSION
A data reduction solution for a high-fidelity driving simulator is built. The algorithm performs efficiently and meets the requirements of the study cases.
Addition of more participant data from the experiment can be easily added to the algorithm’s process. Due to very large size of data, the algorithm can take up considerable amount of processing time.
The reduced and summarized data of the experiment is saved in a “.csv” file format that can be imported to other analytical tools and platforms. The data is ready for statistical analysis by the researchers of the respective project.
ACKNOWLEDGMENT
Research reported in this publication was supported by the Eunice Kennedy Shriver National Institute of Child Health & Human Development of the National Institutes of Health under Award Number R01HD089998. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Special thanks to the UAB Translational Research for Injury Prevention Laboratory for data collection and entry, and support from the UAB Edward R. Roybal Center for Translational Research in Aging in Mobility (NIH/NIA grant no. 5 P30 AG022838-09) and a grant from the National Institute on Aging (NIH/NIA grant no. 5 R01 AG005739-24).
Contributor Information
Piyush Pawar, Dept.of Psychology, University of Alabama at Birmingham, Birmingham, USA.
Thomas Anthony, Dept.of Electrical and Computer Engineering, University of Alabama at Birmingham, Birmingham, USA.
REFERENCES
- [1].Zaslavsky A, Perera C and Georgakopoulos D (2012). “Sensing as a Service and Big Data”. Bangalore: International Conference on Advances in Cloud Computing (ACC). [Google Scholar]
- [2].G. Watson G, Papelis Y. and Schikore M. (2000). “A MULTIMEDIA, INTERACTIVE DATA VERIFICATION AND REDUCTION TOOL FOR USE IN DRIVING SIMULATOR RESEARCH”. Scottsdale: IMAGE 2000 Conference. [Google Scholar]
- [3].Schreiner C, Zhang H, Guerrero C, Torkkola K and Zhang K (2007). “A Semi-Automatic Data Annotation Tool for Driving Simulator Data Reduction”. Iowa City: DSC 2007 North America. [Google Scholar]
- [4].FAAC. (2019). RDS-2000 Full Cab Research Simulators | Realtime Technologies. Available at: https://www.faac.com/realtime-technologies/products/rds-2000-full-cab-driving-simulator/ [Google Scholar]
- [5].Parallelgraphics.com. (2019). Internet Scene Assembler 2.0 User’s Guide. Available at: http://www.parallelgraphics.com/l2/bin/isa2_guide.pdf
- [6].SimCreator User’s Manual. (2019). Realtime Technologies, Inc. [Google Scholar]
- [7].Smart Eye Pro Manual. (2017). Smart Eye.. [Google Scholar]
- [8].Eyesdx.com. (2019). Available at: https://www.eyesdx.com/wp-content/uploads/2015/01/Flyer-2015-A.pdf
- [9].Conda.io. (2019). Conda — Conda documentation. Available at: https://conda.io/en/latest/
- [10].Docs.uabgrid.uab.edu. (n.d.). Anaconda - UABgrid Documentation. Available at: https://docs.uabgrid.uab.edu/wiki/Anaconda
- [11].Jupyter-notebook.readthedocs.io. (2019). The Jupyter Notebook — Jupyter Notebook 6.0.2 documentation. Available at: https://jupyter-notebook.readthedocs.io/en/stable/notebook.html.
- [12].Pandas.pydata.org. (n.d.). pandas: powerful Python data analysis toolkit — pandas 0.25.3 documentation. Available at: https://pandas.pydata.org/pandas-docs/stable/
