Skip to main content
. 2021 Oct 14;110(11-12):2993–3013. doi: 10.1007/s10994-021-06052-0

Table 2.

Comparison of operations among stages before and after the initialization of ML lifecycle using MLife

Before Initialization After Initialization
Tools: Excel and JIRA issues Tools: BAM in MLife
Workflow: At least 4 meetings between the Testing, Engineering and Algorithm teams to align badcases and discuss the plan for data collection and ML model iteration Workflow: The Testing Team uploads all badcases in BAM. At most 2 meetings between the Engineering and Algorithm teams to discuss the plan for data collection and ML model iteration
Standardization: Hard to unify badcase names and priorities Standardization: All badcase related metrics are standardized
Scalability: At most 3 customers due to redundant manual works on badcase management and alignment. Scalability: Unlimited by extending the APIs in the outer layer
Tools: Excel and folders in services Tools: DAM in MLife
Workflow: Manual moving, copying and analysis using Excel and Linux commands Workflow: Upload the raw images to MLife and then carry out management and analysis in DAM using the built-in functions
Standardization: Hard to reuse the scripts for data analysis since algorithm engineers have different places to store data, and different strategies to name data Standardization: All built-in functions and APIs can be reused and extended by algorithm engineers and data scientists from different teams
Scalability: At most 4 customers due to redundant manual works on data management and preparation. Scalability: Unlimited by extending the APIs in the outer and inner layers
Tools: Python scripts and WIKI Tools: DAM, MTT and MSM in MLife
Workflow: Copy all the data to a specific location manually or using scripts. After packaging, the training and testing tasks can be triggered. Training and testing results need to be managed manually in different folders and WIKI pages Workflow: All training and testing data are stored in DAM and linked with the training and testing scripts via CSV IDs. After that, the training logs and testing results are stored in MTT. The trained ML models are stored in MSM and linked with MTT via model IDs
Standardization: Hard to standardize name and stored places of trained ML models, logs and testing results Standardization: Name and stored places of ML models, logs, and testing results are all standardized
Scalability: At most 4 customers due to redundant manual works on data management and preparation. Scalability: Unlimited by extending the APIs in the outer and inner layers
Tools: Excel and folders in services Tools: MSM in MLife
Workflow: Manual moving and copying of ML models from different services for serving. Send emails to relevant people regarding the serving details Workflow: All ML models and serving logs are stored and managed in MSM. Serving actions can be configured and triggered in MSM. Serving details are automatically sent to the relevant people afterwards
Standardization: Hard to standardize the name and stored place of ML models. Hard to automate the serving workflow Standardization: Name and storing places are standardized. Serving details and workflow are unified after certain configurations
Scalability: At most 6 customers due to redundant manual works on ML model management and testing after serving. Scalability: Unlimited and automated by extending APIs in the two layers

Top row: Badcase management and analysis; Second row: Data Management and Analysis; Third row: Model training and testing; Bottom row: Model management and serving. Note that the number of customers is estimated based on the affordability of a fixed number of Algorithm, Engineering and Testing team members