Table 2.
Comparison of operations among stages before and after the initialization of ML lifecycle using MLife
Before Initialization | After Initialization |
---|---|
Tools: Excel and JIRA issues | Tools: BAM in MLife |
Workflow: At least 4 meetings between the Testing, Engineering and Algorithm teams to align badcases and discuss the plan for data collection and ML model iteration | Workflow: The Testing Team uploads all badcases in BAM. At most 2 meetings between the Engineering and Algorithm teams to discuss the plan for data collection and ML model iteration |
Standardization: Hard to unify badcase names and priorities | Standardization: All badcase related metrics are standardized |
Scalability: At most 3 customers due to redundant manual works on badcase management and alignment. | Scalability: Unlimited by extending the APIs in the outer layer |
Tools: Excel and folders in services | Tools: DAM in MLife |
Workflow: Manual moving, copying and analysis using Excel and Linux commands | Workflow: Upload the raw images to MLife and then carry out management and analysis in DAM using the built-in functions |
Standardization: Hard to reuse the scripts for data analysis since algorithm engineers have different places to store data, and different strategies to name data | Standardization: All built-in functions and APIs can be reused and extended by algorithm engineers and data scientists from different teams |
Scalability: At most 4 customers due to redundant manual works on data management and preparation. | Scalability: Unlimited by extending the APIs in the outer and inner layers |
Tools: Python scripts and WIKI | Tools: DAM, MTT and MSM in MLife |
Workflow: Copy all the data to a specific location manually or using scripts. After packaging, the training and testing tasks can be triggered. Training and testing results need to be managed manually in different folders and WIKI pages | Workflow: All training and testing data are stored in DAM and linked with the training and testing scripts via CSV IDs. After that, the training logs and testing results are stored in MTT. The trained ML models are stored in MSM and linked with MTT via model IDs |
Standardization: Hard to standardize name and stored places of trained ML models, logs and testing results | Standardization: Name and stored places of ML models, logs, and testing results are all standardized |
Scalability: At most 4 customers due to redundant manual works on data management and preparation. | Scalability: Unlimited by extending the APIs in the outer and inner layers |
Tools: Excel and folders in services | Tools: MSM in MLife |
Workflow: Manual moving and copying of ML models from different services for serving. Send emails to relevant people regarding the serving details | Workflow: All ML models and serving logs are stored and managed in MSM. Serving actions can be configured and triggered in MSM. Serving details are automatically sent to the relevant people afterwards |
Standardization: Hard to standardize the name and stored place of ML models. Hard to automate the serving workflow | Standardization: Name and storing places are standardized. Serving details and workflow are unified after certain configurations |
Scalability: At most 6 customers due to redundant manual works on ML model management and testing after serving. | Scalability: Unlimited and automated by extending APIs in the two layers |
Top row: Badcase management and analysis; Second row: Data Management and Analysis; Third row: Model training and testing; Bottom row: Model management and serving. Note that the number of customers is estimated based on the affordability of a fixed number of Algorithm, Engineering and Testing team members