. 2020 Sep 29;21:428. doi: 10.1186/s12859-020-03774-1

Table 1.

Parameters in SnakeMake configuration file

config.yml parameter	Explanation	Example
1 input_data_folder	Path to folder in which input data can be found	/input_data
2 input_data_files	List of prefixes of data files	['input_data1’, 'input_data2’]
3 gold_standard_file	File name of gold_standard_file, must be in input_data_folder	{'input_data': 'gold_standard_file.txt'}
4 read_csv_kwargs	pandas.read_csv keyword arguments for input data	{'test_input': {'index_col':[0]}}
5 output_folder	Path to folder into which results should be written	/results
6 intermediates_folder	Name of subfolder to put intermediate results	clustering_intermediates
7 clustering_results	Name of subfolder to put aggregated results	clustering
8 clusterer_kwargs	Additional arguments to pass to clusterers	KMeans: {'random_state':8}}
9 generate_parameters_addtl_kwargs	Additonal keyword arguments for the hypercluster.AutoClusterer class	{‘KMeans’: {'random_search': true)
10 evaluations	Names of evaluation metrics to use	['silhouette_score', 'number_clustered']
11 eval_kwargs	Additional kwargs per evaluation metric function	{'silhouette_score': {'random_state': 8}}
12 metric_to_choose_best	Which metric to maximize to choose the labels	silhouette_score
13 metric_to_compare_labels	Which metric to use to compare label results to each other	adjusted_rand_score
14 compare_samples	Whether to made a table and figure with counts of how often each two samples are in the same cluster	"true"
15 output_kwargs	pandas.to_csv and pandas.read_csv keyword arguments for output tables	{'evaluations': {'index_col':[0]}, 'labels': {'index_col':[0]}}
16 heatmap_kwargs	Arguments for seaborn.heatmap for pairwise visualizations	{'vmin':-2, 'vmax':2}
17 optimization_parameters	Which algorithms and corresponding hyperparameters to try	{'KMeans': {'n_clusters': [5, 6, 7] }}