Algorithm 3: Measures and Concerns Detector |
Input: tweets_p; [K]; [R]; threshold Output: concerns[][], tweets_g_DF
-
1
spark ← createSparkSession()
-
2
tweets_DF ← spark.read(tweets_p)
-
3
features_DF ← generate_TFIDF_vector(tweets_DF)
-
4
LDAmodel ← get_best_model(LDA_clustering(features_DF, [K], [R]))
-
5
concernsProb_tw_DF ← train_best_model(LDAmodel)
-
6
concerns[][] ← LDAmodel.describeTopics()
-
7
concern_tw_DF ← assign_tweets_to_concern(concernsProb_tw_DF)
-
8
tweets_g_DF ← group_filter_tweets(concern_tw_DF, threshold)
|