Skip to main content
. 2021 Mar 10;7:e426. doi: 10.7717/peerj-cs.426

Algorithm 2. Data stream preprocessing: duplicate instance rule.

Input: A sequence of input stream S = s1s2…sn.
Output: Preprocessed stream.
Apply the Data Quality Rule (DQR) on input data // Duplicate Instance Rule
• Find Duplicate: Given an input stream S = s1s2…sn, where si ∈ [m] and n > m, find a ∈ [m], which appears more than once.
         Begin
           For each si ∈ S do
             Get K records si+1 … sk
             If any of the above k records are similar to si then
               Flag = “Y”
               Skip the input stream.
             Else
               Flag = “N”
               Consider the input stream.
           End For
         End