The proposed CNN configuration: numbers at the top represent the
dimension of the patch, numbers at the bottom represent feature maps dimension,
layers are either convolution, pooling or ReLU, filter size is (5×5) for
the first 7 layers, (3×3) for the next 5 layers and the last two layers
are fully connected, output is either F (foreground) or G (background).