View full-text article in PMC Diagnostics (Basel). 2023 Sep 1;13(17):2831. doi: 10.3390/diagnostics13172831 Search in PMC Search in PubMed View in NLM Catalog Add to search Copyright and License information © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). PMC Copyright notice Algorithm 2: Log Softmax with ASGO. // Log Softmax Function Input:A vector of real numbers, denoted as x. Output:A vector of the same shape as x, where each element represents the logarithm of the softmax probability. Notation:Let f(x) represent the log softmax function applied to vector x. f(x)=log(softmax(x)) where softmax(x) is defined as: softmax(xi)=exp(xi)/sum(exp(xj)) for all j in{1,2,...,n} where n is the number of elements in the vector x. Input→A vector of real numbers I→Iteration, K→logsoftmax fore:1→Ido for k:1→K do for each input xjn incjdo CNN layer xjnthrough CNN to obtain fjn max pool rj,in=i−pifjn Batch Normalizationdj,in CNN layerrj,in into AGSO update all network parameters using log softmax if j=i then fjn else 1fjn end end end calculate image classes if image classes classify then break end fjn = log(softmax(x)) where softmax(x) is defined as: softmax(xi) = exp(xi)/sum(exp(xj)) for all j in{1,2,...,n} //Adaptive Subgradient Optimizer (Adagrad): θ:Model parameters(weights and biases). J(θ):The objective function to be minimized(typically the loss function). gt:The gradient of the objective function with respect to θ at time step t. η:Learning rate(a hyperparameter). ε:A small constant to avoid division by zero. G:A diagonal matrix where each element Giiaccumulates the squared sum of past gradients for parameter θi. Gt=Gt−1+gt∗gt(element−wise squared sum accumulation) (3) θt=θt−1−(η/sqrt(Gt+ε))∗gt(element−wise division) (4) Note: The square root is applied element-wise to the matrix Gt