Dilated Convolutional Neural Network for Skin Cancer Classification Based on Image Data

ABSTRACT

image data using Deep Learning is more precise than Machine Learning because Deep Learning is able to handle image data with accuracy and smooth margins (Demir et al., 2021).
One of the methods included in Deep Learning is the Convolutional Neural Network (CNN) Method (Nugroho et al., 2020). The Convolutional Neural Network (CNN) method was inspired by the visual cortex research conducted by Hubel and Wiesel on the cat's sense of vision (Aprianto, 2021). This method has become the most popular because it is able to study and generalize a problem like the human brain (Indolia et al., 2018). In recent years, the Convolutional Neural Network (CNN) method has achieved success in the field of image classification because it has good results (Lei et al., 2019) (Maulana & Rochmawati, 2019).
Research related to image data classification using the Convolutional Neural Network method has been widely carried out. A research by (Q. Li et al., 2014), performed automatic classification of High Resolution Computed Tomography (HRCT) lung images of interstitial lung disease patterns with good accuracy results. In another study (Pratt et al., 2016) obtained 95% for the sensitivity and 75% for the accuracy in the classification of diabetic retinopathy image data. Then, another research that related to Convolutional Neural Network is classifying breast cancer image data and get accuracy 73.6% (Ragab et al., 2019). Another research is classifying dermoscopic image data on skin cancer Fu'adah et al., (2020) with the best accuracy is 99%. Due to its improved capabilities, this method is suitable to be used to solve complex problems or problems that use large-scale data. As a result, the computational process using the Convolutional Neural Network method will take a long time (Putra et al., 2020) (Krizhevsky et al., 2007) (Qotrunnada & Utomo, 2022). To overcome this, in this study the Dilated Convolutional Neural Network method will be used because according to research conducted by (Lei et al., 2019) the Dilated Convolutional Neural Network method gets better results and a shorter time in the image classification process compared to Convolutional Neural Network method.
Based on the introduction, this study proposes using Dilated Convolutional Neural Network method to classifying the skin cancer based on image data. Contribution on this study is by using the development method of the Convolutional Neural Network method by modifying the dilation factor. This study is expected to providing progress in technology, especially in the basic knowledge and applied sciences and to be able for applying mathematics computation to the health technology.

B. METHODS
The data that used in this study is the HAM1000 dataset (Codella et al., 2019). The data is a dermoscopic image dataset consisting of 10015 data with a size of 600 × 450 pixels. The amount of data that used on this study can be seen in Table 1. In this study, the original dataset on the Table 1 divide into two categories. Those are training dataset and testing dataset. It can be seen on Table 2, for each category on the data has been divided into two categories, as shown in Table 2 and Figure 1. The research flowchart can be seen in Figure 1. Based on Figure 1, First step on this study is resizing the image data to 224 × 224. That size is used because several popular methods such as GoogleNet architecture, resnet50 architecture, resnet101 architecture, Alexnet architecture, etc. all use 224 × 224 to the image data. The detail of the data can be seen on Table 1. The next step is dividing the data into two categories, those are training data and testing data. The next step is building the architecture that will be used in the extraction stage of Dilated Convolutional Neural Network. The architecture used consists of 8 layers consisting of an input layer, 2 dilated convolution layers, 1 average pooling layer, 3 ReLu layers, and a dropout layer. This architecture was chosen because according to research conducted by (Khalifa et al., 2020), this architecture gets better results than the other architectures.
The next step is processing the training data into the Dilated Convolutional Neural Network Method to looking for the best weight on each convolution layer obtained randomly. The weight will be used to extracting image that will be used in the next process. The Dilated Convolutional Neural Network method is divided into two stages, including feature extraction (ReLu layer, Pooling Layer, Dilated Convolution Layer, Dropout Layer) and fully connected layer (classification process). After getting the best result, it will be checking by the confusion matrix.

Pre-processing
At this stage, the dataset is prepared and then the image is resized to fit to the input size, which is 224 × 224. Next step on this process is divided the data by using k-fold method. Kfold method is a technique used to measure the performance of a model built by taking a random sample to be used dataset of testing process (Marcot & Hanea, 2021).

Feature Extraction a. Dilated Convolution Layer
The basic principle of the Dilated Convolutional Neural Network is to provide a hole/space between the points with each other in the process of multiplying the input matrix with the kernel. The Convolutional Neural Network method uses value = 1, then the Dilated Convolutional Neural Network method uses value > 1 (Lin et al., 2018). The calculation of the Dilated Convolutional Neural Network method can use Equation (1) but with a value > 1. The difference between dilated convolution and convolution can be seen in Figure 2.
(1) Figure 2. Convolutional Neural Network; Dilated Convolutional Neural Network (Lei et al., 2019) In the convolution layer, the result of multiplying the input value with the filter is the output value (Naranjo-Torres et al., 2020). The two-dimensional convolution operation is expressed by input ( , ) with filter ( , ) .
Definition 1 (Dilated Convolutional) (Chakraborty et al., 2019) Given an input ∶ → R and a kernel ∶ {0, … , − 1} → R, the dilated convolutional function ( * ): N → R is: Where is the set of natural numbers, is the size of kernel, and is the dilation factor. b. Average Pooling Layer Average pooling layer is one of method that can reduce the size of matrix. Illustration of the average pooling process can be seen in Figure 3.
d. Dropout Layer Dropout layer is a technique used to avoid overfitting. During the iteration process in training, several neurons will be randomly removed from the network with a probability value, the value used is generally 0.5 (Labach et al., 2019). The illustration can be seen in Figure 4. Based on Figure 4, Dropout Layer is a process that nullifies the contribution of some neurons towards the next layer and leaves unmodified all others.

Classification
At this stage consists of several layers in which several nodes/neurons are fully connected to the previous layer, namely the forward method and the backward method (Rumelhart et al., 1986). The mathematical equation of the forward method can be seen in Equation (4).
Where is the input, is the output, is the bias, is the vector length of the input, and is the weight that will be used. While, the mathematical equation of the backward method can be seen in Equation (5) and Equation (6). From the output to the hidden layer = . .
From the hidden layer to the input = . .

Confusion Matrix
The information of the actual results and the prediction results from the classification system obtained on the confusion matrix. The results of the system performance can be evaluated by using the data in the matrix by calculating several, which are: True Positive (TP) : is a positive class from the data identified as positive class, True Negative (TN) : is a negative class from the data identified as negative class, False Positive (FP) : is a negative class from the data identified as positive class, False Negative (FN) : is a positive class from the data identified as negative class.
From the Figure 5, it can produce statistical calculations of the accuracy and sensitivity values (X. Li et al., 2021). From the result of the accuracy and sensitivity, it can determine the good or bad performance of the built classification system. Accuracy is a result that represents the amount of data classified with actual data. Thus, the greater the value of accuracy, the more data that is classified correctly. The mathematical equation of accuracy can be seen in Equation (7).
Sensitivity is a result that represents the number of correct data classified in the positive class. Thus, the greater the sensitivity value, the greater the classification system can classify positive classes well. The mathematical equation of sensitivity can be seen in Equation (8).
Precision is a result that represents the number of instances that are relevant, out of the total instances the model retrieved. The mathematical equation of precision can be seen in Equation (9).
Recall is a result that represents the number of instances which the model correctly identified as relevant out of the total relevant instances. The mathematical equation of recall can be seen in Equation (10).
F1-score is simply the harmonic mean of Precision and Recall. The mathematical equation of sensitivity can be seen in Equation (11).

C. RESULT AND DISCUSSION
In this study, the classification process of skin cancer was carried out using the Dilated Convolutional Neural Network (DCNN) method. This method uses several experiments to change the d value of 2, 4, 6, and 8 to be carried out in the classification process to obtain optimal results. Several types of skin cancer classified in this study include Benign Keratosislike Lesions (BKL) on Figure   The Dilated Convolutional Neural Network architecture used in this study can be seen in Figure 1. The architecture on this study based on the research (Khalifa et al., 2020). On that research, the architecture that used is able to get good results without using too many layers. The parameters used in this study are epoch = 100, minibatch size = 8, learning rate = 0.1, and dropout = 0.5. The comparison of the accuracy carried out in this study can be seen in Table 3.
Based on Table 3 it can be seen that in this study used several experiments, those are = 2, = 4, = 6 and = 8. Each dilation factor was tested by using k-fold method with = 1, = 2, = 3 and = 4. The accuracy can be calculated by Equation (7). Based on Table 3, for the dilation factor is = 2, four accuracy values are obtained for each k, = 1 is 84.49%, = 2 is 85.67%, = 3 is 81.79%, and = 4 is 83.58%. The best accuracy for = 2 is 85.67%, that best result on = 2. While for the dilation factor is = 4, four accuracy values are obtained for each k, = 1 is 74.88%, = 2 is 74.60%, = 3 is 75.08%, and = 4 is 74.67%. The best accuracy for = 4 is 74.88%, that best result on = 1. The other results for the dilation factor is = 6, four accuracy values are obtained for each k, = 1 is 73.96%, = 2 is 74.68%, = 3 is 75.40%, and = 4 is 75.55%. The best accuracy for = 6 is 75.55%, that best result on = 4. The last dilation factor is = 8, four accuracy values are obtained too for each , = 1 is 68.97%%, = 2 is 68.17%, = 3 is 71.33%, and = 4 is 72.71%. The best accuracy for = 4 is 72.71%, that best result on = 4. The best accuracy for this study is on = 2 with = 2 and the accuracy is 85.67%, as shown in Table 3 and Figure 13.  Confusion matrix of the best result in this study can be seen on Figure 13. Based on Figure  13, the Dermatofibroma (DF) class tends to be predictable to the Melanocytic Nevi (NV) class. This is most likely due to the fact that the number of datasets used is not balanced with the amount of data for the Melanocytic Nevi (NV) class. Based on Table 1, it can be seen that the number of Melanocytic Nevi (NV) dataset used is about 67% of the total data. This trend can also be seen in other classes that are predictable in the Melanocytic Nevi (NV) class. Based on the confusion matrix, the sensitivity can be calculated by using Equation (8). The value of sensitivity is 0.65.
From the results obtained the Dilated Convolutional Neural Network method is able to classify skin cancer well. Contribution on this study is by using the development method of the Convolutional Neural Network method. This study conducted several experimental scenarios of changes in the value of , which are 2, 4, 6, and 8 to get the optimal results. The result of this study is also able to provide a better accuracy of 85.67%, as shown in Table 4. Based on Table 4, the proposed Dilated Convolutional Neural Network can be compared with the standard Convolutional Neural Network method by (Raja Subramanian et al., 2021) and the Convolutional Neural Network-SVM method by (Yohannes & al Rivan, 2022) using the same dataset, that is HAM10000 dataset. It can be seen that proposed Dilated Convolutional Neural Network method is better than the standard Convolutional Neural Network Method and Convolutional Neural Network-SVM method. I can be seen by the testing accuracy, the proposed Dilated Convolutional Neural Network method has accuracy 85.67% while the accuracy of Convolutional Neural Network by (Raja Subramanian et al., 2021) is 83.04% and the accuracy of Convolutional Neural Network-SVM is 65.33%. It can be seen on Table 4 that the proposed Dilated Convolutional Neural Network is capable of classifying with an average accuracy is more than 80% by using the HAM10000 dataset better than CNN method ang CNN-SVM method.

D. CONCLUSION AND SUGGESTIONS
Based on the results obtained, it can be concluded that the Dilated Convolutional Neural Network method is capable of classifying with an average accuracy is more than 80% by using the HAM10000 dataset well. Contribution on this study is by using the development method of the Convolutional Neural Network method by modifying the dilation factors. This study conducted several experimental scenarios of changes in the value of , which are 2, 4, 6, and 8 to get the optimal results. It's proven by the result of this study is able to provide a better accuracy of 85.67%. It can be compared to the Convolutional Neural Network method with an accuracy of 83.04% (Raja Subramanian et al., 2021) and the Convolutional Neural Network-SVM method with an accuracy of 65.33% (Yohannes & al Rivan, 2022) using the same dataset, that is HAM10000 dataset. It can be seen that the Dilated Convolutional Neural Network is better than the standard Convolutional Neural Network Method and Convolutional Neural Network-Support Vector Machine. However, this method is still unable to overcome the data imbalance which in this case the data is dominated by the NV class. It is hoped that further research will be able to use balance data or oversampling technique for the imbalanced data.