Computer Vision Algorithm for the detection of fracture cracks in Oil Hardening Non-Shrinking (OHNS) die steel after machining process

A BSTRACT . A variant of neural network for processing with images is a convolutional neural network (CNN). This type of neural network receives input from an image and extracts features from the image while also providing learnable parameters to effectively do the classification, detection, and many other tasks. In the present work, U-Net convolutional neural network is implemented on Jupyter platform by using Python programming for fracture surface image segmentation in Oil Hardening Non-Shrinking (OHNS) die steel after the machining process. The results showed that the fracture cracks can be validated by testing with higher accuracy. The plot of accuracy vs. number of epochs showed the obtained accuracy score 0f 1.0 which means that 100 % of points were correctly labeled by our implemented algorithm


INTRODUCTION
omputer vision is a branch of artificial intelligence (AI) that permits computers and systems to extract useful information from digital photos, videos, and other visual inputs and to execute actions or formulate predictions based on that information [1][2][3][4]. AI makes it possible for computers to think, while computer vision makes it C possible for them to see, hear, and comprehend. Humans have an advantage over computers when it comes to how eyesight works. The benefit of human sight is that it has had a lifetime to learn how to distinguish between things, determine their distance from the viewer, whether they are moving, and whether something is off about an image [5][6][7]. Computer Vision algorithms are implemented in manufacturing industries for improving both mechanical and microstructure properties of the fabricated components. For the structural investigation of the total pore space, Singh et al. [8] used a contour detection computer vision technique. Because fractures are typically thought of as planar structures, they performed a slice-by-slice study on 3D segmented pictures of cracked sandstones and carbonates. In order to distinguish between granular pores and fractures, the contours of pores (both granular pores and fractures) and their composition and structure are taken into consideration. The exploratory research revealed that, for the both sandstones and carbonates, two principal components are the ideal number needed for segregation. A deep learning strategy based on convolutional neural networks (CNNs), known as the third iteration of the You Only Look Once principle (YOLOv3)., was proposed by Zhou et al. [9] in their study. There aren't many photos of the shattered bolts that can be used in practice, which presents a problem for the detector training using YOLOv3. The brightness transformation, Gaussian blur, flipping, perspective transformation, and scaling are five data augmentation techniques that are developed to overcome this issue and provide better labeled images. Six YOLOv3 neural networks are developed using six various augmented training sets, and each network's performance is then assessed on the same testing set to determine the effectiveness of various augmentation techniques. By analyzing inline measured process data, Hartl et al. [10] used CNN to find voids inside friction stir welds. The goal was to test whether interior weld faults could also be identified using CNNs rather than only surface defects. Ultrasonic testing was used to create 120 welds for this purpose, and that data was used to determine if it was "good" or "defective." Different artificial neural network models were examined for their ability to anticipate where the welds would fall within the designated classes. The method used to label the data was found to be important for the level of precision that could be attained. These artificial intelligence based algorithms can be further used various machining process to determine the mechanical and microstructure properties of the fabricated components [11][12][13]. It has been demonstrated that detecting the energy release during fatigue tests of common engineering materials provides pertinent information on fatigue qualities, cutting down on testing time and material usage. A static tensile test allows for the evaluation of two separate phases: When all crystals are elastically strained in the first phase (Phase I), the temperature trend is linear and follows the thermoelastic rule. However, when certain crystals start to deform in the second phase (Phase II), the temperature trend becomes non-linear. The "limit stress" that, if repeatedly applied, would cause material failure could be related to the macroscopic transition stress between Phase I and Phase II. A universal methodology was developed by Milone et al. [15] that uses neural networks to evaluate the variation in temperature trend in order to estimate the limit stress. Buccino et al. [ 16] innovative use of convolutional neural networks was integrated with the depiction of the micro-crack propagation mechanism. For the first time, a substantial collection of human synchrotron data from osteoporotic and healthy femoral heads that were tested using micro-compression served as the foundation for the artificial intelligence technology that was used. In the present work, U-Net convolutional neural network is implemented on Jupyter platform by using Python programming for fracture surface image segmentation in Oil Hardening Non-Shrinking (OHNS) die steel after the machining process. The results showed that the fracture cracks can be validated by testing with higher accuracy.

Obtaining Microstructures
scanning electron microscope was used to study the surface morphology and microstructures (SEM). Samples were etched and polished according to conventional metallographic procedure before SEM pictures were taken. When using EDM or PMEDM, the material attrition is determined by the influence of heat concentration. The abrupt temperature increase may cause some material to melt and be removed. Heat-induced metallographic changes in some of the nearby material lead to the creation of discharge pits and fracture cracks of various sizes. While a tiny amount of this melted surface hardens once more as a result of the dielectric's cooling effect. The flushing operation of the dielectric fails to completely remove some of the molten material. The density and thickness of these numerous pockmarks, globules, and microcracks, which are all part of the resolidified layer known as the white layer, depend on the process parameters. Other layers may be seen behind the white layer, and the number of layers varies from sample to sample. Images were collected using scanning electron microscope (SEM), where the surface morphology of OHNS dies steel was captured after powder mixed electrical discharge machining. Before taking SEM images samples were etched and polished as per standard A metallographic process. In the present work total of fifty microstructure images were collected for both training and testing purposes.

U-Net architecture
Biomedical image segmentation was the first use for U-net. A general description of its design would be an encoder network accompanied by a decoder network. In contrast to classification, where the deep network's final output is the only factor that matters, semantic segmentation requires not only pixel-level discrimination but also a method for projecting the discriminative features that were learned at varying phases of the encoder onto the pixel space. The architecture diagram's first half is the encoder as shown in Fig. 1. In order to encode the input image into feature representations at many different levels, it is typically a pre-trained classification network like VGG/ResNet where convolution blocks are applied followed by a maxpool downsampling. The architecture's second component is the decoder. The objective is to obtain a dense classification by semantically projecting the discriminative features (lower resolution) learned by the encoder onto the pixel space (higher resolution). Upsampling, concatenation, and standard convolution operations make up the decoder. Figure 1: Representation of U-Net Architecture. White boxes denote cloned feature maps, whereas blue boxes depict multi-channel feature maps. Different colored arrows denote various operations [14]. Fig. 2 shows the implemented framework used in the present study. The collected microstructure images are stored in two folders in the system i.e. training and testing folders. In the training folder, there are two sub-folders that contain the original microstructure images, and another folder contains the respective mask of that particular microstructure. The training folder consisted of 40 microstructure images with their respective masks while the testing folder consisted of 10 microstructure images. Training and testing of the images were performed by indicating the location of the folders located in our system. The masks were created using the canny edge descriptors. With the use of the Canny edge detection technology, the amount of data that needs to be processed can be drastically reduced while still extracting meaningful structural information from various vision objects. It is frequently used in many computer vision systems. Canny edge detection quantifies the edge strength and direction for each pixel in the noise-smoothed image using linear filtering with a Gaussian kernel. The pixels that endure a process of thinning known as non-maximal suppression are those that are used to identify candidate edge pixels. Each potential edge pixel in this method has its edge strength set to zero if it is not greater than the edge strengths of the two pixels next to it in the gradient direction. The flattened edge magnitude image is then thresholded using hysteresis. Two edge strength thresholds are applied in hysteresis. All potential edge pixels underneath the lower threshold are classified as non-edges, and all edge pixels well above low threshold are those that can be connected to any edge pixel well above high threshold by a chain of edge pixels. Three settings must be entered by the user to use the Canny edge detector. The first is sigma, which is the pixel-based standard deviation of the Gaussian filter. The low threshold, which is defined as a percentage of the calculated high threshold, is the second low parameter. The third parameter high, which is supplied as a percentage point in the distribution of gradient magnitude value systems for the candidate edge pixels, is the high threshold to be used in the hysteresis. The Laplacian operator is used to calculate a matrix's derivative. We must first compute the first two derivatives, also known as Sobel derivatives, which each account for gradient variations in a certain direction: one horizontal and the other vertical. Through the convolution of the picture with a matrix called kernel, which is always of odd size, the horizontal Sobel derivative (Sobel x) is obtained. The simplest situation is a kernel with a size of 3. Through the convolution of the picture with a matrix called kernel, which is always of odd size, the vertical Sobel derivative (Sobel y) is obtained. The simplest situation is a kernel with a size of 3. The gradient strength and direction of the pixel is calculated by using the Eqn. 1 and 2.
where x G and y G are pair of convolution masks in x and y directions.
In the present work, the U-Net algorithm is used for the identification of fracture cracks present in the microstructure images. The U-Net approach has various benefits for segmentation tasks, starting with the simultaneous use of central place and context. Second, it performs better for segmentation tasks even with a small number of training examples. The network has a u-shaped architecture because it has both a constricting path and an expandable path. The contracting path is a standard convolutional network that applies convolutions repeatedly, followed by rectified linear units (ReLU) and max pooling operations for each one. The performance evaluation of the framework was carried out by using loss function and accuracy evaluation. One of the most crucial components of neural networks is the loss function, which, together with the optimization functions, is directly in charge of fitting the model to the provided training data. A loss function analyzes how effectively the neural network models the training data by comparing the target and predicted output values. We try to reduce this difference in output between the predicted and the target during training. We discover the weights, w T , and biases b, that minimize the magnitude of J after adjusting the hyperparameters to minimize the average loss (average loss) as indicated in Eqn. 3.

Microstructure analysis
his section discusses the surface analysis of OHNS die steel that has been machined using a copper electrode and tungsten powder as well as electrical parameters such a 5A gap current, an 8-s pulse on time, and a 9-s pulse off time. The micrograph of OHNS die steel material without machining is shown in Fig. 3. It is visible that the cementite phase appears as tiny white dots in the tempered martensite's gray matrix. Figs. 4 and 5 depict the micrograph of the OHNS die steel material after EDM machining, which was taken at 100X and 1000X magnifications, respectively. Fig.  3 depicts the surface of OHNS die steel that has been EDM-milled using tungsten powder suspended in dielectric. The topography shows that there are several spherical droplets left on the machined surface, along with some discrete craters and volcanic characteristics, which suggests that melting and evaporation is how the material was removed. In regions of extremely high temperature, the upper material will vaporize while the bottom material melts. The original material and the recast layer, which is a brighter white, are both clearly distinguishable regions. The lower pulse current results in a smaller thermal gradient, which results in a thinner recast layer. Because a steeper thermal gradient sets up at higher pulse currents, potentially producing a thermal effect beneath the melting zone, the recast layer seems thicker as the pulse current rises. Due to this phenomenon, molten layer that is connected to the machined surface but is not washed out by the dielectric fluid is removed to a higher extent. The electrode material, the type of dielectric, and the flushing conditions all affect how thick the recast layer is. EDM randomly wears down the surface, and the surface finish is poor due to more frequent dielectric fluid breaking and metal expulsion. The differential temperature gradient on the surface is what causes the microcracks shown on the resolidified features seen in Fig. 5. The microstructures of OHNS die steel are shown in Fig. 5 along with compound formation, pock mark formation, region with and without white layer deposition, and globule creation.  The EDX result for the chemical composition of the OHNS die steel material is displayed in Tab. 1. The EDX of OHNS die steel is displayed in Fig. 6. The percentages of vanadium have climbed from 0.13% to 0.95%, tungsten from 0.025% to 1.891%, and carbon from 0.94% to 8.87% have all increased. The percentage of chromium has increased from 0.35% to 0.86%. The increase in the percentage of carbon, tungsten, and silicon confirms the improvement in the hardness of the machined surface. The percentage increase of tungsten element in OHNS die steel, shows the possibility of deposition of tungsten in free form. Similarly presence of carbon in work material can be in free form, carbides or in the form of alloyed cementite. This can be understood by observing the XRD pattern. The percentage of carbon has increased from 0.94% to 8.87%, percentage of chromium has increased from 0.35% to 0.86%, percentage of tungsten has increased from 0.025% to 1.891% and percentage of vanadium has increased from 0.13% to 0.95%. The percentage increase of tungsten element in OHNS die steel, shows the possibility of deposition of tungsten in free form. Similarly presence of carbon in work material can be in free form, carbides or in the form of alloyed cementite. XRD peaks are obtained for the material by scanning the workpiece with a scan speed of 5 0 /min and the 2θ range from 5 0 to 100 0 . Fig. 7 shows the XRD pattern of OHNS die steel material. It can be seen from XRD pattern that peaks corresponding to 27 0 and 35 0 show the presence of tungsten carbide (WC) and iron carbide (Fe 3 C) respectively. This confirms the transfer of tungsten powder suspended in the dielectric and migration of carbon from the dielectric onto the surface of die material. From the above analysis it is confirmed that the increase in micro-hardness is due to the presence of tungsten carbide and iron carbide. Similarly the distribution of micro-cavities with shallow carter depth confirms the moderate value of surface roughness as observed from the micrographs. Different types of carbides observed on the other OHNS die steel materials are such as Cr 4 C, Cr 7 C 3 , (FeCr) 3 C, and Fe 5 C 2 .  Identification of fracture cracks Fig. 8 shows the samples of the reference training microstructure images and Fig. 9 shows the mask of the corresponding microstructure images. In the present work, the masks are created by using canny edge descriptors. The simplest technique to show or hide any particular section of an image is to use image masking. It enables editors to effectively extract the desired shots and separate them from the backdrop. Image masking also makes it possible to trim pictures out of the backdrop. Parts of photos can be hidden or made visible with layer masking. This method can be used to remove an image's background. For example, when it pertains to product photographs, layer masking can be really helpful. The image can be utilized more freely and imaginatively when the background is removed. In the present work, the masks are created by using canny edge descriptors. Canny edge detection quantifies the edge intensity and direction for each pixel in the noise-smoothed image using linear filtering with a Gaussian kernel. The pixels that endure a process of thinning known as non-maximal suppression are those that are used to identify candidate edge pixels. Each potential edge pixel in this method has its edge strength set to zero if it is not greater than the edge strengths of the two pixels next to it in the coordinates. The flattened edge magnitude image is then thresholded using hysteresis. Two edge intensity thresholds are applied in hysteresis. All potential edge pixels underneath the lower bar are classified as non-edges, and all edge pixels above the defined level are those that can be coupled to any edge pixel above the higher bar by a chain of edge pixels. CNN's major goal is to learn an image's feature mapping and use it to create a more accurate feature mapping. This is effective for classification issues since it turns the image into a vector that can then be utilized for classification. However, in order to segment an image, we must first transform a feature map into a vector and then use this vector to reassemble the image. This is a huge undertaking because it's much more difficult to turn a vector into an image than the opposite. This issue is at the center of U-Net's entire design. The contraction, bottleneck, and expansion sections of the U-Net architecture are depicted in Fig. 10. Numerous contraction blocks make up the contraction section. Every block takes a single input, applies two 3X3 convolution layers, and then does a 2X2 max pooling. After each block, there are twice as many kernels or feature maps, allowing the architecture to efficiently learn the intricate structures. Between the contraction layer and the expansion layer, the bottom layer serves as a mediator. A 2X2 up convolution layer is used after two 3X3 CNN layers.  However, the extension portion is where this architecture's core is located. It also consists of a number of expansion blocks, similar to the contraction layer. A 2X2 upsampling layer is added after each block's two 3X3 CNN layers to process the input. Additionally, to maintain symmetry, the number of feature maps used by the convolutional layer is cut in half after each block. But each time, feature maps from the corresponding contraction layer are also added to the input. This would guarantee that the features that are acquired during the image's contraction will be used to rebuild it. The frequency of expansion blocks and contraction blocks is equal. Following that, a second 3X3 CNN layer with as many feature maps as desired segments is applied to the output mapping. For each pixel, U-Net employs a pretty unique loss weighting system that places a larger weight near the edge of segmented objects. The U-Net model was able to segment cells in biomedical pictures discontinuously thanks to this loss weighting approach, making it simple to distinguish individual cells within the binary segmentation map. The generated image is first subjected to pixel-by-pixel softmax, which is then followed by a cross-entropy loss function. Therefore, we are assigning each pixel to a certain class. Every pixel must fall into a certain group even during segmentation, therefore all we need to do is make sure they do. The segmentation problem was simply transformed into a multiclass classification problem, and it outperformed the conventional loss functions. Fig. 11 a) shows the plot of loss function with increasing number of epochs. It is observed that the loss function decreases with increasing number of epochs. Fig. 11 b) shows the accuracy of the prediction of fracture cracks present in the microstructure images.
It is observed that the U-Net architecture is resulting in the accuracy score of 1.0 which is highly efficient for characterization of the fracture surfaces.
(a) (b) Figure 11: a) Plot of loss function with number of epochs b) Accuracy evaluation CONCLUSION n order to elicit the appropriate responses and help humans with a variety of production-related tasks, computer vision in manufacturing concentrates on creating artificial systems that can capture, process, and thus recognize visual inputs from the physical world (primarily factories and other industrial sites). The simplest forms of computer vision, used in manufacturing as well as other industries, can identify particular objects and prompt a response using a rule-based principle. Specifically, they do this by identifying key characteristics in the collected visual elements and determining whether they match a set of predetermined parameters. This method is less effective at handling the finer distinctions and variances that frequently appear when working with unstructured sources of information like images and is prone to producing a lot of misclassification. The following conclusions are drawn from the current study:  The abrupt temperature increase may cause some material to melt and be removed. Heat-induced metallographic changes in some of the nearby material lead to the creation of discharge pits of various sizes. While a tiny amount of this melted surface hardens once more as a result of the dielectric's cooling effect. The flushing operation of the dielectric fails to completely remove some of the molten material. The density and thickness of these numerous pockmarks, globules, and microcracks, which are all part of the resolidified layer known as the white layer, depend on the process parameters.  The goal of picture segmentation is to separate an image into a number of smaller pieces. The computations of image object segmentation will be aided by these segments or these numerous segments that were formed. Use of masks is another crucial prerequisite for picture segmentation jobs. We may get the desired outcome needed for the segmentation task with the use of masking, which is essentially a binary image made up of zero or nonzero values. With the aid of images and their corresponding masks, we can explain the key elements of the image that were discovered during image segmentation, allowing us to use them for a variety of future tasks.  The future scope of this work can be based on the implementation of the embedded machine learning to incorporate our proposed framework for identification of the presence of the fracture cracks in a real-time by a normal human operator.