Classification of ductile cast iron specimens: a machine learning approach

In this paper an automatic procedure based on a machine learning approach is proposed to classify ductile cast iron specimens according to the American Society for Testing and Materials guidelines. The mechanical properties of a specimen are strongly influenced by the peculiar morphology of their graphite elements and useful characteristics, the features, are extracted from the specimens’ images; these characteristics examine the shape, the distribution and the size of the graphite particle in the specimen, the nodularity and the nodule count. The principal components analysis are used to provide a more efficient representation of these data. Support vector machines are trained to obtain a classification of the data by yielding sequential binary classification steps. Numerical analysis is performed on a significant number of images providing robust results, also in presence of dust, scratches and measurement noise.


INTRODUCTION
iscovered in the years 1943-48, ductile cast irons (DCIs) offer a really interesting combination of cast irons peculiarities (first of all, castability) and of carbon steels mechanical properties (e.g., toughness), [1].Small additions of elements like Mg or Ce allow to modify the graphite elements shapes, from lamellae (extremely D dangerous) to spheroids (with a decrease of the stress intensification near the graphite elements): these grades are widely used to produce pressure pipes and fittings, in the automotive industry (e.g., crankshafts), in road and construction application.Graphite elements morphology peculiarities (e.g.shape, dimension, distribution) are crucial to define the DCI mechanical properties.Image analysis has been using extensively in the last two decades in order to automatically characterize specimens in material science, [2][3][4].The aim is to provide quantitative characterization of the materials in order to determine mechanical properties and establish relationship with damaging mechanisms, [5][6].Nevertheless up to now the official guide of the International Standard [7] is applied almost manually for visual inspection.Only few attempts have been made to automatize the classification of microstructural image data [8].In this paper the aim is to provide an automatic procedure to classify specimens according to the American Society for Testing and Materials (ASTM) standard with respect to the graphite elements shape, the "Type" parameter.As will be recalled in Section II, the shape is the first characteristic to be evaluated in order to determine whether a graphite has a desirable shape or not.In case the shape is not nodular, different levels are possible and it could be determined if the graphite has vermicular aspect or if it contains exploded nodules and so on.Given the images classified by two experts, useful features are extracted and re-arranged by principal components analysis (PCA) [9] in order to enhance the informative and useful content of the data.The classification is performed by support vector machine (SVM) suitably trained, [10]; it is a versatile tool useful to classify signals of different nature [11][12].The classes identified with respect to the Type in the ASTM 2016 are seven; nevertheless binary classifiers are trained in order to simplify the classification step and guarantee the modularity of the procedure.The paper is organized as follows: in Section II, after the description of the data, the image analysis and features extraction is described.Then the training and classification procedure by the SVM is outlined.In Section III numerical results are proposed and discussed, whereas in Section IV conclusions and future work are presented.

MATERIALS AND METHODS
n this section the procedure for the image acquisition and classification is outlined.Different DCIs have been considered, focusing the attention only on the graphite elements morphological peculiarities and not on the metal matrix microstructure.Specimens have been obtained by means of a metallographic preparation according to the following procedure: -specimen sectioning operation by abrasive cutting; -specimen mounting; -specimen grinding (decreasing grit sizes for abrasive papers up to P1200) and polishing (6 micron diamond followed by 1 micron diamond on low napped polishing cloths); -observation of the metallographically prepared specimen by means of a Light Optical microscope (LOM); Graphite elements characterization is usually performed by means of a visual inspection and a qualitatively evaluation according to the standards [7,13].The standardized procedure is based on the visual comparison between the observed images and the charts that are available in the standards.To classify automatically images of ductile cast iron specimens the idea is to extract features useful to describe the specimens and, once the classes of interest are defined, train a classifier able to assign each image to the specific class.This implies the identification of a sort of signature of the images, so that once a new unknown image is proposed, it could be classified by evaluating its signature.On the basis of the International Standard ASTM [7] the information to be retrieved from the images are: -the shape, in particular a measure of its nodularity in shape; the classes with respect to the shape are indicated by: Type I-II-III-IV-V-VI-VII; -the distribution of the graphite in the specimen: it is particularly important in rating the flake graphite and the distribution is described by the letters A-B-C-D-E; -the size of the graphite particles, and the classes are indicated by 1-2-3-4-5-6-7-8 depending on the actual dimension ; -the nodularity, measured as the percentage of the nodular particles present in the microstructure; -the nodule count evaluated as the number of nodules per mm 2 at a magnification of 100x.In Fig. 1 examples of specimens belonging to the Type I, IV-and VII (whose differences between them are more evident) are proposed.

I
It is worth noting that, starting from the classification with respect to the shape (the Type), all the characterizations (distribution, size, nodularity, nodule count) could be further particularized with respect to the other properties, suggesting a sequential procedure for the classification.Therefore, first it will be determined the type-class to which the specimen belongs and then the other characterizations will be established.An efficient procedure to classify the specimens with respect to the type is to use binary classifiers in a sequential way: Step 1: with a binary classifier C1 first establish if a specimen could be assigned to Type I class or not -If it belongs to Type I class one can refine the classification with respect to the other characteristics.
-If the specimen does not belong to Type I one proceeds to Step 2; Step 2: with a binary classifier C2 establish if the specimen (that is not of TypeI) may be classified of Type II or of Type III-IV-V-VI-VII.If it belongs to Type II, then again one refines the classification with respect to the other characteristics, otherwise one goes to step 3 using another binary classifier C3 and so on.As it can be noted the core of the global procedure is the binary classification step.From now on we will refer to the first step in which one wants to classify a specimen as belonging to the Class 1 (Type I specimen) or the Class 2 (Type II-III-IV-V-VI-VII specimens), thus determining the classifier C1.Therefore it will be possible to distinguish the specimens with normal and well-formed nodules with respect to all the other situations.In Fig. 2 a scheme of the overall classification procedure, simplified when considering only three types, is presented.The classifiers distinguishes the specimens on the basis of suitable features that are evaluated from a simplified representation of the image obtained by using a segmentation procedure; then the features are efficiently modified by the principal components analysis that provides the most efficient data representation.Finally a classifier is obtained by using the support vector machine.The block diagram of the binary classification step is outlined in Fig.

Image analysis and features extraction
Given an image, it could be noted that, though it is of good quality, it requires a segmentation process in order to evaluate the properties of each nodule and their spatial distribution.The segmentation with respect to the gray level allows to represent the data with a reduced number of gray levels, thus allowing to retrieve useful information on the nodules, such as the area, or the eccentricity, or their spatial distribution, for example.Different segmentation methods could be applied, [14,15] and in this case, with the nodules well defined over the background, the results obtained with different methods are quite equivalent.Moreover, since the images are of good quality, a binarization is sufficient to enhance the nodules with respect to the background and to determine the properties of interest.The features to be extracted from the images should be chosen in order to determine the best characterization of the data.The indications in the International Standard ASTM 2016 suggest that useful information to be retrieved to determine the classifier C1 concern the roundness of the nodules and their area.Therefore the following features are identified: - , respectively.Nodules with area less than 25 pixels are discarded since could be associated to dust or measurement noise; nodules with are greater than 900 pixels are in general not present; feature 4 f defined as the number of elements with area greater than the minimum one (25 pixels) normalized with respect to the area of the background: it is a measure of the presence of the nodules; for each image; these information are collected in a dataset matrix where on the k -th row the f n features of the specimen k S are collected.
The f n features have been chosen in order to determine the best characteristics useful to distinguish specimens of Class 1 with respect to specimens of Class 2; nevertheless if one uses directly these features to train a classifier, maybe they don't represent at best the data, or maybe some of them yields the same information.To this aim the Principal Component Analysis, that will be herein briefly recalled, yields the best data representation, [16].The PCA is a linear data transformation aiming at reducing the redundancy of the data covariance matrix and maximizing the information retrieved; in the new reference coordinate the new variables are independent one another.One can consider the features selections, when a subset of the original features is considered, or the features extraction, when a new set of features is built suitably weighting the information of interest.Of course, when the dimensionality of the data is reduced it is mandatory to quantify the loss of information.In where is the matrix constituted by the ordered eigenvectors.Therefore, for example, the first principal component is: being 1 D the first row of matrix D .Generally the number of principal components p n is chosen in order to retrieve the p-percentage of the information content, that is: It means that from now on, instead of trying to classify the data collected in the matrix D of dimension f n n  , the data to be considered are the first p n principal components.

Training and classification
The PCA allows to reduce the dimensionality of the data preserving adequately the information; therefore now each image X is described by a new set of feature.The aim is to determine a classifier able to assign each set of feature (and therefore each image) to Class 1 or to Class 2.
To train a classifier able to separate the available data into two classes, the set of n images is split into two groups, the training set, tr N , and the test set test N .To the data corresponding to images belonging to the Class 1 it is assigned label 1, whereas label 0 is assigned to the data belonging to the Class 2.
The training set tr N is divided into two groups, N is used to determine the classification accuracy.The support vector machine determines the optimal hyperplane that splits the data into two groups, [17]; it is a tradeoff between the requirement of minimizing the error on misclassified points and maximizing the Euclidean distance between the closest points, see Fig. 4. The optimal hyperplane is obtained as the solution of the quadratic programming problem: , , 1 with the constraint: where w is the vector of the points perpendicular to the separating hyperplane and H>0 is a penalty parameter on the error term.To make the elements i x of the two classes linearly separable, the data are mapped into a richer space, and the separating hyperplane is determined in that space.A possible choice for the mapping function  is the radial basis function and, denoting with 2 the 2 L -norm, for the kernel function it is assumed: The two parameters to be evaluated, H and  , may be determined during the training phase, by using the 10-fold cross validation, [18].The classification is performed by the SVM algorithm LIBSVM 3.18, [19][20].Once the optimal parameters   The same procedure is applied to train the classifier C2 able to assign a specimen (not belonging to Type I class) to Type II class or to Type III-IV-V-V-VII class and so on, according to the scheme of Fig. 2.

NUMERICAL RESULTS AND DISCUSSION
n this section the results of the classification procedure are described.As could be noted in the International Standard [7], the specimens of Type I, II and III, though they could present a similarity between each other, they differ significantly from the other types.Therefore out attention will be focused in Type I, II and III, even if the overall analysis may be extended to all the types' classification.The first step is the classification of a specimen as of Type I or of Type II-III.If the specimen is of Type II-III a further classification procedure starts in order to decide whether the specimen is of Type II or III.The classification accuracy is calculated as the average value of the accuracy evaluated for 20 different random choices of the training and the test sets, to be sure that the results do not depends on lucky choices, obtaining a percentage of success over 99%.With this calculation the off-line step is over.The results over the test set (containing images not used in the training phase) yield a percentage of success of 97.3% 2.7  .The results of the classifier C1 appears satisfactory; moreover it has been also investigated if the classifier C1 makes a mistake more often with images of Class 1 (Type I data) or with images of Class 2 (Type II and III data), and among the Class 2 if more errors are made when testing with images of Type II or III.This unbundled test on 10 images of each type, repeated 20 times, shows that images of Type III are always correctly classified (percentage of success of 100%), whereas the results on Type I and Type II yield percentage of success of 97.5% 5.5  and 94.5% 10.5  , respectively.A possible explanation could be that images of Type III are a little bit more different with respect to the Type I, than the images of Type II.For the images classified by the classifier C1 as belonging to Class 2 the second classifier C2 must be applied in order to discriminate the images of Type II and those of Type III.Also in this case all the results have been repeated for 20 different random choices of the training and test sets.The classifier C2, trained using only images of Type II and Type III, has a classification accuracy of 98.9%.The test accuracy provides a percentage of success of almost 100% on a test set of 10 images belonging to Type II class and of 98.9% 3.15  on a test set of 10 images of Type III.The results of the classifier C2 are even more satisfactory with respect to those of classifier C1, since the training has been more specific.The classifier C2 has the aim of determining the class membership of images of Type II and III; when applied to an image of Type I, for example if the classifier C1 has provided an erroneous classification, in more than 91% the C2 classifier assigns the specimen of Type I to the class of Type II images.This is the correct choice, being the images of Type II the more similar to the ones of Type I.

CONCLUSIONS AND FUTURE WORK
n this paper an automatic procedure to support the classification of microstructure of graphite in iron castings is proposed.By training binary support vector machine classifiers it is possible, in an efficient way, to determine the type of the specimen according to the American Society for Testing and Materials guidelines and therefore to proceed in the classification specifying the size, the nodularity and the nodule count.Three classes (Type I, Type II and Type III) may be identified by the proposed procedure, but it could be extended to as many classes as needed.The choice of using binary classifiers operating sequentially is determined aiming at yielding a simple, efficient and modular procedure.

I
The classifier uses features evaluated on the original specimens' images and successively suitably transformed by principal components analysis that reduces the complexity and yields a more efficient representation of the information.The results appear satisfactory, and future work will be devoted in: classify the images of the specimen with respect to all the properties (size, nodule count,…); determine the most suitable features in order to better characterize each nodule present in the specimen; consider different classification schemes, for example by using polling systems, evaluating their robustness.
3. It consists of two steps; a first one is off-line, aiming at determining the classifier after a proper data processing (image segmentation, features computation and extraction) and training.The second step is on-line, and represents the application of the classifier over images of specimens not used for training.

Figure 3 :
Figure 3: Block diagram of the classification procedure are the number of nodules with area (in pixels) in the intervals this paper the PCA are used aiming at the features extraction.More precisely, the covariance matrix D C of size f f n n  of the data matrix D is evaluated and its eigenvalues   1 ,..., n f   are sorted according to decreasing order.The corresponding unit eigenvectors i v , 1, 2,..., f i n  are the directions of maximum variance of the data; the transformation yielding the new data representation in the principal components Z is: first one is used to train the classifier; the second one 2 tr

Figure 4 :
Figure 4: Representation of the classification problem.
been determined, the classifier is trained; the classification accuracy, evaluated on the 2 tr N , is defined as the percentage of correctly classified data with the optimal choice   , H    and it is a property of the classifier.With this calculation the off-line phase of the classification procedure is over.The obtained classifier is tested over the test set test N , not used for the training, simulating the situation of unlabeled data.The percentage of misclassified images is the error of the classifier.
of specimens is considered, 64 are of specimens of Type I, 64 of specimens of Type II and 64 of specimens of Type III.The images have been previously classified by an expert, manually.To obtain the features a binarization procedure is applied; it has been chosen the binarization by the discrete level set approach[15] and the ten features described in Section 2 have been evaluated, thus obtaining three matrices of size 64 10 , collected together in the data matrix D , 192 10  .To deal with data with comparable magnitude, a normalization is applied.The covariance matrix D C of size 10 10  of the data matrix D is evaluated; after evaluating its eigenvalues, by using formula (2) 6 p n  principal components are considered, thus preserving the percentage of more than 94% of the original information.The training set tr N contains 45 images: 27 of Type I, randomly chosen among the set of 64 Type I data, and 27 of Type II and III randomly chosen among the set of 128 images of specimens of these type.The test set is constituted by 20 images, equally distributed between Type I and Type II-III.The 1 tr N contains 40 elements and the remaining 14 are used for the set 2 tr N .The number of images of specimens of Class 1 (i.e.Type I specimens) and of Class 2 (i.e.Type II-and III, equally distributed) is the same in the groups involved in training and testing steps to avoid polarization in the result.As said, the parameters   , H    are determined by the 10-fold cross validation that provides also the optimized value for b .The used SVM algorithm LIBSVM 3.18 is a simple and efficient open source software.