YANGQirui,XUKaizhou,ZHENGXiaohu,XIAOLei(肖雷),BAOJinsong
1 College of Mechanical Engineering, Donghua University, Shanghai 201600, China2 Shanghai Space Propulsion Technology Research Institute, Shanghai 201100, China
Abstract: The healthy condition of the milling tool has a very high impact on the machining quality of the titanium components. Therefore, it is important to recognize the healthy condition of the tool and replace the damaged cutter at the right time. In order to recognize the health condition of the milling cutter, a method based on the long short term memory (LSTM) was proposed to recognize tool health state in this paper. The various signals collected in the tool wear experiments were analyzed by time-domain statistics, and then the extracted data were generated by principal component analysis (PCA) method. The preprocessed data extracted by PCA is transmitted to the LSTM model for recognition. Compared with back propagation neural network (BPNN) and support vector machine(SVM), the proposed method can effectively utilize the time-domain regulation in the data to achieve higher recognition speed and accuracy.
Key words: health condition recognition; milling tool; principal component analysis(PCA); long short term memory (LSTM)
With the rapid demands of the titanium alloy components in aerospace industry, high-speed machining technology has also been more widely used. However, due to the high strength-to-weight ratio and high toughness of titanium alloys, real-time monitoring of tool healthy condition is important for high-speed machining. If the tool health condition cannot be recognized correctly, the harmful effect on the machining efficiency and machining quality may happen. Therefore, it is important to recognize the tool health condition.
The traditional method of monitoring tool health condition is based on off-line measurement by optic microscope. Although the traditional method has an intuitive recognition result, it is helpless to increase productivity. At the same time, due to the need of high productivity, more consideration should be given to the instant monitoring. To satisfy this demand,principal component analysis (PCA) is used to analyze the various processing state data and long short term memory network (LSTM) is used recognize the healthy condition. When recognizing tool health condition, the popular methods are based on data-driven forms, mainly including neural network-based methods. Chenetal.[1]introduced information such as machine vibration signals and machine operating conditions into a neural network model based on logistic regression to evaluate the reliability of CNC machine tools. Wuetal.[2]developed a new optimization method based on the traditional deep neural network. They innovatively pass data from adjacent time nodes to the model for evaluation. Gebraeel and Lawley[3]introduced the features extracted from the bearing vibration signal into the neural network to predict the life distribution of the bearing. Data-driven method has become the mainstream of tool monitoring.
In recent research, it has become common practice to introduce multiple state features into neural network models. Compared with the early neural network model that uses single state features as incoming data, this model can significantly improve accuracy and robustness. The characteristics of the incoming model can be extracted from the vibration signal of the machine. However, the traditional deep neural network method selection model often ignores the temporal sequence law hidden in the feature set. Although Wuetal.[2]passed the characteristics of adjacent time nodes into the deep network, the traditional deep neural network has two significant problems. (1) The weight of the model increases with the number of time nodes of the incoming features. The number of valueswis also increasing in geometric multiples. (2) The model doesn’t fully learn the timing law of the collected data set. In order to solve the above problems that may arise in the determination of tool health, this paper proposes a method based on long short time memory (LSTM). LSTM has been proved to be very effective for training tests on time series data. Yang and Kim[4]proved that LSTM has better results in other areas of machinery. Qinetal.[5]found that LSTM perform better in sensor fault diagnosis of autonomous underwater vehicle. Especially in the application of state monitoring and fault diagnosis of energy systems, Leietal.[6]attached higher test accuracy by using LSTM. Yuanetal.[7]used the LSTM framework to build a model for hybrid fault diagnosis. Zhaoetal.[8]adopted the combination framework of convolutional neural networks (CNN)+ bi-directional long short term memory (BLSTM) to realize the diagnosis of tool wear state from the perspective of machine vision. Zhangetal.[9]successfully applied LSTM method to life prediction of lithium batteries. Liuetal.[10]used LSTM method to remaining useful life estimation for proton exchange membrane fuel cells. In this paper, the LSTM network is used as a model to judge the health status of milling tools. Compared with the traditional deep learning network, the timing law of incoming data can be mined to obtain higher recognition accuracy. In this paper, the characteristic data transmitted in the experiment is the time domain characteristic obtained by the mean square error and root variance processing of the cutting force in the three axial directions of the main shaft.
The rest of this article is organized as follows. Section 1 introduces the data collected on by the experiments in this paper and the method for judging recognizing the health condition of the tool by using the PCA and LSTM network. Section 2 introduces the optimal selection of model parameters and compare comparison between the proposed model with and the traditional method. The conclusion can be seen in section 3.
The recognition model structure is mainly divided into two parts: the principal component analysis part and the LSTM recognition model part. The overall process is shown in Fig. 1. The 20 time-domain features will be introduced in the data collection section.
Fig. 1 Model process
The first part is the preprocessing part, which uses PCA to reduce the amount of data to reduce the model’s data and the size of the model, and improve the training speed. When applying PCA, a total of 20 time-domain features of the three axes are subjected to orthogonal transformation into eight sets of linear independent parameters.
The second part is the recognition part, which adopts the LSTM model. The processed data is passed to the model for recognition. The LSTM has more input gates, forgetting gates, and output gates than the recurrent neural network(RNN). The output of the previous cellht-1, the input of the current cellXt, and the previous cell stateCt-1are transmitted into the current cell. The forgotten gate readsht-1andXtto determine if the last Cell state needs to be retained. The input gate updates the cell state in the current cell. The output gate is used to generate the output of the current cellht.
In this study, high speed milling experiments of titanium alloy were conducted. Two data sets were collected for the model. Carbide milling cutter with TiAlN heat-resistant coating is used for experiment. Its diameter is 10 mm, the number of blades is 2, and the grain size is 0.6 μm. The experimental data are based on a DMC635V vertical machining center equipped with the Siemens 840D CNC system. The force sensor adopts YDXM-M97 three-way piezoelectric quartz dynamometer, and the sampling frequency is set to 50 kHz per channel. The experimental platform is shown in Fig. 2. In the process of data acquisition, the tool is placed under the microscope to measure the amount of flank wear. Wear measurement based on microscope is shown in Fig. 3.
Fig. 2 DMC635V vertical machining center equipped with the Siemens 840D data system
Fig. 3 Wear measurement based on microscope
The purpose of the tool wear experiment is to study the relationship between tool wear and the state parameters of the sensor acquisition, and to provide a data basis for accurate recognition of tool health condition mean value VB of the wear in the middle part of the wear zone of the flank as an indicator of the health of the tool. VB is defined as the degree of flank wear. Because the milling tool has multiple cutting edges, the data set uses the maximum value of VB in the multi-blade as the true value of milling wear. Schematic diagram of VB is shown in Fig. 4. The tool wear condition is divided according to the tool wear value VB, which is shown in Table 1.
Fig. 4 Schematic diagram of VB
CategoryTool wear value/mmWear level10-0.1Light wear20.1-0.5Moderate wear3>0.5Heavy wear
The data are collected under wear conditions with the spindle speed of 10 000 r/min, the feed rate of 200 mm/min and the cutting depth of 0.2 mm.
Data acquisition is derived from simultaneous acquisition of multiple sensors that collect machine state parameters. The data set is maily divided into main cutting force signal, spindle load rate data, spindle power data, and spindle current data in sequence. However, due to the huge capacity of the collected signals, the signals cannot be directly used for the recognition of tool wear. It is necessary to use various signal processing methods to generate characteristic quantities of the signals that can reflect the tool wear.
We have also frequency-domain features in model recognition. However, due to the low sampling frequency of the machine tool, the frequency domain information obtained after processing the sampling signal of the machine tool is not available. Therefore, the final recognition features are based on time-domain features.
Time-domain feature extraction is a more commonly solution in signal processing. It can reflect the change of the signal with time. Time-domain features are mainly divided into amplitude analysis, autocorrelation analysis, cross-correlation analysis and time series analysis. Among them, amplitude analysis is the most widely used. Amplitude analysis is a statistical analysis of the probability density distribution of sensor signals, such as maximum, average, root mean square(RMS), variance, standard deviation, skewness, kurtosis, peak-to-peak and peak coefficients. After final selection, we select 20 times domain characteristics in 3-axis, such as the average spindle load, the RMS value of the spindle load, the peak load, the peak load factor, the load waveform coefficient, and the axle load pulse factor.
In this paper, the cutting force, cutting current and the average spindle load of the three axes, the RMS value of the spindle load, the peak load, the peak load factor, the load waveform coefficient, the axle load pulse coefficient, and a total of 20 time domain features are selected to form the signal feature quantity. Ten examples of features acquired on theY-axis are shown in Table 2.
Table 2 Features acquired on the spindle and Y-axis
In order to avoid physical unit interference and prevent gradient explosion in the training model, this paper uses a linear function conversion method to scale the data to fall into a small specific interval. Since the indicators of each index of the credit indicator system are different, in order to be able to participate in the evaluation calculation, the indicators need to be normalized, and their values are mapped to a certain numerical interval by function transformation. The expression is
whereyis the new value after normalization,xis the original value, max(x) is the maximum value in the set of data, and min(x) is the minimum value in the set of data.
After metal cutting experiments, 500 sets of data in different processing environments were obtained, and each set of data obtained 20 effective eigenvalues through the above feature engineering. Therefore, a 500×20 valid data set normalized under different processing parameters is obtained.
In order to analyze the accuracy of the proposed method, two recognition methods are selected as the comparison model, which are back propagation neural network (BPNN) and support vector machine (SVM). In this paper, a two-class SVM classification model is used. The basic model is defined as the linear classifier with the largest interval in the feature space. The learning strategy is to maximize the interval. The gamma value of the SVM model is 50, and radial basis function (RBF) is chosen as the kernel. The deep neural network is divided into three layers. The parameters of the model are shown in Table 3.
Table 3 Parameters of the deep neural network
At the same time, in order to verify the effect of the LSTM model, in addition to the direct identification of the obtained data, this paper also divides the data, trying to investigate the recognition effect of the model in the case of unbalanced samples. Separately, the severe wear data accounted for 5% and 10% of the training data set. In order to examine the model’s ability to normalize, the first comparison is to verify the impact of the PCA method on the time spent on model training. When PCA is not used, a total of 20 time-frequency domain features need to be taken into account, which will greatly increase the complexity of the model and reduce the speed of model training. Table 4 shows the impact of adopting PCA in model training.
Table 4 Impact of adopting PCA in model training
The hyper-parameters of the LSTM model proposed in this paper are shown in Table 5. Case 1 and Case 2 are compared to find the impact of hidden layer. Case 2 and Case 3 are used to find whether different iteration numbers have different effect for test accuracy.
Table 5 Different hyper-parameters for LSTM model structure
The accuracy of choosing different hidden layers is shown in Fig. 5. When the hidden layer reaches 50 layers in Case 2, the model can achieve 95.74% in accuracy after 500 iterations of training, which is more accurate than that in Case 1.
Fig. 5 Accuracy of LSTM model with different hyper-parameters
It can be seen from the results that the model can reach the accuracy of 95.74% under the condition of 500 iterations and 50 hidden layers. That is higher than the accuracy of 92.16% under 250 iterations.
The wear condition of the tool is in the medium wear state of the tool during the machining process. Because the data are in an unbalanced state, it is necessary to improve the accuracy of recognizing wear level.
In order to study the accuracy of the proposed method better, this paper designed three experimental data settings. The setting 1 is making training data containing 5% severe wear samples. The setting 2 is making training data containing 10% severe wear samples. The setting 3 uses all pre-processed data for training.
In this comparison, the LSTM model has a hidden layer size of 50 and an iteration number of 500. The hyper-parameters of BPNN and SVM are the same as those in introduction. The discriminate results of the three models are shown in Fig. 6.
Fig. 6 Accuracy of the model with different experimental data settings
It can be concluded from Fig. 6 that the LSTM model still has better interpretation accuracy than the other two models under the condition of unbalanced samples.
In order to test the fit of the model to untrained data, cross-validation of the model is required. The cross-validation method divides the data set into two parts, one part is divided into training set for training, and the other part is tested as a test set. Then the two data parts are swapped for repeated training and test. After the cross-validation method, the generalization ability of the model can be judged intuitively. In this verification, we trained the well-defined test set to test the accuracy of the interpretation of the training set. The results of cross validation are shown in Fig. 7.
Fig. 7 Comparison of discriminant accuracy rates when LSTM is cross-validated
In this paper, a method for judging the health state of milling tools based on PCA and LSTM is proposed. Time-domain processing is performed on the collected processing parameters to initially reduce the amount of data. Principal component analysis is then performed on the time-domain data for further data reduction. The final processed data is passed to the LSTM model for tool health status discrimination.
Due to the combination of PCA method and LSTM network, this method can excavate the timing law in the data. Compared with the traditional SVM and BPNN, this method has been proved to have higher recognition accuracy.
Journal of Donghua University(English Edition)2019年4期