What is Data Mining
With the rapid development of IT technology and the wide application of database management system, the amount of information owned by enterprises is also increasing. Therefore, enterprises begin to analyze these huge data to get more information to help decision-making. Regrettably, more than half of the data is still lying quietly unused in the data warehouse after being stored, making it impossible to find hidden correlations in the data and predict future trends.
Because of the needs of enterprises and the development of science and technology, using database technology to manage data and using artificial intelligence to analyze data, thus mining the knowledge hidden behind the data, has become a topic of great concern to enterprises. Among them, Data Mining is the most important part of business intelligence. "Data mining" is a kind of project process which analyzes in the huge data warehouse, and then excavates the useful information, knowledge, model or rules. Its application scope can be said to cover all walks of life, and manufacturing industry is one of the most widely used industries.
Process of Data Mining
"To teach a man to fish is better than to give a man fish." As can be seen from the above, data mining is a project process that seeks useful knowledge from a large database to provide decision-making. Therefore, behind the acquisition of "fish (useful knowledge)", what we really need to understand is "fishing (how to use a standard process)". Therefore, the process of data mining becomes very important invisibly.
According to the survey, CRISP-DM process (as shown in the following figure) is now recognized as an influential data mining process among many data mining processes, and IBM SPSS Modeler integrates the concept of CRISP-DM process into the analysis process to obtain more efficient data mining results. The CRISP-DM process includes six processes: Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, and Deployment.
Analytical Application in Manufacturing Industry
In the process of manufacturing management, the business process and production system will produce a large number of data, and the relevant departments often use statistical analysis and data mining technology to solve the problems faced. Common problems of manufacturing production management include statistical quality management, mechanical equipment maintenance forecast analysis and so on, and these problems can be effectively solved through SPSS series products.
As a mature business intelligence analysis tool, SPSS's application functions in manufacturing quality management include data management, statistical analysis, trend research, tabulation and so on. SPSS's statistical analysis of quality management is the process of processing and analyzing the quality of products in all stages of production and management with the help of SPSS series products and statistical methods. The analysis methods mainly include: data collection and processing, descriptive statistics and multivariate statistics of quality indicators, compiling statistical quality control charts, experimental design, sampling acceptance, equipment maintenance prediction and analysis and so on.
- Design of quality analysis experiment
- Statistical quality management chart analysis
- Analysis of factors affecting quality indicators
- Predict product quality and equipment operation
ANOVA of SPSS is mainly used to analyze the experimental data to determine which factors or combinations affect the quality characteristics of the product, and to select the best model, process or formula. The variance analysis tools of SPSS include ANOVA, ANCOVA and MANOVA. The orthogonal design function of SPSS can effectively improve the efficiency of experimental design.
The quality management chart module of SPSS can monitor and control each quality pointer of the product, grasp the change of quality indicator in the production process in real time, and analyze or adjust the production process immediately, so as to make the production line work normally.
The powerful multivariate statistical function of SPSS can also analyze the factors that affect the quality indicators. Regression analysis of SPSS is mainly used to find out the relationship between quality characteristics and different production factors in order to make statistical prediction or determine the best operating conditions. Regression analysis mainly includes linear regression, Probit, Logit, multivariate regression, Logistic regression, nonlinear and restricted nonlinear regression (NLR and CNLR) and so on.
Sampling data in the production process is often time-series, and time series technology can analyze the relationship between the data (such as self-correlation). Time series technology includes ARIMA, EXSMOOTH, SEASON, SPECTRA, AREG and so on. They are powerful tools for analyzing product process. SPSS provides a variety of tools ranging from product design and production process analysis to product quality monitoring and forecasting.
The historical data of quality indicators in the production process reflects the operation status of the existing production line, which implies the inherent quality factors of various raw materials and the random combination mode of the process. Capture important production rules in a large number of cluttered data, establish a prediction model to improve the level of raw material supply management and process optimization and adjustment, so as to further improve the stability of the process and product quality, and make the equipment run at a more stable and good level at the existing quality management level.