您好,欢迎来到佳博论文网!

异常数据的诊断、处理及化工应用

论文编号:HG178 论文字数:12365,页数:32

摘 要:在化工数据测试中,数据质量的保证有着很大的意义,对于异常数据需要准确诊断和处理。本文列举了目前常见的几种化工异常数据诊断和处理的方法,分析了它们的基本原理和算法步骤,并归结了各自的优缺点。而后,以飞机的剩余油量观测数据为例,探讨基于统计的方法如何来处理该类问题。首先用一般的多元回归方法来处理数据,发现这样处理出来的数据线性相关系数很小,于是,做了一些改进,采用逐步回归分析方法。逐步回归分析可以将数据中离群显著的值去掉,然后再对剩余的值做回归处理;发现这样处理的数据线性相关明显改善。但因这些都是基于预先设定数据本身就具有线性的基础上得出的结论,于是本文再次对数据处理方法做进一步改进,引入了多项式回归方法来处理。这样处理的出来的数据明显有很好的线性相关,效果也比较好。

关键词:异常数据;延迟焦化;线性回归分析;多元逐步回归;多项式回归

Abstract: In chemical data test, the guarantee of data quality has great meanings, for unusual data needs accurate diagnosis and handling. This paper has enumerated the present common methods of some kinds of chemical unusual data diagnosis and handling. It analysed their basic principle and algorithm steps , and summed up the good and shortcomings of each one. Then, with the surplus oil of airplane as an example to measure observation data, we discusssed the method based on statistics how to come to handle this kind of problem. First, as handling data with general multivariate regression method, we discover such handling the data linear correlation coefficient that comes out very little. So, to make some improvements , we adopt step by step regression analysis method. Regress analysis step by step can will data in is away from crowd notable value remove , then again the value for surplus do regression handling; Discover the data of such handling linear related obvious improvement. But because these are all based on setting data in advance having the conclusion that reached on linear foundation, so this paper do improve further again for data processing technique. It have led into polynomial regression method to handle. So handling comes out that data have obviously very good relating linear, and the effect is also compared.

Keywords:unusual data; delayed coking; linear regression analysis;multivariate stepwise regression;polynomial regression

目 录

中文摘要I

英文摘要II

目录III

1. 绪论1

1.1 化工测试中数据质量保证的意义1

1.2 异常数据的来源2

1.3 异常数据的特征分析2

1.4 目前常用诊断和处理异常数据方法3

1.4.1 物理判别法3

1.4.2 基于统计的方法3

1.4.3 基于距离的方法4

1.4.4 基于偏差的方法6

1.4.5 基于密度的方法6

1.4.6 各种方法优缺点小结6

2. 实验部分8

2.1 多元线性回归分析10

2.1.1 多元线性回归模型10

2.1.2 多元线性回归模型的检验10

2.1.3 实例应用11

2.2 多元逐步回归分析13

2.2.1 原理介绍13

2.2.2 多元逐步回归分析的步骤13

2.2.3 实例应用14

2.3 多元逐步回归分析16

2.3.1 多项式回归模型介绍16

2.3.2 Matlab 程序实现17

3. 结果与讨论20

3.1 多元线性回归方法的结果分析20

3.2 多元逐步回归分析的结果分析22

3.3 三种方法结合起来的优势25

4.总结与展望38

致谢30

参考文献31

异常数据的诊断、处理及化工应用......