文章目录
- 1.Three way ANOVA
- 2.Latin square design
- 2.Hierarchical (nested) ANOVA
- 3.Split-plot ANOVA
- 4.Repeated measures ANOVA
- 5.Mixed effect models
1.Three way ANOVA
三因素相关分析
单因子分析的代码
data(mtcars)
nrow(mtcars) # 32
mtcars$cyl = as.factor(mtcars$cyl)
levels(mtcars$cyl) # “4” “6” “8”
model = aov(mpg~cyl, data = mtcars)
summary(model)
三因子分析的model
# Three way ANOVA
Dat = read.table('d:/ioz/statistics/2015/3way.ANOVA.txt',
sep=' ', header=T)
Dat$species <- as.factor(Dat$species)
model <- aov(rate ~ species * temp* sex, data=Dat)
summary(model)
summary.lm(model)
species:temp:sex大p值,删除它,它的作用不大。
模型选择是删除变量。
model <- aov(rate ~ species * temp* sex
- species : temp: sex,
data=Dat)
回归的情况下的,分析二阶相互作用
model <- aov(rate ~ species * temp* sex
- species : temp: sex - temp:sex - sex,
data=Dat)
线性模型是解释大致的关系,不解决机制的关系。
2.Latin square design
行变量,列变量
行列和treatments
没有任何重复需要做4*4*4次,拉丁方是特别省实验量的数据。
error是他们的交互作用,为什么是三个参数呢?这个还需有待探索。
library(faraway)
data(abrasion)
lines <-
"id run position material wear
1 1 1 C 235
2 1 2 D 236
3 1 3 B 218
4 1 4 A 268
5 2 1 A 251
6 2 2 B 241
7 2 3 D 227
8 2 4 C 229
9 3 1 D 234
10 3 2 C 273
11 3 3 A 274
12 3 4 B 226
13 4 1 B 195
14 4 2 A 270
15 4 3 C 230
16 4 4 D 225"
abrasion.data <- read.table(con <-
textConnection(lines), header=TRUE)
close(con)matrix(abrasion.data$material, 4, 4)
abrasion.data$run = as.factor(abrasion.data$run)
abrasion.data$position = as.factor(abrasion.data$position)
fit1 = aov(wear ~ run + position + material, abrasion.data)
fit2 = lm (wear ~ run + position + material, abrasion.data)
summary(fit1)
summary(fit2)
标准值之间的分析。
2.Hierarchical (nested) ANOVA
嵌套分析,因素是嵌套在一起的。
这个是嵌套的,x之前是有区别。
下面是非嵌套的情况的。
希望嵌套的差距越小越好,实际则是越大越好。
看treatment能不能被盖住。
平均值,考虑到treatment,在加上年份的影响,在加上其他的影响
和双因素的方差分析相比,第二个式子有差别。
方差都是需要相除。这里面是除以嵌套的值,与之前双因素分析的存在差别。
像是方差,做实验的yao qiu你
r里面的计算方法是有问题的。蚊子的
对其进行调整,
3.Split-plot ANOVA
列区实验实际。需要做双因素,或多因素。
相当于有重复的双因素
# Split-plot
# Tensile strength in paper manufacturing
Y <- c(30,35,37,36,34,41,38,42,29,26,33,36,
28,32,40,41,31,36,42,40,31,30,32,40,
31,37,41,40,35,40,39,44,32,34,39,45)
block <- gl(3,12,36) # Three blocks
A <- gl(3,4,36) # Three pulp preparation methods
B <- gl(4,1,36) # Four different temperatures
Dat <- cbind(Y, block, A, B)
fit <- aov(Y ~ A*B + Error(block/A))
summary(fit)
# Compare regular ANOVA
summary(aov(Y ~ A*B + block))
在不同水平上,检查显著性
summary(aov(Y ~ A*B + Error(block/A))) # spilt plot
summary(aov(Y ~ (A + B + block)^2)) # regular ANOVA
irrigation和den是可以变
4.Repeated measures ANOVA
重复实验设计。能减少不同人之间的差距。
减少之间的差距。
时间序列的研究。
重复对其的影响。
重复实验设计。这个条件需要球性,sphericity
球形检验(Mauchly’s test of sphericity),适用于重复测量时检验不同测量之间的差值的方差是否相等,用于三次以及三次之上(想也能够想明白,两次重复测量根本就没有办法比较差值的方差,因为只有一个方差)。
参考:
https://blog.csdn.net/qq_41989587/article/details/82351591
检验这个球性,不能拒绝假设,说明
# data
Group <- c("A","A","A","A","A","A","A","A","B","B","B","B","B","B","B","B",
"C","C","C","C","C","C","C","C")
Value <- c(1,2,4,1,1,2,2,3,3,4,4,2,3,4,4,3,4,5,3,5,5,3,4,6)
Participant <- c("1","2","3","4","5","6","7","8","1","2","3","4","5","6","7","8",
"1","2","3","4","5","6","7","8")
data <- data.frame(Participant, Group, Value)
# make a matrix such that the rows are the within-subject factor (Participant)
# and the columns are the groups to compare (Group)
matrix <- with(data, cbind(Value[Group == "A"], Value[Group == "B"], Value[Group == "C"]))
# build a multivariate linear model with the matrix you've just created
model <- lm(matrix ~ 1)
# define the design of the study, make a list of the independent variable
design <- factor(c("A", "B", "C"))
# load car package, which has Anova() function including Mauchly's test
library(car)
options(contrasts=c("contr.sum", "contr.poly"))
aov <- Anova(model, idata = data.frame(design), idesign = ~design, type = "III")
# 第三类的方差和
summary(aov, multivariate = F)
# Repeated measures ANOVA
face = read.table("d:/ioz/statistics/repeated_ANOVA/face.csv", header = T, sep = ",")
face$aspect <- as.factor(face$aspect)
face$id <- as.factor(face$id)
# id / aspect (aspect within id)
face.aov = aov(time ~ aspect + Error(id / aspect), data = face)
face.aov = aov(time ~ aspect + Error(id), data = face) # same
summary(face.aov)
# pairwise comparison
with(face, pairwise.t.test(time, aspect, p.adjust.method="holm", paired=T))
线性模型方差可以分解,连续变量做回归。多于两个因子做方差分析。嵌套不考虑相互作用。
5.Mixed effect models
不同的x变量都有相对重要性。我们可能只想要一种变量的显著性。
a代表代表平均值。
混合效益模型来进行分析。随机项的变异要表现出来。
随机项是一个值。混合模型用了极大似然估计
# Randomized Block Design
# Carbon dioxygen density, 8 incubators and 4 treatments
CO2 <- data.frame(ID=1:32, group=NA, treat=NA, density=NA)
n <- 0
for(i in 1:8){for(j in c('A','B','C','D')){n <- n+1CO2$group[n] = iCO2$treat[n] = j
}}
CO2$group <- factor(CO2$group)
CO2$treat <- factor(CO2$treat)
CO2$density <- c(5.27,5.27,5.94,5.53,5.27,5.22,4.88,4.96,5.88,5.83,
5.38,5.53,5.44, 5.38,5.27,5.32,5.66, 5.44,5.38,4.88,6.22,
6.22,5.61,5.92,5.83,5.72,5.38,4.88,5.27,5.11,5.12,4.44)
fit1 <- aov (density ~ treat, data = CO2) # one way ANOVA
fit2 <- aov (density ~ group + treat, data = CO2) # Randomized Block Design
library(lme4) # package
fit3 <- lmer(density ~ treat + (1|group), CO2) # mixed effect model
summary(fit1)
summary(fit2)
summary(fit3)
可以比较two way分析和混合分析的各个值,它们存在很多地方相似的部分。
# Crop products
Y <- c(30,35,37,36,34,41,38,42,29,26,33,36,
28,32,40,41,31,36,42,40,31,30,32,40,
31,37,41,40,35,40,39,44,32,34,39,45)
block <- gl(3,12,36) # Three blocks
A <- gl(3,4,36) # Three different fertilizers
B <- gl(4,1,36) # Four different pesticides
Dat <- data.frame(Y, block, A, B); head(Dat)
model <- aov(Y ~ A*B + Error(block/A)) # split plot
# Mixed-effect models
library(nlme)
model1 <- lme(Y ~ A*B, random=~1|block/A, data=Dat)
summary(model1)
library(lme4)
model2 <- lmer(Y ~ A*B+(1|block/A), data=Dat)
summary(model2)
anova(model2)
summary(model2)
anova(model1)
重复实验设计也是可以用混合模型,这里的时间是连续变量,固定效益是B
# Crop products
Y <- c(30,35,37,36,34,41,38,42,29,26,33,36,
28,32,40,41,31,36,42,40,31,30,32,40,
31,37,41,40,35,40,39,44,32,34,39,45)
block <- gl(3,12,36) # Three blocks
time <- gl(8,2,36); time <- as.numeric(time) # 8 time periods
B <- gl(4,1,36) # Four different pesticides
Dat <- data.frame(Y, block, time, B); head(Dat)
# Mixed-effect models
library(nlme)
model <- lme(Y ~ B, random = ~ time | block, data = Dat)
summary(model)
混合效益模型,空间上的差别。
# plot 5 autocorrelation types in package nlme
library(nlme)
par(mfrow=c(2,3))
D <- seq(from = 0, to = 1, by = 0.1); Mydata <- data.frame(D = D)
autocor <- corSpher(c(0.8, 0.1), form = ~ D, nugget = TRUE)
autocor <- Initialize(autocor, data = Mydata)
semivar <- Variogram(autocor)
plot(semivar[,2], semivar[,1], type = "l", col = 1, xlab = 'Distance', ylab =
'Semivariogram'
, main = 'corSpher')
有包可以计算空间自相关的情况。
AED这个数据可以直接下载,我们这里是直接用它的函数。
现在混合效益模型是当前的主流。
需要考虑空间自相关的因素。
lme(response ~ factorA, random=~1|factorB)
#library(nlme)
如何用混合效益模型来分析其他的效益。这个都可以用于其他的分析中去。
大的文章
control 是有计划,是控制的。关注你关心的变量。
balance 不正态,对f分布的影响差别不大。不正态对p值的影响值