数据挖掘入门与实战 公众号: datadw
如何使用R语言计算两个股票走势的相似度。比如我们知道黄金和美元负相关,但是如何计算?
下面就以AAPL、GOOGL、MSFT三支股票作分析示例:
library(quantmod)
library(ggplot2)
# get data
getSymbols("MSFT", src = "yahoo", from = "2008-01-01", to = "2011-12-31")
getSymbols("GOOGL", src = "yahoo", from = "2008-01-01", to = "2011-12-31")
getSymbols("AAPL", src = "yahoo", from = "2008-01-01", to = "2011-12-31")
df <- data.frame(time(MSFT), Cl(MSFT)/max(Cl(MSFT)),
Cl(GOOGL)/max(Cl(GOOGL)), Cl(AAPL)/max(Cl(AAPL)))
colnames(df) <- c("date","MS","GO","AP" )
# plot
p <- ggplot(data=df, aes(x=date, y=GO, color="GOOG")) + geom_line() + theme_bw() +
geom_line(aes(y=MS, color="MSFT")) + geom_line(aes(y=AP, color="APPL")) +
labs(y="Normalize Close", colour="Stock")
p
# calculate lm
m <- Cl(MSFT)
g <- Cl(GOOGL)
a <- Cl(AAPL)
#回归建模
> lm.mg <- lm(g ~ 0+m)
> summary(lm.mg)
#结果
Call:
lm(formula = g ~ 0 + m)
Residuals:
Min 1Q Median 3Q Max
-147.435 -29.555 -3.387 36.031 137.898
Coefficients:
Estimate Std. Error t value Pr(>|t|)
m 19.56864 0.06888 284.1 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 56.72 on 1008 degrees of freedom
Multiple R-squared: 0.9877, Adjusted R-squared: 0.9877
F-statistic: 8.071e+04 on 1 and 1008 DF, p-value: < 2.2e-16
#回归建模
> lm.ag <- lm(g ~ 0+a)
> summary(lm.ag)
#结果
Call:
lm(formula = g ~ 0 + a)
Residuals:
Min 1Q Median 3Q Max
-270.43 -51.71 128.00 159.25 311.00
Coefficients:
Estimate Std. Error t value Pr(>|t|)
a 1.9644 0.0192 102.3 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 151.4 on 1008 degrees of freedom
Multiple R-squared: 0.9122, Adjusted R-squared: 0.9121
F-statistic: 1.047e+04 on 1 and 1008 DF, p-value: < 2.2e-16
简要结论:
(1) 以可视化的图来看, 收盘价(Close) MSFT跟GOOGLE的相似度高于AAPL或GOOGLE的。
(2)以线性回归结果, 二者的回归参数的p值(2.2e-16)都很小,表示回归计算有参考意义。
(3)MSFT跟GOOGLE的相关性高于AAPL跟GOOGLE。
(Adjusted R-squared:
MSFT vs GOOGLE(0.9877) > AAPL vs GOOGLE(0.9121))
本文数据和全文word版下载:
回复 数据挖掘入门与实战 公众号 “股票”即可获取。
公众号推荐: infu1024
情报排行榜
关注社交网络舆情、话题,提供主流有价评论。
有些知识,你知道后你的世界马上就不一样了。
长按图片,识别二维码,点关注