学习目标

我们所采用的学习内容来自B站的Lizongzhang老师的R语言的学习分享
今天学习的主要内容是关于
使用R语言表示人口总量趋势的方程以及fct_reorder2的使用

使用数据

相关使用的数据放在这里
[视频中使用的数据文件：10city population 2000-21.xls]

人口总量趋势方程

数据导入

下面是学习的主要内容
先对数据进行导入(这个根据自身数据所在位置进行判断)
然后设置一个result的dataframe保存计算的结果

#fct_reorder2()
#导入数据
library(readxl)
data<- read_excel("D:/R语言/10city population 2000-21.xls")
#View(data)
data$t<-c(22:1)
data$year<-c(2021:2000)
result<-data.frame(
  city=colnames(data)[2:11]
)
result[,c("growth_size","growth_rate")]=NA

对数据进行处理

使用循环语句进行操作

library(tidyverse)
for(i in 1:10){
  result$city[i] %>% print()
  eq1<-lm(get(result$city[i])~t,data)
  eq1 %>% print()
  result[i,2]<-eq1$coefficients[2] %>% round()
  predict(eq1,data.frame(t=c(23:25))) %>% 
    round() %>% print
}
#get函数可以去到名称的双引号
#t表示增长人数

得到的相应的结果如下

[1] "北京市"

Call:
lm(formula = get(result$city[i]) ~ t, data = data)

Coefficients:
(Intercept)            t  
    1331.87        46.73  

   1    2    3 
2407 2453 2500 
[1] "上海市"

Call:
lm(formula = get(result$city[i]) ~ t, data = data)

Coefficients:
(Intercept)            t  
    1666.64        44.06  

   1    2    3 
2680 2724 2768 
[1] "重庆市"

Call:
lm(formula = get(result$city[i]) ~ t, data = data)

Coefficients:
(Intercept)            t  
    2703.59        22.47  

   1    2    3 
3220 3243 3265 
[1] "广州市"

Call:
lm(formula = get(result$city[i]) ~ t, data = data)

Coefficients:
(Intercept)            t  
     704.90        54.77  

   1    2    3 
1965 2019 2074 
[1] "深圳市"

Call:
lm(formula = get(result$city[i]) ~ t, data = data)

Coefficients:
(Intercept)            t  
     582.01        42.01  

   1    2    3 
1548 1590 1632 
[1] "成都市"

Call:
lm(formula = get(result$city[i]) ~ t, data = data)

Coefficients:
(Intercept)            t  
     919.10        51.75  

   1    2    3 
2109 2161 2213 
[1] "珠海市"

Call:
lm(formula = get(result$city[i]) ~ t, data = data)

Coefficients:
(Intercept)            t  
    111.615        4.471  

  1   2   3 
214 219 223 
[1] "厦门市"

Call:
lm(formula = get(result$city[i]) ~ t, data = data)

Coefficients:
(Intercept)            t  
     180.59        16.44  

  1   2   3 
559 575 592 
[1] "天津市"

Call:
lm(formula = get(result$city[i]) ~ t, data = data)

Coefficients:
(Intercept)            t  
     960.31        24.64  

   1    2    3 
1527 1552 1576 
[1] "长沙市"

Call:
lm(formula = get(result$city[i]) ~ t, data = data)

Coefficients:
(Intercept)            t  
     526.19        20.26  

   1    2    3 
 992 1012 1033

其中t表示的是每年的增长数量
另外一种写法就是对数据人口数量(解释变量)取log
将eq1改写成eq2

library(tidyverse)
for(i in 1:10){
  result$city[i] %>% print()
  eq2<-lm(log(get(result$city[i]))~t,data)
  eq2 %>% print()
  result[i,3]<-eq2$coefficients[2] %>% round(3)
  #第二行第三列
  predict(eq2,data.frame(t=c(23:25))) %>% 
    exp() %>% #再取一次反对数
    round() %>% print
}
#get函数可以去到名称的双引号
#t表示增长人数

fct_reorder2的应用

对数据进行整理

对数据进行整合

long<-data %>% 
  pivot_longer(cols=2:11,
               names_to="city",
               values_to="population")

打开long数据进行查看
在这里插入图片描述

对图像进行可视化操作

对整理后的数据进行绘制图像

long %>% 
  ggplot(aes(year,population,col=city))+
  geom_point()+
  geom_smooth(method="lm",#估计的方法使用lm
              se=FALSE)+
  labs(title="2000-2021常驻人口人数(万人)")#添加标题
#对数据进行排序(按照城市的名称进行排列)

请添加图片描述
由于图像整体上看起来十分的混乱,对数据进行排序之后再绘制图像

使用fct_reorder2

使用fct_reorder2()进行数据的排序,实现对图标以及图像的排序

long %>% 
  mutate(city=fct_reorder2(city,year,
                           population)) %>% 
  ggplot(aes(year,population,col=city))+
  geom_point()+
  geom_smooth(method="lm",
              se=FALSE)+
  labs(title="2000-2021常驻人口人数(万人)")

请添加图片描述
这里可以看到图像的图标的位置和画出的线的位置是一一对应的

去掉趋势点

保留线段,去掉点,可以使用line


long %>% 
  mutate(city=fct_reorder2(city,year,
                           population)) %>% 
  ggplot(aes(year,population,col=city))+
  geom_line()+
  labs(title="2000-2021常驻人口人数(万人)")