Chapter 8 R语言ggplot2热图

8.1 pheatmap热图

R语言里做热图最快捷的方式是用pheatmap这个R包,优点是用非常少的代码就可以出一个比较好看的图，缺点是细节修改不是很方便，比如要用热图展示基因表达量的数据，准备数据的格式如下

pheatmap不是R语言自带的R包，第一次使用需要先安装,安装直接使用命令install.packages("pheatmap")

读取数据

library(readxl)
dat01<-read_excel("example_data/06-lineplot/dat08.xlsx")

这里需要注意的一个点是热图数据通常需要把第一列的基因名作为整个数据的行名，但是读取excel的函数好像没有指定列为行名的函数，当然可以将数据集读取进来以后再进行转换，另外一种方式就是把数据另存为csv格式，然后用读取csv格式数据的函数

这里需要注意的一点是转换为csv格式的时候选择如下截图中的2，不要选择1

读取csv格式数据

dat01<-read.delim(file = "example_data/08-heatmap/01pheatmap_example.csv",
                  header=TRUE,
                  sep=",",
                  row.names = 1)

作图代码

dat01<-read.delim(file = "example_data/08-heatmap/01pheatmap_example.csv",
                  header=TRUE,
                  sep=",",
                  row.names = 1)
library(pheatmap)
pheatmap(dat01)

如果要在色块上添加文本，再单独准备一个和热图数据格式一样的数据，然后用display_numbers参数添加文本,这里我就直接使用热图的数据

dat01<-read.delim(file = "example_data/08-heatmap/01pheatmap_example.csv",
                  header=TRUE,
                  sep=",",
                  row.names = 1)
library(pheatmap)
pheatmap(dat01)

pheatmap(dat01,
         display_numbers = dat01)

以上是对pheatmap这个R包的简要介绍

ggplot2也有直接做热图的函数 geom_tile(),ggplot2做热图可能代码稍微繁琐，但是优点是细节调整方便，基本上所有的细节都可以用代码来调整

ggplot2做热图还需要掌握的一个知识点是长格式数据和宽格式数据，ggplot2作图的输入数据都是长格式数据，长格式数据如下，一列x，一列y，还有一个数据

宽格式数据截图如下

这个长宽格式转化是ggplot2作图必须理解的一个概念

R语言里提供了长宽格式数据互相转化的函数，这里我以tidyverse这个R包里的函数作为介绍，tidyverse主要是用来在数据处理的，也不是R语言自带的R包，需要运行安装命令install.packages("tidyverse")

宽格式数据转换为长格式用到的函数是pivot_longer()

转换代码

library(readxl)
dat01<-read_excel("example_data/08-heatmap/02_wide_data.xlsx")
head(dat01)

## # A tibble: 2 x 3
##   gene_name sample1 sample2
##   <chr>       <dbl>   <dbl>
## 1 gene1           4       3
## 2 gene2           5       4

library(tidyverse)

dat01 %>% 
  pivot_longer(-gene_name,names_to = "A",values_to = "B")

## # A tibble: 4 x 3
##   gene_name A           B
##   <chr>     <chr>   <dbl>
## 1 gene1     sample1     4
## 2 gene1     sample2     3
## 3 gene2     sample1     5
## 4 gene2     sample2     4

长格式转换为宽格式的函数是pivot_wider()

转换代码

library(readxl)
dat01<-read_excel("example_data/08-heatmap/02_long_data.xlsx")
head(dat01)

## # A tibble: 4 x 3
##   x     y       value
##   <chr> <chr>   <dbl>
## 1 gene1 sample1     3
## 2 gene1 sample2     4
## 3 gene2 sample1     5
## 4 gene2 sample2     6

library(tidyverse)

dat01 %>% 
  pivot_wider(names_from = y,values_from = value)

## # A tibble: 2 x 3
##   x     sample1 sample2
##   <chr>   <dbl>   <dbl>
## 1 gene1       3       4
## 2 gene2       5       6

这个是最基本的长宽格式数据转换，如果数据集有很多列，有时候转换会相对比较复杂，这里就不做介绍，因为我也搞不懂有时候

8.2 ggplot2热图

以下介绍ggplot2做热图的代码都是假设已经拿到了长格式数据

示例数据如下

最基本的热图代码

library(readxl)
dat01<-read_excel("example_data/08-heatmap/03_heatmap_example.xlsx")
head(dat01)

## # A tibble: 6 x 3
##   gene_name Sample   expr_value
##   <chr>     <chr>         <dbl>
## 1 gene_1    sample_1          1
## 2 gene_1    sample_2          5
## 3 gene_1    sample_3          8
## 4 gene_1    sample_4          4
## 5 gene_1    sample_5          2
## 6 gene_1    sample_6          5

library(ggplot2)

ggplot(data=dat01,aes(x=gene_name,y=Sample))+
  geom_tile(aes(fill=expr_value),color="red")

ggplot(data=dat01,aes(x=gene_name,y=Sample))+
  geom_tile(aes(fill=expr_value),color=NA)

热图经常遇到的操作是调整坐标轴的顺序，这个可以通过赋予因子水平来实现

library(readxl)
dat01<-read_excel("example_data/08-heatmap/03_heatmap_example.xlsx")
head(dat01)

## # A tibble: 6 x 3
##   gene_name Sample   expr_value
##   <chr>     <chr>         <dbl>
## 1 gene_1    sample_1          1
## 2 gene_1    sample_2          5
## 3 gene_1    sample_3          8
## 4 gene_1    sample_4          4
## 5 gene_1    sample_5          2
## 6 gene_1    sample_6          5

dat01$gene_name<-factor(dat01$gene_name,
                        levels = c("gene_1","gene_2","gene_3",
                                   "gene_4","gene_5","gene_6",
                                   "gene_7","gene_8","gene_9",
                                   "gene_10"))

library(ggplot2)

ggplot(data=dat01,aes(x=gene_name,y=Sample))+
  geom_tile(aes(fill=expr_value),color="red")

dat01$Sample<-factor(dat01$Sample,
                     levels = c("sample_2","sample_4","sample_6",
                                "sample_1","sample_3","sample_5"))

ggplot(data=dat01,aes(x=gene_name,y=Sample))+
  geom_tile(aes(fill=expr_value),color="red")

更改热图色块填充的颜色

更改热图填充颜色有很多种方式，这里我介绍我自己最常用的一种方式

参考链接

https://r-charts.com/

这里用到额外的一个R包 paletteer

https://github.com/EmilHvitfeldt/paletteer

library(readxl)
dat01<-read_excel("example_data/08-heatmap/03_heatmap_example.xlsx")
head(dat01)

## # A tibble: 6 x 3
##   gene_name Sample   expr_value
##   <chr>     <chr>         <dbl>
## 1 gene_1    sample_1          1
## 2 gene_1    sample_2          5
## 3 gene_1    sample_3          8
## 4 gene_1    sample_4          4
## 5 gene_1    sample_5          2
## 6 gene_1    sample_6          5

dat01$gene_name<-factor(dat01$gene_name,
                        levels = c("gene_1","gene_2","gene_3",
                                   "gene_4","gene_5","gene_6",
                                   "gene_7","gene_8","gene_9",
                                   "gene_10"))

library(ggplot2)

ggplot(data=dat01,aes(x=gene_name,y=Sample))+
  geom_tile(aes(fill=expr_value),color="red")

library(paletteer)

## Warning: package 'paletteer' was built under R version 4.0.5

ggplot(data=dat01,aes(x=gene_name,y=Sample))+
  geom_tile(aes(fill=expr_value),color="red")+
  scale_fill_paletteer_c("ggthemes::Classic Orange-White-Blue")

ggplot(data=dat01,aes(x=gene_name,y=Sample))+
  geom_tile(aes(fill=expr_value),color="red")+
  scale_fill_paletteer_c("ggthemes::Classic Orange-White-Blue",
                         direction = -1)

调整坐标轴文本标签的位置，y轴左右，x轴是上下

library(readxl)
dat01<-read_excel("example_data/08-heatmap/03_heatmap_example.xlsx")
head(dat01)

## # A tibble: 6 x 3
##   gene_name Sample   expr_value
##   <chr>     <chr>         <dbl>
## 1 gene_1    sample_1          1
## 2 gene_1    sample_2          5
## 3 gene_1    sample_3          8
## 4 gene_1    sample_4          4
## 5 gene_1    sample_5          2
## 6 gene_1    sample_6          5

dat01$gene_name<-factor(dat01$gene_name,
                        levels = c("gene_1","gene_2","gene_3",
                                   "gene_4","gene_5","gene_6",
                                   "gene_7","gene_8","gene_9",
                                   "gene_10"))

library(ggplot2)

library(paletteer)

ggplot(data=dat01,aes(x=gene_name,y=Sample))+
  geom_tile(aes(fill=expr_value),color="red")+
  scale_fill_paletteer_c("ggthemes::Classic Orange-White-Blue",
                         direction = -1)+
  scale_x_discrete(position = "top")+
  scale_y_discrete(position = "right")

调整坐标轴的文本方向

library(readxl)
dat01<-read_excel("example_data/08-heatmap/03_heatmap_example.xlsx")
head(dat01)

## # A tibble: 6 x 3
##   gene_name Sample   expr_value
##   <chr>     <chr>         <dbl>
## 1 gene_1    sample_1          1
## 2 gene_1    sample_2          5
## 3 gene_1    sample_3          8
## 4 gene_1    sample_4          4
## 5 gene_1    sample_5          2
## 6 gene_1    sample_6          5

dat01$gene_name<-factor(dat01$gene_name,
                        levels = c("gene_1","gene_2","gene_3",
                                   "gene_4","gene_5","gene_6",
                                   "gene_7","gene_8","gene_9",
                                   "gene_10"))

library(ggplot2)

library(paletteer)

ggplot(data=dat01,aes(x=gene_name,y=Sample))+
  geom_tile(aes(fill=expr_value),color="red")+
  scale_fill_paletteer_c("ggthemes::Classic Orange-White-Blue",
                         direction = -1)+
  scale_x_discrete(position = "top")+
  scale_y_discrete(position = "right")+
  theme(axis.text.x = element_text(angle=60))

ggplot(data=dat01,aes(x=gene_name,y=Sample))+
  geom_tile(aes(fill=expr_value),color="red")+
  scale_fill_paletteer_c("ggthemes::Classic Orange-White-Blue",
                         direction = -1)+
  scale_x_discrete(position = "top")+
  scale_y_discrete(position = "right")+
  theme(axis.text.x = element_text(angle=60,hjust=0,vjust=1))

去掉整个的灰色背景和坐标轴的小短线

library(readxl)
dat01<-read_excel("example_data/08-heatmap/03_heatmap_example.xlsx")
head(dat01)

## # A tibble: 6 x 3
##   gene_name Sample   expr_value
##   <chr>     <chr>         <dbl>
## 1 gene_1    sample_1          1
## 2 gene_1    sample_2          5
## 3 gene_1    sample_3          8
## 4 gene_1    sample_4          4
## 5 gene_1    sample_5          2
## 6 gene_1    sample_6          5

dat01$gene_name<-factor(dat01$gene_name,
                        levels = c("gene_1","gene_2","gene_3",
                                   "gene_4","gene_5","gene_6",
                                   "gene_7","gene_8","gene_9",
                                   "gene_10"))

library(ggplot2)

library(paletteer)

ggplot(data=dat01,aes(x=gene_name,y=Sample))+
  geom_tile(aes(fill=expr_value),color="red")+
  scale_fill_paletteer_c("ggthemes::Classic Orange-White-Blue",
                         direction = -1)+
  scale_x_discrete(position = "top")+
  scale_y_discrete(position = "right")+
  theme(axis.text.x = element_text(angle=60,hjust=0,vjust=1),
        axis.ticks = element_blank(),
        panel.background = element_blank())

如果看起来坐标轴的文本距离图比较远的话还可以调整

library(readxl)
dat01<-read_excel("example_data/08-heatmap/03_heatmap_example.xlsx")
head(dat01)

## # A tibble: 6 x 3
##   gene_name Sample   expr_value
##   <chr>     <chr>         <dbl>
## 1 gene_1    sample_1          1
## 2 gene_1    sample_2          5
## 3 gene_1    sample_3          8
## 4 gene_1    sample_4          4
## 5 gene_1    sample_5          2
## 6 gene_1    sample_6          5

dat01$gene_name<-factor(dat01$gene_name,
                        levels = c("gene_1","gene_2","gene_3",
                                   "gene_4","gene_5","gene_6",
                                   "gene_7","gene_8","gene_9",
                                   "gene_10"))

library(ggplot2)

library(paletteer)

ggplot(data=dat01,aes(x=gene_name,y=Sample))+
  geom_tile(aes(fill=expr_value),color="red")+
  scale_fill_paletteer_c("ggthemes::Classic Orange-White-Blue",
                         direction = -1)+
  scale_x_discrete(position = "top",
                   expand = expansion(mult = c(0,0)))+
  scale_y_discrete(position = "right",
                   expand = expansion(mult = c(0,0)))+
  theme(axis.text.x = element_text(angle=60,hjust=0,vjust=1),
        axis.ticks = element_blank(),
        panel.background = element_blank())

在色块上添加文本，用geom_text()函数来添加

library(readxl)
dat01<-read_excel("example_data/08-heatmap/03_heatmap_example.xlsx")
head(dat01)

## # A tibble: 6 x 3
##   gene_name Sample   expr_value
##   <chr>     <chr>         <dbl>
## 1 gene_1    sample_1          1
## 2 gene_1    sample_2          5
## 3 gene_1    sample_3          8
## 4 gene_1    sample_4          4
## 5 gene_1    sample_5          2
## 6 gene_1    sample_6          5

dat01$gene_name<-factor(dat01$gene_name,
                        levels = c("gene_1","gene_2","gene_3",
                                   "gene_4","gene_5","gene_6",
                                   "gene_7","gene_8","gene_9",
                                   "gene_10"))

library(ggplot2)

library(paletteer)

ggplot(data=dat01,aes(x=gene_name,y=Sample))+
  geom_tile(aes(fill=expr_value),color="red")+
  scale_fill_paletteer_c("ggthemes::Classic Orange-White-Blue",
                         direction = -1)+
  scale_x_discrete(position = "top",
                   expand = expansion(mult = c(0,0)))+
  scale_y_discrete(position = "right",
                   expand = expansion(mult = c(0,0)))+
  theme(axis.text.x = element_text(angle=60,hjust=0,vjust=1),
        axis.ticks = element_blank(),
        panel.background = element_blank())+
  geom_text(aes(label=expr_value))

调整图例的细节

参考公众号推文 ggplot2画热图展示相关系数的简单小例子

截断和标签是在scale_fill函数里设置breaks和labels

图例的位置是在主题里进行设置

其他一些细节在guides函数里设置

library(readxl)
dat01<-read_excel("example_data/08-heatmap/03_heatmap_example.xlsx")
head(dat01)

## # A tibble: 6 x 3
##   gene_name Sample   expr_value
##   <chr>     <chr>         <dbl>
## 1 gene_1    sample_1          1
## 2 gene_1    sample_2          5
## 3 gene_1    sample_3          8
## 4 gene_1    sample_4          4
## 5 gene_1    sample_5          2
## 6 gene_1    sample_6          5

dat01$gene_name<-factor(dat01$gene_name,
                        levels = c("gene_1","gene_2","gene_3",
                                   "gene_4","gene_5","gene_6",
                                   "gene_7","gene_8","gene_9",
                                   "gene_10"))

library(ggplot2)

library(paletteer)

ggplot(data=dat01,aes(x=gene_name,y=Sample))+
  geom_tile(aes(fill=expr_value),color="red")+
  scale_fill_paletteer_c("ggthemes::Classic Orange-White-Blue",
                         direction = -1,
                         breaks=c(1.1,3,5,7,9,9.9),
                         labels=c(1,3,5,7,9,10))+
  scale_x_discrete(position = "top",
                   expand = expansion(mult = c(0,0)))+
  scale_y_discrete(position = "right",
                   expand = expansion(mult = c(0,0)))+
  theme(axis.text.x = element_text(angle=60,hjust=0,vjust=1),
        axis.ticks = element_blank(),
        panel.background = element_blank())+
  geom_text(aes(label=expr_value))+
  theme(legend.position = "top")+
  guides(fill=guide_colorbar(title = "AAA",
                             title.position = "top",
                             title.hjust = 0.5,
                             barwidth = 20,
                             ticks = FALSE,
                             label = TRUE))

以上介绍的用来填充颜色的数据是连续型的，数据是离散的也是可以的，比如只关心某个基因在样本中是否表达，并不关心这个基因的表达量高低,示例数据如下

这里A代表基因表达B代表基因不表达，这个AB可以用任意字符代替

library(readxl)
dat01<-read_excel("example_data/08-heatmap/04_heatmap_example.xlsx")
head(dat01)

## # A tibble: 6 x 3
##   gene_name Sample   expr 
##   <chr>     <chr>    <chr>
## 1 gene_1    sample_1 B    
## 2 gene_1    sample_2 B    
## 3 gene_1    sample_3 B    
## 4 gene_1    sample_4 A    
## 5 gene_1    sample_5 A    
## 6 gene_1    sample_6 A

dat01$gene_name<-factor(dat01$gene_name,
                        levels = c("gene_1","gene_2","gene_3",
                                   "gene_4","gene_5","gene_6",
                                   "gene_7","gene_8","gene_9",
                                   "gene_10"))

library(ggplot2)


ggplot(data=dat01,aes(x=gene_name,y=Sample))+
  geom_tile(aes(fill=expr),color="blue")

ggplot(data=dat01,aes(x=gene_name,y=Sample))+
  geom_tile(aes(fill=expr),color="blue")+
  scale_fill_manual(values = c("A"="#fe0904",
                               "B"='#f9b54f'))

ggplot(data=dat01,aes(x=gene_name,y=Sample))+
  geom_tile(aes(fill=expr),color="blue")+
  scale_fill_manual(values = c("A"="#fe0904",
                               "B"='white'))+
  theme_bw()

8.3 ggplot2气泡热图图

如果x 和 y都是离散的，把热图函数geom_tile()换成geom_point()函数，然后用表达量的值映射点的大小同时映射颜色也可以归为热图的一种

比如

library(readxl)
dat01<-read_excel("example_data/08-heatmap/03_heatmap_example.xlsx")
head(dat01)

## # A tibble: 6 x 3
##   gene_name Sample   expr_value
##   <chr>     <chr>         <dbl>
## 1 gene_1    sample_1          1
## 2 gene_1    sample_2          5
## 3 gene_1    sample_3          8
## 4 gene_1    sample_4          4
## 5 gene_1    sample_5          2
## 6 gene_1    sample_6          5

dat01$gene_name<-factor(dat01$gene_name,
                        levels = c("gene_1","gene_2","gene_3",
                                   "gene_4","gene_5","gene_6",
                                   "gene_7","gene_8","gene_9",
                                   "gene_10"))

library(ggplot2)


ggplot(data=dat01,aes(x=gene_name,y=Sample))+
  geom_point(aes(fill=expr_value,
                 size=expr_value),color="blue",shape=21)

更改默认的配色和点的大小

library(readxl)
dat01<-read_excel("example_data/08-heatmap/03_heatmap_example.xlsx")
head(dat01)

## # A tibble: 6 x 3
##   gene_name Sample   expr_value
##   <chr>     <chr>         <dbl>
## 1 gene_1    sample_1          1
## 2 gene_1    sample_2          5
## 3 gene_1    sample_3          8
## 4 gene_1    sample_4          4
## 5 gene_1    sample_5          2
## 6 gene_1    sample_6          5

dat01$gene_name<-factor(dat01$gene_name,
                        levels = c("gene_1","gene_2","gene_3",
                                   "gene_4","gene_5","gene_6",
                                   "gene_7","gene_8","gene_9",
                                   "gene_10"))

library(ggplot2)
library(paletteer)

ggplot(data=dat01,aes(x=gene_name,y=Sample))+
  geom_point(aes(fill=expr_value,
                 size=expr_value),color="blue",shape=21)+
  scale_fill_paletteer_c("ggthemes::Classic Orange-White-Blue",
                         direction = -1)+
  scale_size_continuous(range = c(1,20),
                        guide = guide_legend(override.aes = list(size = c(1,2,3,5)) ))

8.4 一个完整的例子

来源 https://www.royfrancis.com/a-guide-to-elegant-tiled-heatmaps-in-r-2019/

library(reshape2)

## 
## Attaching package: 'reshape2'

## The following object is masked from 'package:tidyr':
## 
##     smiths

library(ggplot2)
m <- read.csv("example_data/08-heatmap/measles_lev1.txt",
              header = T,
              stringsAsFactors = F,
              skip = 2)
m2 <- melt(m,id.vars = c("YEAR", "WEEK"))
#rename column names
colnames(m2) <- c("Year", "Week", "State", "Value")

#inspect data
head(m2)

##   Year Week   State Value
## 1 1928    1 ALABAMA  3.67
## 2 1928    2 ALABAMA  6.25
## 3 1928    3 ALABAMA  7.95
## 4 1928    4 ALABAMA 12.58
## 5 1928    5 ALABAMA  8.03
## 6 1928    6 ALABAMA  7.27

table(m2$State)

## 
##              ALABAMA               ALASKA              ARIZONA             ARKANSAS 
##                 3952                 3952                 3952                 3952 
##           CALIFORNIA             COLORADO          CONNECTICUT             DELAWARE 
##                 3952                 3952                 3952                 3952 
## DISTRICT.OF.COLUMBIA              FLORIDA              GEORGIA               HAWAII 
##                 3952                 3952                 3952                 3952 
##                IDAHO             ILLINOIS              INDIANA                 IOWA 
##                 3952                 3952                 3952                 3952 
##               KANSAS             KENTUCKY            LOUISIANA                MAINE 
##                 3952                 3952                 3952                 3952 
##             MARYLAND        MASSACHUSETTS             MICHIGAN            MINNESOTA 
##                 3952                 3952                 3952                 3952 
##          MISSISSIPPI             MISSOURI              MONTANA             NEBRASKA 
##                 3952                 3952                 3952                 3952 
##               NEVADA        NEW.HAMPSHIRE           NEW.JERSEY           NEW.MEXICO 
##                 3952                 3952                 3952                 3952 
##             NEW.YORK       NORTH.CAROLINA         NORTH.DAKOTA                 OHIO 
##                 3952                 3952                 3952                 3952 
##             OKLAHOMA               OREGON         PENNSYLVANIA         RHODE.ISLAND 
##                 3952                 3952                 3952                 3952 
##       SOUTH.CAROLINA         SOUTH.DAKOTA            TENNESSEE                TEXAS 
##                 3952                 3952                 3952                 3952 
##                 UTAH              VERMONT             VIRGINIA           WASHINGTON 
##                 3952                 3952                 3952                 3952 
##        WEST.VIRGINIA            WISCONSIN              WYOMING 
##                 3952                 3952                 3952

camelCase <- function(string=NULL,separator="\\."){
  if(is.null(string)) 
    stop("No input string.n")
  s <- strsplit(string, separator)
  s <- tolower(s[[1]])
  paste(toupper(substring(s, 1, 1)), substring(s, 2),sep = "", collapse = " ")
}


#change variable to character, convert to camel case,
#remove dot and change variable back to factor
m2$State <- factor(as.character(sapply(as.character(m2$State),camelCase)))
m2$Year <- factor(m2$Year)
m2$Week <- factor(m2$Week)
#also converts '-' to NA
m2$Value <- as.numeric(m2$Value)

## Warning: NAs introduced by coercion

naSum <- function(x)
{
  if(all(is.na(x))) val <- sum(x,na.rm=F)
  if(!all(is.na(x))) val <- sum(x,na.rm=T)
  return(val)
}

#sums incidences for all weeks into one year
m3 <- plyr::ddply(m2,c("Year","State"),Incidence=round(naSum(Value),0),plyr::summarise)

#inspect data
head(m3)

##   Year      State Incidence
## 1 1928    Alabama       335
## 2 1928     Alaska        NA
## 3 1928    Arizona       201
## 4 1928   Arkansas       482
## 5 1928 California        69
## 6 1928   Colorado       207

str(m3)

## 'data.frame':    3876 obs. of  3 variables:
##  $ Year     : Factor w/ 76 levels "1928","1929",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ State    : Factor w/ 51 levels "Alabama","Alaska",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ Incidence: num  335 NA 201 482 69 207 635 256 536 120 ...

m3$State <- factor(as.character(m3$State),levels=rev(levels(m3$State)))

#create a new variable from incidence
m3$IncidenceFactor <- cut(m3$Incidence,
                          breaks = c(-1,0,1,10,100,500,1000,max(m3$Incidence,na.rm=T)),
                          labels=c("0","0-1","1-10","10-100","100-500","500-1000",">1000"))

#change level order
m3$IncidenceFactor <- factor(as.character(m3$IncidenceFactor),
                             levels=rev(levels(m3$IncidenceFactor)))
textcol <- "grey40"

#modified ggplot
p <- ggplot(m3,aes(x=Year,y=State,fill=IncidenceFactor))+
  geom_tile()+
  #redrawing tiles to remove cross lines from legend
  geom_tile(colour="white",size=0.25, show_guide=FALSE)+
  #remove axis labels, add title
  labs(x="",y="",title="Incidence of Measles in the US")+
  #remove extra space
  scale_y_discrete(expand=c(0,0))+
  #custom breaks on x-axis
  scale_x_discrete(expand=c(0,0),
                   breaks=c("1930","1940","1950","1960","1970","1980","1990","2000"))+
  #custom colours for cut levels and na values
  scale_fill_manual(values=c("#d53e4f","#f46d43","#fdae61",
                             "#fee08b","#e6f598","#abdda4","#ddf1da"),na.value="grey90")+
  #mark year of vaccination
  geom_vline(aes(xintercept = 36),size=3.4,alpha=0.24)+
  #equal aspect ratio x and y axis
  #coord_fixed()+
  #set base size for all font elements
  theme_grey(base_size=10)+
  #theme options
  theme(
    #remove legend title
    legend.title=element_blank(),
    #remove legend margin
    legend.margin = grid::unit(0,"cm"),
    #change legend text properties
    legend.text=element_text(colour=textcol,size=7,face="bold"),
    #change legend key height
    legend.key.height=grid::unit(0.8,"cm"),
    #set a slim legend
    legend.key.width=grid::unit(0.2,"cm"),
    #set x axis text size and colour
    axis.text.x=element_text(size=10,colour=textcol),
    #set y axis text colour and adjust vertical justification
    axis.text.y=element_text(vjust = 0.2,colour=textcol,size=4),
    #change axis ticks thickness
    axis.ticks=element_line(size=0.4),
    #change title font, size, colour and justification
    plot.title=element_text(colour=textcol,hjust=0,size=14,face="bold"),
    #remove plot background
    plot.background=element_blank(),
    #remove plot border
    panel.border=element_blank())

## Warning: `show_guide` has been deprecated. Please use `show.legend` instead.

## Warning: `legend.margin` must be specified using `margin()`. For the old behavior use
## legend.spacing

library(RColorBrewer)

## Warning: package 'RColorBrewer' was built under R version 4.0.5

#change the scale_fill_manual from previous code to below
p+
  scale_fill_manual(values=rev(brewer.pal(7,"YlGnBu")),
                    na.value="grey90")

## Scale for 'fill' is already present. Adding another scale for 'fill', which will replace
## the existing scale.