R:使用R连接数据库处理数据

1.数据库连接

library(DBI)

library(dplyr)

library(dbplyr)

library(odbc)

con <- dbConnect(odbc::odbc(), "Oracle DB")

2.使用DBI查询数据库

dbGetQuery(con,'

select "month_idx", "year", "month",

sum(case when "term_deposit" = 'yes' then 1.0 else 0.0 end) as subscribe,

count(*) as total

from "bank"

group by "month_idx", "year", "month"

')

3.使用 dplyr 语法查询

q1 <- tbl(con, "bank") %>%
  group_by(month_idx, year, month) %>%
  summarise(
    subscribe = sum(ifelse(term_deposit == "yes", 1, 0)),
    total = n())
show_query(q1)

4.R Notebooks查询

“`{sql, connection=con, output.var = “mydataframe”}

SELECT "month_idx", "year", "month", SUM(CASE WHEN ("term_deposit" = 'yes') THEN (1.0) ELSE (0.0) END) AS "subscribe",
COUNT(*) AS "total"
FROM ("bank") 
GROUP BY "month_idx", "year", "month"
```
```{r}
library(ggplot2)
ggplot(mydataframe, aes(total, subscribe, color = year)) +
  geom_point() +
  xlab("Total contacts") +
  ylab("Term Deposit Subscriptions") +
  ggtitle("Contact volume")
```
bankggplot

5.三种查询方式比较

方法 比较
  1. DBI::dbGetQuery
  • Fewer dependencies required
  1. dplyr syntax
  • Use the same syntax for R and database objects
  • No knowledge of SQL required
  • Code is standard across SQL variants
  • Lazy evaluation
  1. R Notebook SQL engine
  • Copy and paste SQL – no formatting required
  • SQL syntax is highlighted

如若转载,请注明出处:https://www.ouq.net/1317.html

(0)
打赏 微信打赏,为服务器增加50M流量 微信打赏,为服务器增加50M流量 支付宝打赏,为服务器增加50M流量 支付宝打赏,为服务器增加50M流量
上一篇 03/05/2022 21:08
下一篇 03/06/2022

相关推荐