为什么要使用to_sql方法
表结构如下:
1 2 3 4 5 6 7 8 |
CREATE TABLE `my_balance` ( `id` int(11) NOT NULL AUTO_INCREMENT, `balance` decimal(20,4) NOT NULL COMMENT '余额', `account` varchar(30) DEFAULT NULL COMMENT '账户', `charges` decimal(10,4) NOT NULL COMMENT '交易手续费/每笔', PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8 COMMENT='账户表' |
- 1
- 2
- 3
- 4
- 5
- 6
- 7
现在要向my_balance
表中插入一下数据,下面我们来看三段代码:
- 普通pymysql sql insert 写法(强烈不推介这种写法)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
<span class="token keyword">import</span> pymysql<span class="token punctuation">.</span>cursors <span class="token comment"># 建立数据库连接</span> db <span class="token operator">=</span> pymysql<span class="token punctuation">.</span>connect<span class="token punctuation">(</span>host<span class="token operator">=</span><span class="token string">'127.0.0.1'</span><span class="token punctuation">,</span> user<span class="token operator">=</span><span class="token string">'root'</span><span class="token punctuation">,</span> passwd<span class="token operator">=</span><span class="token string">'root'</span><span class="token punctuation">,</span> db<span class="token operator">=</span><span class="token string">'tushare'</span><span class="token punctuation">,</span> charset<span class="token operator">=</span><span class="token string">'utf8'</span><span class="token punctuation">)</span> balance <span class="token operator">=</span> <span class="token number">11500.0000</span> account <span class="token operator">=</span> <span class="token number">8888</span> charges <span class="token operator">=</span> <span class="token number">0.0005</span> <span class="token comment"># insert 语句</span> sql <span class="token operator">=</span> f<span class="token string">"insert into `tushare`.`my_balance` ( `balance`, `account`, `charges`) values ( '{balance}', '{account}', '{charges}')"</span> <span class="token comment"># 执行新增或更新数据操作,返回受影响的行数</span> cursor <span class="token operator">=</span> db<span class="token punctuation">.</span>cursor<span class="token punctuation">(</span><span class="token punctuation">)</span> res_row <span class="token operator">=</span> cursor<span class="token punctuation">.</span>execute<span class="token punctuation">(</span>sql<span class="token punctuation">)</span> <span class="token comment"># 提交事务</span> db<span class="token punctuation">.</span>commit<span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token comment"># 关闭链接</span> db<span class="token punctuation">.</span>close<span class="token punctuation">(</span><span class="token punctuation">)</span> |
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 优点:代码比较直观易懂,适合初学者读
- 缺点:每次一个
insert
语句都要复制一堆代码,代码复用性为0.
- pandas拼接insert写法(这种写法有BUG)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
<span class="token keyword">import</span> pandas <span class="token keyword">as</span> pd <span class="token keyword">import</span> pymysql<span class="token punctuation">.</span>cursors <span class="token keyword">def</span> <span class="token function">init_db</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">:</span> <span class="token comment"># 建立数据库连接</span> db <span class="token operator">=</span> pymysql<span class="token punctuation">.</span>connect<span class="token punctuation">(</span>host<span class="token operator">=</span><span class="token string">'127.0.0.1'</span><span class="token punctuation">,</span> user<span class="token operator">=</span><span class="token string">'root'</span><span class="token punctuation">,</span> passwd<span class="token operator">=</span><span class="token string">'root'</span><span class="token punctuation">,</span> db<span class="token operator">=</span><span class="token string">'tushare'</span><span class="token punctuation">,</span> charset<span class="token operator">=</span><span class="token string">'utf8'</span><span class="token punctuation">)</span> <span class="token keyword">return</span> db <span class="token keyword">def</span> <span class="token function">insert_data_pandasType</span><span class="token punctuation">(</span>table_name<span class="token punctuation">,</span> valuses<span class="token punctuation">,</span> db<span class="token punctuation">)</span><span class="token punctuation">:</span> <span class="token triple-quoted-string string">''' 向数据库中新增数据 table_name:库名 valuses:字段名称和对应数据,组成的数组 '''</span> pop <span class="token operator">=</span> pd<span class="token punctuation">.</span>Series<span class="token punctuation">(</span>valuses<span class="token punctuation">)</span> into <span class="token operator">=</span> <span class="token string">','</span><span class="token punctuation">.</span>join<span class="token punctuation">(</span>pop<span class="token punctuation">.</span>keys<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span>values<span class="token punctuation">)</span> val <span class="token operator">=</span> <span class="token string">','</span><span class="token punctuation">.</span>join<span class="token punctuation">(</span><span class="token builtin">str</span><span class="token punctuation">(</span>v<span class="token punctuation">)</span> <span class="token keyword">for</span> v <span class="token keyword">in</span> pop<span class="token punctuation">.</span>values<span class="token punctuation">)</span> sql <span class="token operator">=</span> f<span class="token string">"insert into {table_name}({into}) values({val})"</span> cursor <span class="token operator">=</span> db<span class="token punctuation">.</span>cursor<span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">return</span> cursor<span class="token punctuation">.</span>execute<span class="token punctuation">(</span>sql<span class="token punctuation">)</span> <span class="token comment"># 执行SQL</span> db <span class="token operator">=</span> init_db<span class="token punctuation">(</span><span class="token punctuation">)</span> values <span class="token operator">=</span> <span class="token punctuation">{</span><span class="token string">'balance'</span><span class="token punctuation">:</span> <span class="token number">11400.0000</span><span class="token punctuation">,</span> <span class="token string">'account'</span><span class="token punctuation">:</span> <span class="token string">'8866.ZH'</span><span class="token punctuation">,</span> <span class="token string">'charges'</span><span class="token punctuation">:</span> <span class="token number">0.0005</span><span class="token punctuation">}</span> insert_data_pandasType<span class="token punctuation">(</span><span class="token string">'my_balance'</span><span class="token punctuation">,</span> values<span class="token punctuation">,</span> db<span class="token punctuation">)</span> db<span class="token punctuation">.</span>commit<span class="token punctuation">(</span><span class="token punctuation">)</span> db<span class="token punctuation">.</span>close<span class="token punctuation">(</span><span class="token punctuation">)</span> |
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
'account': '8866.ZH'
问题出在这里,错误如下:
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘ZH,0.0005)
所以只能存储没带标点符号的字符串~!
so,优缺点就不说了,这BUG太明显,压根不能用~!
- to_sql写法
1 2 3 4 5 6 7 8 9 |
<span class="token keyword">import</span> pandas <span class="token keyword">as</span> pd <span class="token keyword">from</span> sqlalchemy <span class="token keyword">import</span> create_engine engine <span class="token operator">=</span> create_engine<span class="token punctuation">(</span> <span class="token string">"mysql+pymysql://root:root@localhost:3306/tushare?charset=utf8"</span><span class="token punctuation">)</span> df <span class="token operator">=</span> pd<span class="token punctuation">.</span>DataFrame<span class="token punctuation">(</span><span class="token punctuation">{</span><span class="token string">'balance'</span><span class="token punctuation">:</span> <span class="token punctuation">[</span><span class="token number">10000.000</span><span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token string">'account'</span><span class="token punctuation">:</span> <span class="token punctuation">[</span><span class="token string">'13579.SH'</span><span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token string">'charges'</span><span class="token punctuation">:</span> <span class="token punctuation">[</span><span class="token number">0.0005</span><span class="token punctuation">]</span><span class="token punctuation">}</span><span class="token punctuation">)</span> df<span class="token punctuation">.</span>to_sql<span class="token punctuation">(</span>name<span class="token operator">=</span><span class="token string">'my_balance'</span><span class="token punctuation">,</span> con<span class="token operator">=</span>engine<span class="token punctuation">,</span> if_exists<span class="token operator">=</span><span class="token string">'append'</span><span class="token punctuation">,</span> index<span class="token operator">=</span><span class="token boolean">False</span><span class="token punctuation">,</span> index_label<span class="token operator">=</span><span class="token string">'id'</span><span class="token punctuation">)</span> |
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
你没有看错,就这么几行就搞定了insert了,是不是非常爽~!但是有几点需要注意下:
错误信息:Execution failed on sql ‘SELECT name FROM sqlite_master WHERE type=‘table’ AND name=?;’: not all arguments converted during string formatting
数据库链接不再使用pymysql
,而改用sqlalchemy
,con=engine
而不是con=db
官方文档
最最最坑的地方出现了,大家注意:
错误:No module named ‘MySQLdb’
百度到的代码,基本99%都是这么写的:
1 2 |
mysql<span class="token operator">+</span>mysqldb<span class="token punctuation">:</span><span class="token operator">//</span>root<span class="token punctuation">:</span>root@localhost<span class="token punctuation">:</span><span class="token number">3306</span><span class="token operator">/</span>tushare?charset<span class="token operator">=</span>utf8 |
- 1
但是,如果按照如上写法,在python3.6(我的python版本)环境下会出现找不到mysqldb
模块错误!
正确的写法如下,因为python3将mysqldb
改为pymysql
了!!!
1 2 |
mysql<span class="token operator">+</span>pymysql<span class="token punctuation">:</span><span class="token operator">//</span>root<span class="token punctuation">:</span>root@localhost<span class="token punctuation">:</span><span class="token number">3306</span><span class="token operator">/</span>tushare?charset<span class="token operator">=</span>utf8 |
- 1
if_exists 的参数说明
fail的意思如果表存在,啥也不做
replace的意思,如果表存在,删了表,再建立一个新表,把数据插入
append的意思,如果表存在,把数据插入,如果表不存在创建一个表!!
index=False, index_label=‘id’
如果你数据库中主键名称为index
则不用设置这两项,如果不是,请按照主键名称设置!
附上一篇文章:to_sql与普通insert性能对比测试
关于to_sql事务问题
看到这里,有编程经验的朋友,一定会发现,使用
to_sql
的代码中,没有看到事务,在有多个insert
或者update
的情况时,如果回滚提交呢?从下面这篇文章中寻找答案吧~!
不能翻墙的小伙伴请看下面代码示例:
1 2 3 4 5 6 7 8 9 10 11 12 |
engine <span class="token operator">=</span> create_engine<span class="token punctuation">(</span> <span class="token string">"mysql+pymysql://root:root@localhost:3306/tushare?charset=utf8"</span><span class="token punctuation">)</span> df <span class="token operator">=</span> pd<span class="token punctuation">.</span>DataFrame<span class="token punctuation">(</span><span class="token punctuation">{</span><span class="token string">'balance'</span><span class="token punctuation">:</span> <span class="token punctuation">[</span><span class="token number">10000.000</span><span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token string">'account'</span><span class="token punctuation">:</span> <span class="token punctuation">[</span><span class="token string">'13579.SH'</span><span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token string">'charges'</span><span class="token punctuation">:</span> <span class="token punctuation">[</span><span class="token number">0.0005</span><span class="token punctuation">]</span><span class="token punctuation">}</span><span class="token punctuation">)</span> df1 <span class="token operator">=</span> pd<span class="token punctuation">.</span>DataFrame<span class="token punctuation">(</span><span class="token punctuation">{</span><span class="token string">'balance'</span><span class="token punctuation">:</span> <span class="token punctuation">[</span><span class="token number">1234.000</span><span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token string">'account'</span><span class="token punctuation">:</span> <span class="token punctuation">[</span><span class="token string">'17886.SH'</span><span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token string">'charges'</span><span class="token punctuation">:</span> <span class="token punctuation">[</span><span class="token number">0.0005</span><span class="token punctuation">]</span><span class="token punctuation">}</span><span class="token punctuation">)</span> <span class="token comment"># 开始事物</span> <span class="token keyword">with</span> engine<span class="token punctuation">.</span>begin<span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> conn<span class="token punctuation">:</span> df<span class="token punctuation">.</span>to_sql<span class="token punctuation">(</span>name<span class="token operator">=</span><span class="token string">'my_balance'</span><span class="token punctuation">,</span> con<span class="token operator">=</span>conn<span class="token punctuation">,</span> if_exists<span class="token operator">=</span><span class="token string">'append'</span><span class="token punctuation">,</span> index<span class="token operator">=</span><span class="token boolean">False</span><span class="token punctuation">,</span> index_label<span class="token operator">=</span><span class="token string">'id'</span><span class="token punctuation">)</span> <span class="token comment"># 故意出现错误的代码,测试事物回滚</span> df1<span class="token punctuation">.</span>to_sql<span class="token punctuation">(</span>name<span class="token operator">=</span><span class="token string">'my_balance'</span><span class="token punctuation">,</span> con<span class="token operator">=</span>conn<span class="token punctuation">,</span> if_exists<span class="token operator">=</span><span class="token string">'append'</span><span class="token punctuation">)</span> |