Pandas & SQL 语法归纳总结,真的太全了

Pandas与SQL可能是大家用的比较多的两个工具,两者都可以对数据集进行深度的分析,挖掘出有价值的信息,但是二者的语法有着诸多的不同,今天小编就来总结归纳一下Pandas与SQL这两者之间在语法上到底有哪些不同。导入数据
Pandas而言,我们需要提前导入数据集,然后再进行进一步的分析与挖掘import?pandas?as?pd
airports?=?pd.read_csv('data/airports.csv')
airport_freq?=?pd.read_csv('data/airport-frequencies.csv')
runways?=?pd.read_csv('data/runways.csv')
基础语法
SQL当中,我们用SELECT来查找数据,WHERE来过滤数据,DISTINCT来去重,LIMIT来限制输出结果的数量,##?SQL
select?*?from?airports
##?Pandas
airports
##?SQL
select?*?from?airports?limit?3
##?Pandas
airports.head(3)
##?SQL
select?id?from?airports?where?ident?=?'KLAX'
##?Pandas
airports[airports.ident?==?'KLAX'].id
##?SQL
select?distinct?type?from?airport
##?Pandas
airports.type.unique()
多个条件交集来筛选数据
##?SQL
select?*?from?airports?
where?iso_region?=?'US-CA'?and?
type?=?'seaplane_base'
##?Pandas
airports[(airports.iso_region?==?'US-CA')?&?
(airports.type?==?'seaplane_base')]
##?SQL
select?ident,?name,?municipality?from?airports?
where?iso_region?=?'US-CA'?and
type?=?'large_airport'
##?Pandas
airports[(airports.iso_region?==?'US-CA')?&
(airports.type?==?'large_airport')][['ident',?'name',?'municipality']]
排序
Pandas当中默认是对数据进行升序排序,要是我们希望对数据进行降序排序,需要设定ascending参数##?SQL
select?*?from?airport_freq
where?airport_ident?=?'KLAX'
order?by?type
##?Pandas
airport_freq[airport_freq.airport_ident?==?'KLAX']
.sort_values('type')
##?SQL
select?*?from?airport_freq
where?airport_ident?=?'KLAX'
order?by?type?desc
##?Pandas
airport_freq[airport_freq.airport_ident?==?'KLAX']
.sort_values('type',?ascending=False)
筛选出列表当中的数据
isin()方法,代码如下##?SQL
select?*?from?airports?
where?type?in?('heliport',?'balloonport')
##?Pandas
airports[airports.type.isin(['heliport',?'balloonport'])]
##?SQL
select?*?from?airports?
where?type?not?in?('heliport',?'balloonport')
##?Pandas
airports[~airports.type.isin(['heliport',?'balloonport'])]
删除数据
Pandas当中删除数据用的是drop()方法,代码如下##?SQL
delete?from?dataframe?where?col_name?=?'MISC'
##?Pandas
df?=?df[df.type?!=?'MISC']
df.drop(df[df.type?==?'MISC'].index)
更新数据
SQL当中更新数据使用的是update和set方法,代码如下###?SQL
update?airports?set?home_link?=?'......'
where?ident?==?'KLAX'
###?Pandas
airports.loc[airports['ident']?==?'KLAX',?'home_link']?=?'......'
调用统计函数
runways.head()

min()、max()、mean()以及median()函数作用于length_ft这一列上面,代码如下##?SQL
select?max(length_ft),?min(length_ft),
avg(length_ft),?median(length_ft)?from?runways
##?Pandas
runways.agg({'length_ft':?['min',?'max',?'mean',?'median']})
合并两表格
Pandas当中合并表格用的是pd.concat()方法,在SQL当中则是UNION ALL,代码如下##?SQL
select?name,?municipality?from?airports
where?ident?=?'KLAX'
union?all
select?name,?municipality?from?airports
where?ident?=?'KLGB'
##?Pandas
pd.concat([airports[airports.ident?==?'KLAX'][['name',?'municipality']],
airports[airports.ident?==?'KLGB'][['name',?'municipality']]])
分组
groupby()方法,代码如下##?SQL
select?iso_country,?type,?count(*)?from?airports
group?by?iso_country,?type
order?by?iso_country,?type
##?Pandas
airports.groupby(['iso_country',?'type']).size()
分组之后再做筛选
Pandas当中是在进行了groupby()之后调用filter()方法,而在SQL当中则是调用HAVING方法,代码如下##?SQL
select?type,?count(*)?from?airports
where?iso_country?=?'US'
group?by?type
having?count(*)?>?1000
order?by?count(*)?desc
##?Pandas
airports[airports.iso_country?==?'US']
.groupby('type')
.filter(lambda?g:?len(g)?>?1000)
.groupby('type')
.size()
.sort_values(ascending=False)
TOP N records
##?SQL?
select?列名?from?表名
order?by?size
desc?limit?10
##?Pandas
表名.nlargest(10,?columns='列名')


分享

点收藏

点点赞

点在看
关注公众号:拾黑(shiheibook)了解更多
[广告]赞助链接:
四季很好,只要有你,文娱排行榜:https://www.yaopaiming.com/
让资讯触达的更精准有趣:https://www.0xu.cn/
关注网络尖刀微信公众号随时掌握互联网精彩
赞助链接
排名
热点
搜索指数
- 1 中央经济工作会议在北京举行 7904070
- 2 紧急提醒:请在日中国公民进行登记 7808772
- 3 中央定调明年继续“国补” 7713918
- 4 “九天”无人机成功首飞 7616111
- 5 断崖式降温!今冬最强寒潮来了 7523592
- 6 中央经济工作会议释信号:3件事不做 7426492
- 7 中国“空中航母”首飞成功 7329179
- 8 00后女生摆摊卖水培蔬菜日售千元 7233778
- 9 人民空军中日双语发文:大惊小怪 7141499
- 10 寒潮来袭 “速冻”模式如何应对 7040628

![野小马儿:把头发留长!留到腰那么长! 哦不好意思我有腰吗[doge] ](https://imgs.knowsafe.com:8087/img/aideep/2021/7/5/cd4d2210df3e3497e6b26364e1608430.jpg?w=250)
![FFang静涵 年前努努力年后冲冲冲[微风][微风][太阳]](https://imgs.knowsafe.com:8087/img/aideep/2022/1/8/abf513cf13b6bb480c37ce58160a7ef7.jpg?w=250)




AI100
