美文网首页
pandas实例-Merge-Fictitious Names

pandas实例-Merge-Fictitious Names

作者: 橘猫吃不胖 | 来源:发表于2020-05-07 17:55 被阅读0次

继续前面的练习,之前的文章参考:


加载数据集:

raw_data_1 = {
        'subject_id': ['1', '2', '3', '4', '5'],
        'first_name': ['Alex', 'Amy', 'Allen', 'Alice', 'Ayoung'], 
        'last_name': ['Anderson', 'Ackerman', 'Ali', 'Aoni', 'Atiches']}

raw_data_2 = {
        'subject_id': ['4', '5', '6', '7', '8'],
        'first_name': ['Billy', 'Brian', 'Bran', 'Bryce', 'Betty'], 
        'last_name': ['Bonder', 'Black', 'Balwner', 'Brice', 'Btisan']}

raw_data_3 = {
        'subject_id': ['1', '2', '3', '4', '5', '7', '8', '9', '10', '11'],
        'test_id': [51, 15, 15, 61, 16, 14, 15, 1, 61, 16]}
df1 = pd.DataFrame(raw_data_1)
df2 = pd.DataFrame(raw_data_2)
df3 = pd.DataFrame(raw_data_3)

1. Join the two dataframes along rows and assign all_data

沿着行连接两个dataframes并分配all_data
有道翻译,真厉害

就是说把2个DataFrame拼接起来
在上一篇,我们其实使用了append函数

df1.append(df2)

就像这样,直接追加,这里又有一个新的函数可以使用
pandas.concat

pd.concat([df1,df2])

为了忽略index,使用

pd.concat([df1,df2] , ignore_index=True)

2. Join the two dataframes along columns and assing to all_data_col

和上一题类似,这一回是根据column来拼接

pd.concat([df1 , df2] , axis=1)

3. Merge all_data and data3 along the subject_id value

把上面那个拼接好的DataFrame,再和data3拼一下

这里要用到merge函数

pd.merge(all_data , df3 , on='subject_id')

关于函数使用,我一会儿单独写一篇介绍下,这里的merge默认是内关联,就和SQL中的join一样

4. Merge only the data that has the same 'subject_id' on both data1 and data2

现在把df1和df2关联起来

继续使用merge就行了

pd.merge(df1 , df2 , on='subject_id')

5. Merge all values in data1 and data2, with matching records from both sides where available

这样我想起了SQL中的full join
这里也有参数配置
outer: use union of keys from both frames, similar to a SQL full outer join; sort keys lexicographically.

pd.merge(df1 , df2 , on='subject_id' , how='outer')

好了,练习题,就到这里了,主要和2个函数有关,函数使用,请参考下一篇吧

相关文章

网友评论

      本文标题:pandas实例-Merge-Fictitious Names

      本文链接:https://www.haomeiwen.com/subject/tsslghtx.html