2018年4月20日 星期五

Python - 使用條件來新增欄位,常用於將資料分組 - Pandas conditional creation of a dataframe column

Information:

System version : Windows 10 64-bit
Python version : Python 3.6.0 :: Anaconda 4.3.1 (64-bit)

Code:

import pandas as pd
import numpy as np

df1 = pd.DataFrame()
number = [1,2,3,4,5]
sex = ['male','male','female','female','female']
df1['number'] = number
df1['sex'] = sex
df1['income'] = [500,2000,500,2000,500]
print(df1)
print('='*45)

conditions = [
    (df1['sex']=='male') & (df1['income']>1000),
    (df1['sex']=='male') & (df1['income']<1000),
    (df1['sex']=='female') & (df1['income']>1000)
]                    

choices = ['male-high_income','male-low_income', 'female-high_income']
df1['group'] = np.select(conditions, choices, default='no_group')
print(df1)

Result:

   number     sex  income
0       1    male     500
1       2    male    2000
2       3  female     500
3       4  female    2000
4       5  female     500
=============================================
   number     sex  income               group
0       1    male     500     male-low_income
1       2    male    2000    male-high_income
2       3  female     500            no_group
3       4  female    2000  female-high_income
4       5  female     500            no_group

沒有留言:

張貼留言