This article will show you some examples of how to add, update, delete rows & columns in the pandas DataFrame object.
1. The Example Original DataFrame Data.
- The below python source code is used to generate the original DataFrame object that will be modified in the later examples.
import pandas as pd ''' This function create a python pandas DataFrame object with a 2 dimension array. ''' def create_dataframe_from_2_dimension_array(): pd.set_option('display.unicode.east_asian_width', True) ''' Define a 2 dimension array, each element in the array's first dimension is a list. It contains the position number, programming language and operating system ''' data = [[1, 'python', 'Windows'], [5, 'java', 'Linux'],[8, 'c++', 'macOS']] # Define the column list, each element in the list is the column label. columns = ['Position', 'Programming Language', 'Operating System'] name = ['USA', 'UK', 'CA'] df = pd.DataFrame(data=data, index=name, columns=columns) print('========================== original DataFrame data ================================') print(df, '\r\n') # Return the python pandas DataFrame object. return df if __name__ == '__main__': create_dataframe_from_2_dimension_array()
- When you run the above code, it will output the below DataFrame object’s rows and columns data.
========================== original DataFrame data ================================ Position Programming Language Operating System USA 1 python Windows UK 5 java Linux CA 8 c++ macOS
2. Add Columns & Rows In Pandas DataFrame Object.
2.1 Add Columns.
- You can add a column to the DataFrame object in the following ways.
- Add a column to DataFrame directly.
- Add the DataFrame column with the DataFrame loc attribute.
- Add the DataFrame column with the DataFrame object’s insert method.
- Below is the example source code, the function name is add_columns_in_dataframe().
import pandas as pd def add_columns_in_dataframe(): df = create_dataframe_from_2_dimension_array() print('\r\n========================== add column Database-1 at the end of the DataFrame columns ================================\r\n') # add a column by providing the column name and data list. df['Database-1'] = ['Oracle', 'MySQL', 'SQLite'] print(df) print('\r\n========================== use DataFrame loc attribute to add column Database-2 at the end of the DataFrame columns ================================\r\n') # add column with the DataFrame loc attribute. df.loc[:, 'Database-2'] = ['MongoDB', 'SQL Server', 'Access'] print(df) print('\r\n========================== insert column Coding-Language after DataFrame object first column ================================\r\n') # add column with the DataFrame insert method. df.insert(1, 'Coding-Language', ['C++', 'C', 'C#']) print(df) if __name__ == '__main__': add_columns_in_dataframe()
- Below is the above example source code output.
========================== original DataFrame data ================================ Position Programming Language Operating System USA 1 python Windows UK 5 java Linux CA 8 c++ macOS ========================== add column Database-1 at the end of the DataFrame columns ================================ Position Programming Language Operating System Database-1 USA 1 python Windows Oracle UK 5 java Linux MySQL CA 8 c++ macOS SQLite ========================== use DataFrame loc attribute to add column Database-2 at the end of the DataFrame columns ================================ Position Programming Language Operating System Database-1 Database-2 USA 1 python Windows Oracle MongoDB UK 5 java Linux MySQL SQL Server CA 8 c++ macOS SQLite Access ========================== insert column Coding-Language after DataFrame object first column ================================ Position Coding-Language ... Database-1 Database-2 USA 1 C++ ... Oracle MongoDB UK 5 C ... MySQL SQL Server CA 8 C# ... SQLite Access [3 rows x 6 columns]
2.2 Add Rows.
- There are 2 ways to add rows into pandas DataFrame object.
- Use the DataFrame object’s loc attribute.
- Use the DataFrame object’s append method.
- Below is the example function add_rows_in_dataframe().
import pandas as pd def add_rows_in_dataframe(): df = create_dataframe_from_2_dimension_array() print('\r\n========================== add one row in DataFrame ================================\r\n') # add DataFrame row with the loc attribute. df.loc['CN'] = ['2', 'Python', 'Windows'] print(df) print('\r\n========================== add multiple rows in DataFrame ================================\r\n') # create a python dictionary object. dict_insert = {'Position':[3, 6, 9], 'Programming Language':['Go', 'R', 'Php'], 'Operating System':['Linux', 'macOS', 'Windows']} name = ['JP', 'TW', 'KO'] df_1 = pd.DataFrame(data = dict_insert, index = name) # append the new DataFrame object to the existing one. df = df.append(df_1) print(df) if __name__ == '__main__': add_rows_in_dataframe()
- Below is the above example execution result.
========================== original DataFrame data ================================ Position Programming Language Operating System USA 1 python Windows UK 5 java Linux CA 8 c++ macOS ========================== add one row in DataFrame ================================ Position Programming Language Operating System USA 1 python Windows UK 5 java Linux CA 8 c++ macOS CN 2 Python Windows ========================== add multiple rows in DataFrame ================================ Position Programming Language Operating System USA 1 python Windows UK 5 java Linux CA 8 c++ macOS CN 2 Python Windows JP 3 Go Linux TW 6 R macOS KO 9 Php Windows
3. Update DataFrame Object Data.
3.1 Update DataFrame Object Column Title Labels.
- You can update the DataFrame object’s column by it’s columns attribute or rename method.
import pandas as pd def update_column_title_in_dataframe(): df = create_dataframe_from_2_dimension_array() print('\r\n========================== update DataFrame column title ================================\r\n') # update the DataFrame columns name by it's columns attribute. df.columns=['Pos', 'PL','OS'] print(df) print('\r\n========================== update DataFrame multiple columns title by rename method ================================\r\n') dict_column_title_change = {'Pos':'Order', 'PL':'Coding Language','OS':'Operating System'} # update the DataFrame columns by it's rename method. df.rename(columns=dict_column_title_change, inplace=True) print(df) if __name__ == '__main__': update_column_title_in_dataframe()
- Below is the above example execution output.
========================== original DataFrame data ================================ Position Programming Language Operating System USA 1 python Windows UK 5 java Linux CA 8 c++ macOS ========================== update DataFrame column title ================================ Pos PL OS USA 1 python Windows UK 5 java Linux CA 8 c++ macOS ========================== update DataFrame multiple columns title by rename method ================================ Order Coding Language Operating System USA 1 python Windows UK 5 java Linux CA 8 c++ macOS
3.2 Update DataFrame Object Row Index Labels.
- You can use the DataFrame object’s index attribute or rename method to update the DataFrame object’s row index labels.
import pandas as pd def update_row_index_title_in_dataframe(): df = create_dataframe_from_2_dimension_array() print('\r\n========================== update DataFrame row index label ================================\r\n') # update the DataFrame object's row index label by it's index attribute. df.index = list('123') print(df) # get the original DataFrame object again. df = create_dataframe_from_2_dimension_array() print('\r\n========================== update DataFrame multiple rows index title by rename method ================================\r\n') dict_row_index_title_change = {'USA':'7', 'UK':'8','CA':'9'} # call DataFrame object's rename method to update the row label, axis=0 means update rows. df.rename(dict_row_index_title_change, axis=0, inplace=True) print(df) if __name__ == '__main__': update_row_index_title_in_dataframe()
- Below is the above example execution result.
========================== original DataFrame data ================================ Position Programming Language Operating System USA 1 python Windows UK 5 java Linux CA 8 c++ macOS ========================== update DataFrame row index label ================================ Position Programming Language Operating System 1 1 python Windows 2 5 java Linux 3 8 c++ macOS ========================== original DataFrame data ================================ Position Programming Language Operating System USA 1 python Windows UK 5 java Linux CA 8 c++ macOS ========================== update DataFrame multiple rows index title by rename method ================================ Position Programming Language Operating System 7 1 python Windows 8 5 java Linux 9 8 c++ macOS
3.3 Update DataFrame Object Rows & Columns Data.
- You can use the DataFrame object’s loc or iloc attribute to update the DataFrame object’s data.
import pandas as pd def update_row_column_data_in_dataframe(): df = create_dataframe_from_2_dimension_array() print('\r\n========================== update DataFrame entire row data ================================\r\n') df.loc['CA'] = [1, 'Python','Linux'] df.iloc[1,:] = [2, 'Java & Python', 'Unix'] print(df) print('\r\n========================== update DataFrame entire column data ================================\r\n') df.loc[:, 'Operating System'] = ['macOS', 'Linux', 'Windows'] df.iloc[:, 1] = ['Java', 'Python', 'C++'] print(df) print('\r\n========================== update DataFrame data by row and column index ================================\r\n') df.loc['CA', 'Programming Language'] = 'Python & JavaScript' df.iloc[1, 1] = 'Python & R' print(df) if __name__ == '__main__': update_row_column_data_in_dataframe()
- Below is the above example execution output.
========================== original DataFrame data ================================ Position Programming Language Operating System USA 1 python Windows UK 5 java Linux CA 8 c++ macOS ========================== update DataFrame entire row data ================================ Position Programming Language Operating System USA 1 python Windows UK 2 Java & Python Unix CA 1 Python Linux ========================== update DataFrame entire column data ================================ Position Programming Language Operating System USA 1 Java macOS UK 2 Python Linux CA 1 C++ Windows ========================== update DataFrame data by row and column index ================================ Position Programming Language Operating System USA 1 Java macOS UK 2 Python & R Linux CA 1 Python & JavaScript Windows
4. Delete DataFrame Object Rows & Columns Data.
- Call the DataFrame object’s drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, error=’raise’) method.
- labels: row or column label.
- axis: axis=0 means drop by row, axis=1 means drop by column, the default value is 0.
- index: the dropped row label array.
- columns: the dropped column label array.
- inplace: True means changing the original DataFrame object directly, False means returning a new DataFrame object.
4.1 Drop DataFrame Object Columns Data.
- delete_column_in_dataframe().
import pandas as pd def delete_column_in_dataframe(): df = create_dataframe_from_2_dimension_array() print('\r\n========================== drop DataFrame "Position" column data ================================\r\n') # axis=1 means to drop column,so the first parameter is the column name. df.drop('Position', axis=1, inplace=True) print(df) print('\r\n========================== drop DataFrame "Programming Language" column data ================================\r\n') # specify the dropped column name to the columns parameter to drop it. df.drop(columns='Programming Language', inplace=True) print(df) print('\r\n========================== drop DataFrame "Operating System" column data ================================\r\n') # specify the dropped column name to the columns parameter to drop it. df.drop(labels='Operating System', axis=1, inplace=True) print(df) if __name__ == '__main__': delete_column_in_dataframe()
- Below is the above example execution result.
========================== original DataFrame data ================================ Position Programming Language Operating System USA 1 python Windows UK 5 java Linux CA 8 c++ macOS ========================== drop DataFrame "Position" column data ================================ Programming Language Operating System USA python Windows UK java Linux CA c++ macOS ========================== drop DataFrame "Programming Language" column data ================================ Operating System USA Windows UK Linux CA macOS ========================== drop DataFrame "Operating System" column data ================================ Empty DataFrame Columns: [] Index: [USA, UK, CA]
4.2 Drop DataFrame Object Rows Data.
- delete_row_in_dataframe().
import pandas as pd def delete_row_in_dataframe(): df = create_dataframe_from_2_dimension_array() print('\r\n========================== drop DataFrame "USA" row data ================================\r\n') # axis=0 means to drop row,so the first parameter is the row name. df.drop('USA', axis=0, inplace=True) print(df) print('\r\n========================== drop DataFrame "UK" row data ================================\r\n') # specify the dropped row name to the labels parameter to drop it. df.drop(labels='UK', inplace=True) print(df) print('\r\n========================== drop DataFrame "CA" row data ================================\r\n') # specify the dropped column name to the columns parameter to drop it. df.drop(labels='CA', axis=0, inplace=True) print(df) if __name__ == '__main__': delete_row_in_dataframe()
- Below is the above source code execution output.
========================== original DataFrame data ================================ Position Programming Language Operating System USA 1 python Windows UK 5 java Linux CA 8 c++ macOS ========================== drop DataFrame "USA" row data ================================ Position Programming Language Operating System UK 5 java Linux CA 8 c++ macOS ========================== drop DataFrame "UK" row data ================================ Position Programming Language Operating System CA 8 c++ macOS ========================== drop DataFrame "CA" row data ================================ Empty DataFrame Columns: [Position, Programming Language, Operating System] Index: []