This article will tell you how to use pandas DataFrame object’s loc, iloc, and iat attribute to query a DataFrame object’s rows, columns, and elements at a special index(row_index, column_index) with examples.
1. Example Data.
- In this example, we will create a DataFrame object that contains the below data.
- Below is the python source code that generates the DataFrame data.
import pandas as pd ''' This function create a python pandas DataFrame object with a 2 dimension array. ''' def create_dataframe_from_2_dimension_array(): pd.set_option('display.unicode.east_asian_width', True) ''' Define a 2 dimension array, each element in the array's first dimension is a list. It contains the position number, programming language and operating system ''' data = [[1, 'python', 'Windows'], [5, 'java', 'Linux'],[8, 'c++', 'macOS']] # Define the column list, each element in the list is the column label. columns = ['Position', 'Programming Language', 'Operating System'] name = ['USA', 'UK', 'CA'] df = pd.DataFrame(data=data, index=name, columns=columns) print(df, '\r\n') # Return the python pandas DataFrame object. return df if __name__ == '__main__': create_dataframe_from_2_dimension_array()
- Below is the DataFrame object data. We can see that there are 3 columns ( Position, Programming Language, Operating System ). The row index name is USA, UK, CA.
Position Programming Language Operating System USA 1 python Windows UK 5 java Linux CA 8 c++ macOS
2. Pandas DataFrame loc, iloc, iat Attribute.
- loc(row_name_list, column_name_list): Get the DataFrame object’s data by row name label list, and column name label list.
- iloc(row_index_number_list, column_index_number_list): Get the DataFrame object’s data by row index number list, and column index number list.
- iat(element_row_index_number, element_column_index_number): Get the DataFrame object located at the provided row index number and column index number ( not a number list).
3. PandasDataFrame loc, iloc, iat Attribute Examples.
- query_one_row().
import pandas as pd def query_one_row(): df = create_dataframe_from_2_dimension_array() # query the data row with index name 'USA' row_data_usa = df.loc['USA'] print('row_data_usa = \r\n', row_data_usa, '\r\n') # query the data row with the row index number 2. row_data_number_2 = df.iloc[2] print('row_data_number_2 = \r\n', row_data_number_2, '\r\n') if __name__ == '__main__': query_one_row() ============================================================================ Below is the execution output. Position Programming Language Operating System USA 1 python Windows UK 5 java Linux CA 8 c++ macOS row_data_usa = Position 1 Programming Language python Operating System Windows Name: USA, dtype: object row_data_number_2 = Position 8 Programming Language c++ Operating System macOS Name: CA, dtype: object
- query_multiple_rows().
import pandas as pd def query_multiple_rows(): df = create_dataframe_from_2_dimension_array() # query 2 rows with the provided row index name list. row_data_usa_uk = df.loc[['USA', 'UK']] print('row_data_usa_uk = \r\n', row_data_usa_uk, '\r\n') # query 2 rows with the provided row index number list. row_data_number_1_2 = df.iloc[[1, 2]] print('row_data_number_1_2 = \r\n', row_data_number_1_2, '\r\n') # query 1 row with the provided row index number list slice. row_data_number_1 = df.iloc[1:2] print('row_data_number_1 = \r\n', row_data_number_1, '\r\n') if __name__ == '__main__': query_multiple_rows() ===================================================================================== Below is the execution output. Position Programming Language Operating System USA 1 python Windows UK 5 java Linux CA 8 c++ macOS row_data_usa_uk = Position Programming Language Operating System USA 1 python Windows UK 5 java Linux row_data_number_1_2 = Position Programming Language Operating System UK 5 java Linux CA 8 c++ macOS row_data_number_1 = Position Programming Language Operating System UK 5 java Linux
- query_one_column().
import pandas as pd def query_one_column(): df = create_dataframe_from_2_dimension_array() # set the first arguments to : to query all rows's 'Programming Language' column. column_data_pl = df.loc[:, ['Programming Language']] print('column_data_pl = \r\n', column_data_pl, '\r\n') # the first argument is the row index number list( : means all), the second argument is the column index number. column_data_os = df.iloc[:, [2]] print('column_data_os = \r\n', column_data_os, '\r\n') if __name__ == '__main__': query_one_column() ================================================================================= Below is the execution output. Position Programming Language Operating System USA 1 python Windows UK 5 java Linux CA 8 c++ macOS column_data_pl = Programming Language USA python UK java CA c++ column_data_os = Operating System USA Windows UK Linux CA macOS
- query_multiple_columns().
import pandas as pd def query_multiple_columns(): df = create_dataframe_from_2_dimension_array() # : means all rows, ['Programming Language', 'Operating System'] means the 2 columns. column_data_pl_os = df.loc[:, ['Programming Language', 'Operating System']] print('column_data_pl_os = \r\n', column_data_pl_os, '\r\n') # : means all rows, 1:2 means column index number 1 only not include column with index number 2. column_data_1 = df.iloc[:, 1:2] print('column_data_1 = \r\n', column_data_1, '\r\n') column_data_pl_os = df.iloc[:,1:] print('column_data_pl_os = \r\n', column_data_pl_os, '\r\n') column_data_pos_pl = df.iloc[:,[0, 1]] print('column_data_pos_pl = \r\n', column_data_pos_pl, '\r\n') if __name__ == '__main__': query_multiple_columns() ======================================================================================== Below is the above code execution result. Position Programming Language Operating System USA 1 python Windows UK 5 java Linux CA 8 c++ macOS column_data_pl_os = Programming Language Operating System USA python Windows UK java Linux CA c++ macOS column_data_1 = Programming Language USA python UK java CA c++ column_data_pl_os = Programming Language Operating System USA python Windows UK java Linux CA c++ macOS column_data_pos_pl = Position Programming Language USA 1 python UK 5 java CA 8 c++
- query_by_row_and_column().
import pandas as pd def query_by_row_and_column(): df = create_dataframe_from_2_dimension_array() # query DataFrame with both rows and columns specified by row name list and column name list. rows_and_columns = df.loc[['USA','UK'], ['Programming Language', 'Operating System']] print('rows_and_columns =\r\n ', rows_and_columns, '\r\n') # query DataFrame with both rows and columns specified by row index number and column index number list. rows_and_columns = df.iloc[2, :2] print('rows_and_columns =\r\n ', rows_and_columns) if __name__ == '__main__': query_by_row_and_column() =============================================================================================== Below is the above source code execution result. Position Programming Language Operating System USA 1 python Windows UK 5 java Linux CA 8 c++ macOS rows_and_columns = Programming Language Operating System USA python Windows UK java Linux rows_and_columns = Position 8 Programming Language c++ Name: CA, dtype: object
- query_at_index().
import pandas as pd def query_at_index(): df = create_dataframe_from_2_dimension_array() # query the DataFrame element at the specified row & column index number. column_data_at_index_1_1 = df.iat[1, 1] print('column_data_at_index = \r\n', column_data_at_index_1_1, '\r\n') if __name__ == '__main__': query_at_index() ===================================================================================== Below is the above example source code execution result. Position Programming Language Operating System USA 1 python Windows UK 5 java Linux CA 8 c++ macOS column_data_at_index = java
- query_bigger_than().
import pandas as pd def query_bigger_than(): df = create_dataframe_from_2_dimension_array() # query the data at provided rows and column, and then compare the column value to an integer. print(df.loc[:, 'Position'] > 5) if __name__ == '__main__': query_bigger_than() ==================================================================================== Above example source code execution result. Position Programming Language Operating System USA 1 python Windows UK 5 java Linux CA 8 c++ macOS USA False UK False CA True Name: Position, dtype: bool