The DataFrames we work with in real life are quite large and contain lots of columns. In some cases, it is not practical to visually check the column names and we want them in a list.
In this short how-to article, we will learn how to create a list from column names in Pandas and PySpark DataFrames.
Pandas
The columns method returns an Index object which contains all the column names. It can be converted to a list by using the list constructor or the tolist method.
# with list constructor
col_list = list(df.columns)
# with tolist method
col_list = df.columns.tolist()
PySpark
The columns method in PySpark returns a list of columns. Thus, we do not need an additional operation.
col_list = df.columns
This question is also being asked as:
- Pandas column names to list
- Displaying the column names of a DataFrame