Replace NaN Values with Zeros in Pandas or Pyspark DataFrame

Back to Blog

NaN values are also called missing values and simply indicate the data we do not have. We do not like to have missing values in a dataset but it’s inevitable to have them in some cases. Therefore, we need to learn how to handle them properly.

There are different ways of handling missing values. In this how-to article, we will learn how to replace NaN values by zeros in Pandas and PySpark DataFrames.

Pandas

The fillna function can be used for replacing missing values. We just need to write the value to be used as the replacement inside the function.

# Replace all missing values in the DataFrame
df = df.fillna(0)

# Replace missing values in a specific column
df["f2"] = df["f2"].fillna(0)

PySpark

We can either use fillna or na.fill function. They are aliases and return the same results.

# Replace all missing values in the DataFrame
df = df.na.fill(0)

# Replace missing values in a specific column
df = df.na.fill(0, subset=["f2"])

This question is also being asked as:

How to replace NaN values in Python?
How to replace NaN value with some other value in Pandas?

People have also asked for:

Aporia Team

Sometimes, writing is a joint effort.

building a RAG app?

Read about Aporia’s AI Guardrails

Learn more

Pandas

PySpark

This question is also being asked as:

People have also asked for:

On this page

Related Articles

How to Build an End-To-End ML Pipeline With Databricks & Aporia

How to Convert a Dictionary to a DataFrame

How to Delete Rows Based on Column Values in a DataFrame

How to Convert the Index of a DataFrame to a Column

How to Write a DataFrame to a CSV File

How to Sort a DataFrame by Values in a Column

How to Count the Frequency that a Value Occurs in a DataFrame Column

How to Count the NaN Values in a DataFrame

How to Replace NaN Values by Zeros in a DataFrame

Pandas

PySpark

This question is also being asked as:

People have also asked for:

On this page

Related Articles