The most advanced ML Observability platform
We’re super excited to share that Aporia is now the first ML observability offering integration to the Databricks Lakehouse Platform. This partnership means that you can now effortlessly automate your data pipelines, monitor, visualize, and explain your ML models in production. Aporia and Databricks: A Match Made in Data Heaven One key benefit of this […]
Start integrating our products and tools.
We’re excited 😁 to share that Forbes has named Aporia a Next Billion-Dollar Company. This recognition comes on the heels of our recent $25 million Series A funding and is a huge testament that Aporia’s mission and the need for trust in AI are more relevant than ever. We are very proud to be listed […]
Dictionary is a built-in data structure of Python, which consists of key-value pairs. In this short how-to article, we will learn how to convert a dictionary to a DataFrame in Pandas and PySpark.
The DataFrame constructor can be used to create a DataFrame from a dictionary. The keys represent the column names and the dictionary values become the rows.
import pandas as pd # create a dictionary A = { "name": ["John", "Jane"], "age": [20, 24] } # convert to a DataFrame df = pd.DataFrame(A)
Although there exist some alternatives, the most practical way of creating a PySpark DataFrame from a dictionary is to first convert the dictionary to a Pandas DataFrame and then converting it to a PySpark DataFrame.
import pandas as pd spark = SparkSession.builder.getOrCreate() # create a dictionary A = { "name": ["John", "Jane"], "age": [20, 24] } # convert to a Pandas DataFrame df = pd.DataFrame(A) # from Pandas to PySpark df_pyspark = spark.createDataFrame(df)