The most advanced ML Observability product in the market
Building an ML platform is nothing like putting together Ikea furniture; obviously, Ikea is way more difficult. However, they both, similarly, include many different parts that help create value when put together. As every organization sets out on a unique path to building its own machine learning platform, taking on the project of building a […]
Start integrating our products and tools.
We’re excited 😁 to share that Forbes has named Aporia a Next Billion-Dollar Company. This recognition comes on the heels of our recent $25 million Series A funding and is a huge testament that Aporia’s mission and the need for trust in AI are more relevant than ever. We are very proud to be listed […]
Dictionary is a built-in data structure of Python, which consists of key-value pairs. In this short how-to article, we will learn how to convert a dictionary to a DataFrame in Pandas and PySpark.
The DataFrame constructor can be used to create a DataFrame from a dictionary. The keys represent the column names and the dictionary values become the rows.
import pandas as pd # create a dictionary A = { "name": ["John", "Jane"], "age": [20, 24] } # convert to a DataFrame df = pd.DataFrame(A)
Although there exist some alternatives, the most practical way of creating a PySpark DataFrame from a dictionary is to first convert the dictionary to a Pandas DataFrame and then converting it to a PySpark DataFrame.
import pandas as pd spark = SparkSession.builder.getOrCreate() # create a dictionary A = { "name": ["John", "Jane"], "age": [20, 24] } # convert to a Pandas DataFrame df = pd.DataFrame(A) # from Pandas to PySpark df_pyspark = spark.createDataFrame(df)