Everything you need for AI Performance in one platform.
We decided that Docs should have prime location.
Fundamentals of ML observability
Metrics, feature importance and more
We’re excited ???? to share that Forbes has named Aporia a Next Billion-Dollar Company. This recognition comes on the heels of our recent $25 million Series A funding and is a huge testament that Aporia’s mission and the need for trust in AI are more relevant than ever. We are very proud to be listed […]
In this short how-to article, we will learn how to group DataFrame rows into a list in Pandas and PySpark. Groups will be based on the distinct values in a column. The values will be taken from another column and combined into a list.
The rows are grouped using the groupby function and then we will apply the list constructor to the column that contains the values. We can perform this task as follows:
Members = df.groupby("Team", as_index=False).agg( Members = ("Member", list) )
To do this operation in PySpark, we can use the collect_list function along with the groupby.
from pyspark.sql import functions as F Members = df.groupby("Team").agg(F.collect_list("Member"))