Ensure reliable, on-target Gen-AI responses
Protect intellectual property and ensure compliance
Safely navigate GenAI: Detect and avoid off-topic conversations
Keep interactions tasteful, filter NSFW content
Secure company data: Detect and anonymize sensitive info
Shield data from smart LLM SQL queries
Detect and filter out malicious input for prompt integrity
Safeguard LLM: Keep model instructions confidential
Explore LLM interactions for user engagement insights
Track costs, queries, and tokens for budget control
Tailored production ML dashboards to monitor key metrics
Real-time ML monitoring to detect drifts and monitor predictions
Direct Data Connectors: Monitor and observe billions of predictions
Root Cause Analysis to gain actionable insights and explore model predictions
LLM Observability for your ML: Monitor, troubleshoot and enhance efficiency
Explainable AI to understand, ensure trust, and communicate predictions
Tailored Aporia Observe for your models: Integrate any model in minutes
Integrate Aporia to every LLM and tool in the market
Empower tabular models with Aporia
Streamline AI Act compliance with Aporia Guardrails and Observe
Unlock potential in CV & NLP models
A team of Cybersecurity, Compliance, and AI Experts that ensures Aporia users top-tier protection
Optimize LLM & GenAI apps for peak performance
Your go-to resource for Aporia insights and guides
Integrate Aporia to your LLM as a Proxy with Guardrail Policies
Integrate Aporia with Your Firewall for AI Tool Security
Easily Integrate and Monitor ML Models in Production
Define ML Observability Resources as Code with SDK
Learn about AI control from our experts
Your dictionary for AI terminology.
Step-by-step guides to master AI
Dive into our GitHub projects and examples
Unlock AI secrets with our eBooks
Elevate your GenAI and LLM knwoledge
Navigate the core of ML observability
Metrics, feature importance and more
DataFrames are great for data cleaning, analysis, and visualization. However, they cannot be used in storing or transferring data. Once we are done with our analysis, we need to write the DataFrame into a file.
One of the commonly used file formats for this purpose is CSV. In this how-to article, we will learn how to write Pandas and PySpark DataFrames to a CSV file
The to_csv function can be used for this task. We just need to specify the file path.
If we use the default parameter values, row names (i.e. index) are written in the file so when we read back the csv file into a DataFrame, the index will be shown as a new column. We can change this behavior by setting the index parameter as False.
We also have the option to write only some of the columns, which is useful if there are some redundant columns in the DataFrame. The list of columns to be written is given to the columns parameter.
We can use the csv method provided by the DataFrameWriter class.
This line of code creates a folder named results and writes the csv file in it. PySpark writes files in partitions by default. We can have the data written in a single csv file by using the repartition method.
Unlike Pandas, PySpark does not write the header (i.e. column names) to the csv file. We can change this behavior by setting the header option as True.