Skip to main content

One post tagged with "Polars"

View All Tags

Polars versus Spark

· 6 min read
Hayssam Saleh
Starlake Core Team

Introduction

Polars is often compared to Spark. In this post, I will highlight the main differences and the best use cases for each in my data engineering activities.

As a Data Engineer, I primarily focus on the following goals:

  1. Parsing files, validating their input, and loading the data into the target data warehouse.
  2. Once the data is loaded, applying transformations by joining and aggregating the data to build KPIs.

However, on a daily basis, I also need to develop on my laptop and test my work locally before delivering it to the CI pipeline and then to production.

What about my fellow data scientist colleagues? They need to run their workload on production data through their favorite notebook environment.