Ad
Skycle.appSkycle.appWho are your best interactions on Bluesky ?
Generate Now

Best Self-hosted Distributed Data Engines

A curated collection of the best self-hosted distributed Data Engines like Apache Spark, Trino, or Dremio execute data processing tasks across clusters of machines. They handle batch and streaming workloads with support for SQL, Python, or custom transformations. These engines power ETL, machine learning pipelines, and big data processing with high performance at scale.

 

 
 
  • Stars


  • Forks


  • Last commit


 

 
 
  • Stars


  • Forks


  • Last commit


 

 
 
  • Stars


  • Forks


  • Last commit


 

 
 
  • Stars


  • Forks


  • Last commit


 

 
 
  • Stars


  • Forks


  • Last commit


 

 
 
  • Stars


  • Forks


  • Last commit


Command Menu

Best Self-hosted Distributed Data Engines