Pachyderm logo

Pachyderm

Performs distributed computations using Docker containers.

Made by Hewlett Packard Enterprise

  • Email/Help Desk

  • Knowledge Base

  • FAQs/Forum

What is Pachyderm?

Pachyderm is an open source data science platform that combines Data Lineage with End-to-End Pipelines on Kubernetes, engineered for the enterprise. It is a cost-effective solution that enables data engineering teams to automate complex pipelines with sophisticated data transformations across any type of data. Pachyderm's unique approach provides parallelized processing of multi-stage, language-agnostic pipelines with data versioning and data lineage tracking, delivering the ultimate CI/CD engine for data Pachyderm is the leader in data versioning and pipelines for MLOps, providing the data foundation that allows data science teams to automate and scale their machine learning lifecycle while guaranteeing reproducibility. With over $40 million in three rounds of funding from leading investors, Pachyderm offers both a commercial Pachyderm Enterprise Edition and an open source Pachyderm Community Edition, helping customers get their ML and AI projects to market faster, lower data processing and storage costs, and support strict data governance requirements

Highlights

  • Data Lineage and End-to-End Pipelines
  • Parallelized Processing of Multi-stage, Language-agnostic Pipelines
  • Data Versioning and Reproducibility for Machine Learning Lifecycle Automation

Platforms

  • Mobile iPhone
  • Cloud, SaaS, Web-based
  • Mac
  • Linux
  • Web
  • On-Premise Windows
  • Web-based
  • Mobile Android
  • Desktop Chromebook
  • Mobile iPad
  • Desktop Linux
  • On-Premise Linux
  • Desktop Mac
  • Desktop Windows

Languages

  • English

Social

Features

    • Deployed with CoreOS

    • Git-like File System

    • Microservice Architecture

    • Dockerized MapReduce