Monorepos for AI Projects: The Good, the Bad, and the Ugly

Jul 18, 2025

Recently, I met a team that used an AI‑focused monorepo containing everything: notebooks, training pipelines, microservices, and infrastructure code. In this post I share my observations on how their data scientists, engineers, and DevOps teams collaborated, and where things broke down. I also explore how KitOps helped introduce structure at a critical point: the transition from experimentation to production.

✅ The Good

When a monorepo works, it works because the alignment and velocity benefits outweigh any drawbacks. Here is what I noticed:

Shared context: Everyone, from data scientists to platform engineers, had visibility into the same repository, which fostered fast collaboration and fewer misunderstandings.
Fast iteration: A data scientist could tweak a model and then message an engineer to wire it up to an API within the same codebase.
Unified CI/CD: Teams could run pipelines for end‑to‑end tests, integrate model‑training jobs into GitHub Actions, and deploy inference microservices using the same scripts.

❌ The Bad

This setup had major flaws, some of them critical to production readiness.

No model provenance: Models trained in notebooks were often dumped into S3 buckets with ad‑hoc names. There was no versioning or traceability. The teams included their names in the model filenames, but that practice did not age well.
Reproducibility gaps: Because experiments were often driven from notebooks, they lacked pinned dependencies or runtime configuration. Rerunning a past experiment was, at best, guesswork.
Security blind spots: With no SBOMs or attestations, the security team had no idea what was running in production, creating a compliance risk.

😬 The Ugly

Some things technically “worked,” but only through tribal knowledge, individual heroics, or duct‑taped workflows.

Manual model handoffs: Data scientists pinged infrastructure engineers on Slack with pointers to model files. There was no formalized way to package a model.
Inconsistent naming conventions: Some model folders were named teamXXX_final_modelv2, while others used names like modelXX_2024_05_19. Pipelines frequently broke when a model name changed or a new model appeared.
Overloaded CI pipelines: A single Git push could retrigger training, redeploy the inference container, and run unrelated tests. The infrastructure was brittle because the monorepo lacked boundaries between experimentation and production.
Blurred ownership: When a model in production failed, nobody knew whether to call the data scientist, ML engineer, or platform SRE. The repository did not encode accountability.

🧰 How KitOps Helped

KitOps introduced structure at the artifact level without forcing the team to refactor the entire repository.

Clear handoff via ModelKit artifacts
Data scientists used the kit CLI and pykitops to export trained models as self‑contained, versioned ModelKits that included:
- Weights
- Metadata (input and output schema)
- Optional model cards as README files
- Runtime dependencies (for example, tokenizers, configuration files, and sometimes Python code)
These kits became immutable units that downstream teams could trust.
Decoupled training and inference
ModelKits were pushed to an OCI‑compatible registry where inference microservices could pull them at runtime. Training scripts no longer needed to be bundled into deployment images. The same model could be pulled into staging, production, or offline evaluation environments with confidence, allowing platform engineers to treat inference containers as cattle rather than pets.
Auditability and compliance
The team did not yet add SBOMs to ModelKits, but they recorded the monorepo SHA as an attestation with which each ModelKit created. This practice gave the security team visibility into what was running in production and where it came from, easing a key compliance bottleneck.
Standardization without a repo rewrite
The team adopted one simple convention: if a model is going to production, it must be exported as a ModelKit. That rule turned Git chaos into structured deployment boundaries.

Monorepos are a double‑edged sword. Their collaboration benefits are impressive, but scaling them, especially in AI and ML systems, requires discipline.

KitOps did not “fix” the monorepo; arguably, it did not need fixing. Instead, it created clean seams where they mattered most: at the handoff between teams and in the lifecycle from experimentation to production. That was enough.

The Software Maker

Discussion about this post

Ready for more?