- When is this the right engagement versus a smaller scope?
- When you have a working pilot and the next step is paying customers. If you're earlier — exploring the idea, validating fit — the 2-Week Production Pilot is a better starting point. If you already have a v1 in production and need ongoing engineering support, Fractional AI Engineering is a better fit.
- What does it cost?
- Fixed fee for the full engagement, scoped on the first discovery call. Pricing depends on the size of the surface, integration complexity, and what the launch readiness gap looks like. Inference costs are passed through transparently — no markup on tokens.
- What does production mean to you?
- Real users, real money or real-stakes decisions, real monitoring. That means rate limits, retries, queues, observability, cost controls, auth, billing or entitlement plumbing, an eval that runs in CI, alerts that wake the right person, and a documented runbook. If any of that is missing, it isn't production.
- Who does the work?
- Three to four engineers from our Toronto-based team, led by Vatsal. The people who scope the engagement are the people who write the code. The full team is named on our team page — you can see and talk to them before we start.
- Do we own the code?
- Yes. Everything ships into your repository from commit one — the application code, the eval cases, the prompts, the infrastructure-as-code, the runbook. No vendor lock-in, no recurring license tax.
- What if the launch hits a wall — bad eval results, slow adoption, surprise cost?
- We share progress and risk weekly, not at the end. Bad results show up in the Friday eval report and we cut scope or change the approach in week six rather than week twelve. If the product needs to be repositioned, you hear it from us early enough to act on it.
- Where does it run?
- Default is your existing cloud — AWS, Azure, or GCP. For Canadian data residency, we deploy on AWS ca-central-1, Azure Canada Central, or Bedrock Canada. Inference can run through Anthropic, OpenAI, Google, or open-weights models depending on cost, latency, and residency requirements.
- What happens after week 12?
- Most teams keep us on a Fractional AI Engineering retainer for the next 3–6 months to ship the next set of features and respond to launch feedback. Some teams take it from there with their own engineering. Both paths are designed in from the start.