Loading...
Loading...
Weekly AI insights —
Real strategies, no fluff. Unsubscribe anytime.
Written by Gareth Simono, Founder and CEO of Agentik {OS}. Full-stack developer and AI architect with years of experience shipping production applications across SaaS, mobile, and enterprise platforms. Gareth orchestrates 267 specialized AI agents to deliver production software 10x faster than traditional development teams.
Founder & CEO, Agentik {OS}
Battle-tested patterns for using Claude Code in production apps. Project setup, CLAUDE.md config, testing, and deployment automation that actually works.

I've used Claude Code on over a dozen production applications. Some went smoothly from day one. Others were a mess until I figured out what I was doing wrong.
The difference was never Claude Code's capability. It was how I set up the project for it.
After all those projects, I have a clear picture of what separates the teams that get extraordinary results from the teams that get mediocre ones. It comes down to preparation, not the model.
The single most impactful thing you can do before writing a line of code: write a thorough CLAUDE.md file. Think of it as your project's operating manual for AI agents. Not optional documentation that lives in a corner. The difference between an agent that guesses and an agent that knows.
I've seen CLAUDE.md files that are two paragraphs. I've seen ones that are five pages. The five-page ones produce dramatically better agent output. Every time.
Here is what a production-grade CLAUDE.md contains.
Architecture decisions with their rationale. Not just "we use Convex" but "we use Convex instead of Supabase because our data access patterns are highly relational but query-dynamic. We need reactive queries without building WebSocket infrastructure manually." When the agent understands the why, it makes better how decisions.
Coding conventions, specified precisely. "Use camelCase for variables" is too vague. Production CLAUDE.md reads like this:
## Coding Conventions
- Boolean variables: always use is/has/should prefix (isLoading, hasError, shouldRedirect)
- Async functions: always use explicit return types
- Error handling: always use our custom AppError class, never throw raw strings
- Database queries: always through the repository layer, never direct from components
- Environment variables: always validate at startup with Zod, never read process.env inlineThese constraints prevent entire categories of mistakes. The agent follows them automatically.
Testing requirements, non-negotiable. "Unit tests for all business logic. Integration tests for all API routes. Accessibility tests for all new components. Tests must pass before any PR description mentions the feature as complete."
Deployment procedures, fully specified. Build commands. Environment variable requirements. Deployment targets. Rollback procedures. The agent should be able to deploy without asking a single question.
Known gotchas. "The Convex client throws a specific error format when called from server components. Always wrap Convex calls in a try-catch that handles ConvexError." Every project accumulates these. Document them.
A CLAUDE.md that tells the agent everything it needs to know is worth more than a week of prompt engineering. Write it once. Maintain it obsessively.
Claude Code navigates well-organized codebases like a fish in water. It struggles with messy ones the same way every developer does, except it has less ability to ask clarifying questions.
Rules I follow on every project without exception:
Files under 500 lines. If a component file approaches that limit, it needs splitting. This is not just for AI. It is good engineering. But agents benefit disproportionately because they can hold an entire file in context without truncation. A 2,000-line component file means the agent cannot see the whole thing at once and makes decisions based on partial information.
Consistent naming that reflects purpose. Every route file follows the same pattern. Every component file follows the same pattern. Every utility follows the same pattern. Consistency lets the agent predict where things are and how they should look. Deviation creates confusion.
Clear separation of concerns. Business logic separate from UI logic. Database queries separate from business logic. API handling separate from data transformation. When responsibilities are cleanly separated, the agent modifies the right file every time.
src/
app/ # Next.js App Router pages
(dashboard)/ # Route group
layout.tsx
page.tsx
api/ # API routes
users/
route.ts
components/
ui/ # shadcn/ui components (never modify)
features/ # Feature-specific components
user-profile/
index.tsx
UserProfileHeader.tsx
UserProfileForm.tsx
types.ts
lib/
repositories/ # Database access layer
user.repository.ts
services/ # Business logic layer
user.service.ts
utils/ # Pure utility functions
types/ # Shared TypeScript types
This structure is opinionated. It is opinionated deliberately. The agent knows exactly where every piece of code belongs without thinking about it.
I was skeptical about AI-generated tests. My concern was that they would test what the code does rather than what it should do. That they would be shallow.
Then I read the coverage reports.
Claude Code generates tests I would not have thought to write. Edge cases with null inputs. Boundary conditions on pagination. Race conditions in async operations. Timezone mismatches that manifest as off-by-one errors on specific dates. The agent thinks about failure modes more systematically than I do because it does not get bored, does not get impatient, and does not assume anything works.
The key practice: configure test generation alongside feature development, not after. When the agent writes the feature and the tests in the same session, the tests reflect the actual behavior of the code. Tests written after the fact tend to test what the developer thinks the code does.
// Example: agent-generated test for a user creation endpoint
describe('POST /api/users', () => {
describe('successful creation', () => {
it('creates a user with valid data', async () => {
const response = await request(app)
.post('/api/users')
.send({ email: 'test@example.com', name: 'Test User' })
.expect(201);
expect(response.body).toMatchObject({
id: expect.any(String),
email: 'test@example.com',
name: 'Test User',
createdAt: expect.any(String),
});
});
it('returns the user without sensitive fields', async () => {
const response = await request(app)
.post('/api/users')
.send({ email: 'test@example.com', name: 'Test User' });
expect(response.body).not.toHaveProperty('passwordHash');
expect(response.body).not.toHaveProperty('apiKey');
});
});
describe('validation errors', () => {
it('rejects missing email', async () => {
await request(app)
.post('/api/users')
.send({ name: 'Test User' })
.expect(400);
});
it('rejects invalid email format', async () => {
await request(app)
.post('/api/users')
.send({ email: 'notanemail', name: 'Test User' })
.expect(400);
});
it('rejects email longer than 254 characters', async () => {
const longEmail = 'a'.repeat(250) + '@example.com';
await request(app)
.post('/api/users')
.send({ email: longEmail, name: 'Test User' })
.expect(400);
});
it('rejects duplicate email with appropriate error message', async () => {
await createUser({ email: 'existing@example.com' });
const response = await request(app)
.post('/api/users')
.send({ email: 'existing@example.com', name: 'New User' })
.expect(409);
expect(response.body.error).toContain('email');
});
});
describe('authentication', () => {
it('requires authentication', async () => {
await request(app)
.post('/api/users')
.set('Authorization', '')
.expect(401);
});
it('requires admin role', async () => {
await request(app)
.post('/api/users')
.set('Authorization', `Bearer ${regularUserToken}`)
.expect(403);
});
});
});I did not write any of that. The agent did. It covers cases I would have missed under deadline pressure. The fact that these tests exist means regressions get caught before production.
I have strong opinions about TypeScript. Strict mode is not a stylistic preference. It is the primary mechanism through which types give you compile-time guarantees.
// tsconfig.json
{
"compilerOptions": {
"strict": true,
"noUncheckedIndexedAccess": true,
"noImplicitReturns": true,
"exactOptionalPropertyTypes": true,
"noPropertyAccessFromIndexSignature": true
}
}These settings catch entire categories of agent mistakes at compile time. The agent knows this because your CLAUDE.md says "TypeScript strict mode. Zero any. Every function has explicit return types." So it writes code that satisfies these constraints.
Zero any. Every any is a hole in your type system. One any cascades into many. The agent will try to use any when it cannot figure out the right type. Make it figure out the right type by configuring your linter to disallow any.
Explicit return types on all functions. Not for style. Because it forces the agent to commit to what a function returns. A function returning Promise<User | null> cannot silently start returning Promise<undefined>.
Zod for all external data. API inputs, environment variables, database responses. Everything that crosses a trust boundary goes through Zod validation. The agent generates Zod schemas automatically when configured correctly.
I made every mistake possible in my first six months of production Claude Code usage. Here are the ones that cost the most time.
Vague CLAUDE.md. "Follow best practices" is useless. "Use Zod for all API input validation with detailed error messages that specify which field failed and why" is useful. Every instruction in CLAUDE.md should be specific enough that there is only one way to interpret it.
Inconsistent project structure. I had one project where half the API routes used one pattern and half used another. The agent would pick whichever pattern appeared more recently in context. I spent a day standardizing everything and the output quality improved immediately.
Not reviewing early outputs line by line. In the first few sessions on a new project, review everything. Correct patterns early. The agent learns your project's patterns from the code that already exists. Bad early patterns get replicated. Good early patterns get replicated.
Skipping type definitions. The shortcut of leaving types vague or using any to move faster always costs more time than it saves. Types are the guidance system for agents. Weak types produce wandering agents.
Large context sessions without checkpoints. Very long agent sessions accumulate context that can drift from the original intent. I now checkpoint every 45-60 minutes: summarize what has been built, reset, and continue. The output quality stays higher.
The goal is a workflow where merging to main triggers automatic quality gates and production deployment. No manual steps. No "let me just check one thing before deploying."
Claude Code integrates with CI/CD pipelines naturally. The pipeline runs the build, type checker, test suite, and linter. Everything passes, it deploys. Anything fails, it blocks and reports.
# .github/workflows/deploy.yml
name: Deploy to Production
on:
push:
branches: [main]
jobs:
quality:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
- run: npm ci
- run: npm run type-check
- run: npm run lint
- run: npm run test
- run: npm run build
deploy:
needs: quality
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: vercel --prod --yes --token ${{ secrets.VERCEL_TOKEN }}Deployment confidence comes from test coverage and type safety, not from manual review. If your tests are comprehensive and your types are strict, you can deploy every commit to production safely. Claude Code gets you to that level of confidence faster than traditional development.
The upfront investment in a thorough CLAUDE.md, clean project structure, strict TypeScript, and fast test suite feels expensive. It is not.
It pays back within the first week. Every subsequent week compounds the return. A well-configured project with thorough documentation and clean architecture gets outstanding results indefinitely. A messy project with no CLAUDE.md gets mediocre results that degrade as the codebase grows.
The math is clear. Put in the setup work.
For the next level, see how autonomous coding agents fit into the full picture, and how AI-powered development workflows compound these individual practices into something transformative.
Q: What is a CLAUDE.md file and why is it important?
A CLAUDE.md file is the project operating manual for Claude Code — a comprehensive configuration file that contains architecture decisions, coding conventions, testing requirements, deployment procedures, and known gotchas. It is the single highest-leverage investment in AI-assisted development because it transforms the agent from guessing to knowing. Projects with thorough CLAUDE.md files produce dramatically better agent output than those without.
Q: What are the best practices for using Claude Code in production?
The key best practices are: write a thorough CLAUDE.md file (5+ pages, not 2 paragraphs), maintain files under 500 lines with consistent naming, use TypeScript strict mode with zero any types, configure fast testing infrastructure (under 2 minutes), use Zod for all external data validation, and review agent output line-by-line during the first few sessions on any new project.
Q: How should I structure a project for optimal Claude Code performance?
Use a clean, conventional project structure with clear separation of concerns: separate business logic from UI, database queries from business logic, and API handling from data transformation. Keep files under 500 lines, use consistent naming patterns, and organize code into predictable directories (components/features, lib/repositories, lib/services, types). Agents navigate well-organized codebases efficiently and struggle with messy ones.
Q: Should I use TypeScript strict mode with Claude Code?
Yes, TypeScript strict mode is essential, not optional. Enable strict, noUncheckedIndexedAccess, noImplicitReturns, and exactOptionalPropertyTypes. Types are the primary feedback mechanism for AI agents — they catch entire categories of mistakes at compile time. Agents with TypeScript produce dramatically better output than agents working in dynamically typed languages.
Q: Can Claude Code generate reliable tests?
Yes, Claude Code generates comprehensive tests that often cover edge cases human developers miss — null inputs, boundary conditions, race conditions, and timezone mismatches. The key is to configure test generation alongside feature development (not after) so tests reflect actual behavior. Agent-generated test suites typically achieve 80-95% coverage compared to 40-60% with manual testing.
Full-stack developer and AI architect with years of experience shipping production applications across SaaS, mobile, and enterprise. Gareth built Agentik {OS} to prove that one person with the right AI system can outperform an entire traditional development team. He has personally architected and shipped 7+ production applications using AI-first workflows.

AI Dev Workflows: How We Ship 10x Faster
Real AI development workflows combining autonomous agents, smart code review, and automated testing to ship production software at unprecedented speed.

AI Testing Automation: Way Beyond Unit Tests
AI agents generate, maintain, and evolve your test suite. From unit tests to E2E scenarios and security audits. No excuses left for skipping tests.

CI/CD with AI: Pipelines That Think
Your CI pipeline runs 2,400 tests on every commit. Most are irrelevant. AI-enhanced pipelines fix this and predict deployment failures before they happen.
Stop reading about AI and start building with it. Book a free discovery call and see how AI agents can accelerate your business.