If you build something with an LLM that people will depend on, read the code. In fact, you’ll probably do more than that. You’ll spend 5-10 minutes knocking it back into your own style. LLMs are showing signs of adapting to local idiom, but we’re not there yet.

Newer LLMs, especially those that rely on test-time-compute, often called reasoning models, are especially good at following instructions or directives by their overloads

While these instructions became best practice across the industry, these instructions are often hard to enforce

Let’s give custom instructions to LLMs

One practice that helps me and my workflows includes asking LLMs to follow instructions as directives by respond with a directive abbreviation

Example design directive:

# Design Directives

## DESIGN-1: What Before How

Sequentially address two distinct tasks: First, precisely define *what* the program/system must achieve. Second, determine *how* it will achieve it. Avoid mixing these concerns.

## DESIGN-2: Scrutinize Requirements

Critically examine requirements. Do you need all the features requested? Simplifying the "what" is often the most effective way to reduce complexity and coding effort. Focus on what is essential.

## DESIGN-3: Design Above Code

Create the high-level design (the "how") at a level of abstraction significantly above the code. This design should be conceptual, mathematical, and independent of implementation language details like specific data structures or procedures unless they are fundamental to the *concept*.

[...]

(This example is complete, it can be run "as is")

When paired with respond directives, we enforce LLMs to follow our instructions more closely:

# Response Directives

## Before Response

Before every response, explicitly state which rule(s) or directive(s) you are applying, written as:

APPLYING RULES: RULE-1, RULE-2, RULE-3, ..., RULE-n

## After Response

After every response, explicitly state which rule(s) or directive(s) you applied, written as:

APPLIED RULES: RULE-1, RULE-2, RULE-3, ..., RULE-n

(This example is complete, it can be run "as is")

How does that look like?

Let’s imagine a user asks the following:

Hey LLM, I need to build a new e-commerce platform from scratch. It needs to handle user accounts, product listings, a shopping cart, payment processing, order history, and admin features for managing products and orders. Can you outline the architecture and tell me what technologies to use?

Example LLM Response applying your directive structure:

APPLYING RULES: DESIGN-1, DESIGN-2, DESIGN-3

Hello! That's an exciting project. Building an e-commerce platform involves many components. Before we dive into specific technologies or a detailed architecture (the "how"), let's ensure we have a solid grasp of the requirements (the "what"), and simplify where possible.

**(Applying DESIGN-1: What Before How & DESIGN-2: Scrutinize Requirements)**

First, let's break down *what* the system must achieve, and critically examine these needs. You've listed several core areas:
1.  **User Accounts:** Registration, login, profile management.
    *   *Scrutiny (DESIGN-2):* Is social login essential for V1? Is password recovery via email sufficient, or do you need SMS too?
2.  **Product Listings:** Display products, categories, search, filtering.
    *   *Scrutiny (DESIGN-2):* How complex does filtering need to be initially? Are user reviews and ratings a must-have for launch?
3.  **Shopping Cart:** Add/remove items, update quantities.
    *   *Scrutiny (DESIGN-2):* Does it need to persist indefinitely for unauthenticated users, or is a session-based cart okay to start?
4.  **Payment Processing:** Integration with a payment gateway.
    *   *Scrutiny (DESIGN-2):* Which payment methods are absolutely critical for your target audience at launch?
5.  **Order History:** Users can view past orders.
    *   *Scrutiny (DESIGN-2):* What level of detail is required in the order history display for V1?
6.  **Admin Features:** Manage products (CRUD), manage orders (view, update status).
    *   *Scrutiny (DESIGN-2):* What are the absolute minimum admin capabilities needed to operate the store initially?

Simplifying the "what" by focusing on the most essential features for a Minimum Viable Product (MVP) will significantly reduce initial complexity. Could you clarify which of these features, and their sub-components, are truly non-negotiable for your first release?

**(Applying DESIGN-3: Design Above Code)**

Once we have a refined and scrutinized list of "what" the MVP must achieve, we can then move to the high-level "how" – the conceptual design. At this stage, we'd think about:

*   **Core Entities:** Conceptually, what are the main "things" in your system? E.g., `User`, `Product`, `Order`, `CartItem`. What are their essential attributes and relationships? (e.g., an `Order` contains multiple `OrderItems`, which refer to `Products`).
*   **Key Services/Modules (Conceptual):** What are the broad functional areas? E.g., `AuthenticationService`, `ProductCatalogService`, `OrderingService`, `PaymentService`. We're not defining APIs or choosing microservices vs. monolith yet, just identifying logical groupings of responsibilities.
*   **Data Flow (Conceptual):** How does information move between these conceptual services? For example, when a user places an order, what sequence of interactions happens at a high level?

This conceptual design will be independent of specific programming languages or database choices. For instance, "Persistence for `Product` data" is a conceptual need; whether it's SQL, NoSQL, or flat files comes much later.

Let me know your thoughts on the essential "what" for your MVP, and then we can iterate on the conceptual "how."

APPLIED RULES: DESIGN-1, DESIGN-2, DESIGN-3

(This example is complete, it can be run "as is")

Cloudlare experiment

The Cloudflare team recently published workers-oauth-provider

It was written using Claude

This library (including the schema documentation) was largely written with the help of Claude, the AI model by Anthropic. Claude’s output was thoroughly reviewed by Cloudflare engineers with careful attention paid to security and compliance with standards. Many improvements were made on the initial output, mostly again by prompting Claude (and reviewing the results). Check out the commit history to see how Claude was prompted and what code it produced.

Maxe Mitchell, one of the team members at Cloudflare, published I Read All Of Cloudflare’s Claude-Generated Commits

Here’s what Maxe said:

The lead engineer, @kentonv, started as an AI skeptic. ‘I was trying to validate my skepticism. I ended up proving myself wrong.’ Two months later, Claude had generated nearly all of the code in what became a production-ready authentication library.

The first mistake that Claude was making was:

You did X, but we should do Y. pls fix.

LLM Directives: Evolution of Custom Instructions

Let’s give custom instructions to LLMs

Cloudlare experiment