- circumscribe LLM calls with requirement verifiers. We will see variations on this principle throughout the tutorial.
- Generative programs should use simple and composable prompting styles. Mellea takes a middle-ground between the “framework chooses the prompt” and “client code chooses the prompt” paradigms. By keeping prompts small and self-contained, then chaining together many such prompts, we can usually get away with one of a few prompt styles. When a new prompt style is needed, that prompt should be co-designed with the software that will use the prompt. In Mellea, we encourage this by decomposing generative programs into Components; more on this in Architecture.
- Generative models and inference-time programs should be co-designed. Ideally, the style and domain of prompting used at inference time should match the style and domain of prompting using in pretraining, mid-training, and/or post-training. And, similarly, models should be built with runtime components and use-patterns in mind. We will see some early examples of this in Adapters.
- Generative programs should carefully manage context. Each Component manages context of a single call, as we see in Quickstart, Architecture, Generative Slots, and MObject. Additionally, Mellea provides some useful mechanisms for re-using context across multiple calls (Context Management).