Your blog has 50 posts. One has a date field that reads "Aprll 10" instead of a proper date. Another is missing its required tags array entirely. A third references hero-iamge.jpg — a file that does not exist. You discover none of this until a reader reports a broken page in production.
Content Collections prevent all three failures. Every field in every content file is validated against a Zod schema at build time. A typo in a date? Build fails. Missing tag array? Build fails. A hero image pointing to a file that is not on disk? Build fails — before your readers ever see the damage.
In Part 1 we scaffolded the project. In Part 2 we wired up islands for interactivity. Now we build the content layer that makes the whole thing worth deploying.
The Content Layer API
Astro 5 replaced the legacy content system with the Content Layer API. The old approach relied on magic directories and implicit conventions. The new system is explicit: you define collections with loaders, validate with Zod schemas, and query with type-safe functions.
The single source of truth is src/content.config.ts. Not src/content/config.ts (that was Astro 4). Not a contentlayer.config.ts. One file, one schema definition, full TypeScript inference.
Defining Collections
A collection needs two things: a loader that tells Astro where to find files, and a schema that tells Astro what shape the data must have.
Here is the blog collection from Blogcraft:
import { defineCollection, z } from 'astro:content';import { glob } from 'astro/loaders';
const blog = defineCollection({ loader: glob({ pattern: '**/*.mdx', base: './src/content/blog' }), schema: ({ image }) => z.object({ title: z.string().max(100), description: z.string().max(300), pubDate: z.coerce.date(), updatedDate: z.coerce.date().optional(), heroImage: image(), heroAlt: z.string(), author: z.string().default('default'), tags: z.array(z.string()), category: z.enum([ 'engineering', 'architecture', 'data', 'devops', 'ai-ml', 'rust', 'frontend', 'tutorial' ]), draft: z.boolean().default(false), featured: z.boolean().default(false), series: z.object({ slug: z.string(), order: z.number(), }).optional(), toc: z.boolean().default(true), readingTime: z.number().optional(), }),});Every piece of this schema earns its place. z.string().max(100) on the title prevents runaway SEO titles. z.coerce.date() accepts both 2026-04-10 and "2026-04-10T00:00:00Z" — it coerces the string to a Date object. z.enum() on category restricts values to a known set, so a typo like "tutoral" fails the build instead of silently creating an uncategorized post.
The image() helper deserves special attention.
Image Validation with image()
The image() function is not a standard Zod type. It is an Astro-provided schema helper passed through the destructured { image } parameter in the schema function. It does two things no regular z.string() can:
-
Validates the file exists at build time. If your frontmatter says
heroImage: ../../assets/images/heroes/missing.jpgand that file is not on disk, the build fails immediately with a clear error pointing to the exact file and field. -
Returns processed image metadata. The resolved value is not a string path — it is an image object with
src,width,height, andformatproperties. This metadata feeds directly into Astro’s<Image>and<Picture>components for responsive image generation.
When you use the processed image in a layout:
---import { Picture } from 'astro:assets';
const { post, headings, readingTime } = Astro.props;---
<article> <Picture src={post.data.heroImage} alt={post.data.heroAlt} widths={[400, 800, 1200]} formats={['avif', 'webp', 'jpeg']} class="hero-image" /> <!-- ... --></article>Astro’s image service (Sharp) generates optimized variants at build time — AVIF, WebP, and JPEG fallback at multiple widths. The <Picture> component emits a <picture> element with <source> tags and proper srcset attributes. Zero runtime cost, zero layout shift (because width and height are known from the schema).
Multiple Collection Types
Content Collections are not limited to MDX. Any structured data with a loader and schema qualifies. Blogcraft uses three collections:
const authors = defineCollection({ loader: glob({ pattern: '**/*.json', base: './src/content/authors' }), schema: z.object({ name: z.string(), bio: z.string(), avatar: z.string(), social: z.object({ github: z.string().url().optional(), twitter: z.string().url().optional(), linkedin: z.string().url().optional(), }).optional(), }),});
const series = defineCollection({ loader: glob({ pattern: '**/*.yaml', base: './src/content/series' }), schema: ({ image }) => z.object({ title: z.string().max(100), description: z.string().max(300), coverImage: image(), coverAlt: z.string(), status: z.enum(['ongoing', 'complete']), comingSoon: z.array(z.string()).default([]), }),});
export const collections = { blog, authors, series };JSON for authors. YAML for series metadata. MDX for posts. The glob loader handles all of them — you just change the pattern and base path. Each collection gets its own fully-typed schema, and Astro generates TypeScript types for all of them.
Querying Collections
Astro 5 provides two primary query functions: getCollection for lists and getEntry for single items.
import { getCollection, getEntry } from 'astro:content';
// All published posts, sorted newest firstconst posts = (await getCollection('blog', ({ data }) => !data.draft)) .sort((a, b) => b.data.pubDate.valueOf() - a.data.pubDate.valueOf());
// Single author by ID (filename without extension)const author = await getEntry('authors', 'default');
// Filter by tagconst rustPosts = await getCollection('blog', ({ data }) => !data.draft && data.tags.includes('rust'));
// Filter by seriesconst seriesPosts = (await getCollection('blog', ({ data }) => !data.draft && data.series?.slug === 'astro-deep-dive')).sort((a, b) => (a.data.series?.order ?? 0) - (b.data.series?.order ?? 0));The filter callback in getCollection runs at build time. It receives the full typed data object, so TypeScript knows data.draft is a boolean, data.tags is string[], and data.series is { slug: string; order: number } | undefined. No type assertions. No as any. The schema is the type.
Dynamic Routes with getStaticPaths
Every blog post needs a URL. In Astro 5, dynamic routes use getStaticPaths to generate pages at build time:
---import { getCollection, render } from 'astro:content';import BlogPostLayout from '../../layouts/BlogPostLayout.astro';import Callout from '../../components/mdx/Callout.astro';import ComparisonTable from '../../components/mdx/ComparisonTable.astro';import Tabs from '../../components/mdx/Tabs.astro';// ... other MDX components
export async function getStaticPaths() { const posts = await getCollection('blog', ({ data }) => !data.draft); return posts.map((post) => ({ params: { slug: post.id }, props: { post }, }));}
const { post } = Astro.props;const { Content, headings, remarkPluginFrontmatter } = await render(post);const readingTime = remarkPluginFrontmatter?.minutesRead ?? '5 min read';
const mdxComponents = { Callout, ComparisonTable, Tabs, // ... register all MDX components};---<BlogPostLayout post={post} headings={headings} readingTime={readingTime}> <Content components={mdxComponents} /></BlogPostLayout>Two critical Astro 5 changes to note here:
-
render()is a standalone import, not a method on the entry. The oldentry.render()from Astro 4 is gone. You importrenderfromastro:contentand callrender(entry). -
post.idis the content identifier. In Astro 5’s Content Layer,idis derived from the filename (minus the extension). There is no separateslugproperty —idserves as both the identifier and the route parameter.
Type Safety: Schema to TypeScript
When you run astro dev or astro build, Astro generates TypeScript types from your schemas into .astro/types.d.ts. Every getCollection('blog') call returns CollectionEntry<'blog'>[] where CollectionEntry<'blog'> has a fully typed data property matching your Zod schema.
This means:
- Autocomplete works on
post.data.title,post.data.tags,post.data.series?.order - Accessing
post.data.nonexistentis a compile error - The
categoryfield autocompletes to'engineering' | 'architecture' | 'data' | ... post.data.pubDateis typed asDate, notstring
You never write an interface for your frontmatter. The schema is the interface.
Series Support
Linking posts within a series requires the series field in the schema and a query that groups by series slug:
import { getCollection } from 'astro:content';
export async function getSeriesPosts(seriesSlug: string) { return (await getCollection('blog', ({ data }) => !data.draft && data.series?.slug === seriesSlug )).sort((a, b) => (a.data.series?.order ?? 0) - (b.data.series?.order ?? 0) );}
export function getSeriesNavigation( posts: Awaited<ReturnType<typeof getSeriesPosts>>, currentOrder: number,) { const currentIndex = posts.findIndex( (p) => p.data.series?.order === currentOrder, ); return { prev: currentIndex > 0 ? posts[currentIndex - 1] : null, next: currentIndex < posts.length - 1 ? posts[currentIndex + 1] : null, total: posts.length, current: currentIndex + 1, };}The layout calls getSeriesNavigation and renders prev/next links. Because the series order is a number in the schema, sorting is deterministic. No string comparison surprises.
Schema Parity with Your CMS
If you use a CMS like Keystatic alongside Content Collections, the CMS field definitions must match the Zod schema exactly. A field that is required in the schema but optional in the CMS means authors can save content that fails the build.
Blogcraft enforces parity with a prebuild script:
// Parse the Zod schema and Keystatic config programmatically// Compare field names, types, required/optional status// Fail the build if any drift is detected
// Runs during: pnpm build → prebuild hook// Exit code 1 = drift detected, build abortedThis script runs before every build. If someone adds a field to the Keystatic config without updating content.config.ts (or vice versa), the build fails with a clear diff showing exactly what drifted.
The Content Pipeline
MDX files do not go straight from source to HTML. They pass through a pipeline of remark and rehype plugins, each transforming the content:
Each plugin operates on the AST (abstract syntax tree) at a specific stage. Remark plugins work on the Markdown AST (mdast). Rehype plugins work on the HTML AST (hast). The order within each stage matters — rehypeSlug must run before rehypeAutolinkHeadings because autolink needs the IDs that slug generates.
Here is the relevant section from astro.config.ts:
markdown: { remarkPlugins: [remarkGfm, remarkReadingTime], rehypePlugins: [ rehypeSlug, [rehypeAutolinkHeadings, { behavior: 'append' }], [rehypeMermaid, { strategy: 'inline-svg', mermaidConfig: { fontFamily: 'General Sans, sans-serif', theme: 'neutral', }, }], ],},The remarkReadingTime plugin is a custom remark plugin that counts words and injects a minutesRead string into remarkPluginFrontmatter. The render() function exposes this via its return value, so the layout can display “8 min read” without any client-side calculation.
Integration Order: Why It Matters
In astro.config.ts, the integrations array is ordered. This is not cosmetic — it determines the build pipeline:
integrations: [ expressiveCode({ /* ... */ }), // MUST come first mdx(), // MUST come after expressiveCode react(), keystatic(), sitemap(),],Expressive Code must register before MDX. It intercepts code fences and transforms them into styled blocks with syntax highlighting, line numbers, and frame decorations. If MDX processes the code fences first, Expressive Code never sees them and your code blocks render as unstyled <pre> tags.
This is not documented prominently. It will cost you an hour of debugging if you get it wrong.
| Approach | Schema Validation | Type Safety | Image Processing | Multi-Format |
|---|---|---|---|---|
| Astro Content Layer | Zod at build time | Full TypeScript inference | Built-in image() helper | MDX, JSON, YAML, custom loaders |
| Contentlayer (deprecated) | Zod-like, custom DSL | Generated types | Manual | MDX, JSON |
| Manual frontmatter parsing | None (runtime errors) | Manual interfaces | Manual | Whatever you build |
| gray-matter + manual Zod | Possible but DIY | Manual wiring | Manual | Whatever you build |
Putting It All Together
Here is the complete content.config.ts that powers Blogcraft — three collections, three formats, full type safety:
import { defineCollection, z } from 'astro:content';import { glob } from 'astro/loaders';
const blog = defineCollection({ loader: glob({ pattern: '**/*.mdx', base: './src/content/blog' }), schema: ({ image }) => z.object({ title: z.string().max(100), description: z.string().max(300), pubDate: z.coerce.date(), updatedDate: z.coerce.date().optional(), heroImage: image(), heroAlt: z.string(), author: z.string().default('default'), tags: z.array(z.string()), category: z.enum([ 'engineering', 'architecture', 'data', 'devops', 'ai-ml', 'rust', 'frontend', 'tutorial', ]), draft: z.boolean().default(false), featured: z.boolean().default(false), series: z.object({ slug: z.string(), order: z.number() }).optional(), toc: z.boolean().default(true), readingTime: z.number().optional(), }),});
const authors = defineCollection({ loader: glob({ pattern: '**/*.json', base: './src/content/authors' }), schema: z.object({ name: z.string(), bio: z.string(), avatar: z.string(), social: z.object({ github: z.string().url().optional(), twitter: z.string().url().optional(), linkedin: z.string().url().optional(), }).optional(), }),});
const series = defineCollection({ loader: glob({ pattern: '**/*.yaml', base: './src/content/series' }), schema: ({ image }) => z.object({ title: z.string().max(100), description: z.string().max(300), coverImage: image(), coverAlt: z.string(), status: z.enum(['ongoing', 'complete']), comingSoon: z.array(z.string()).default([]), }),});
export const collections = { blog, authors, series };Every MDX file, every JSON author profile, every YAML series definition passes through Zod validation before a single HTML page is generated. A malformed date, a missing image, a misspelled category — the build catches it.
What We Covered
This post established the content foundation:
- Content Layer API with
defineCollectionandglobloader replaces the legacy content system - Zod schemas validate every field at build time, including the
image()helper that catches missing files content.config.tsis the single source of truth for all content typesgetCollectionandrenderprovide type-safe querying and rendering (note:render(entry), notentry.render())- Dynamic routes via
getStaticPathsgenerate one HTML page per content entry - Remark and rehype plugins transform MDX through a deterministic pipeline
- Integration order in
astro.config.tsis load-bearing — Expressive Code before MDX, always
In Part 4, we deploy all of this to Cloudflare Pages with GitHub Actions, Lighthouse CI, and the configuration that makes deploy day boring.