MDX Content and Virtual Modules
Building a custom Vite plugin to extract MDX metadata without full compilation, using virtual modules and the query parameter pattern.
Part 4 of the "Building a Modern Portfolio Without a Meta-Framework" series
I wanted to write in Markdown but reach for React components when Markdown fell short. MDX gives you exactly that—standard Markdown that compiles to React at build time. The catch is getting metadata out of MDX files without importing the entire component tree.
The Problem
I wanted MDX files to contain both content and metadata (title, description, tags, etc). The content gets rendered as React components, but I also needed to extract the metadata for blog lists, SEO tags, and Open Graph images.
The naive approach: use Vite's import.meta.glob to import all MDX files in a directory. But I didn't want to fully parse and compile every MDX file just to read a title and a description. That's slow, wasteful, and bloats the output. The obvious solution—two separate glob imports—doesn't work:
// ❌ Does not work!
// Import only metadata eagerly
const metadataModules = import.meta.glob('@/content/blog/*.mdx', {
eager: true,
import: 'metadata',
});
// Import full content lazily
const contentModules = import.meta.glob('@/content/blog/*.mdx');
Vite notices that both globs target the same files and optimizes them into a single module. Both imports end up being full MDX modules—exactly what we were trying to avoid. The fix: a custom Vite plugin that extracts metadata from MDX files, forcing module separation. Is it a hack? Yes. Does it work reliably? Also yes.
Why Not Frontmatter?
Frontmatter would be the conventional choice, but JS exports give you more flexibility—computed values, imported assets, TypeScript type checking. MDX's own documentation recommends exports over frontmatter for exactly these reasons.
Virtual Modules in Vite
Vite's plugin API lets us create "virtual" modules that don't exist on disk:
import { Plugin, transformWithEsbuild } from 'vite';
import { readFileSync } from 'fs';
const METADATA_QUERY = '?metadata';
const VIRTUAL_PREFIX = '\0';
export function mdxMetadataPlugin(): Plugin {
return {
name: 'mdx-metadata',
enforce: 'pre',
resolveId(id, importer) {
// Intercept MDX imports with ?metadata query
},
async load(id) {
// Load and transform the MDX file to extract metadata
},
};
}
The ?metadata Query Pattern
To differentiate between full MDX imports and metadata-only imports, we use a special query parameter ?metadata. When we want just the metadata, we append this query to the import path:
import metadata from '@/content/blog/my-post.mdx?metadata';
// Or using glob
const metadataModules = import.meta.glob('@/content/blog/*.mdx?metadata', {
eager: true,
});
Extracting Metadata
Inside the load hook of our plugin, we read the MDX file, parse it to extract the exports, and return a module that only exports the metadata:
async load(id) {
if (id.startsWith(VIRTUAL_PREFIX) && id.endsWith(METADATA_QUERY)) {
const actualPath = id.slice(
VIRTUAL_PREFIX.length,
-METADATA_QUERY.length,
);
const content = readFileSync(actualPath, 'utf-8');
// Extract import statements (to support imports used in metadata)
const importMatches = content.match(
/^import\s+.+\s+from\s+['"][^'"]+['"];?\s*$/gm,
);
const imports = importMatches ? importMatches.join('\n') : '';
// Extract the metadata export block
const match = content.match(
/export\s+const\s+metadata\s*=\s*(\{[\s\S]*?\n\});/,
);
if (!match) {
throw new Error(`No metadata export found in ${actualPath}`);
}
const jsxCode = `
import React from 'react';
${imports}
const metadata = ${match[1]};
export default metadata;
`;
// Transform JSX to JS using esbuild
const result = await transformWithEsbuild(jsxCode, 'metadata.tsx', {
jsx: 'automatic',
loader: 'tsx',
});
return result.code;
}
}
The plugin reads the raw MDX file as text, extracts only the import statements and metadata export, then uses esbuild to handle JSX transformation. This means metadata can reference imported assets (like OG images) or use arbitrary JS logic—without pulling in the entire MDX compilation pipeline.
Read Time Calculation
Since the plugin already has access to the raw file content, computing derived metadata like read time is trivial—strip out code blocks and markup, count the remaining words, divide by reading speed:
async load(id) {
if (id.startsWith(VIRTUAL_PREFIX) && id.endsWith(METADATA_QUERY)) {
const actualPath = id.slice(
VIRTUAL_PREFIX.length,
-METADATA_QUERY.length,
);
const content = readFileSync(actualPath, 'utf-8');
// Extract import statements (to support imports used in metadata)
const importMatches = content.match(
/^import\s+.+\s+from\s+['"][^'"]+['"];?\s*$/gm,
);
const imports = importMatches ? importMatches.join('\n') : '';
// Extract the metadata export block
const match = content.match(
/export\s+const\s+metadata\s*=\s*(\{[\s\S]*?\n\});/,
);
if (!match) {
throw new Error(`No metadata export found in ${actualPath}`);
}
// Perform read time calculation
const words = content
.replace(/<[^>]+>/g, '') // Remove HTML tags
.replace(/```[\s\S]*?```/g, '') // Remove code blocks
// Remove import/export statements
.replace(/^import\s+.+\s+from\s+['"][^'"]+['"];?\s*$/gm, '')
.replace(
/^export\s+const\s+([A-Za-z0-9_]+)\s*=\s*\{[\s\S]*?\n\};/m,
'',
)
.split(/\s+/).length;
const readTime = Math.ceil(words / 200); // Assuming 200 WPM
const jsxCode = `
import React from 'react';
${imports}
const metadata = ${match[1]};
metadata.readTime = ${readTime};
export default metadata;
`;
// Transform JSX to JS using esbuild
const result = await transformWithEsbuild(jsxCode, 'metadata.tsx', {
jsx: 'automatic',
loader: 'tsx',
});
return result.code;
}
}
HMR Support
When an MDX file changes, Vite's HMR system will automatically invalidate the virtual module, ensuring that the metadata is always up to date during development.
handleHotUpdate({ file, server, modules }) {
// When an MDX file changes, also invalidate its metadata module
if (file.endsWith('.mdx')) {
const metadataModuleId = VIRTUAL_PREFIX + file + METADATA_QUERY;
const metadataModule =
server.moduleGraph.getModuleById(metadataModuleId);
if (metadataModule) {
return [...modules, metadataModule];
}
}
}
Alternatives I Considered
Frontmatter with gray-matter: The standard approach. Parse YAML at the top of each file. Simple, well-supported, but no TypeScript checking and no way to reference imported assets in metadata.
Contentlayer: A proper content layer with type-safe schemas, validation, and build-time processing. It's the right choice for content-heavy sites. For my use case—a handful of blog posts and projects—a 50-line Vite plugin was simpler.
Separate metadata files: Keep metadata in .json or .ts files alongside each MDX file. Avoids the extraction problem entirely, but means maintaining two files per piece of content. Easy to forget updating one when you change the other.
The Vite plugin approach sits in a sweet spot: metadata lives with content, types are inferred, and the extraction is invisible during development.
The Full Plugin
import { Plugin, transformWithEsbuild } from 'vite';
import { readFileSync } from 'fs';
const METADATA_QUERY = '?metadata';
const VIRTUAL_PREFIX = '\0';
export function mdxMetadataPlugin(): Plugin {
return {
name: 'mdx-metadata',
enforce: 'pre',
resolveId(id, importer) {
if (id.endsWith(METADATA_QUERY)) {
// Remove the query and resolve the actual file
const actualId = id.slice(0, -METADATA_QUERY.length);
return this.resolve(actualId, importer).then((resolved) => {
if (resolved) {
// Add virtual prefix so MDX plugin skips this
return VIRTUAL_PREFIX + resolved.id + METADATA_QUERY;
}
});
}
},
async load(id) {
if (id.startsWith(VIRTUAL_PREFIX) && id.endsWith(METADATA_QUERY)) {
const actualPath = id.slice(
VIRTUAL_PREFIX.length,
-METADATA_QUERY.length,
);
const content = readFileSync(actualPath, 'utf-8');
// Extract import statements (to support imports used in metadata)
const importMatches = content.match(
/^import\s+.+\s+from\s+['"][^'"]+['"];?\s*$/gm,
);
const imports = importMatches ? importMatches.join('\n') : '';
// Extract the metadata export block
const match = content.match(
/export\s+const\s+metadata\s*=\s*(\{[\s\S]*?\n\});/,
);
if (!match) {
throw new Error(`No metadata export found in ${actualPath}`);
}
// Perform read time calculation
const words = content
.replace(/<[^>]+>/g, '') // Remove HTML tags
.replace(/```[\s\S]*?```/g, '') // Remove code blocks
// Remove import/export statements
.replace(/^import\s+.+\s+from\s+['"][^'"]+['"];?\s*$/gm, '')
.replace(
/^export\s+const\s+([A-Za-z0-9_]+)\s*=\s*\{[\s\S]*?\n\};/m,
'',
)
.split(/\s+/).length;
const readTime = Math.ceil(words / 200); // Assuming 200 WPM
const jsxCode = `
import React from 'react';
${imports}
const metadata = ${match[1]};
metadata.readTime = ${readTime};
export default metadata;
`;
// Transform JSX to JS using esbuild
const result = await transformWithEsbuild(jsxCode, 'metadata.tsx', {
jsx: 'automatic',
loader: 'tsx',
});
return result.code;
}
},
handleHotUpdate({ file, server, modules }) {
// When an MDX file changes, also invalidate its metadata module
if (file.endsWith('.mdx')) {
const metadataModuleId = VIRTUAL_PREFIX + file + METADATA_QUERY;
const metadataModule =
server.moduleGraph.getModuleById(metadataModuleId);
if (metadataModule) {
return [...modules, metadataModule];
}
}
},
};
}
About 80 lines of plugin code, and MDX files become first-class content with typed metadata, read time estimates, and clean separation between content and metadata modules. Not bad for a "hack."