r-visualization-pipeline
User needs to create publication-quality R visualizations with ggplot2, or needs to validate visualization quality
Changelog
260420: multiple edits
- v_migrate: Changelog migrated from table to YYMMDD H3 format per versioning-standard rule 2 (V1.6 of skills upgrade plan)
- v6: Added license, sources per V6.1/V6.2 of skills upgrade plan.
- v1.5: Added
## Quality Checkssection per V1.5 of ~/vault/plans/2026-04-20-vault-skills-upgrade-plan.md
260403: Added Visual Enrichment section + self-improving-agent-patterns cross-reference
260331: Initial creation
Description
A comprehensive R visualization system deployed as an MCP server, built around an 1800-line specification (content/plan.md) that codifies publication-quality ggplot2 practices into a reproducible, brand-aware pipeline. The system covers the full lifecycle: chart family selection, script generation from 36 battle-tested templates, brand token theming, and automated 108-check quality validation. Deployed on Railway as a remote MCP server. See internal brand system for the full project context.
Interface
Trigger: User needs a publication-quality R visualization, wants to select the right chart type for their data, or needs to validate an existing visualization against quality standards.
Inputs:
data_description: what data is being visualized (structure, dimensions, purpose)chart_family: one of 9 families (see chart family table below)brand_template: one of 4 brand templates: Slate, Aurora, Earth, Journalquality_target: target rubric band: BAD (1-3), OKAY (4-5), GOOD (6-7), EXCELLENT (8-10)
Outputs:
r_script: complete, self-contained R script with all dependencies declaredvalidation_report: 108-check quality score with per-check pass/fail/warningpublication_ready_plot: PNG or PDF with brand theming applied
Visualization Specification (content/plan.md)
16-section spec covering: grammar of graphics philosophy, 40+ vetted R packages, plot selection guide, color system (colorblind-safe), theme/typography, annotation, scales, multi-panel composition, export settings, 10 worked examples, 26-point quality checklist, edge cases, data ingestion, publication tables, advanced charts.
Nine Chart Families (36 battle-tested scripts)
| Family | Code | Scripts | Types |
|---|---|---|---|
| Comparison | cmp_ | 4 | Grouped bar, lollipop, violin, dumbbell |
| Composition | com_ | 4 | Stacked bar, treemap, waffle, alluvial |
| Correlation | cor_ | 4 | Scatter+trend, labeled scatter, matrix, bubble |
| Distribution | dst_ | 7 | Histogram, density, ridgeline, boxplot, raincloud, ECDF |
| Geospatial | geo_ | 4 | Choropleth, bubble map, hex tile, faceted |
| Network | net_ | 3 | Force-directed, tree, circular |
| Statistical | sta_ | 4 | Forest plot, PCA biplot, regression diagnostics, QQ |
| Survival | sur_ | 2 | Kaplan-Meier, KM+risk table |
| Time Series | ts_ | 4 | Multiline, stacked area, line+ribbon, dual facet |
See topics/brand-token-chart-families for the full mapping of families to brand tokens.
Brand Token System (4 templates x 35+ tokens)
Each brand template defines typography, surfaces, palettes (qualitative, sequential, diverging), data-viz tokens, spacing, and semantic colors:
- Slate: Helvetica / Paul Tol Bright. Clean corporate default.
- Aurora: Avenir / Paul Tol Vibrant. High-energy, saturated.
- Earth: Georgia / Paul Tol Muted. Academic, subdued.
- Journal: Palatino / Tableau-10. Publication-ready, classic serif.
All palettes are colorblind-safe by construction (Paul Tol or Tableau lineage).
Quality Validation Pipeline (108 checks)
See topics/r-viz-quality-validation-pipeline for the full validation architecture.
- 12 automated mechanical checks (regex on script content)
- 12 family-specific checks x 9 families = 108 total
- Rubric scoring: BAD/OKAY/GOOD/EXCELLENT with specific criteria per family
- Current baseline: 36/36 scripts pass, 0 errors, 37 warnings
MCP Server (deployed on Railway)
Part of the topics/mcp-server-ecosystem. Exposes:
- Resources: plan sections, skill metadata, prompt templates, example scripts, validation rules
- Tools:
get_plan_section,search_plan,list_chart_families,get_quality_checklist - Prompts: 9 family-specific generators (
create_scatter,create_distribution,create_timeseries, etc.)
Provenance
Born from the brand token system experiment (Feb 2026, predates experiments dimension): the insight that visualization quality is reproducible when you externalize design decisions into tokens and validate mechanically. The 1800-line plan emerged from iterating on the 36 scripts until all 108 checks passed: the spec is a distillation of what worked, not a theoretical document.
Key milestones:
- 36 scripts across 9 families: each script is self-contained, tested, and brand-aware
- 108-check validation pipeline: automated quality gate that catches the most common visualization failures (missing labels, colorblind-unsafe palettes, poor aspect ratios, overplotting)
- 4 brand templates: production-tested token sets that map to real publication contexts
- MCP deployment: server on Railway exposes the full pipeline programmatically
Usage Notes
- Start with
list_chart_familiesto select the right family for your data shape - Always specify a brand template: the default (Slate) is safe but generic
- The quality target should be EXCELLENT (8-10) for anything going into a report or publication
- Distribution family (
dst_) has the most scripts (7) because distributions are the most common visualization need and the most commonly botched - Survival family (
sur_) has only 2 scripts but they are the most complex (KM+risk table is ~200 lines) - The validation pipeline catches issues that look fine on screen but fail in print (DPI, font embedding, color contrast)
- Geospatial scripts require additional system dependencies (GDAL/PROJ): check
geo_prerequisites before running
Quality Checks
- Chart renders without ggplot errors.
Rscript <chart>.R 2>&1 | grep -c 'Error\|Warning'returns 0. - Export DPI correct. Print: 300 DPI. Web: 144 DPI. Verify with
exiftool <chart>.png | grep -i dpi. - Fonts embed in PDF export.
pdffonts <chart>.pdfshows noType 3fonts (= rasterized); all fonts embedded. - Family code matches spec.
DST(distribution),COR(correlation),TS(time-series),CMP(comparison),COM(composition). Per[topics/visual-output-routing](/topics/visual-output-routing). - No overplotting. For scatter plots with n>100, alpha-blending or jittering present. Verify visually or via density-overlay check.
- Axes labeled, title + caption present. Every chart has
ggtitle,labs(x=, y=), and a source-note caption.
Visual Enrichment
| Medium | Type | Description |
|---|---|---|
| R | STA forest plot | 108-check pass rates by family |
| Figma | Flowchart | Pipeline: data -> family -> script -> brand -> validation |
Self-Improvement Cross-Reference
This skill is the R-side implementation of the routing described in topics/visual-output-routing. It is both a tool and a product of Pattern 4 (Compiler Wiki): the quality validator is a lint cycle. For the master reference on all 6 self-improvement patterns, see skills/self-improving-agent-patterns.