Methodology

How a drug page is built, end to end.

1 · Source retrieval

Per-regulator scrapers run on cron schedules and respect each source’s rate limits. Raw payloads are retained for an audit trail.

2 · Normalization

Regulator-specific fields are mapped to a consistent schema. Active ingredients are normalized via RxNorm and the WHO INN list; brands are cross-referenced to their ingredient per country; classification follows the WHO ATC hierarchy.

3 · AI-assisted compilation

A language model receives structured data plus verbatim source text and is prompted to compile (not author) per fixed templates, preserving verbatim quotes for safety-critical sections.

4 · Quality verification

Before anything is published, each compiled page must clear a series of automated quality checks: grounding against the cited source, verbatim preservation for safety-critical sections, citation discipline (no source names in body text), readability, and length discipline. Content that fails a check is held back rather than published.

5 · Publishing

Approved content is cached and rendered as static pages. Sitemap lastmoduses the regulator’s last revision date - not our publish date.

Limitations we disclose

Information may lag - some regulators publish updates infrequently. Each page shows the last regulator revision date.
Pill identification is US-only (DailyMed images).
Comprehensive drug-drug interactions are not in V1.
Translation accuracy depends on the clarity of the source label.
Errors happen - all known errors are logged publicly in /corrections.