This site uses essential browser storage for authentication and preferences. No tracking cookies are used. Privacy Policy
Release history

Changelog

Versioned releases of the BUDOVA corpus and platform.

  1. v0.4Apr 2026Current

    Platform hardening. Password reset flow, inter-annotator agreement aggregation endpoint, coverage/provenance APIs, redesigned auth pages, email delivery via ACS.

    platformauthapi
  2. v0.3Apr 2026

    Azure deployment complete: custom domain budov.org, nightly Postgres backups, Application Insights, Playwright smoke tests in CI/CD, PostHog product analytics.

    infraobservability
  3. v0.2Mar 2026

    Annotation platform v1: task creation, NER annotator with span editing, skipped-task unskip flow, admin panel, speech upload + in-browser recording, lexicon editor.

    platformannotation
  4. v0.1Feb 2026

    Project bootstrap. Hero landing, docs structure, LINGUA grant received through Microsoft AI for Good Lab. Initial seed data.

    launch
  5. v1.0Q3 2026Planned

    First public dataset release: 100h speech + 10M tokens annotated text + lexicon snapshot on Hugging Face. Baseline NER model trained and benchmarked.

    datasetmodel
Collaboration

Join BUDOVA

We are looking for researchers, construction professionals, and language specialists to participate in the project.

Supported by
Microsoft AI for Good Lab