v0.4.0
Central Extraction Service, Unified Grading & Language Intelligence
New Features
- Central Extraction Service. New modular extraction pipeline with native text, pix2text OCR, and LLM Vision paths. Automatic GPU detection, vector/raster figure heuristics, and a
vision_handledflag that prevents redundant LLM calls in smart mode. Runs behind a feature flag for gradual rollout. - Unified grading orchestrator. Canvas and Offline grading consolidated into a single shared orchestrator, eliminating duplicated logic and ensuring both paths receive identical pipeline improvements.
- Language-aware grading. Automatic language detection from PDF text and grading instructions. Non-English exams receive localised feedback headers and language-specific prompts, with the annotation language forwarded end-to-end from API to grading worker.
- Simple schema bypass. Schemas with zero tasks now skip the full schema pipeline and route directly to basic grading, reducing latency and avoiding unnecessary processing.
- Schema builder sync. Preview-editor scroll synchronisation with drag-based region selection, allowing examiners to highlight PDF areas and map them to schema fields interactively.
Improvements
- All image processing consolidated to WebP format, removing the image format setting and reducing storage footprint.
- Vision-based annotation placement now used for offline grading, matching Canvas workflow quality.
- Schema grading instructions loaded from the linked schema file rather than requiring manual re-entry.
- Structured feedback added to the vision-only grading prompt for richer single-call results.
- Schema pipeline JSON repair improved with LLM-assisted recovery for malformed extraction output.
- UI refinements across analytics, SEO meta tags, and dashboard layout.
Bug Fixes
- Fixed double bullet prefix in structured feedback formatting.
- Fixed schema builder modal losing
schemaJsonwhen reopened after initial configuration. - Fixed
hide-unpublishedfilter and schema prompt persistence on save. - Fixed
vision_onlymissing from supported vision strategies, causing strategy validation failures. - Fixed coordinate regressions and permission checks in the grading pipeline.
- Fixed PyMuPDF error noise with a
suppress_mupdf_errorscontext manager. - Fixed handwritten exam page-split detection for non-English content.
- Fixed scroll sync warm-up timing after preview render completes.