batch_exams_gen
An LLM pipeline that turns ten years of F=ma exam PDFs into a structured problem corpus, then generates new variants through composable, skill-orchestrated mutation.
PhysElo needs a steady supply of olympiad-level physics problems. Manual authoring doesn’t keep up, and pure LLM generation produces low-quality output. batch_exams_gen is the hybrid in between: real anchors, structured mutation, and verification, producing problems good enough to put into competitive contests.
Extraction
F=ma exam problems come out of their PDFs as structured JSON — body, choices, answer, solution, topic, difficulty. New olympiad papers release once a year, so ingest is a manual yearly drop; automating the fetch isn’t worth it at that cadence.
Mutation and generation
Anchor problems are the base. New variants come from parameter perturbation and LLM-guided rephrasing — the mutation step is where problem volume comes from. Without it, the corpus is just the roughly 300 extracted anchors.
Composable skills
Each generation step — write, verify figure, draw TikZ — is a separate Claude Code skill, kept small and composable, and it runs through the Claude CLI rather than the API to keep batch generation cheap. The LLM invocation point is structured so it can swap to API calls if production recurrence is ever needed.
Each skill sees only its own problem. That is a deliberate design choice — and it is exactly the blind spot the Physics Corpus Quality Platform exists to fix.
On the source
The F=ma problems are owned by AAPT (the American Association of Physics Teachers). The extracted corpus and its derivatives stay private to avoid IP exposure, as does the platform that consumes them.