AI pipeline

batch_exams_gen

An LLM pipeline that turns ten years of F=ma exam PDFs into a structured problem corpus, then generates new variants through composable, skill-orchestrated mutation.

Ongoing · private

Source private — F=ma exam problems are AAPT IP

Python Claude Code skills Claude CLI OpenAI TikZ

PhysElo needs a steady supply of olympiad-level physics problems. Manual authoring doesn’t keep up, and pure LLM generation produces low-quality output. batch_exams_gen is the hybrid in between: real anchors, structured mutation, and verification, producing problems good enough to put into competitive contests.

Extraction

F=ma exam problems come out of their PDFs as structured JSON — body, choices, answer, solution, topic, difficulty. New olympiad papers release once a year, so ingest is a manual yearly drop; automating the fetch isn’t worth it at that cadence.

Mutation and generation

Anchor problems are the base. New variants come from parameter perturbation and LLM-guided rephrasing — the mutation step is where problem volume comes from. Without it, the corpus is just the roughly 300 extracted anchors.

Composable skills

Each generation step — write, verify figure, draw TikZ — is a separate Claude Code skill, kept small and composable, and it runs through the Claude CLI rather than the API to keep batch generation cheap. The LLM invocation point is structured so it can swap to API calls if production recurrence is ever needed.

Each skill sees only its own problem. That is a deliberate design choice — and it is exactly the blind spot the Physics Corpus Quality Platform exists to fix.

On the source

The F=ma problems are owned by AAPT (the American Association of Physics Teachers). The extracted corpus and its derivatives stay private to avoid IP exposure, as does the platform that consumes them.