Getting Started¶
Let's go from nothing on your disk to a working, reproducible analysis. You can read this top to bottom without running anything, or follow along — every command is copy-paste ready.
What you'll build: a small two-output analysis that fits a linear model on
a public dataset and sweeps one methodological decision (whether to standardize
features). The result is two universes, baseline and raw, each with its
own r2 metric and fit_plot figure — a clean comparison ready for a paper
figure.
Make sure you've finished the install first.
1. Create a project¶
lc init is a one-shot setup. It creates a small, opinionated directory
layout and, when Claude Code, Codex, or Pi are on your PATH, installs the
shared lightcone bundle into the corresponding user-scoped agent config.
2. What you got¶
r2-decision-demo/
├── astra.yaml
├── CLAUDE.md
├── .gitignore
├── .git
├── .venv
├── .claude/
│ └── settings.json # Claude permissions tier only
├── .lightcone/
│ └── lightcone.yaml
├── Containerfile
├── requirements.txt
├── universes/
│ └── baseline.yaml
├── src/
└── results/
The shared lightcone bundle — skills, hooks, Codex manifest, and Pi extension —
installs user-scoped outside the project. lc init does not copy those assets
into .claude/; the only project-local Claude file is settings.json, which
holds the permissions tier for this repo.
The two files you'll actually look at:
astra.yaml — the single source of truth for your analysis. Inputs,
outputs, methodological decisions, recipes. Everything else lightcone-cli does
is downstream of this file. The boilerplate from lc init has one example
output and an empty decisions block — enough to run lc run and see something
materialize, but not yet a real analysis.
CLAUDE.md — a short project note for the agent. Claude Code reads it
automatically; the same note is still useful when you're driving the project
from Codex or Pi.
3. Open an agent CLI¶
Run that command inside the project directory. lc init already installed
whichever agent integrations it could, so the lightcone entry points should be
available on first launch.
4. The slash commands¶
Inside Claude Code, Codex, or Pi, the /lc-from-* family is organized by what
you're starting from. We'll use /lc-new in this guide; the others work the
same way.
| Command | Use it when… |
|---|---|
/lc-new |
You're starting from a research question and an empty astra.yaml. |
/lc-from-code |
You have an existing codebase you want wrapped in ASTRA. |
/lc-from-paper |
You have a published paper (DOI / arXiv ID) you want to reproduce. |
/lc-feedback |
Something broke and you want to file a GitHub issue without leaving the session. |
These are structured entry points for common starting situations. Once inside a
project you can also just describe what you're trying to do to the agent —
astra.yaml, lc run, and lc verify keep things tracked regardless of how
you got there.
5. Scope the analysis with /lc-new¶
Type:
The agent banner switches to RESEARCH QUESTION and asks something like "What are you trying to learn?" Reply in plain prose:
I want to know how much R² changes on the diabetes dataset depending
on whether I standardize features before fitting a linear regression.
A few follow-ups will sharpen this. After Phase 1 your astra.yaml already
has a name, description, and version — open it in another window if
you're curious; it's <30 lines.
In Phase 2 (ANALYSIS STRUCTURE) the agent asks about inputs, outputs, and whether this should be one analysis or split into stages. For our case, one analysis is right:
- Input:
diabetes(sklearn's bundled toy dataset). - Output 1:
r2, typemetric. - Output 2:
fit_plot, typefigure.
In Phase 3 (DEEP DIVE), say "skip the literature pass" to keep this a quick demo. The agent will still walk you through identifying the decision: does it preprocess? what options? what's the default?
You'll end up with something like this in astra.yaml:
version: "1.0"
name: "R² with and without feature standardization"
description: "Linear regression on the diabetes dataset, sweeping the standardization choice."
inputs: []
decisions:
standardize:
label: "Feature standardization"
rationale: "Standardizing changes coefficient scales and can shift R² for ridge-like models."
default: standardized
options:
standardized: { label: "StandardScaler before fit" }
raw: { label: "No preprocessing" }
outputs:
- id: r2
type: metric
description: "Coefficient of determination on the test split."
recipe:
command: python scripts/fit.py --standardize {standardize} --output {output[0]}
- id: fit_plot
type: figure
description: "Predicted vs true scatter."
recipe:
command: python scripts/plot.py --r2_dir {input.r2} --output {output[0]}
inputs: [r2]
container: Containerfile
Phase 4 (FINALIZE) runs astra validate astra.yaml, writes
universes/baseline.yaml, and fills in the narrative: block. You're handed
back a short summary table — two outputs, one decision, zero prior insights.
The agent may suggest /clear to free up context. Take its advice.
6. Implement the spec¶
/clear
Implement this analysis from astra.yaml. Write the scripts, run the baseline universe, and verify the result.
The agent reads the spec, the universe file, and the empty scripts/ dir,
then makes an implementation checklist:
1. Add Python deps (scikit-learn, matplotlib) to requirements.txt
2. Write Containerfile if missing
3. scripts/fit.py — accepts --standardize {standardized,raw}, writes r2.json
4. scripts/plot.py — reads r2_dir, writes fit_plot.png
5. lc run --universe baseline
6. lc status
7. astra validate astra.yaml
8. lc verify
It works through the checklist one item at a time. You'll see commands like:
Expected lc status output:
lc verify and astra validate should exit cleanly — no tampering, no broken
chains. If anything fails, ask the agent to fix the concrete error and rerun.
The agent commits after each successful output, so your git log is a clean
record of the build.
7. Verify integrity¶
This recomputes data hashes for every output and walks the input chain back to
declare whether anything has been tampered with since materialization. Useful
pre-publication, when archiving a project, or any time you want a stronger
guarantee than lc status.
What just happened¶
astra.yamlwas the only file you "wrote" — and the agent did most of the typing.- The agent wrote
scripts/fit.pyandscripts/plot.pywith argparse-driven decision injection. lc rungenerated.lightcone/Snakefilefrom your spec, dispatched each rule through Snakemake, and wrote a per-output sidecar manifest recording the recipe, container image, decisions, input hashes, and output hash.lc statusandlc verifyrely on those manifests — they don't re-execute anything; they just check.
If your laptop dies tomorrow and you git clone the repo on a fresh machine
and run lc run, you'll get bit-identical results.
Where to next¶
- The Agentic Workflow — what each entry command does in detail.
- Running on a Cluster — take the same project to SLURM.
- Troubleshooting — when something goes sideways.
- Glossary — terms like universe, decision, and manifest in plain language.