🔬
ActiveXinyang Han (UCB)Jiaming Zheng (MIT)Lu Li (USC)Yiyou Sun (UCB)

Materials Science

Contribute to Materials Science
Part 1

Overview

Materials science work follows a continuously iterated closed loop that operates across a value chain from discovery to qualification. Understanding this landscape is essential for identifying where AI agents can contribute.

The Core Loop

The most stable, general abstraction of materials work is a continuously iterated closed loop.

Design

Define target, plan computations/experiments to validate

Make

Synthesize/grow/fabricate the material

Measure

Characterize sample, transform raw data into metrics

Decide

Compare evidence to goal, choose next action

Repeat

Core Value Chain

A stage-based view that makes clear where "computer-completable" work emerges—not just in simulation, but also in measurement analysis and qualification documentation.

Stage-based View
Digital
Mixed
Physical
1

Discovery / Design

Digital

Mostly on computers (literature, databases, simulation setup)

2

Synthesis / Growth

Physical

Mostly physical (furnaces/reactors) with digital logging

3

Processing / Integration

Mixed

Mixed: physical fabrication + digital design/simulation

4

Characterization / Testing

Mixed

Instrument → computer: acquisition physical, interpretation digital

5

Scale-up / Qualification

Mixed

Physical production + QA systems + documentation

Key insight: Even when the pipeline includes physical steps, the highest-leverage, most repeatable work often lives in the digital analysis + deliverable packaging layers.

The Three-Board Model

How teams actually work in parallel and hand off artifacts. Progress is driven by moving evidence between Compute, Make, and Measure, plus extensions for product settings.

Collaboration View

Compute

Simulate/model; generate predictions and derived quantities

Digital Artifacts
Structure filesInput decksJob scriptsPostprocessing notebooksPlots/tables
Often computer-completable and scorable (QC, extraction, reporting)

Make

Synthesize/process; realize candidates physically

Digital Artifacts
Protocol draftsRun logsStructured ELN records
Hard to benchmark directly; adjacent digital support tasks feasible

Measure

Instrument → data → interpretation; produce evidence

Digital Artifacts
Raw instrument filesQC flagsExtracted measurementsAnnotated plots
Analysis/reporting layers are repeatable; strong candidates
computemake: Candidates + how to attempt
makemeasure: Sample + provenance
measurecompute: Evidence that changes next loop
Product Extensions(appear in product settings)

Integrate

CAD/CAE/CAM; compare artifacts to design intent

Digital Artifacts
CAD modelsReconstructionsSegmentation masks
Geometry- and metric-based ground truth; often benchmarkable

Qualify

Standards, reliability, QA; ship-ready evidence

Digital Artifacts
SOPsQC dashboardsQualification packs
Documentation-heavy and verifiable; data often private
Digital Maturity:
High (Compute)
Medium (Measure)
Low (Make)
Part 2

Where LLM Agents Fit

Analysis of roles and representative workflows in materials science, with focus on where AI agents can reliably contribute to execution and evidence packaging.

Part 3

Example Tasks

Benchmarkable tasks that evaluate end-to-end execution on a computer: given raw files and constraints, the agent must use real tools to produce deliverables that a reviewer can verify.

Core Tasks (3)

Compute

DFT Run-Directory QC + Report Packaging

A computational researcher needs to quickly determine whether a run is trustworthy and produce submission-ready plots and structured summary.

Measure

XRD Phase Identification

A characterization scientist needs to quickly produce the most likely phases with evidence, packaged as a reviewable deliverable.

Integrate

CT Segmentation + Compare-to-CAD

Engineering team needs to verify whether processed geometry matches design intent (porosity, defects, dimensional deviation).

Recommended Tool Stack

Common Python tooling for data handling and visualization, domain-aligned parsers for computational outputs, and scriptable imaging/volume toolkits.

Contribute to Materials Science

We seek high-level, representative contributions—not exhaustive documentation. Share your expertise in any of these areas:

Our Commitments to Contributors

  • Evaluation Only: All contributions are used exclusively for agent evaluation, never for model training.
  • Partner Review: Industry partners can review and approve task specifications before public release.
  • Data Control: Contributors can exclude sensitive or proprietary data from submissions.