Project Aperio: AI Schema Discovery with LangExtract

The Innovation

🔍

Zero Configuration

No manual YAML schemas required. LangExtract analyzes your text and suggests extraction patterns automatically.

🧠

Domain Intelligence

Adapts extraction patterns based on text type - medical notes get different schemas than scientific papers.

⚡

Instant Results

From sample text to structured data in seconds. No domain expertise or manual configuration needed.

Traditional Approach

# Manual schema.yaml - hours of work
extraction_schema:
  patient_info:
    age: "Extract age near 'year-old'"
    gender: "Extract male or female"
  medications:
    drug: "Extract medication names"
    dosage: "Extract dosage amounts"
# ... 50+ more lines of configuration

Aperio + LangExtract

# AI discovers schema automatically
result = lx.extract(
    text=medical_text,
    prompt="Extract patient info for analytics",
    examples=[simple_example]
)
# ✅ Complete schema discovered automatically
# ✅ No manual configuration required

Live Demo Results

15

Total Entities Extracted

2

Domains Analyzed

10

Schema Categories Discovered

0

Manual Configuration Files

Medical Domain Schema

Automatically discovered categories:

PATIENT_DEMO - Demographics and patient info
CONDITION - Medical conditions and diagnoses
MEDICATION - Drugs, dosages, frequencies
TREATMENT - Procedures and recommendations
DIAGNOSIS - Clinical assessments

Scientific Domain Schema

Completely different patterns discovered:

METHOD - Research techniques and approaches
PERFORMANCE - Accuracy metrics and results
DATASET - Research data sources
INFRASTRUCTURE - Hardware and training details
IMPROVEMENT - Performance gains and innovations

Knowledge Graph Visualization

Interactive network graphs showing relationships between extracted entities across both domains.

Knowledge graphs showing medical and scientific domain extractions

Left: Medical domain relationships (patient data, conditions, treatments)
Right: Scientific domain relationships (methods, performance, datasets)

Try It Yourself

Ready to explore AI-powered schema discovery? Get started in minutes.

View Technical Implementation Clone & Run Locally

git clone https://github.com/knightsri/aperio
cd aperio
python -m venv aperio_env
source aperio_env/Scripts/activate
pip install -r requirements.txt
cp .env.example .env

# Add your Gemini API key to .env
jupyter notebook aperio_demo.ipynb