Skip to content

pyFIA Architecture

What is pyFIA?

pyFIA is a Python library for analyzing USDA Forest Inventory and Analysis (FIA) data. It provides: - Statistical estimation functions for forest metrics (area, volume, biomass, etc.) - FIA-standard API following official FIA methodology - High performance using DuckDB and Polars - Proper FIA methodology with EVALID-based statistical validity

Core Architecture

graph TB
    %% Entry Points
    subgraph "Entry Layer"
        PY[Python API<br/>import pyfia]
        CLI[pyfia CLI<br/>Direct Functions]
    end

    %% Core Components
    subgraph "Core Layer"
        FIA[FIA Class<br/>Database Connection<br/>EVALID Management]
        DR[Data Reader<br/>DuckDB Interface]
    end

    %% Processing
    subgraph "Processing Layer"
        EST[Estimation Functions<br/>area, volume, biomass<br/>tpa, mortality, growth]
        FILT[Filters<br/>Domain, EVALID<br/>Grouping, Joins]
        UTILS[Utilities<br/>Statistical Calculations<br/>Stratification]
    end

    %% Data
    subgraph "Data Layer"
        DB[(DuckDB<br/>FIA Database)]
    end

    %% Direct Path
    PY --> FIA
    CLI --> FIA
    FIA --> EST
    EST --> FILT
    EST --> UTILS
    FIA --> DR
    DR --> DB

    style FIA fill:#e74c3c
    style EST fill:#2ecc71
    style DB fill:#34495e

Data Flow

Direct API Flow

sequenceDiagram
    participant User
    participant pyFIA
    participant FIA Class
    participant Estimator
    participant Database

    User->>pyFIA: area(db, evalid=372301)
    pyFIA->>FIA Class: Get filtered data
    FIA Class->>Database: Query with EVALID
    Database-->>FIA Class: Plot/Condition data
    FIA Class-->>pyFIA: Filtered DataFrames
    pyFIA->>Estimator: Calculate estimates
    Estimator-->>pyFIA: Results with SE
    pyFIA-->>User: DataFrame with estimates

Key Components

Core Components

Component Purpose Key Functions
FIA Class Main interface to database clipFIA(), readFIA(), findEvalid()
Data Reader Database abstraction Handles DuckDB connections and queries
Settings Configuration management Database paths, default options

Estimation Functions

Function Calculates Key Features
area() Forest land area By forest type, ownership, size class
biomass() Tree biomass Above/below ground, carbon content
volume() Wood volume Net/gross, merch/sound, board feet
tpa() Trees per acre By species, size, status
mortality() Annual mortality Trees, volume, biomass
growth() Annual growth Net growth accounting for mortality

Filter System

Filter Type Purpose Example
EVALID Statistical validity Only use data from one evaluation
Domain Tree/area filtering "DIA >= 5", "OWNGRPCD == 10"
Grouping Result aggregation By species, size class, ownership
Classification Tree categorization Live/dead, growing stock

Design Principles

1. Statistical Validity First

  • EVALID-based filtering ensures proper population estimates
  • All estimators follow FIA statistical methodology
  • Standard errors and confidence intervals included

2. Performance Optimized

  • DuckDB for fast analytical queries
  • Polars for efficient data manipulation
  • Lazy evaluation where possible

3. FIA Standards Compliance

  • Function signatures follow FIA methodology
  • Parameter names use FIA standard conventions
  • Statistical outputs match FIA standards

4. Modular Design

  • Estimation functions are independent
  • Filters can be composed
  • Easy to add new estimators

5. User Friendly

  • Consistent function signatures
  • Clear parameter names
  • Rich documentation and examples

File Organization

src/pyfia/
├── core/           # Database connection, EVALID management
├── estimation/     # Statistical estimation functions
├── filters/        # Data filtering and processing
├── cli/           # Command-line interface
├── database/      # Database utilities and schema
├── models/        # Data models (Pydantic)
└── locations/     # Geographic parsing utilities

Key Concepts

EVALID System

The heart of FIA's statistical design: - Groups plots into valid populations - Ensures proper expansion factors - Links to specific time periods - Required for all population estimates

Stratification

FIA uses post-stratified estimation: 1. Plots assigned to strata 2. Strata have expansion factors 3. Estimates calculated by stratum 4. Combined for population totals

FIA Methodology

pyFIA follows official FIA methodology: - Standard FIA estimation procedures - Official statistical methodology - FIA-compliant output structures - Proper expansion factors and variance calculation

This architecture provides a solid foundation for forest inventory analysis following official FIA statistical standards and methodology.