Skip to content

Site Index Estimation

Estimate area-weighted mean site index as a measure of site productivity.

Overview

The site_index() function calculates area-weighted mean site index from FIA condition data. Site index represents the expected height (in feet) of dominant trees at a specified base age, indicating the inherent productivity of a forest site.

import pyfia

db = pyfia.FIA("georgia.duckdb")
db.clip_by_state("GA")

# Basic site index estimation
result = pyfia.site_index(db)

# Site index by county
by_county = pyfia.site_index(db, grp_by="COUNTYCD")

Function Reference

site_index

site_index(db: str | FIA, grp_by: str | list[str] | None = None, land_type: str = 'forest', area_domain: str | None = None, plot_domain: str | None = None, most_recent: bool = False, eval_type: str | None = None) -> DataFrame

Estimate area-weighted mean site index from FIA data.

Calculates area-weighted site index estimates using FIA's design-based estimation methods. Site index represents expected dominant tree height (in feet) at a specified base age, indicating site productivity.

Results are always grouped by SIBASE (base age) because site index values are not comparable across different base ages.

PARAMETER DESCRIPTION
db

Database connection or path to FIA database.

TYPE: str | FIA

grp_by

Column name(s) to group results by. Common grouping columns:

Site Index Species: - 'SISP': Species code used for site index determination

Forest Characteristics: - 'FORTYPCD': Forest type code - 'STDSZCD': Stand size class - 'OWNGRPCD': Ownership group

Location: - 'STATECD': State FIPS code - 'COUNTYCD': County code - 'UNITCD': FIA survey unit

TYPE: str or list of str DEFAULT: None

land_type

Land type to include:

  • 'forest': All forestland
  • 'timber': Timberland only (unreserved, productive)
  • 'all': All land types

TYPE: ('forest', 'timber', 'all') DEFAULT: 'forest'

area_domain

SQL-like filter for condition-level attributes. Examples:

  • "OWNGRPCD == 40": Private land only
  • "STDAGE > 20": Stands over 20 years old
  • "FORTYPCD IN (161, 162)": Specific forest types

TYPE: str DEFAULT: None

plot_domain

SQL-like filter for plot-level attributes. Examples:

  • "COUNTYCD == 183": Single county
  • "LAT >= 35.0 AND LAT <= 36.0": Latitude range

TYPE: str DEFAULT: None

most_recent

If True, automatically select most recent evaluation.

TYPE: bool DEFAULT: False

eval_type

Evaluation type if most_recent=True. Default is 'ALL'.

TYPE: str DEFAULT: None

RETURNS DESCRIPTION
DataFrame

Site index estimates with columns:

  • YEAR : int - Inventory year
  • SIBASE : int - Base age (always included)
  • [grouping columns] : varies - Columns from grp_by
  • SI_MEAN : float - Area-weighted mean site index (feet)
  • SI_SE : float - Standard error of mean
  • SI_VARIANCE : float - Variance of estimate
  • N_PLOTS : int - Number of plots in estimate
  • N_CONDITIONS : int - Number of conditions with site index
See Also

pyfia.area : Estimate forest area pyfia.volume : Estimate tree volume

Notes

Site index estimation uses the area-weighted mean formula:

SI_mean = sum(SICOND * CONDPROP_UNADJ * ADJ_FACTOR * EXPNS) / sum(CONDPROP_UNADJ * ADJ_FACTOR * EXPNS)

This ratio-of-means estimator requires proper variance calculation accounting for covariance between numerator and denominator.

Important Considerations:

  1. Base Age Comparability: Results are always grouped by SIBASE because site index values are only meaningful within the same base age. Common base ages are 25 years (southern pines) and 50 years (northern species).

  2. Species Specificity: SISP indicates which species equation was used for site index determination. Different species may have different site index scales.

  3. Missing Values: Conditions without site index (non-productive land, recently disturbed, etc.) are excluded from calculations.

Examples:

Basic site index estimation:

>>> from pyfia import FIA, site_index
>>> with FIA("path/to/fia.duckdb") as db:
...     db.clip_by_state(37)  # North Carolina
...     results = site_index(db)

Site index by ownership group:

>>> results = site_index(db, grp_by="OWNGRPCD")

Site index by site index species:

>>> results = site_index(db, grp_by="SISP")

Site index for private timberland:

>>> results = site_index(
...     db,
...     land_type="timber",
...     area_domain="OWNGRPCD == 40",
... )

County-level site index:

>>> results = site_index(db, grp_by="COUNTYCD")
Source code in src/pyfia/estimation/estimators/site_index.py
def site_index(
    db: str | FIA,
    grp_by: str | list[str] | None = None,
    land_type: str = "forest",
    area_domain: str | None = None,
    plot_domain: str | None = None,
    most_recent: bool = False,
    eval_type: str | None = None,
) -> pl.DataFrame:
    """
    Estimate area-weighted mean site index from FIA data.

    Calculates area-weighted site index estimates using FIA's design-based
    estimation methods. Site index represents expected dominant tree height
    (in feet) at a specified base age, indicating site productivity.

    Results are always grouped by SIBASE (base age) because site index
    values are not comparable across different base ages.

    Parameters
    ----------
    db : str | FIA
        Database connection or path to FIA database.
    grp_by : str or list of str, optional
        Column name(s) to group results by. Common grouping columns:

        **Site Index Species:**
        - 'SISP': Species code used for site index determination

        **Forest Characteristics:**
        - 'FORTYPCD': Forest type code
        - 'STDSZCD': Stand size class
        - 'OWNGRPCD': Ownership group

        **Location:**
        - 'STATECD': State FIPS code
        - 'COUNTYCD': County code
        - 'UNITCD': FIA survey unit
    land_type : {'forest', 'timber', 'all'}, default 'forest'
        Land type to include:

        - 'forest': All forestland
        - 'timber': Timberland only (unreserved, productive)
        - 'all': All land types
    area_domain : str, optional
        SQL-like filter for condition-level attributes. Examples:

        - "OWNGRPCD == 40": Private land only
        - "STDAGE > 20": Stands over 20 years old
        - "FORTYPCD IN (161, 162)": Specific forest types
    plot_domain : str, optional
        SQL-like filter for plot-level attributes. Examples:

        - "COUNTYCD == 183": Single county
        - "LAT >= 35.0 AND LAT <= 36.0": Latitude range
    most_recent : bool, default False
        If True, automatically select most recent evaluation.
    eval_type : str, optional
        Evaluation type if most_recent=True. Default is 'ALL'.

    Returns
    -------
    pl.DataFrame
        Site index estimates with columns:

        - **YEAR** : int - Inventory year
        - **SIBASE** : int - Base age (always included)
        - **[grouping columns]** : varies - Columns from grp_by
        - **SI_MEAN** : float - Area-weighted mean site index (feet)
        - **SI_SE** : float - Standard error of mean
        - **SI_VARIANCE** : float - Variance of estimate
        - **N_PLOTS** : int - Number of plots in estimate
        - **N_CONDITIONS** : int - Number of conditions with site index

    See Also
    --------
    pyfia.area : Estimate forest area
    pyfia.volume : Estimate tree volume

    Notes
    -----
    Site index estimation uses the area-weighted mean formula:

    SI_mean = sum(SICOND * CONDPROP_UNADJ * ADJ_FACTOR * EXPNS) /
              sum(CONDPROP_UNADJ * ADJ_FACTOR * EXPNS)

    This ratio-of-means estimator requires proper variance calculation
    accounting for covariance between numerator and denominator.

    **Important Considerations:**

    1. **Base Age Comparability**: Results are always grouped by SIBASE
       because site index values are only meaningful within the same base
       age. Common base ages are 25 years (southern pines) and 50 years
       (northern species).

    2. **Species Specificity**: SISP indicates which species equation was
       used for site index determination. Different species may have
       different site index scales.

    3. **Missing Values**: Conditions without site index (non-productive
       land, recently disturbed, etc.) are excluded from calculations.

    Examples
    --------
    Basic site index estimation:

    >>> from pyfia import FIA, site_index
    >>> with FIA("path/to/fia.duckdb") as db:
    ...     db.clip_by_state(37)  # North Carolina
    ...     results = site_index(db)

    Site index by ownership group:

    >>> results = site_index(db, grp_by="OWNGRPCD")

    Site index by site index species:

    >>> results = site_index(db, grp_by="SISP")

    Site index for private timberland:

    >>> results = site_index(
    ...     db,
    ...     land_type="timber",
    ...     area_domain="OWNGRPCD == 40",
    ... )

    County-level site index:

    >>> results = site_index(db, grp_by="COUNTYCD")
    """
    # Validate inputs
    inputs = validate_estimator_inputs(
        land_type=land_type,
        grp_by=grp_by,
        area_domain=area_domain,
        plot_domain=plot_domain,
        variance=False,
        totals=False,
        most_recent=most_recent,
    )

    # Ensure db is a FIA instance
    db_instance, owns_db = ensure_fia_instance(db)

    # Ensure EVALID is set
    ensure_evalid_set(
        db_instance, eval_type=eval_type or "ALL", estimator_name="site_index"
    )

    # Create config
    config = {
        "grp_by": inputs.grp_by,
        "land_type": inputs.land_type,
        "area_domain": inputs.area_domain,
        "plot_domain": inputs.plot_domain,
        "most_recent": inputs.most_recent,
        "eval_type": eval_type,
    }

    try:
        estimator = SiteIndexEstimator(db_instance, config)
        return estimator.estimate()
    finally:
        if owns_db and hasattr(db_instance, "close"):
            db_instance.close()

Key Concepts

Base Age (SIBASE)

Site index values are only comparable within the same base age. Results are always grouped by SIBASE to ensure comparability.

Base Age Common Usage
25 years Southern pines, fast-growing species
50 years Northern hardwoods, slower-growing species

Site Index Species (SISP)

The SISP column indicates which species' height-age equation was used to calculate site index. Different species may have different site index scales even at the same base age.

Technical Notes

Site index estimation uses:

  • COND table for site index values (SICOND, SIBASE, SISP)
  • Condition-level estimation (not tree-level like volume or TPA)
  • Area-weighted mean: Each condition's site index is weighted by its proportion of plot area
  • Conditions without site index (null SICOND) are excluded

Calculation Method

The area-weighted mean is calculated as:

SI_mean = Σ(SICOND × CONDPROP_UNADJ × ADJ_FACTOR × EXPNS) / Σ(CONDPROP_UNADJ × ADJ_FACTOR × EXPNS)

This ratio-of-means estimator follows Bechtold & Patterson (2005) methodology with proper variance calculation.

Examples

Statewide Site Index

result = pyfia.site_index(db)
print(f"Mean Site Index: {result['SI_MEAN'][0]:.1f} ft at base age {result['SIBASE'][0]}")

Site Index by County

result = pyfia.site_index(db, grp_by="COUNTYCD")
print(result.sort("SI_MEAN", descending=True).head(10))

Site Index by Ownership

result = pyfia.site_index(db, grp_by="OWNGRPCD")
# OWNGRPCD: 10=National Forest, 20=Other Federal, 30=State/Local, 40=Private
print(result)

Site Index by Forest Type

result = pyfia.site_index(db, grp_by="FORTYPCD")
result = pyfia.join_forest_type_names(result, db)
print(result.sort("SI_MEAN", descending=True).head(10))

Site Index by Site Index Species

Group by the species equation used to calculate site index:

result = pyfia.site_index(db, grp_by="SISP")
result = pyfia.join_species_names(result, db, spcd_column="SISP")
print(result)

Private Timberland Only

result = pyfia.site_index(
    db,
    land_type="timber",
    area_domain="OWNGRPCD == 40"
)
print(result)

Productive Sites Only

Filter to high productivity sites (site class 1-3):

result = pyfia.site_index(
    db,
    area_domain="SITECLCD IN (1, 2, 3)"
)

Multiple Grouping Variables

result = pyfia.site_index(
    db,
    grp_by=["OWNGRPCD", "FORTYPCD"]
)
print(result)

Interpreting Results

Column Description
YEAR Inventory year
SIBASE Base age in years (always included)
SI_MEAN Area-weighted mean site index (feet)
SI_SE Standard error of the mean
SI_VARIANCE Variance of the estimate
N_PLOTS Number of plots contributing to estimate
N_CONDITIONS Number of conditions with site index values

Comparison with Other Estimators

Unlike tree-based estimators (volume(), tpa(), mortality()), site_index():

  • Uses condition-level data, not tree-level
  • Has no measure or tree_type parameters
  • Always groups by SIBASE (base age)
  • Returns a mean value, not a total