Live Tree Carbon Estimation¶

Estimate live tree carbon using the NSVB biomass framework with species-specific carbon fractions.

Overview¶

The live_tree() function recomputes above-ground live tree biomass from scratch using the National Scale Volume and Biomass (NSVB) framework of Westfall et al. (2023, GTR-WO-104) and converts to carbon via species-specific S10a carbon fractions. This produces carbon estimates that align with the EPA NGHGI LULUCF live tree pool and match FIADB's pre-computed CARBON_AG column for NSVB-era inventories (September 2023 onward).

import pyfia

db = pyfia.FIA("georgia.duckdb")
db.clip_by_state("GA")
db.clip_most_recent(eval_type="VOL")

# Above-ground live tree carbon
result = pyfia.live_tree(db, pool="ag")

# Total carbon (AG + BG bridge)
total = pyfia.live_tree(db, pool="total")

Function Reference¶

live_tree ¶

live_tree(db: str | FIA, pool: str = 'ag', grp_by: str | list[str] | None = None, by_species: bool = False, by_size_class: bool = False, land_type: str = 'forest', tree_domain: str | None = None, area_domain: str | None = None, plot_domain: str | None = None, totals: bool = True, variance: bool = False, most_recent: bool = False) -> DataFrame

Estimate live tree carbon from FIA data using the NSVB framework.

Recomputes above-ground live tree biomass from scratch using the National Scale Volume and Biomass (NSVB) framework of Westfall et al. (2023, GTR-WO-104) — the same framework USDA FIA uses to populate the FIADB CARBON_AG column for inventories from September 2023 onward. Species-specific live carbon fractions from Table S10a (GTR-WO-104) replace the flat ~0.47 multiplier used by pyfia.biomass(), producing carbon estimates that align with the EPA NGHGI LULUCF live tree pool.

Belowground carbon is bridged directly to the FIADB pre-computed TREE.CARBON_BG column; a native NSVB coarse-root model is deferred.

PARAMETER	DESCRIPTION
`db`	Database connection or path to FIA database. Can be either a path string to a DuckDB/SQLite file or an existing FIA connection object. TYPE: `str \| FIA`
`pool`	Live tree carbon pool to estimate: 'ag': Above-ground live tree carbon via the NSVB pipeline — stem wood + stem bark + branches, harmonized to the directly- predicted total AGB and converted to carbon via species-specific S10a fractions. Foliage is excluded (not part of AGB in NSVB). 'bg': Below-ground live tree carbon (coarse roots) via a bridge to FIADB `TREE.CARBON_BG`. A native NSVB root model is deferred. 'total': `'ag' + 'bg'` (NSVB AG + FIADB BG bridge). TYPE: `('ag', 'bg', 'total')` DEFAULT: `'ag'`
`grp_by`	Column name(s) to group results by. Can be any column from the FIA tables used in the estimation (PLOT, COND, TREE). Common grouping columns include: 'FORTYPCD': Forest type code 'OWNGRPCD': Ownership group (10=National Forest, 20=Other Federal, 30=State/Local, 40=Private) 'STATECD': State FIPS code 'COUNTYCD': County code 'INVYR': Inventory year 'STDAGE': Stand age class 'SITECLCD': Site productivity class For complete column descriptions, see USDA FIA Database User Guide. TYPE: `str or list of str` DEFAULT: `None`
`by_species`	If True, group results by species code (SPCD). Convenience parameter equivalent to adding 'SPCD' to `grp_by`. TYPE: `bool` DEFAULT: `False`
`by_size_class`	If True, group results by diameter size classes (1.0-4.9", 5.0-9.9", 10.0-19.9", 20.0-29.9", 30.0+ in). TYPE: `bool` DEFAULT: `False`
`land_type`	Land type to include in estimation: 'forest': All forestland 'timber': Productive timberland only (unreserved, productive) 'all': All land conditions TYPE: `('forest', 'timber', 'all')` DEFAULT: `'forest'`
`tree_domain`	SQL-like filter expression for tree-level filtering. Example: `"DIA >= 10.0 AND SPCD == 131"`. Applied on top of the live-tree filter (`STATUSCD == 1`), which is always on for this function. TYPE: `str` DEFAULT: `None`
`area_domain`	SQL-like filter expression for area/condition-level filtering. Example: `"OWNGRPCD == 40 AND FORTYPCD == 161"`. TYPE: `str` DEFAULT: `None`
`plot_domain`	SQL-like filter expression for plot-level filtering. TYPE: `str` DEFAULT: `None`
`totals`	If True, include population-level total estimates in addition to per-acre values. TYPE: `bool` DEFAULT: `True`
`variance`	If True, calculate and include variance and standard error estimates following Bechtold & Patterson (2005). TYPE: `bool` DEFAULT: `False`
`most_recent`	If True, automatically filter to the most recent EXPVOL evaluation for each state in the database before estimation. TYPE: `bool` DEFAULT: `False`

RETURNS DESCRIPTION

DataFrame

Live tree carbon estimates with the following columns:

YEAR : int Evaluation reference year from EVALID.
POOL : str Pool identifier — one of 'AG', 'BG', 'TOTAL'.
CARBON_ACRE : float Carbon per acre in short tons.
CARBON_TOTAL : float (if totals=True) Total carbon in short tons expanded to population level.
CARBON_ACRE_SE : float (if variance=True) Standard error of the per-acre estimate.
CARBON_TOTAL_SE : float (if variance=True and totals=True) Standard error of the population total.
N_PLOTS : int Number of FIA plots included in the estimation.
N_TREES : int Number of individual tree records.
[grouping columns] : various Any columns specified in grp_by or via by_species / by_size_class.

See Also

biomass : Estimate tree biomass (dry weight) using FIA's pre-computed DRYBIO columns. pyfia.estimation.estimators.carbon.carbon : Legacy carbon estimator that reads the FIADB CARBON_AG / CARBON_BG columns directly. live_tree is the NSVB-native alternative; both should agree at the tree level for NSVB-era inventories (Sep 2023 and later). pyfia.carbon.nsvb.equations.compute_nsvb_biomass : The vectorized NSVB biomass pipeline this function wraps.

Notes

NSVB Pipeline

For each live tree the function predicts:

Stem inside-bark wood volume (S1a)
Stem bark volume (S2a)
Stem bark biomass (S6a)
Branch biomass (S7a)
Total AGB (S8a), predicted directly from D and H

The first four are summed and harmonized proportionally to the directly-predicted total AGB (which becomes the truth), yielding w_wood + w_bark + w_branch == agb by construction. Cull-reduced wood weight uses the Harmon et al. (2011) DECAYCD=3 density proportions (0.54 hardwood, 0.92 softwood). The hardwood/softwood split is the SPCD < 300 rule, which is consistent with the NSVB Model 2 k constant selection and correctly classifies SPCD=10 (fir spp.) as softwood despite S10a's misclassification.

Carbon = AGB × species-specific S10a fraction. Species missing from S10a fall back to the S10a arithmetic mean (~0.4741), with a warn-once log entry.

Belowground Bridge

The current implementation does not include the Heath et al. (2009) coarse-root model. When pool in ('bg', 'total'), the function reads FIADB TREE.CARBON_BG directly and adds it to the estimate. A native NSVB BG model is deferred.

EVALID Handling

If no EVALID is set on the database and most_recent=True, the function auto-selects the most recent EXPVOL evaluation. For explicit control, call db.clip_by_evalid(...) before calling live_tree.

Examples:

Above-ground live tree carbon per acre on forestland:

>>> results = live_tree(db, pool="ag")
>>> print(f"Carbon: {results['CARBON_ACRE'][0]:.1f} tons/acre")

Total live tree carbon (AG + BG bridge) by ownership group:

>>> results = live_tree(db, pool="total", grp_by="OWNGRPCD")
>>> for row in results.iter_rows(named=True):
...     print(f"OWNGRPCD {row['OWNGRPCD']}: {row['CARBON_ACRE']:.2f} tons/acre")

Above-ground carbon by species on timberland with standard errors:

>>> results = live_tree(
...     db,
...     pool="ag",
...     by_species=True,
...     land_type="timber",
...     variance=True,
... )

Large live tree carbon (≥ 20" DBH) by forest type:

>>> results = live_tree(
...     db,
...     pool="ag",
...     grp_by="FORTYPCD",
...     tree_domain="DIA >= 20.0",
...     totals=True,
... )

Source code in src/pyfia/carbon/live_tree.py

def live_tree(
    db: str | FIA,
    pool: str = "ag",
    grp_by: str | list[str] | None = None,
    by_species: bool = False,
    by_size_class: bool = False,
    land_type: str = "forest",
    tree_domain: str | None = None,
    area_domain: str | None = None,
    plot_domain: str | None = None,
    totals: bool = True,
    variance: bool = False,
    most_recent: bool = False,
) -> pl.DataFrame:
    """
    Estimate live tree carbon from FIA data using the NSVB framework.

    Recomputes above-ground live tree biomass from scratch using the
    National Scale Volume and Biomass (NSVB) framework of Westfall et al.
    (2023, GTR-WO-104) — the same framework USDA FIA uses to populate the
    FIADB ``CARBON_AG`` column for inventories from September 2023 onward.
    Species-specific live carbon fractions from Table S10a (GTR-WO-104)
    replace the flat ~0.47 multiplier used by ``pyfia.biomass()``, producing
    carbon estimates that align with the EPA NGHGI LULUCF live tree pool.

    Belowground carbon is bridged directly to the FIADB pre-computed
    ``TREE.CARBON_BG`` column; a native NSVB coarse-root model is
    deferred.

    Parameters
    ----------
    db : str | FIA
        Database connection or path to FIA database. Can be either a path
        string to a DuckDB/SQLite file or an existing FIA connection object.
    pool : {'ag', 'bg', 'total'}, default 'ag'
        Live tree carbon pool to estimate:

        - 'ag': Above-ground live tree carbon via the NSVB pipeline —
          stem wood + stem bark + branches, harmonized to the directly-
          predicted total AGB and converted to carbon via species-specific
          S10a fractions. Foliage is excluded (not part of AGB in NSVB).
        - 'bg': Below-ground live tree carbon (coarse roots) via a bridge
          to FIADB ``TREE.CARBON_BG``. A native NSVB root model is deferred.
        - 'total': ``'ag' + 'bg'`` (NSVB AG + FIADB BG bridge).
    grp_by : str or list of str, optional
        Column name(s) to group results by. Can be any column from the
        FIA tables used in the estimation (PLOT, COND, TREE). Common
        grouping columns include:

        - 'FORTYPCD': Forest type code
        - 'OWNGRPCD': Ownership group (10=National Forest, 20=Other Federal,
          30=State/Local, 40=Private)
        - 'STATECD': State FIPS code
        - 'COUNTYCD': County code
        - 'INVYR': Inventory year
        - 'STDAGE': Stand age class
        - 'SITECLCD': Site productivity class

        For complete column descriptions, see USDA FIA Database User Guide.
    by_species : bool, default False
        If True, group results by species code (SPCD). Convenience parameter
        equivalent to adding 'SPCD' to ``grp_by``.
    by_size_class : bool, default False
        If True, group results by diameter size classes (1.0-4.9", 5.0-9.9",
        10.0-19.9", 20.0-29.9", 30.0+ in).
    land_type : {'forest', 'timber', 'all'}, default 'forest'
        Land type to include in estimation:

        - 'forest': All forestland
        - 'timber': Productive timberland only (unreserved, productive)
        - 'all': All land conditions
    tree_domain : str, optional
        SQL-like filter expression for tree-level filtering. Example:
        ``"DIA >= 10.0 AND SPCD == 131"``. Applied on top of the live-tree
        filter (``STATUSCD == 1``), which is always on for this function.
    area_domain : str, optional
        SQL-like filter expression for area/condition-level filtering.
        Example: ``"OWNGRPCD == 40 AND FORTYPCD == 161"``.
    plot_domain : str, optional
        SQL-like filter expression for plot-level filtering.
    totals : bool, default True
        If True, include population-level total estimates in addition to
        per-acre values.
    variance : bool, default False
        If True, calculate and include variance and standard error
        estimates following Bechtold & Patterson (2005).
    most_recent : bool, default False
        If True, automatically filter to the most recent EXPVOL evaluation
        for each state in the database before estimation.

    Returns
    -------
    pl.DataFrame
        Live tree carbon estimates with the following columns:

        - **YEAR** : int
            Evaluation reference year from EVALID.
        - **POOL** : str
            Pool identifier — one of ``'AG'``, ``'BG'``, ``'TOTAL'``.
        - **CARBON_ACRE** : float
            Carbon per acre in short tons.
        - **CARBON_TOTAL** : float (if ``totals=True``)
            Total carbon in short tons expanded to population level.
        - **CARBON_ACRE_SE** : float (if ``variance=True``)
            Standard error of the per-acre estimate.
        - **CARBON_TOTAL_SE** : float (if ``variance=True`` and ``totals=True``)
            Standard error of the population total.
        - **N_PLOTS** : int
            Number of FIA plots included in the estimation.
        - **N_TREES** : int
            Number of individual tree records.
        - **[grouping columns]** : various
            Any columns specified in ``grp_by`` or via ``by_species`` /
            ``by_size_class``.

    See Also
    --------
    biomass : Estimate tree biomass (dry weight) using FIA's pre-computed DRYBIO columns.
    pyfia.estimation.estimators.carbon.carbon : Legacy carbon estimator that reads the
        FIADB ``CARBON_AG`` / ``CARBON_BG`` columns directly. ``live_tree`` is the
        NSVB-native alternative; both should agree at the tree level for NSVB-era
        inventories (Sep 2023 and later).
    pyfia.carbon.nsvb.equations.compute_nsvb_biomass : The vectorized NSVB biomass
        pipeline this function wraps.

    Notes
    -----
    **NSVB Pipeline**

    For each live tree the function predicts:

    1. Stem inside-bark wood volume (S1a)
    2. Stem bark volume (S2a)
    3. Stem bark biomass (S6a)
    4. Branch biomass (S7a)
    5. Total AGB (S8a), predicted directly from D and H

    The first four are summed and harmonized proportionally to the
    directly-predicted total AGB (which becomes the truth), yielding
    ``w_wood + w_bark + w_branch == agb`` by construction. Cull-reduced
    wood weight uses the Harmon et al. (2011) ``DECAYCD=3`` density
    proportions (0.54 hardwood, 0.92 softwood). The hardwood/softwood
    split is the ``SPCD < 300`` rule, which is consistent with the
    NSVB Model 2 ``k`` constant selection and correctly classifies
    SPCD=10 (fir spp.) as softwood despite S10a's misclassification.

    Carbon = AGB × species-specific S10a fraction. Species missing from
    S10a fall back to the S10a arithmetic mean (~0.4741), with a
    warn-once log entry.

    **Belowground Bridge**

    The current implementation does not include the Heath et al. (2009)
    coarse-root model. When ``pool in ('bg', 'total')``, the function reads
    FIADB ``TREE.CARBON_BG`` directly and adds it to the estimate. A native
    NSVB BG model is deferred.

    **EVALID Handling**

    If no EVALID is set on the database and ``most_recent=True``, the
    function auto-selects the most recent EXPVOL evaluation. For explicit
    control, call ``db.clip_by_evalid(...)`` before calling
    ``live_tree``.

    Examples
    --------
    Above-ground live tree carbon per acre on forestland:

    >>> results = live_tree(db, pool="ag")
    >>> print(f"Carbon: {results['CARBON_ACRE'][0]:.1f} tons/acre")

    Total live tree carbon (AG + BG bridge) by ownership group:

    >>> results = live_tree(db, pool="total", grp_by="OWNGRPCD")
    >>> for row in results.iter_rows(named=True):
    ...     print(f"OWNGRPCD {row['OWNGRPCD']}: {row['CARBON_ACRE']:.2f} tons/acre")

    Above-ground carbon by species on timberland with standard errors:

    >>> results = live_tree(
    ...     db,
    ...     pool="ag",
    ...     by_species=True,
    ...     land_type="timber",
    ...     variance=True,
    ... )

    Large live tree carbon (≥ 20" DBH) by forest type:

    >>> results = live_tree(
    ...     db,
    ...     pool="ag",
    ...     grp_by="FORTYPCD",
    ...     tree_domain="DIA >= 20.0",
    ...     totals=True,
    ... )
    """
    from ..validation import (
        validate_boolean,
        validate_domain_expression,
        validate_grp_by,
        validate_land_type,
    )

    # ----- Validate pool -----
    pool = pool.lower()
    valid_pools = {"ag", "bg", "total"}
    if pool not in valid_pools:
        raise ValueError(
            f"Invalid pool '{pool}'. Must be one of: {sorted(valid_pools)}"
        )

    # ----- Validate standard estimator inputs -----
    land_type = validate_land_type(land_type)
    grp_by = validate_grp_by(grp_by)
    tree_domain = validate_domain_expression(tree_domain, "tree_domain")
    area_domain = validate_domain_expression(area_domain, "area_domain")
    plot_domain = validate_domain_expression(plot_domain, "plot_domain")
    by_species = validate_boolean(by_species, "by_species")
    by_size_class = validate_boolean(by_size_class, "by_size_class")
    totals = validate_boolean(totals, "totals")
    variance = validate_boolean(variance, "variance")
    most_recent = validate_boolean(most_recent, "most_recent")

    # ----- Resolve db + EVALID -----
    db, owns_db = ensure_fia_instance(db)
    # Live tree carbon uses EXPVOL evaluations (same as biomass).
    if most_recent and db.evalid is None:
        db.clip_most_recent(eval_type="VOL")
    else:
        ensure_evalid_set(db, eval_type="VOL", estimator_name="live_tree")

    # ----- Build config and run estimator -----
    config = {
        "pool": pool,
        "grp_by": grp_by,
        "by_species": by_species,
        "by_size_class": by_size_class,
        "land_type": land_type,
        "tree_type": "live",
        "tree_domain": tree_domain,
        "area_domain": area_domain,
        "plot_domain": plot_domain,
        "totals": totals,
        "variance": variance,
        "most_recent": most_recent,
    }

    try:
        estimator = LiveTreeEstimator(db, config)
        if pool == "total":
            # The cross-era warning is best-effort: if we can't determine
            # the inventory year (EVALID parse failures, missing POP_EVAL,
            # type coercion problems), skip the warning rather than fail
            # the whole estimation.
            try:
                year = estimator._extract_evaluation_year()
                if int(year) < 2024:
                    logger.warning(
                        "live_tree(pool='total'): selected EVALID year (%d) "
                        "pre-dates the NSVB framework transition "
                        "(September 2023). The BG bridge reads FIADB "
                        "TREE.CARBON_BG directly, which for pre-NSVB "
                        "inventories was computed via legacy Jenkins-based "
                        "allometry — combining it with NSVB-recomputed AG "
                        "may produce cross-era inconsistencies. Use "
                        "pool='ag' if you need NSVB-only consistency.",
                        int(year),
                    )
            except (ValueError, TypeError, AttributeError, IndexError, KeyError) as exc:
                logger.debug("Skipping live_tree year warning: %s", exc)
        return estimator.estimate()
    finally:
        if owns_db and hasattr(db, "close"):
            db.close()

Carbon Pools¶

Pool	Description	Method
`"ag"`	Above-ground (default)	NSVB pipeline: stem wood + bark + branches, harmonized to total AGB, then multiplied by species-specific S10a carbon fractions
`"bg"`	Below-ground (coarse roots)	Bridge to FIADB `TREE.CARBON_BG` (Phase 1 shortcut; native NSVB root model planned)
`"total"`	AG + BG	NSVB above-ground + FIADB below-ground bridge

How It Differs from `biomass()`¶

	`live_tree()`	`biomass()`
Biomass source	Recomputed from scratch via NSVB equations	Reads FIADB pre-computed `DRYBIO_*` columns
Carbon fraction	Species-specific S10a (0.40-0.55)	Flat 0.47 multiplier
Coefficient lookup	3-level precedence (DIVISION, species, Jenkins)	N/A (pre-computed)
Cull adjustment	NSVB cull formula with DECAYCD=3 density prop	Built into FIADB values
Transparency	Full recompute, auditable	Black-box FIADB values

For NSVB-era inventories (2024+), both should agree closely. live_tree() is the preferred path for carbon accounting work that needs methodological transparency.

Technical Notes¶

The NSVB pipeline predicts five biomass components per tree:

Stem inside-bark wood volume (S1a) x wood density x 62.4 = gross wood weight
Stem bark biomass (S6a)
Branch biomass (S7a)
Total above-ground biomass (S8a) - directly predicted

The component sum is harmonized proportionally to the directly-predicted total AGB. Cull-reduced wood uses the Harmon et al. (2011) DECAYCD=3 density proportion (0.54 hardwood, 0.92 softwood). Carbon = harmonized AGB x species-specific S10a fraction.

The optional PLOTGEOM.ECOSUBCD join activates Level 2 of the NSVB coefficient precedence (SPCD + Bailey DIVISION), closing a ~3% growing-stock biomass bias present in the species-level-only fallback. When PLOTGEOM is missing from older databases, the estimator falls back gracefully with a one-shot log warning.

Examples¶

Above-Ground Carbon Per Acre¶

result = pyfia.live_tree(db, pool="ag")
print(f"Carbon: {result['CARBON_ACRE'][0]:.2f} tons/acre")

Carbon by Species¶

result = pyfia.live_tree(db, pool="ag", by_species=True)
result = pyfia.join_species_names(result, db)
print(result.sort("CARBON_ACRE", descending=True).head(10))

Carbon by Ownership Group¶

result = pyfia.live_tree(
    db,
    pool="total",
    grp_by="OWNGRPCD",
    totals=True,
    variance=True,
)
# OWNGRPCD: 10=National Forest, 20=Other Federal,
#           30=State/Local, 40=Private
print(result)

Large Tree Carbon by Forest Type¶

result = pyfia.live_tree(
    db,
    pool="ag",
    grp_by="FORTYPCD",
    tree_domain="DIA >= 20.0",
)
result = pyfia.join_forest_type_names(result, db)
print(result)

Carbon on Timberland with Standard Errors¶

result = pyfia.live_tree(
    db,
    pool="ag",
    land_type="timber",
    variance=True,
)
print(f"Carbon: {result['CARBON_ACRE'][0]:.2f} +/- "
      f"{result['CARBON_ACRE_SE'][0]:.2f} tons/acre")