Spatial Filtering¶
PyFIA supports spatial filtering and grouping using polygon boundaries. This enables analysis of custom regions like watersheds, management units, or administrative boundaries.
Overview¶
PyFIA provides two spatial methods:
| Method | Purpose | Use Case |
|---|---|---|
clip_by_polygon() |
Filter plots to polygon boundary | Analyze custom regions |
intersect_polygons() |
Join polygon attributes to plots | Group estimates by polygon attributes |
Supported Formats¶
Both methods accept any GDAL-supported spatial file format:
- Shapefile (
.shp) - GeoJSON (
.geojson,.json) - GeoPackage (
.gpkg) - GeoParquet (
.parquet)
Clipping to a Region¶
Use clip_by_polygon() to filter plots within a polygon boundary:
from pyfia import FIA, tpa, area
with FIA("southeast.duckdb") as db:
db.clip_by_state(37) # North Carolina
db.clip_most_recent(eval_type="VOL")
# Filter to custom region
db.clip_by_polygon("my_region.geojson")
# Estimates now only include plots within the polygon
result = tpa(db, tree_type="live")
Multiple Format Examples¶
# GeoJSON
db.clip_by_polygon("boundary.geojson")
# Shapefile
db.clip_by_polygon("counties.shp")
# GeoPackage
db.clip_by_polygon("regions.gpkg")
How It Works¶
- Loads the DuckDB spatial extension
- Performs point-in-polygon test using plot coordinates (LAT, LON)
- Stores matching plot CNs as a filter
- Applies filter when loading PLOT, TREE, and COND tables
Grouping by Polygon Attributes¶
Use intersect_polygons() to join polygon attributes to plots for use in grp_by:
from pyfia import FIA, tpa, area
with FIA("southeast.duckdb") as db:
db.clip_by_state(37)
db.clip_most_recent(eval_type="VOL")
# Join county attributes to plots
db.intersect_polygons("counties.shp", attributes=["NAME", "FIPS"])
# Group estimates by county name
result = tpa(db, grp_by=["NAME"], tree_type="live")
Output Example¶
| NAME | TPA | BAA | N_PLOTS |
|---|---|---|---|
| Wake | 523.4 | 89.2 | 45 |
| Durham | 612.1 | 102.3 | 38 |
| Orange | 487.9 | 78.6 | 29 |
Multiple Attributes¶
# Join multiple attributes
db.intersect_polygons(
"regions.geojson",
attributes=["REGION", "DISTRICT", "MANAGEMENT_UNIT"]
)
# Group by any combination
result = area(db, grp_by=["REGION", "DISTRICT"], land_type="forest")
How It Works¶
- Performs spatial join between plots and polygons
- Stores polygon attributes with plot CNs
- Joins attributes when loading PLOT table
- Attributes flow through stratification to estimators
- Can be used in
grp_bylike any other column
Combining Methods¶
Both methods can be used together:
with FIA("southeast.duckdb") as db:
db.clip_by_state(37)
db.clip_most_recent(eval_type="VOL")
# First, clip to a study area
db.clip_by_polygon("study_area.geojson")
# Then, add management unit attributes for grouping
db.intersect_polygons("management_units.shp", attributes=["UNIT_NAME"])
# Estimates are clipped AND grouped
result = tpa(db, grp_by=["UNIT_NAME"], tree_type="live")
Common Analysis Patterns¶
County-Level Estimates¶
# Forest area by county
db.intersect_polygons("nc_counties.shp", attributes=["NAME"])
result = area(db, grp_by=["NAME"], land_type="forest")
Watershed Analysis¶
# Biomass by watershed
db.clip_by_polygon("watershed_boundary.geojson")
db.intersect_polygons("sub_watersheds.geojson", attributes=["HUC12"])
result = biomass(db, grp_by=["HUC12"])
Management Unit Analysis¶
# Volume on national forest management units
db.clip_by_polygon("national_forest_boundary.shp")
db.intersect_polygons("management_areas.gpkg", attributes=["AREA_NAME", "AREA_TYPE"])
result = volume(db, grp_by=["AREA_NAME"], land_type="timber")
Regional Comparison¶
# Compare north vs south regions
db.intersect_polygons("regions.geojson", attributes=["REGION"])
result = tpa(db, grp_by=["REGION"], by_species=True)
Important Considerations¶
Coordinate Privacy¶
FIA public coordinates are fuzzed up to 1 mile for privacy protection. This means:
- Spatial precision below ~1 mile is not meaningful
- Small polygons may have few or no matching plots
- Boundary effects exist for small analysis units
Plots Outside Polygons¶
When using intersect_polygons():
- Plots outside all polygons get NULL attribute values
- These plots are still included in non-grouped estimates
- Use
grp_byto filter by attribute (NULLs excluded from groups)
Performance¶
- First spatial query loads the DuckDB spatial extension
- Subsequent queries are faster (extension cached)
- Large polygon files may take longer to process
Error Handling¶
Common Errors¶
# Missing spatial extension
from pyfia.core.exceptions import SpatialExtensionError
# Invalid file path or format
from pyfia.core.exceptions import SpatialFileError
# No spatial filter set
from pyfia.core.exceptions import NoSpatialFilterError
Example¶
from pyfia.core.exceptions import SpatialFileError
try:
db.clip_by_polygon("invalid_file.xyz")
except SpatialFileError as e:
print(f"Could not read spatial file: {e}")
API Reference¶
clip_by_polygon¶
db.clip_by_polygon(
polygon: str | Path, # Path to spatial file
) -> FIA # Returns self for chaining
intersect_polygons¶
db.intersect_polygons(
polygon: str | Path, # Path to spatial file
attributes: list[str], # Attribute columns to join
) -> FIA # Returns self for chaining
See Also¶
- Domain Filtering - Non-spatial filtering options
- Grouping Results - Grouping by FIA attributes
- FIA Database API - Full FIA class reference