Data Download¶
PyFIA provides functions to download FIA data directly from the USDA Forest Service FIA DataMart.
Overview¶
from pyfia import download
# Download single state
db_path = download("GA")
# Download multiple states (merged)
db_path = download(["GA", "FL", "SC"])
# Download specific tables
db_path = download("GA", tables=["PLOT", "TREE", "COND"])
Main Function¶
download
¶
download(states: str | list[str], dir: str | Path | None = None, common: bool = True, tables: list[str] | None = None, force: bool = False, show_progress: bool = True, use_cache: bool = True) -> Path
Download FIA data from the FIA DataMart.
This function downloads FIA data for one or more states from the USDA Forest Service FIA DataMart, similar to rFIA's getFIA() function. Data is automatically converted to DuckDB format for use with pyFIA.
| PARAMETER | DESCRIPTION |
|---|---|
states
|
State abbreviations (e.g., 'GA', 'NC'). Supports multiple states: ['GA', 'FL', 'SC']
TYPE:
|
dir
|
Directory to save downloaded data. Defaults to ~/.pyfia/data/
TYPE:
|
common
|
If True, download only tables required for pyFIA functions. If False, download all available tables.
TYPE:
|
tables
|
Specific tables to download. Overrides
TYPE:
|
force
|
If True, re-download even if files exist locally.
TYPE:
|
show_progress
|
Show download progress bars.
TYPE:
|
use_cache
|
Use cached downloads if available.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Path
|
Path to the DuckDB database file. |
| RAISES | DESCRIPTION |
|---|---|
StateNotFoundError
|
If an invalid state code is provided. |
TableNotFoundError
|
If a requested table is not available. |
NetworkError
|
If download fails due to network issues. |
DownloadError
|
For other download-related errors. |
Examples:
>>> from pyfia import download
>>>
>>> # Download Georgia data
>>> db_path = download("GA")
>>>
>>> # Download multiple states merged into one database
>>> db_path = download(["GA", "FL", "SC"])
>>>
>>> # Download only specific tables
>>> db_path = download("GA", tables=["PLOT", "TREE", "COND"])
>>>
>>> # Use with pyFIA immediately
>>> from pyfia import FIA, area
>>> with FIA(download("GA")) as db:
... db.clip_most_recent()
... result = area(db)
Notes
- Large states (CA, TX) may have TREE tables >1GB compressed
- First download may take several minutes depending on connection
- Downloaded data is cached locally to avoid re-downloading
Source code in src/pyfia/downloader/__init__.py
182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 | |
DataMart Client¶
DataMartClient
¶
HTTP client for FIA DataMart downloads.
This client handles downloading CSV files from the FIA DataMart, with support for progress bars, retries, and checksum verification.
| PARAMETER | DESCRIPTION |
|---|---|
timeout
|
Request timeout in seconds.
TYPE:
|
chunk_size
|
Download chunk size in bytes (default 1MB).
TYPE:
|
max_retries
|
Maximum number of retry attempts for failed downloads.
TYPE:
|
Examples:
>>> client = DataMartClient()
>>> path = client.download_table("GA", "PLOT", Path("./data"))
>>> print(f"Downloaded to: {path}")
Source code in src/pyfia/downloader/client.py
download_table
¶
Download a single FIA table for a state.
| PARAMETER | DESCRIPTION |
|---|---|
state
|
State abbreviation (e.g., 'GA') or 'REF' for reference tables.
TYPE:
|
table
|
Table name (e.g., 'PLOT', 'TREE').
TYPE:
|
dest_dir
|
Directory to save the extracted CSV file.
TYPE:
|
show_progress
|
Show download progress bar.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Path
|
Path to the extracted CSV file. |
| RAISES | DESCRIPTION |
|---|---|
StateNotFoundError
|
If the state code is invalid. |
TableNotFoundError
|
If the table is not found for the state. |
NetworkError
|
If the download fails. |
Examples:
Source code in src/pyfia/downloader/client.py
247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 | |
download_tables
¶
download_tables(state: str, tables: list[str] | None = None, common: bool = True, dest_dir: Path | None = None, show_progress: bool = True) -> dict[str, Path]
Download multiple FIA tables for a state.
| PARAMETER | DESCRIPTION |
|---|---|
state
|
State abbreviation (e.g., 'GA') or 'REF' for reference tables.
TYPE:
|
tables
|
Specific tables to download. If None, uses common or all tables.
TYPE:
|
common
|
If tables is None, download only common tables (True) or all (False).
TYPE:
|
dest_dir
|
Directory to save files. Defaults to ~/.pyfia/data/{state}/csv/
TYPE:
|
show_progress
|
Show download progress.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict
|
Mapping of table names to downloaded file paths. |
Examples:
>>> client = DataMartClient()
>>> paths = client.download_tables("GA", common=True)
>>> print(f"Downloaded {len(paths)} tables")
Source code in src/pyfia/downloader/client.py
327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 | |
check_url_exists
¶
Check if a URL exists (HEAD request).
| PARAMETER | DESCRIPTION |
|---|---|
url
|
URL to check.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
bool
|
True if URL exists (status 200), False otherwise. |
Source code in src/pyfia/downloader/client.py
Download Cache¶
DownloadCache
¶
Manages cached FIA data downloads with metadata tracking.
The cache stores metadata about downloaded DuckDB files including timestamps, checksums, and file locations. This allows skipping downloads for files that are already present and valid.
| PARAMETER | DESCRIPTION |
|---|---|
cache_dir
|
Directory for cache storage.
TYPE:
|
Examples:
>>> cache = DownloadCache(Path("~/.pyfia/cache"))
>>> if not cache.get_cached("GA"):
... # Download the file
... cache.add_to_cache("GA", downloaded_path)
Source code in src/pyfia/downloader/cache.py
get_cached
¶
Get the path to a cached DuckDB file if it exists and is valid.
| PARAMETER | DESCRIPTION |
|---|---|
state
|
State abbreviation or cache key (e.g., 'GA', 'MERGED_FL_GA_SC').
TYPE:
|
max_age_days
|
Maximum age in days to consider cache valid. Defaults to None (no age limit).
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Path or None
|
Path to the cached DuckDB file, or None if not found/invalid. |
Source code in src/pyfia/downloader/cache.py
add_to_cache
¶
Add a downloaded DuckDB file to the cache.
| PARAMETER | DESCRIPTION |
|---|---|
state
|
State abbreviation or cache key (e.g., 'GA', 'MERGED_FL_GA_SC').
TYPE:
|
path
|
Path to the downloaded DuckDB file.
TYPE:
|
checksum
|
MD5 checksum of the file. Calculated if not provided.
TYPE:
|
Source code in src/pyfia/downloader/cache.py
clear_cache
¶
clear_cache(older_than: timedelta | None = None, state: str | None = None, delete_files: bool = False) -> int
Clear cached entries.
| PARAMETER | DESCRIPTION |
|---|---|
older_than
|
Only clear entries older than this. If None, clear all.
TYPE:
|
state
|
Only clear entries for this state.
TYPE:
|
delete_files
|
If True, also delete the cached files from disk.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
int
|
Number of entries cleared. |
Source code in src/pyfia/downloader/cache.py
get_cache_info
¶
Get information about the cache.
| RETURNS | DESCRIPTION |
|---|---|
dict
|
Cache statistics including total size, file count, etc. |
Source code in src/pyfia/downloader/cache.py
list_cached
¶
List cached downloads.
| PARAMETER | DESCRIPTION |
|---|---|
state
|
Filter by state.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list of CachedDownload
|
List of cached download metadata. |
Source code in src/pyfia/downloader/cache.py
Cache Management Functions¶
clear_cache
¶
clear_cache(older_than_days: int | None = None, state: str | None = None, delete_files: bool = False) -> int
Clear the download cache.
| PARAMETER | DESCRIPTION |
|---|---|
older_than_days
|
Only clear entries older than this many days.
TYPE:
|
state
|
Only clear entries for this state.
TYPE:
|
delete_files
|
If True, also delete the cached files from disk.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
int
|
Number of cache entries cleared. |
Source code in src/pyfia/downloader/__init__.py
cache_info
¶
Get information about the download cache.
| RETURNS | DESCRIPTION |
|---|---|
dict
|
Cache statistics including size, file count, etc. |
Source code in src/pyfia/downloader/__init__.py
Table Definitions¶
COMMON_TABLES
module-attribute
¶
COMMON_TABLES: list[str] = ['COND', 'COND_DWM_CALC', 'INVASIVE_SUBPLOT_SPP', 'PLOT', 'POP_ESTN_UNIT', 'POP_EVAL', 'POP_EVAL_GRP', 'POP_EVAL_TYP', 'POP_PLOT_STRATUM_ASSGN', 'POP_STRATUM', 'SUBPLOT', 'TREE', 'TREE_GRM_COMPONENT', 'TREE_GRM_MIDPT', 'TREE_GRM_BEGIN', 'SUBP_COND_CHNG_MTRX', 'SEEDLING', 'SURVEY', 'SUBP_COND', 'P2VEG_SUBP_STRUCTURE']
ALL_TABLES
module-attribute
¶
ALL_TABLES: list[str] = ['BOUNDARY', 'COND', 'COND_DWM_CALC', 'COUNTY', 'DWM_COARSE_WOODY_DEBRIS', 'DWM_DUFF_LITTER_FUEL', 'DWM_FINE_WOODY_DEBRIS', 'DWM_MICROPLOT_FUEL', 'DWM_RESIDUAL_PILE', 'DWM_TRANSECT_SEGMENT', 'DWM_VISIT', 'GRND_CVR', 'INVASIVE_SUBPLOT_SPP', 'LICHEN_LAB', 'LICHEN_PLOT_SUMMARY', 'LICHEN_VISIT', 'OZONE_BIOSITE_SUMMARY', 'OZONE_PLOT', 'OZONE_PLOT_SUMMARY', 'OZONE_SPECIES_SUMMARY', 'OZONE_VALIDATION', 'OZONE_VISIT', 'P2VEG_SUBPLOT_SPP', 'P2VEG_SUBP_STRUCTURE', 'PLOT', 'PLOTGEOM', 'PLOTSNAP', 'POP_ESTN_UNIT', 'POP_EVAL', 'POP_EVAL_ATTRIBUTE', 'POP_EVAL_GRP', 'POP_EVAL_TYP', 'POP_PLOT_STRATUM_ASSGN', 'POP_STRATUM', 'SEEDLING', 'SITETREE', 'SOILS_EROSION', 'SOILS_LAB', 'SOILS_SAMPLE_LOC', 'SOILS_VISIT', 'SUBPLOT', 'SUBP_COND', 'SUBP_COND_CHNG_MTRX', 'SURVEY', 'TREE', 'TREE_GRM_BEGIN', 'TREE_GRM_COMPONENT', 'TREE_GRM_ESTN', 'TREE_GRM_MIDPT', 'TREE_REGIONAL_BIOMASS', 'TREE_WOODLAND_STEMS', 'VEG_PLOT_SPECIES', 'VEG_QUADRAT', 'VEG_SUBPLOT', 'VEG_SUBPLOT_SPP', 'VEG_VISIT']
VALID_STATE_CODES
module-attribute
¶
VALID_STATE_CODES: dict[str, str] = {'AL': 'Alabama', 'AK': 'Alaska', 'AZ': 'Arizona', 'AR': 'Arkansas', 'CA': 'California', 'CO': 'Colorado', 'CT': 'Connecticut', 'DE': 'Delaware', 'FL': 'Florida', 'GA': 'Georgia', 'HI': 'Hawaii', 'ID': 'Idaho', 'IL': 'Illinois', 'IN': 'Indiana', 'IA': 'Iowa', 'KS': 'Kansas', 'KY': 'Kentucky', 'LA': 'Louisiana', 'ME': 'Maine', 'MD': 'Maryland', 'MA': 'Massachusetts', 'MI': 'Michigan', 'MN': 'Minnesota', 'MS': 'Mississippi', 'MO': 'Missouri', 'MT': 'Montana', 'NE': 'Nebraska', 'NV': 'Nevada', 'NH': 'New Hampshire', 'NJ': 'New Jersey', 'NM': 'New Mexico', 'NY': 'New York', 'NC': 'North Carolina', 'ND': 'North Dakota', 'OH': 'Ohio', 'OK': 'Oklahoma', 'OR': 'Oregon', 'PA': 'Pennsylvania', 'RI': 'Rhode Island', 'SC': 'South Carolina', 'SD': 'South Dakota', 'TN': 'Tennessee', 'TX': 'Texas', 'UT': 'Utah', 'VT': 'Vermont', 'VA': 'Virginia', 'WA': 'Washington', 'WV': 'West Virginia', 'WI': 'Wisconsin', 'WY': 'Wyoming', 'AS': 'American Samoa', 'FM': 'Federated States of Micronesia', 'GU': 'Guam', 'MH': 'Marshall Islands', 'MP': 'Northern Mariana Islands', 'PW': 'Palau', 'PR': 'Puerto Rico', 'VI': 'Virgin Islands'}
Exceptions¶
DownloadError
¶
Bases: Exception
Base exception for all download-related errors.
| PARAMETER | DESCRIPTION |
|---|---|
message
|
Human-readable error message.
TYPE:
|
url
|
URL that caused the error.
TYPE:
|
Source code in src/pyfia/downloader/exceptions.py
StateNotFoundError
¶
Bases: DownloadError
Raised when an invalid state code is provided.
| PARAMETER | DESCRIPTION |
|---|---|
state
|
The invalid state code.
TYPE:
|
valid_states
|
List of valid state codes for reference.
TYPE:
|
Source code in src/pyfia/downloader/exceptions.py
TableNotFoundError
¶
Bases: DownloadError
Raised when a requested table is not available for download.
| PARAMETER | DESCRIPTION |
|---|---|
table
|
The table name that was not found.
TYPE:
|
state
|
The state for which the table was requested.
TYPE:
|
Source code in src/pyfia/downloader/exceptions.py
NetworkError
¶
Bases: DownloadError
Raised when a network-related download failure occurs.
| PARAMETER | DESCRIPTION |
|---|---|
message
|
Description of the network error.
TYPE:
|
url
|
URL that caused the error.
TYPE:
|
status_code
|
HTTP status code if available.
TYPE:
|