Property-Based Testing Guide for pyFIA¶
Overview¶
Property-based testing with Hypothesis helps us verify that our code satisfies certain properties across a wide range of inputs, not just specific test cases.
What is Property-Based Testing?¶
Instead of writing:
We write:
@given(values=st.lists(st.floats(min_value=0)))
def test_area_always_positive(values):
assert calculate_area(values) >= 0
Hypothesis generates hundreds of test cases automatically!
Key Properties We Test¶
1. Mathematical Invariants¶
- Variance is always non-negative
- CV = (SE / Estimate) × 100
- Proportions sum to ≤ 1
- Ratios preserve ordering
2. Domain Constraints¶
- Forest area ≤ Total area
- Tree counts are non-negative
- DBH values are positive
- Plot counts match expected ranges
3. Statistical Properties¶
- Estimates are unbiased
- Variance formulas are correct
- Stratification reduces variance
- Confidence intervals contain true values
4. Data Integrity¶
- Joins don't increase row counts
- Filters reduce or maintain counts
- Grouping preserves totals
- Missing data is handled correctly
Running Property Tests¶
Basic Usage¶
# Run all property tests
uv run pytest tests/test_property_based.py -v
# Run with more examples (slower but more thorough)
uv run pytest tests/test_property_based.py --hypothesis-profile=ci
# Run specific test
uv run pytest tests/test_property_based.py::TestEstimationProperties::test_variance_non_negative -v
Hypothesis Profiles¶
dev
: 10 examples (fast, for development)ci
: 100 examples (for continuous integration)nightly
: 1000 examples (thorough testing)
Debugging Failures¶
When a test fails, Hypothesis provides: 1. The minimal failing example 2. Steps to reproduce 3. Shrunk input that still fails
Example:
Writing New Property Tests¶
1. Identify Properties¶
Ask: "What should always be true?" - Output constraints (non-negative, bounded) - Relationships (X ≤ Y, sum = total) - Invariants (formulas, conservation laws)
2. Create Custom Strategies¶
@st.composite
def plot_data_strategy(draw):
"""Generate realistic plot data."""
n_plots = draw(st.integers(min_value=1, max_value=100))
return pl.DataFrame({
"PLT_CN": [f"P{i:04d}" for i in range(n_plots)],
"INVYR": draw(st.lists(
st.integers(2010, 2025),
min_size=n_plots,
max_size=n_plots
))
})
3. Write Property Tests¶
@given(data=plot_data_strategy())
def test_property(data):
result = process_data(data)
# Assert property holds
assert property_check(result)
4. Handle Edge Cases¶
@given(values=st.lists(st.floats()))
def test_with_edge_cases(values):
assume(len(values) > 0) # Skip empty lists
assume(not any(math.isnan(v) for v in values)) # Skip NaN
result = calculate(values)
assert result >= 0
Common Patterns¶
Testing Numerical Stability¶
@given(
small=st.floats(min_value=1e-10, max_value=1e-5),
large=st.floats(min_value=1e5, max_value=1e10)
)
def test_numerical_stability(small, large):
# Should handle extreme values
result = calculate_ratio(large, small)
assert not math.isnan(result)
assert not math.isinf(result)
Testing Transformations¶
@given(df=dataframe_strategy())
def test_transformation_preserves_property(df):
original_sum = df["value"].sum()
transformed = apply_transformation(df)
# Transformation should preserve sum
assert abs(transformed["value"].sum() - original_sum) < 1e-10
Testing Estimators¶
@given(
true_value=st.floats(min_value=0, max_value=1000),
n_samples=st.integers(min_value=10, max_value=1000)
)
def test_estimator_unbiased(true_value, n_samples):
estimates = []
for _ in range(100):
sample = generate_sample(true_value, n_samples)
estimates.append(calculate_estimate(sample))
# Mean of estimates should be close to true value
assert abs(np.mean(estimates) - true_value) < true_value * 0.1
Best Practices¶
- Start Simple: Test obvious properties first
- Use Realistic Data: Create domain-specific strategies
- Test Relationships: Not just individual values
- Consider Performance: Use
@settings(deadline=...)
for slow tests - Document Properties: Explain why property should hold
Integration with CI/CD¶
# .github/workflows/test.yml
- name: Run property tests
run: |
uv run pytest tests/test_property_based.py \
--hypothesis-profile=ci \
--hypothesis-show-statistics