test-property-based

📁 dawiddutoit/custom-claude 📅 Jan 26, 2026

总安装量

周安装量

#51684

全站排名

安装命令

npx skills add https://github.com/dawiddutoit/custom-claude --skill test-property-based

Agent 安装分布

mcpjam 4

neovate 4

gemini-cli 4

antigravity 4

windsurf 4

zencoder 4

Skill 文档

Property-Based Testing with Hypothesis

Quick Start

Property-based testing automatically generates hundreds of test cases to validate invariants:

from hypothesis import given, strategies as st

# Instead of writing many example tests...
# def test_sort_1(): assert sorted([3,1,2]) == [1,2,3]
# def test_sort_2(): assert sorted([]) == []
# ... (20 more examples)

# Write ONE property test that covers ALL cases
@given(st.lists(st.integers()))
def test_sort_idempotent(lst):
    """Property: Sorting twice gives same result as once."""
    once_sorted = sorted(lst)
    twice_sorted = sorted(once_sorted)
    assert once_sorted == twice_sorted

Hypothesis automatically generates 100+ test cases including edge cases you’d never think of: empty lists, single elements, duplicates, large lists, negative numbers, etc.

When to Use This Skill
What This Skill Does
Core Concepts
Step-by-Step Workflow
Common Property Patterns
Integration with pytest
Async Property Testing
Pydantic Model Testing
Configuration
Supporting Files
Expected Outcomes
Requirements
Red Flags to Avoid

When to Use This Skill

Explicit Triggers

Use this skill when users mention:

“property test”
“hypothesis test”
“generate test cases”
“fuzz testing”
“invariant testing”
“roundtrip test”
“stateful testing”
“edge case testing”
“test with random data”

Implicit Triggers

Use when you observe:

Manual writing of many similar example tests
Testing parsing/serialization (perfect for roundtrip properties)
Validating configuration classes (especially Pydantic models)
Testing algorithms with mathematical properties
Protocol message handling (IPC, API requests/responses)
State machine behavior

Debugging Triggers

Use when:

Edge case bugs slip through example-based tests
Need more comprehensive input coverage
Test suite misses corner cases
Validating refactored code behavior

What This Skill Does

This skill guides you through:

Installing Hypothesis – Add to project dependencies
Writing property tests – Transform example tests into property-based tests
Choosing strategies – Select appropriate data generators
Creating custom strategies – Build domain-specific generators
Async integration – Combine with pytest-asyncio
Pydantic integration – Test Pydantic models automatically
Configuration – Set up profiles for dev/CI/thorough testing
Stateful testing – Test state machines and complex workflows

Philosophy: Instead of “here are 5 examples that should work”, write “here’s a property that should ALWAYS hold” and let Hypothesis find edge cases.

Core Concepts

Strategies

Strategies describe the type of data Hypothesis should generate:

from hypothesis import strategies as st

# Basic types
st.integers()                           # All integers
st.integers(min_value=0, max_value=100) # Constrained range
st.floats(allow_nan=False)              # Floats without NaN
st.text()                               # Unicode strings
st.text(alphabet="abc", min_size=1)     # Limited alphabet
st.binary()                             # Bytes

# Collections
st.lists(st.integers())                 # Lists of integers
st.dictionaries(st.text(), st.integers()) # Dict[str, int]
st.sets(st.text(), min_size=1)          # Non-empty sets
st.tuples(st.text(), st.integers())     # (str, int) tuples

# Special
st.one_of(st.integers(), st.text())     # Union types
st.none()                               # None values
st.uuids()                              # UUID objects
st.datetimes()                          # datetime objects

See references/strategies-reference.md for complete strategy catalog.

The @given Decorator

The @given decorator runs your test function with generated data:

from hypothesis import given, strategies as st

@given(st.integers(), st.integers())
def test_addition_commutative(a, b):
    """Addition should be commutative."""
    assert a + b == b + a

@given(st.lists(st.integers()))
def test_sort_preserves_length(lst):
    """Sorting preserves list length."""
    assert len(sorted(lst)) == len(lst)

Default behavior: Runs 100 examples (configurable via settings).

Shrinking

When Hypothesis finds a failing test, it automatically minimizes the input:

@given(st.lists(st.integers()))
def test_sum_positive(lst):
    assert sum(lst) >= 0  # Fails for negative numbers

# Hypothesis reports: lst=[-1]
# NOT lst=[-9999, -42, -1, -8888] (the random case it found)

This is invaluable for debugging – you get the minimal failing case, not a complex random one.

Custom Strategies

For complex domain objects, build custom strategies with @composite:

from hypothesis import strategies as st
from hypothesis.strategies import composite

@composite
def valid_emails(draw):
    """Generate valid email addresses."""
    username = draw(st.text(alphabet=st.characters(
        whitelist_categories=('Ll', 'Lu', 'Nd'),
        min_codepoint=ord('a')
    ), min_size=1, max_size=20))

    domain = draw(st.text(alphabet=st.characters(
        whitelist_categories=('Ll',),
        min_codepoint=ord('a')
    ), min_size=1, max_size=15))

    tld = draw(st.sampled_from(['com', 'org', 'net', 'io']))

    return f"{username}@{domain}.{tld}"

@given(valid_emails())
def test_email_parsing(email):
    """Test parsing of valid email addresses."""
    assert '@' in email
    assert '.' in email.split('@')[1]

See references/patterns-catalog.md for more custom strategy patterns.

Step-by-Step Workflow

Step 1: Install Hypothesis

# Add to project dependencies
uv add --dev hypothesis

# Verify installation
python -c "import hypothesis; print(hypothesis.__version__)"

Step 2: Identify Properties to Test

Look for:

Invariants – Things that should always be true
Roundtrips – Serialize â Deserialize â Should equal original
Idempotency – Operation twice = operation once
Commutativity – Order doesn’t matter
Consistency – Related operations agree

Example: Testing a JSON serializer:

Property: parse(serialize(obj)) == obj (roundtrip)
Property: serialize(obj) returns valid JSON string
Property: All serialized objects are parseable

Step 3: Choose Strategies

Map your data types to Hypothesis strategies:

# Simple types
int â st.integers()
str â st.text()
bool â st.booleans()

# Collections
List[int] â st.lists(st.integers())
Dict[str, int] â st.dictionaries(st.text(), st.integers())
Optional[str] â st.one_of(st.text(), st.none())

# Domain models (Pydantic)
MyModel â builds(MyModel)

Step 4: Write Property Test

from hypothesis import given, strategies as st

@given(st.dictionaries(st.text(), st.text()))
def test_json_roundtrip(data):
    """Property: All dicts should roundtrip through JSON."""
    import json
    serialized = json.dumps(data)
    parsed = json.loads(serialized)
    assert parsed == data

Step 5: Run and Observe

# Run property test
pytest tests/test_properties.py -v

# Show statistics
pytest --hypothesis-show-statistics

# Reproduce specific failure
pytest --hypothesis-seed=12345

Step 6: Refine if Needed

If test generates invalid inputs:

Add constraints to strategy
Use assume() to filter (sparingly)
Create custom strategy with @composite

from hypothesis import given, assume, strategies as st

@given(st.lists(st.integers()))
def test_with_filtering(lst):
    # AVOID: Too much filtering (slow)
    assume(len(lst) > 0)  # Better: st.lists(st.integers(), min_size=1)
    assume(all(x >= 0 for x in lst))  # Better: st.lists(st.integers(min_value=0))
    ...

Common Property Patterns

1. Roundtrip Testing

Pattern: Serialize â Deserialize â Should equal original

@given(builds(MyModel))
def test_model_json_roundtrip(model):
    """Property: Models roundtrip through JSON."""
    json_str = model.model_dump_json()
    restored = MyModel.model_validate_json(json_str)
    assert restored == model

2. Invariant Testing

Pattern: Some property should always hold

@given(st.lists(st.integers()))
def test_sort_ordered(lst):
    """Property: Sorted list should be in ascending order."""
    sorted_lst = sorted(lst)
    for i in range(len(sorted_lst) - 1):
        assert sorted_lst[i] <= sorted_lst[i + 1]

3. Idempotency Testing

Pattern: Operation twice = operation once

@given(st.text())
def test_normalize_idempotent(text):
    """Property: Normalizing twice gives same result."""
    once = normalize(text)
    twice = normalize(once)
    assert once == twice

4. Commutativity Testing

Pattern: Order doesn’t matter

@given(st.integers(), st.integers())
def test_addition_commutative(a, b):
    """Property: a + b == b + a."""
    assert a + b == b + a

5. Consistency Testing

Pattern: Different paths to same result should agree

@given(st.lists(st.integers()))
def test_sum_consistency(lst):
    """Property: Manual sum equals built-in sum."""
    manual_sum = 0
    for x in lst:
        manual_sum += x
    assert manual_sum == sum(lst)

See references/patterns-catalog.md for 15+ common patterns.

Integration with pytest

Hypothesis works seamlessly with pytest:

import pytest
from hypothesis import given, strategies as st

# Combine with fixtures
@pytest.fixture
def temp_config(tmp_path):
    """Fixture providing temp configuration."""
    return Config(data_dir=tmp_path)

@given(st.text())
def test_with_fixture(temp_config, text):
    """Hypothesis + fixture: temp_config from fixture, text from Hypothesis."""
    result = process_with_config(temp_config, text)
    assert result is not None

Important: Fixtures are called once per test function, not once per Hypothesis example (100 runs).

pytest Command-Line Options

# Show statistics about data generation
pytest --hypothesis-show-statistics

# Use a specific Hypothesis profile
pytest --hypothesis-profile=ci

# Set verbosity level
pytest --hypothesis-verbosity=debug

# Reproduce a specific failure
pytest --hypothesis-seed=12345

Async Property Testing

Basic Async Pattern

Hypothesis works with pytest-asyncio:

import pytest
from hypothesis import given, strategies as st

@pytest.mark.asyncio
@given(st.text())
async def test_async_property(text):
    """Property test for async function."""
    result = await async_process(text)
    assert isinstance(result, str)

Critical: Decorator Order

MUST follow this order:

@pytest.mark.asyncio  # Innermost (closest to function)
@given(st.text())      # Outermost
async def test_async_property(text):
    pass

If you get “Hypothesis doesn’t know how to run async test functions”, check decorator order.

Example: Testing Async IPC

import pytest
from hypothesis import given, strategies as st

@pytest.mark.asyncio
@given(
    command=st.sampled_from(["execute", "status", "cancel"]),
    prompt=st.text(),
    correlation_id=st.uuids().map(str)
)
async def test_ipc_command_roundtrip(command, prompt, correlation_id):
    """Property: All IPC commands should roundtrip through serialization."""
    request = create_command_request(
        command=command,
        prompt=prompt,
        correlation_id=correlation_id
    )

    import json
    serialized = json.dumps(request)
    deserialized = json.loads(serialized)

    assert deserialized == request
    assert deserialized["command"] == command

Pydantic Model Testing

Hypothesis automatically supports Pydantic models:

from hypothesis import given
from hypothesis.strategies import builds
from pydantic import BaseModel, EmailStr, PositiveFloat

class PaymentModel(BaseModel):
    amount: PositiveFloat
    email: EmailStr
    description: str

# Hypothesis automatically respects Pydantic constraints!
@given(builds(PaymentModel))
def test_payment_validation(payment):
    """Hypothesis generates valid PaymentModel instances."""
    assert payment.amount > 0
    assert '@' in payment.email
    assert isinstance(payment.description, str)

Overriding Specific Fields

@given(builds(
    PaymentModel,
    amount=st.floats(min_value=100, max_value=1000),
    description=st.text(min_size=10, max_size=100)
))
def test_large_payments(payment):
    """Test with payments between $100-$1000."""
    assert 100 <= payment.amount <= 1000
    assert 10 <= len(payment.description) <= 100

Testing Configuration Models

from hypothesis import given, strategies as st
from hypothesis.strategies import builds
from my_project.config import AgentConfig

@given(builds(AgentConfig))
def test_agent_config_invariants(config):
    """Any valid AgentConfig should satisfy these invariants."""
    assert config.agent_id is not None
    assert config.system_prompt is not None
    assert len(config.agent_id) > 0

Configuration

Profile Setup (conftest.py)

Create profiles for different environments:

# tests/conftest.py
from hypothesis import settings, HealthCheck

# Configure Hypothesis profiles
settings.register_profile(
    "ci",
    max_examples=200,
    deadline=1000  # milliseconds
)

settings.register_profile(
    "dev",
    max_examples=50,
    deadline=None
)

settings.register_profile(
    "thorough",
    max_examples=1000,
    deadline=None,
    suppress_health_check=[HealthCheck.too_slow]
)

# Activate based on environment
import os
settings.load_profile(os.getenv("HYPOTHESIS_PROFILE", "dev"))

Per-Test Settings

from hypothesis import given, settings, strategies as st

@settings(max_examples=1000, deadline=None)
@given(st.integers())
def test_expensive_operation(n):
    """Run 1000 examples with no time limit."""
    result = very_slow_computation(n)
    assert result >= 0

Configuration Options

Option	Default	Description
`max_examples`	100	Number of test cases to generate
`deadline`	200ms	Time limit per test case
`suppress_health_check`	[]	Disable specific warnings
`verbosity`	normal	Output verbosity (quiet, normal, verbose, debug)
`derandomize`	False	Use deterministic randomness

Supporting Files

references/strategies-reference.md

Complete catalog of built-in Hypothesis strategies with examples:

Basic types (integers, floats, text, binary)
Collections (lists, sets, dicts, tuples)
Special types (UUIDs, datetimes, emails)
Combinators (one_of, builds, recursive)
Advanced patterns (composite, shared, data)

references/patterns-catalog.md

Common property test patterns with examples:

Roundtrip testing (serialization, encoding)
Invariant testing (order, size, consistency)
Idempotency testing (normalization, deduplication)
Commutativity testing (operations, transformations)
State machine testing (lifecycle, protocols)

templates/property-test-templates.md

Copy-paste ready templates for:

Basic property test
Async property test
Pydantic model property test
Custom strategy
State machine test
conftest.py Hypothesis configuration

Expected Outcomes

Successful Property Test Creation

â Property Tests Added

Module: tests/unit/test_ipc_protocol_properties.py
Properties tested:
  - JSON roundtrip for command requests
  - Correlation ID preservation
  - Valid command types

Generated examples: 100 per property
Edge cases found: 0 (all tests passed)

Test results:
  â All properties hold
  â 300 examples generated (3 properties Ã 100 each)
  â No shrinking needed (no failures)

Configuration:
  Profile: dev (50 examples/property)
  Deadline: None (development)

Time: 2.3 seconds
Confidence: High (comprehensive input coverage)

Property Test Finding Bug

â ï¸ Property Violation Found

Test: test_json_roundtrip
Property: All dicts should roundtrip through JSON
Falsifying example: data={'key': float('inf')}

Error: JSON cannot serialize infinity
Shrinking: Reduced from complex dict to minimal case

Root cause: Missing validation for special float values
Fix required: Add constraint to strategy or handle inf/nan

Next steps:
  1. Decide: Should code handle inf/nan or reject them?
  2. Update strategy: st.floats(allow_nan=False, allow_infinity=False)
  3. OR: Add validation in serializer
  4. Re-run property tests to verify fix

Requirements

Tools needed:

Bash (for running tests)
Read (for examining test files)
Grep (for finding test patterns)
Glob (for file discovery)
Edit/Write (for creating/modifying tests)

Dependencies:

Python 3.8+
pytest
hypothesis (install with: uv add --dev hypothesis)
pytest-asyncio (for async tests)

Test Framework:

pytest with Hypothesis integration
pytest-asyncio for async property tests

Knowledge:

Basic understanding of property-based testing concepts
Familiarity with pytest
Understanding of type annotations (helpful for strategies)

Red Flags to Avoid

â WRONG: Over-Constraining Strategies

# BAD: Too specific, loses property testing benefits
@given(st.integers(min_value=42, max_value=42))
def test_specific_value(n):
    assert n == 42  # This is just an example test!

â RIGHT: Test properties that hold for all inputs

@given(st.integers())
def test_absolute_value_non_negative(n):
    assert abs(n) >= 0

â WRONG: Filtering Too Much

# BAD: Rejecting most generated examples
@given(st.integers())
def test_primes(n):
    assume(is_prime(n))  # Rejects 99% of inputs!
    # ... test code

â RIGHT: Use a strategy that generates valid inputs

@composite
def primes(draw):
    return draw(st.sampled_from([2, 3, 5, 7, 11, 13, 17, 19, 23, 29]))

@given(primes())
def test_primes(n):
    # All inputs are primes

â WRONG: Testing Implementation, Not Properties

# BAD: Duplicating implementation in test
@given(st.lists(st.integers()))
def test_sum_implementation(lst):
    result = sum(lst)
    # Bad: Reimplementing sum() in test
    expected = 0
    for item in lst:
        expected += item
    assert result == expected

â RIGHT: Test properties

@given(st.lists(st.integers()))
def test_sum_commutative(lst):
    assert sum(lst) == sum(reversed(lst))

@given(st.lists(st.integers()))
def test_sum_with_zero(lst):
    assert sum(lst + [0]) == sum(lst)

â WRONG: Wrong Decorator Order for Async

# BAD: Will fail with "Hypothesis doesn't know how to run async"
@given(st.text())
@pytest.mark.asyncio
async def test_async_property(text):
    pass

â RIGHT: pytest.mark.asyncio innermost

@pytest.mark.asyncio
@given(st.text())
async def test_async_property(text):
    pass

â WRONG: Not Using Pydantic Integration

# BAD: Manually constructing Pydantic models
@given(
    amount=st.floats(min_value=0.01),
    email=st.text(),  # Not valid emails!
)
def test_payment(amount, email):
    payment = PaymentModel(amount=amount, email=email)  # Will fail validation

â RIGHT: Use builds() for Pydantic models

@given(builds(PaymentModel))
def test_payment(payment):
    # Hypothesis automatically generates valid instances
    assert payment.amount > 0

â WRONG: Mixing Hypothesis with pytest.mark.parametrize

# BAD: Redundant - Hypothesis already does this
@pytest.mark.parametrize("n", [1, 2, 3, 4, 5])
@given(st.integers())
def test_redundant(n, generated_int):
    # Why both? Pick one approach!
    pass

â RIGHT: Use Hypothesis for data generation

@given(st.integers(min_value=1, max_value=5))
def test_small_integers(n):
    assert 1 <= n <= 5

Notes

Start Small:

Pick one simple function to test
Write one property test
Run it, observe results
Expand to more properties

Think in Properties, Not Examples:

Instead of: “sort([3,1,2]) == [1,2,3]”
Think: “sorted list should be ordered” (invariant)
Or: “sorting twice == sorting once” (idempotency)
Or: “sort preserves all elements” (conservation)

Hypothesis Finds Edge Cases You Miss:

Empty collections
Single elements
Duplicates
Very large/small numbers
Unicode edge cases
Boundary conditions

When Property Tests Fail:

Read the minimal failing example (shrinking gives you this)
Understand why the property doesn’t hold
Decide: Is code wrong or property too strict?
Fix and re-run

Further Reading:

Hypothesis documentation: https://hypothesis.readthedocs.io/
Strategies reference: references/strategies-reference.md
Pattern catalog: references/patterns-catalog.md

GitHub 仓库 ↗ ← 返回陌讯 Skills 聚合平台