Risk Factor Assessment

Evidence-based evaluation of health risks

methodology
risk scoring
algorithms
Author

Preact Health

Published

February 17, 2026

Overview

The risk factor scoring system quantifies health behaviors and conditions that increase disease risk and reduce health scores. Our approach combines clinical evidence with age-specific adjustments to provide personalized risk assessment.


Technical Architecture

Data Structure

Input includes multiple data sources:

risk_data = {
    'risk': {...},           # Risk-related survey responses
    'age_days': 10950,      # Age in days (30 years)
    'indicators': {...},     # Clinical indicators
    'familyhx': {...},      # Family health history
    'exercise_score': 0.8,  # From health factor
    'nutrition_score': 0.7  # From health factor
}

Question Mapping

Each logical question maps to a numeric key: - 'alcohol' → Key for alcohol use questions - 'smoke' → Key for smoking history - 'drugs' → Key for substance use

Responses are validated against accepted options from the database.


Risk Domains

1. Alcohol Risk

Data collected: - Current use status (yes/no) - Frequency (days per week: 0-7) - Amount (drinks per day: 0-5+) - Last use date

Risk calculation:

if status == "No":
    risk = 0
else:
    daily_consumption = (days_per_week / 7) * drinks_per_day
    
    if daily_consumption <= 1.0:  # Low risk
        risk = 0.1
    elif daily_consumption <= 2.0:  # Moderate
        risk = 0.3
    else:  # High risk
        risk = 0.6 + min(daily_consumption - 2.0, 2.0) * 0.2

Evidence base: NIAAA guidelines for low-risk drinking

2. Smoking Risk

Data collected: - Current smoking status - Frequency (days per week) - Amount (cigarettes per day) - Quit date (if former smoker)

Risk calculation: - Current smokers: High risk (0.8-1.0) - Recent quitters (<1 year): Moderate risk (0.4-0.6) - Former smokers (>5 years): Low residual risk (0.1-0.2) - Never smokers: No risk (0.0)

Pack-year adjustment:

pack_years = (cigarettes_per_day / 20) * years_smoking
risk_multiplier = 1.0 + min(pack_years / 20, 1.0)

3. Drug Use Risk

Categories: - Prescription misuse - Recreational drugs - Cannabis - Other substances

Frequency-based scoring: - Occasional use: 0.2 - Regular use: 0.5 - Daily use: 0.8+

4. BMI Risk

Calculation:

# Convert to metric if needed
height_m = height_cm / 100
bmi = weight_kg / (height_m ** 2)

# Risk categories (WHO classification)
if 18.5 <= bmi < 25:
    risk = 0.0  # Normal
elif 25 <= bmi < 30:
    risk = 0.2  # Overweight
elif 30 <= bmi < 35:
    risk = 0.4  # Obese Class I
elif 35 <= bmi < 40:
    risk = 0.6  # Obese Class II
else:
    risk = 0.8  # Obese Class III

Unit conversion support: - Imperial (lb, in) automatically converts to metric - Validation prevents invalid measurements

5. Clinical Indicators

Assessed factors: - Blood pressure - Cholesterol levels - Blood glucose - Other lab values

Integration: Clinical data enhances risk prediction when available

6. Family History

Genetic risk modifiers: - First-degree relatives: Higher weight - Multiple affected relatives: Multiplicative risk - Age of onset in relatives: Earlier onset = higher risk


Age-Specific Risk Adjustment

Risk impact varies by age:

age_factor = lookup_age_factor(risk_type, age_days)
adjusted_risk = base_risk * age_factor

Age Factor Table (Smoking Example)

Age Range Age Factor Rationale
18-29 0.8 Lower cumulative exposure
30-49 1.0 Standard risk
50-64 1.2 Accumulated damage
65+ 1.4 Highest vulnerable population

Database-driven: All age factors stored in risk_lookup table for easy updates


Risk Score Calculation

Step-by-Step Process

  1. Detect Active Risks

    detected_risks = []
    for domain in risk_domains:
        risk_value = calculate_domain_risk(domain, survey_data)
        if risk_value > 0:
            detected_risks.append((domain, risk_value))
  2. Apply Age Adjustment

    for risk_key, risk_value in detected_risks:
        lookup = get_risk_lookup(risk_key, age_days)
        base_factor = lookup.risk_factor
        age_factor = lookup.age_factor
        adjusted_risk = base_factor * age_factor * risk_value
  3. Aggregate Total Risk

    total_risk = sum(adjusted_risks)

Mathematical Formula

\[ \text{Total Risk Score} = \sum_{i=1}^{n} R_i \cdot A_i \cdot V_i \]

Where: - \(R_i\) = Base risk factor for domain \(i\) - \(A_i\) = Age adjustment factor - \(V_i\) = Individual’s risk value (0-1) - \(n\) = Number of detected risks


Validation

Clinical Validation

Compared against: - Framingham Risk Score: 0.87 correlation for cardiovascular risk - ASCVD Calculator: 0.82 agreement on high-risk classification - Clinical outcomes: Predictive of 5-year adverse events (AUC 0.78)

Sensitivity Analysis

Tested robustness to: - Missing data - Self-report accuracy - Temporal changes

See Scoring Audit for full validation report.


Limitations

Self-report bias: Responses may underreport risky behaviors

Unmeasured factors: Cannot capture all health risks

Population validity: Based primarily on US/Western populations

Temporal changes: Single time-point assessment


Future Enhancements

  • Wearable integration: Objective activity and sleep data
  • Lab results: Direct clinical measurements
  • Genetic testing: Polygenic risk scores
  • Longitudinal tracking: Risk trajectory over time

Open Source

Implementation: github.com/preacterik/preact-health-scoring/blob/main/preact/health_scorer/v020/risk_factor.py

Contribute improvements or report issues on GitHub.


References

  1. Framingham Heart Study. Risk Assessment Tool (2019)
  2. WHO. Body Mass Index Classification (2021)
  3. NIAAA. Alcohol Use Guidelines (2020)
  4. US Preventive Services Task Force. Risk Factor Screening (2022)

Last updated: February 17, 2026