2.6 KiB
2.6 KiB
Metrics Definition
Use these definitions consistently across all reports.
1. Core Error Metrics
Let:
pred= predicted close for target sessionactual= official actual close for that session
Compute:
- Absolute Error (AE):
|pred - actual| - Absolute Percentage Error (APE):
|pred - actual| / actual * 100%
2. Hit Criteria
Report two hit criteria in parallel:
- Strict hit:
APE <= 1% - Loose hit:
APE <= 2%
These thresholds are the default correctness criteria for predicted close price.
3. Rolling Accuracy Windows
For each window W (1d, 3d, 7d, 30d, custom):
strict_accuracy_W = strict_hits_W / n_Wloose_accuracy_W = loose_hits_W / n_W
Where n_W is number of valid forecast/actual pairs in that window.
4. Optional Direction Accuracy
Let direction be sign of close-to-close return.
- Direction hit if predicted direction equals realized direction.
direction_accuracy_W = direction_hits_W / n_W
Use only when direction labels are explicitly available.
5. Forecast Correctness Score (Optional)
For a single forecast, you may map APE to a score:
correctness_score = max(0, 100 - 50 * APE_percent)
Examples:
APE = 0.8%-> score60APE = 1.5%-> score25APE >= 2.0%-> score0(or near 0)
6. Sample Size and Insufficient Data Rules
- Never pad missing samples.
- If
n_W = 0, outputN/Afor the window. - If
0 < n_W < target_window_size, output partial result and annotate as partial. - Always display
n_Wbeside each window metric.
7. Adjustment and Comparability Rules
- Prefer adjusted price series when corporate actions materially affect comparability.
- If non-adjusted close is used, state it explicitly.
- Keep forecast and actual on the same price basis.
8. Improvement Trend Metrics
Track whether forecast quality is improving over time:
delta_APE_7d_vs_prev7d
- Difference between current 7-day average APE and previous 7-day average APE.
delta_strict_hit_rate_7d
- Change in strict hit rate versus previous 7-day block.
trend_label
improving,stable, ordegradingbased on combined delta signals.
9. Reporting Format (Minimum)
Every report should include:
- Prior-session review row:
prev_pred_close_t1,prev_actual_close_t1,AE,APE, strict/loose hit status
- Rolling table with at least:
- 1d, 3d, 7d, 30d, optional custom
- strict accuracy, loose accuracy, optional direction accuracy
- sample size
n
- One-line interpretation:
- whether model performance is improving, stable, or degrading
- Improvement block:
- what changed from review
- what will be adjusted in next run