Method Alternatives
This page documents alternative extraction approaches evaluated during development. These runs were not used in the final dataset, but are retained here for transparency and comparison.
Final dataset: regex1-2025-06-19 (regex-gated mentions_cyber plus LLM labeling). See Validation for the metrics used in the published results.
Method Comparison
This comparison reports overall accuracy for different processing modes/runs using the same ground truth and sentence-matching cache (accuracy = TP / (TP+FP+FN)).
| Run | Precision | Recall | F1 | Accuracy | TP | FP | FN | GT | AI |
|---|---|---|---|---|---|---|---|---|---|
| regex_test | 64.5% | 85.3% | 73.5% | 58.1% | 151 | 83 | 26 | 177 | 234 |
| regex1-2025-06-19 | 64.4% | 84.7% | 73.2% | 57.7% | 150 | 83 | 27 | 177 | 233 |
| analysis_openai | 22.6% | 59.9% | 32.8% | 19.6% | 106 | 364 | 71 | 177 | 470 |
| analysis_gemini | 12.8% | 74.6% | 21.8% | 12.3% | 132 | 900 | 45 | 177 | 1,032 |
Method Comparison by Tag
Per-tag precision/recall/F1/accuracy for each run (limited to the four core labels; accuracy = TP / (TP+FP+FN)).
| Run | Tag | Precision | Recall | F1 | Accuracy | TP | FP | FN | GT | AI |
|---|---|---|---|---|---|---|---|---|---|---|
| regex_test | mentions_cyber | 64.8% | 85.8% | 73.8% | 58.5% | 151 | 82 | 25 | 176 | 233 |
| regex_test | mentions_board | 53.8% | 60.9% | 57.1% | 40.0% | 14 | 12 | 9 | 23 | 26 |
| regex_test | regulatory_reference | 62.5% | 50.0% | 55.6% | 38.5% | 5 | 3 | 5 | 10 | 8 |
| regex_test | specificity | 0.0% | 0.0% | 0.0% | 0.0% | 0 | 1 | 0 | 0 | 1 |
| regex1-2025-06-19 | mentions_cyber | 64.2% | 84.7% | 73.0% | 57.5% | 149 | 83 | 27 | 176 | 232 |
| regex1-2025-06-19 | mentions_board | 47.2% | 73.9% | 57.6% | 40.5% | 17 | 19 | 6 | 23 | 36 |
| regex1-2025-06-19 | regulatory_reference | 71.4% | 50.0% | 58.8% | 41.7% | 5 | 2 | 5 | 10 | 7 |
| regex1-2025-06-19 | specificity | 0.0% | 0.0% | 0.0% | 0.0% | 0 | 29 | 0 | 0 | 29 |
| analysis_openai | mentions_cyber | 25.2% | 59.7% | 35.4% | 21.5% | 105 | 312 | 71 | 176 | 417 |
| analysis_openai | mentions_board | 20.6% | 56.5% | 30.2% | 17.8% | 13 | 50 | 10 | 23 | 63 |
| analysis_openai | regulatory_reference | 16.7% | 50.0% | 25.0% | 14.3% | 5 | 25 | 5 | 10 | 30 |
| analysis_openai | specificity | 0.0% | 0.0% | 0.0% | 0.0% | 0 | 173 | 0 | 0 | 173 |
| analysis_gemini | mentions_cyber | 12.8% | 75.0% | 21.9% | 12.3% | 132 | 897 | 44 | 176 | 1,029 |
| analysis_gemini | mentions_board | 12.5% | 87.0% | 21.9% | 12.3% | 20 | 140 | 3 | 23 | 160 |
| analysis_gemini | regulatory_reference | 8.0% | 80.0% | 14.5% | 7.8% | 8 | 92 | 2 | 10 | 100 |
| analysis_gemini | specificity | 0.0% | 0.0% | 0.0% | 0.0% | 0 | 634 | 0 | 0 | 634 |