Method Alternatives

This page documents alternative extraction approaches evaluated during development. These runs were not used in the final dataset, but are retained here for transparency and comparison.

Final dataset: regex1-2025-06-19 (regex-gated mentions_cyber plus LLM labeling). See Validation for the metrics used in the published results.

Method Comparison

This comparison reports overall accuracy for different processing modes/runs using the same ground truth and sentence-matching cache (accuracy = TP / (TP+FP+FN)).

Run Precision Recall F1 Accuracy TP FP FN GT AI
regex_test64.5%85.3%73.5%58.1%1518326177234
regex1-2025-06-1964.4%84.7%73.2%57.7%1508327177233
analysis_openai22.6%59.9%32.8%19.6%10636471177470
analysis_gemini12.8%74.6%21.8%12.3%132900451771,032

Method Comparison by Tag

Per-tag precision/recall/F1/accuracy for each run (limited to the four core labels; accuracy = TP / (TP+FP+FN)).

Run Tag Precision Recall F1 Accuracy TP FP FN GT AI
regex_testmentions_cyber64.8%85.8%73.8%58.5%1518225176233
regex_testmentions_board53.8%60.9%57.1%40.0%141292326
regex_testregulatory_reference62.5%50.0%55.6%38.5%535108
regex_testspecificity0.0%0.0%0.0%0.0%01001
regex1-2025-06-19mentions_cyber64.2%84.7%73.0%57.5%1498327176232
regex1-2025-06-19mentions_board47.2%73.9%57.6%40.5%171962336
regex1-2025-06-19regulatory_reference71.4%50.0%58.8%41.7%525107
regex1-2025-06-19specificity0.0%0.0%0.0%0.0%0290029
analysis_openaimentions_cyber25.2%59.7%35.4%21.5%10531271176417
analysis_openaimentions_board20.6%56.5%30.2%17.8%1350102363
analysis_openairegulatory_reference16.7%50.0%25.0%14.3%52551030
analysis_openaispecificity0.0%0.0%0.0%0.0%017300173
analysis_geminimentions_cyber12.8%75.0%21.9%12.3%132897441761,029
analysis_geminimentions_board12.5%87.0%21.9%12.3%20140323160
analysis_geminiregulatory_reference8.0%80.0%14.5%7.8%892210100
analysis_geminispecificity0.0%0.0%0.0%0.0%063400634