Skip to main content

Table 2 Results for YOLO models trained with data annotated by humans

From: Toward a generalizable machine learning workflow for neurodegenerative disease staging with focus on neurofibrillary tangles

Annotators

Pre-NFT F1 Score

iNFT F1 Score

Macro F1 Score

Val

Test

Emory

Holdout

Val

Test

Emory

Holdout

Val

Test

Emory

Holdout

Novice 1

0.49 ± 0.12

0.63 ± 0.10

0.20 ± 0.09

0.76 ± 0.02

0.76 ± 0.02

0.59 ± 0.05

0.63 ± 0.07

0.70 ± 0.06

0.39 ± 0.02

Novice 2

0.44 ± 0.04

0.39 ± 0.06

0.21 ± 0.08

0.80 ± 0.01

0.73 ± 0.02

0.51 ± 0.05

0.62 ± 0.02

0.56 ± 0.02

0.36 ± 0.06

Novice 3

0.67 ± 0.08

0.65 ± 0.03

0.13 ± 0.02

0.65 ± 0.05

0.74 ± 0.01

0.59 ± 0.01

0.66 ± 0.06

0.70 ± 0.02

0.36 ± 0.01

Expert 1

0.36 ± 0.08

0.55 ± 0.05

0.21 ± 0.04

0.75 ± 0.06

0.75 ± 0.00

0.45 ± 0.06

0.55 ± 0.06

0.65 ± 0.03

0.33 ± 0.05

Expert 2

0.41 ± 0.24

0.26 ± 0.13

0.10 ± 0.01

0.64 ± 0.11

0.46 ± 0.03

0.39 ± 0.10

0.53 ± 0.09

0.36 ± 0.07

0.25 ± 0.05

Expert 3

0.40 ± 0.24

0.54 ± 0.04

0.31 ± 0.08

0.80 ± 0.05

0.75 ± 0.03

0.66 ± 0.03

0.60 ± 0.11

0.65 ± 0.03

0.48 ± 0.05

Expert 4

0.49 ± 0.21

0.40 ± 0.02

0.29 ± 0.07

0.79 ± 0.01

0.71 ± 0.02

0.59 ± 0.07

0.64 ± 0.10

0.55 ± 0.01

0.44 ± 0.06

Expert 5

0.47 ± 0.02

0.43 ± 0.06

0.17 ± 0.04

0.67 ± 0.04

0.65 ± 0.01

0.38 ± 0.04

0.57 ± 0.03

0.54 ± 0.03

0.28 ± 0.03

  1. The Emory-Holdout 28 ROI dataset is the consensus annotated dataset from a hold-out Emory cohort. Val (Validation) and Test datasets are annotated by the specific annotator and reflect how well the models learned the annotator nuances. All values reported are the average results of three-fold cross-validation models for each annotator. Standard deviations are shown