This document contains some extra bits of information to supplement the poster, as well as figures that just weren’t good enough to make it onto the final version of the poster.
Download zip files from the Wellcome Osteological Research Database (website currently not accessible).
Extract all files.
unzip"*.zip"
Rename files to .csv
for file in$(ls .)domv$file$file".csv"done
Importing data to R
Import all datasets containing relevant dental data into R using DuckDB.
con <-dbConnect(duckdb::duckdb(), dbdir =":memory:")dental_inventory <-dbGetQuery(con, "SELECT * FROM read_csv('<path/to/folder>/*bones_present.lst.csv', delim = '|')")dental_pathology <-dbGetQuery(con, "SELECT * FROM read_csv('<path/to/folder>/*dental_path.lst.csv', delim = '|')")
Total sample size: 1650
Filtering dataset
Filtering the dataset to include adult individuals with permanent dentition only. It was also necessary to combine SITECODE (the code for the archaeological site) and CONTEXT (the unique identifier for individuals within each site), as CONTEXT was not unique for individuals across all sites. This was done by combining SITECODE and CONTEXT: ID = SITECODE_CONTEXT.
Still not ideal sample sizes, but definitely an improvement over samples with M3s.
Demographics
Sex and age distributions included for context. Given the low samples size of the LEH-positive sample, there was no sense in doing any analysis on sex and age.
Table 2: Demographics for the LEH-positive sample (n = 16).
SEX
AGE
n
FEMALE
ADULT 26-35 YEARS
1
INTERMEDIATE
ADULT 26-35 YEARS
1
MALE
ADULT 18-25 YEARS
4
MALE
ADULT 26-35 YEARS
3
MALE
ADULT 36-45 YEARS
3
MALE
ADULT >46 YEARS
1
MALE?
ADULT 26-35 YEARS
1
MALE?
ADULT 36-45 YEARS
2
Figures, in case you don’t like tables.
Code
# needs to be proportions?hypoplasia_long |>left_join(demography) |>distinct(ID, .keep_all = T) |>mutate(leh_present =as_factor(leh_present)) |>ggplot(aes(x = SEX)) +geom_bar(position ="dodge") +# overlay lesions present samplegeom_bar(data =distinct(left_join(hypoplasia_present_long, demography), ID, .keep_all = T), aes(x = SEX), fill ="blue") +coord_cartesian(ylim =c(0,400)) # cut y-axis off at 400
Code
# needs to be proportions?hypoplasia_long |>left_join(demography) |>distinct(ID, .keep_all = T) |>mutate(leh_present =as_factor(leh_present)) |>ggplot(aes(x = AGE)) +geom_bar(position ="dodge") +# overlay lesions present samplegeom_bar(data =distinct(left_join(hypoplasia_present_long, demography), ID, .keep_all = T), aes(x = AGE), fill ="blue") +coord_cartesian(ylim =c(0,400)) # cut y-axis off at 400
Key Results
Most of the analysis was done on the 16 individuals that have complete LEH scores in all teeth (except M3s), AND have at least one LEH lesion present (LEH-positive). The nine other individuals with complete scores but without any lesions are not useful in this analysis.
(Missing) Data patterns
Adults with permanent dentition (excl. M3s) that were scored for LEH.
Code
nrow(distinct(hypoplasia_data, ID)) # number of individuals
[1] 1418
Code
na_pattern <-md.pattern(hypoplasia_data[,-1], plot = F)nrow(na_pattern) -1# number of patterns
[1] 1248
Adults with a complete permanent dentition missing that were scored for LEH.
Code
# can_scorenrow(distinct(hypoplasia_complete_wide, ID)) # number of individuals
[1] 711
Code
leh_na_pattern <-md.pattern(hypoplasia_complete_wide[,-1], plot = F)nrow(leh_na_pattern) -1# number of patterns
[1] 573
Before plotting missing data per individual, individuals with all teeth present and endentulous individuals were removed since they are not interesting (the former have no missing data and the later have only missing data). Then individuals were arranged by number of teeth,
Code
# number missing teeth per individual (to order heatmaps)missing_ordered <- dental_inventory_long |>#mutate(presence = presence) |>group_by(ID) |>filter(type !="m3") |>summarise(n_teeth =sum(as.numeric(presence))) |>filter( n_teeth !=28& n_teeth !=0# endentulous and full dentitions are not informative ) |>arrange(desc(n_teeth))dental_inventory_long <- dental_inventory_long |>mutate(presence =as_factor(presence))
and plotted, with maxillary and mandibular teeth displayed in separate plots.
Code
max_inventory_heat <- dental_inventory_long |>filter( region =="maxilla", ID %in% missing_ordered$ID ) |>mutate(ID =factor(ID, levels = missing_ordered$ID)) |>ggplot(aes(x = tooth, y = ID, fill = presence)) +geom_tile() +scale_x_discrete(position ="top", guide =guide_axis(angle =45)) +theme(#axis.text.x.top = element_text(angle = 45, hjust = 0, vjust = 0),axis.text.y =element_blank(),axis.title.x =element_blank() )man_inventory_heat <- dental_inventory_long |>filter( region =="mandible", ID %in% missing_ordered$ID ) |>mutate(ID =factor(ID, levels = missing_ordered$ID)) |>ggplot(aes(x = tooth, y = ID, fill = presence)) +geom_tile() +scale_x_discrete(position ="bottom", guide =guide_axis(angle =45)) +theme(#axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1),axis.text.y =element_blank() )(max_inventory_heat / man_inventory_heat) +plot_layout(guides ="collect", axis_titles ="collect") &labs(x ="Tooth", y ="Individuals", fill ="Presence") &scale_fill_viridis_d(option ="viridis", begin =0.1)
The same was done for hypoplasia scores. If a tooth could be scored, it received a score of 1, if not, it received a score of 0.
Code
scoreable_ordered <- hypoplasia_complete_long |>group_by(ID) |>filter(type !="m3") |>summarise(n_teeth =sum(as.numeric(can_score))) |>filter( n_teeth !=28& n_teeth !=0# endentulous and full dentitions are not informative ) |>arrange(desc(n_teeth))
Then separate heatmaps were made for maxillary and mandibular teeth.
Code
max_scoreable_heat <- hypoplasia_complete_long |>filter( region =="maxilla", ID %in% scoreable_ordered$ID ) |>mutate(ID =factor(ID, levels = scoreable_ordered$ID)) |>ggplot(aes(x = tooth, y = ID, fill =as_factor(can_score))) +geom_tile() +scale_x_discrete(position ="top", guide =guide_axis(angle =45)) +theme(#axis.text.x.top = element_text(angle = 45, hjust = 0, vjust = 0),axis.text.y =element_blank(),axis.title.x =element_blank() )man_scoreable_heat <- hypoplasia_complete_long |>filter( region =="mandible", ID %in% scoreable_ordered$ID ) |>mutate(ID =factor(ID, levels = scoreable_ordered$ID)) |>ggplot(aes(x = tooth, y = ID, fill =as_factor(can_score))) +geom_tile() +scale_x_discrete(guide =guide_axis(angle =45)) +theme(#axis.text.x.top = element_text(angle = 45, hjust = 0, vjust = 0),axis.text.y =element_blank(),#axis.title.x = element_blank() )(max_scoreable_heat / man_scoreable_heat) +plot_layout(guides ="collect", axis_titles ="collect") &labs(x ="Tooth", y ="Individuals", fill ="Scoreable") &scale_fill_viridis_d(option ="viridis", begin =0.1)
LEH lesion patterns
To explore patterns of the presence and absence of LEH using the mice package, scores of 0, i.e. absence of a lesion, were converted to NA.
This allowed us to use the missing data functions on the hypoplasia scores to explore score patterns, which was done on the LEH-positive sample (n = 16).
Number of insults per tooth, tooth type, and tooth class in the LEH-positive sample. Calculated as the number of affected teeth divided by the number of teeth present in the sample.
Table 3: Ratios of LEH per tooth, tooth type, and tooth class, respectively, in the LEH-positive sample.
(a) Ratio per tooth
tooth
n
present
prop
LLC1
16
7
0.4375
URC1
16
6
0.3750
LRC1
16
6
0.3750
ULC1
16
4
0.2500
URP2
16
3
0.1875
URM1
16
3
0.1875
ULP1
16
3
0.1875
LLI1
16
3
0.1875
LRI1
16
3
0.1875
LRP1
16
3
0.1875
URI1
16
2
0.1250
URP1
16
2
0.1250
ULI1
16
2
0.1250
ULP2
16
2
0.1250
LLP2
16
2
0.1250
LRI2
16
2
0.1250
URI2
16
1
0.0625
URM2
16
1
0.0625
ULI2
16
1
0.0625
ULM2
16
1
0.0625
LLI2
16
1
0.0625
LLP1
16
1
0.0625
LRP2
16
1
0.0625
ULM1
16
0
0.0000
LLM1
16
0
0.0000
LLM2
16
0
0.0000
LRM1
16
0
0.0000
LRM2
16
0
0.0000
(b) Ratio per tooth type
type
n
present
prop
c
64
23
0.359375
i1
64
10
0.156250
pm1
64
9
0.140625
pm2
64
8
0.125000
i2
64
5
0.078125
m1
64
3
0.046875
m2
64
2
0.031250
(c) Ratio per tooth class
class
n
present
prop
canine
64
23
0.3593750
premolar
128
17
0.1328125
incisor
128
15
0.1171875
molar
128
5
0.0390625
Code
# Convert scores of 0 to NAhypoplasia_present_na <- hypoplasia_present_wide |>mutate(across(-ID, \(x) if_else(x ==0, NA, x)))score_pairs <-md.pairs(hypoplasia_present_na[,-1])
Code
score_pairs$rr |>as_tibble(rownames ="tooth")
Table 4: Counts of teeth with an LEH ‘present’ score in the LEH-positive sample.
tooth
ULC1
ULP1
ULP2
ULM1
LRI1
LRI2
LRC1
LRP1
LRP2
LRM1
LRM2
ULI2
URP1
URP2
URM1
URM2
ULM2
LLI1
LLI2
LLC1
LLP1
LLP2
LLM1
LLM2
URC1
URI1
URI2
ULI1
ULC1
4
1
0
0
1
1
2
1
0
0
0
1
1
0
1
1
0
1
0
3
0
0
0
0
3
1
1
1
ULP1
1
3
0
0
0
0
2
2
0
0
0
1
2
1
0
0
0
0
0
2
0
0
0
0
2
1
0
0
ULP2
0
0
2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
ULM1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
LRI1
1
0
0
0
3
2
3
1
0
0
0
0
0
1
0
0
0
2
1
3
1
0
0
0
2
0
1
1
LRI2
1
0
0
0
2
2
2
0
0
0
0
0
0
0
0
0
0
2
1
2
0
0
0
0
1
0
1
1
LRC1
2
2
0
0
3
2
6
2
0
0
0
1
2
2
0
0
0
2
1
5
1
0
0
0
5
0
1
1
LRP1
1
2
0
0
1
0
2
3
0
0
0
1
1
1
0
0
0
0
0
2
1
0
0
0
2
1
0
0
LRP2
0
0
0
0
0
0
0
0
1
0
0
0
0
1
1
0
0
0
0
1
0
1
0
0
0
0
0
0
LRM1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
LRM2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
ULI2
1
1
0
0
0
0
1
1
0
0
0
1
1
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
URP1
1
2
0
0
0
0
2
1
0
0
0
1
2
1
0
0
0
0
0
2
0
0
0
0
2
0
0
0
URP2
0
1
0
0
1
0
2
1
1
0
0
0
1
3
1
0
0
0
0
3
1
1
0
0
2
0
0
0
URM1
1
0
0
0
0
0
0
0
1
0
0
0
0
1
3
1
0
0
0
1
0
1
0
0
1
1
0
1
URM2
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
1
1
0
1
ULM2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
LLI1
1
0
0
0
2
2
2
0
0
0
0
0
0
0
0
0
0
3
1
2
0
0
0
0
1
0
1
1
LLI2
0
0
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
1
1
1
0
0
0
0
0
0
0
1
LLC1
3
2
0
0
3
2
5
2
1
0
0
1
2
3
1
0
0
2
1
7
1
1
0
0
4
0
1
1
LLP1
0
0
0
0
1
0
1
1
0
0
0
0
0
1
0
0
0
0
0
1
1
0
0
0
1
0
0
0
LLP2
0
0
0
0
0
0
0
0
1
0
0
0
0
1
1
0
0
0
0
1
0
2
0
0
0
0
0
0
LLM1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
LLM2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
URC1
3
2
0
0
2
1
5
2
0
0
0
1
2
2
1
1
0
1
0
4
1
0
0
0
6
1
1
1
URI1
1
1
0
0
0
0
0
1
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
1
2
0
1
URI2
1
0
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
1
0
1
0
0
0
0
1
0
1
0
ULI1
1
0
0
0
1
1
1
0
0
0
0
0
0
0
1
1
0
1
1
1
0
0
0
0
1
1
0
2
Code
score_pairs$mm |>as_tibble(rownames ="tooth")
Table 5: Counts of teeth with an LEH ‘absent’ score in the LEH-positive sample.
tooth
ULC1
ULP1
ULP2
ULM1
LRI1
LRI2
LRC1
LRP1
LRP2
LRM1
LRM2
ULI2
URP1
URP2
URM1
URM2
ULM2
LLI1
LLI2
LLC1
LLP1
LLP2
LLM1
LLM2
URC1
URI1
URI2
ULI1
ULC1
12
10
10
12
10
11
8
10
11
12
12
12
11
9
10
12
11
10
11
8
11
10
12
12
9
11
12
11
ULP1
10
13
11
13
10
11
9
12
12
13
13
13
13
11
10
12
12
10
12
8
12
11
13
13
9
12
12
11
ULP2
10
11
14
14
11
12
8
11
13
14
14
13
12
11
11
13
13
11
13
7
13
12
14
14
8
12
13
12
ULM1
12
13
14
16
13
14
10
13
15
16
16
15
14
13
13
15
15
13
15
9
15
14
16
16
10
14
15
14
LRI1
10
10
11
13
13
13
10
11
12
13
13
12
11
11
10
12
12
12
13
9
13
11
13
13
9
11
13
12
LRI2
11
11
12
14
13
14
10
11
13
14
14
13
12
11
11
13
13
13
14
9
13
12
14
14
9
12
14
13
LRC1
8
9
8
10
10
10
10
9
9
10
10
10
10
9
7
9
9
9
10
8
10
8
10
10
9
8
10
9
LRP1
10
12
11
13
11
11
9
13
12
13
13
13
12
11
10
12
12
10
12
8
13
11
13
13
9
12
12
11
LRP2
11
12
13
15
12
13
9
12
15
15
15
14
13
13
13
14
14
12
14
9
14
14
15
15
9
13
14
13
LRM1
12
13
14
16
13
14
10
13
15
16
16
15
14
13
13
15
15
13
15
9
15
14
16
16
10
14
15
14
LRM2
12
13
14
16
13
14
10
13
15
16
16
15
14
13
13
15
15
13
15
9
15
14
16
16
10
14
15
14
ULI2
12
13
13
15
12
13
10
13
14
15
15
15
14
12
12
14
14
12
14
9
14
13
15
15
10
13
14
13
URP1
11
13
12
14
11
12
10
12
13
14
14
14
14
12
11
13
13
11
13
9
13
12
14
14
10
12
13
12
URP2
9
11
11
13
11
11
9
11
13
13
13
12
12
13
11
12
12
10
12
9
13
12
13
13
9
11
12
11
URM1
10
10
11
13
10
11
7
10
13
13
13
12
11
11
13
13
12
10
12
7
12
12
13
13
8
12
12
12
URM2
12
12
13
15
12
13
9
12
14
15
15
14
13
12
13
15
14
12
14
8
14
13
15
15
10
14
14
14
ULM2
11
12
13
15
12
13
9
12
14
15
15
14
13
12
12
14
15
12
14
8
14
13
15
15
9
13
14
13
LLI1
10
10
11
13
12
13
9
10
12
13
13
12
11
10
10
12
12
13
13
8
12
11
13
13
8
11
13
12
LLI2
11
12
13
15
13
14
10
12
14
15
15
14
13
12
12
14
14
13
15
9
14
13
15
15
9
13
14
14
LLC1
8
8
7
9
9
9
8
8
9
9
9
9
9
9
7
8
8
8
9
9
9
8
9
9
7
7
9
8
LLP1
11
12
13
15
13
13
10
13
14
15
15
14
13
13
12
14
14
12
14
9
15
13
15
15
10
13
14
13
LLP2
10
11
12
14
11
12
8
11
14
14
14
13
12
12
12
13
13
11
13
8
13
14
14
14
8
12
13
12
LLM1
12
13
14
16
13
14
10
13
15
16
16
15
14
13
13
15
15
13
15
9
15
14
16
16
10
14
15
14
LLM2
12
13
14
16
13
14
10
13
15
16
16
15
14
13
13
15
15
13
15
9
15
14
16
16
10
14
15
14
URC1
9
9
8
10
9
9
9
9
9
10
10
10
10
9
8
10
9
8
9
7
10
8
10
10
10
9
10
9
URI1
11
12
12
14
11
12
8
12
13
14
14
13
12
11
12
14
13
11
13
7
13
12
14
14
9
14
13
13
URI2
12
12
13
15
13
14
10
12
14
15
15
14
13
12
12
14
14
13
14
9
14
13
15
15
10
13
15
13
ULI1
11
11
12
14
12
13
9
11
13
14
14
13
12
11
12
14
13
12
14
8
13
12
14
14
9
13
13
14
(A)Symmetry in LEH insults.
Symmetry was calculated as agreement between tooth isomeres and antimeres, i.e., if either both teeth (in an antimere or isomere pair) had a lesion or had no lesion, this was considered agreement.
Code
isomere_score_agreement <- hypoplasia_present_long |># symmetry is not informative on individuals with no lesionsgroup_by(ID, side, type) |># group by side to compare isomeressummarise(n =n(), # sanity checkscore =sum(leh_bin) ) |>mutate(agree =if_else(score ==1, FALSE, TRUE) ) |>ungroup()# antimere symmetryantimere_score_agreement <- hypoplasia_present_long |>group_by(ID, region, type) |># group by region to compare antimeressummarise(n =n(), # sanity checkscore =sum(leh_bin) ) |>mutate(agree =if_else(score ==1, FALSE, TRUE) ) |>ungroup()
Plots
Code
isomere_symm_plot <- isomere_score_agreement |>ggplot(aes(x = type, y = ID, fill = agree)) +geom_tile() +labs(caption ="Isomere symmetry of lesions.")antimere_symm_plot <- antimere_score_agreement |>ggplot(aes(x = type, y = ID, fill = agree)) +geom_tile() +labs(caption ="Antimere symmetry of lesions.") +theme(axis.text.y =element_blank(),axis.ticks.y =element_blank() )isomere_symm_plot + antimere_symm_plot +plot_layout(guides ="collect", axis_titles ="collect") &labs(x ="Tooth type", fill ="Agreement") &scale_fill_manual(labels =c("No", "Yes"), values = viridisLite::viridis(2, begin =0.1, end =0.7))
References
WORD, Museum of London. (2013). www.museumoflondon.org.uk/ Collections-Research/LAARC/Centre-for-Human-Bioarchaeology/ Accessed April 2024.