Detect Overlaps

Author

Josef Fruehwald

Published

June 26, 2024

When working with a multi-speaker textgrid, you may want to know if a word or phone is overlapped by another speaker’s speech. Here, we’ll detect these overlaps and set an overlapped feature on each interval.

from aligned_textgrid import AlignedTextGrid, custom_classes, SequenceList
import numpy as np

atg = AlignedTextGrid(
    "resources/KY25A_1.TextGrid",
    entry_classes = custom_classes(["Word", "Phone"])
)

Overlap Detection

We’ll get all phones that aren’t silences and create a SequenceList from them. SequenceList have convenience attributes to return an array of the start and end times of SequenceIntervals within them.

all_phones = SequenceList(
    *[
        phone
        for group in atg
        for phone in group.Phone
        if phone.label != ""
    ]
)

Now, we’ll loop through these phones of interest. First we’ll

  • Set every phones “overlapped” feature to False. This will remain false if it is not overlapped.
  • Check for any overlaps with the formula (x_start < y_end) & (y_start < x_end).
  • This will be true once for all phones (when compared to itself), so if this is true more than once, the interval is overlapped.
  • We’ll aslo set an “overlapper” feature which is a list of the intervals that are doing the overlapping.
for phone in all_phones:

    phone.set_feature("overlapped", False)

    overlap = (
        (phone.start < all_phones.ends) &
        (all_phones.starts < phone.end)
    )

    overlappers = (
        np.argwhere(overlap)
          .squeeze()
          .tolist()
    )
     
    if overlap.sum() > 1:

        self_index = all_phones.index(phone)
        overlappers.remove(self_index)

        overlapper_list = [
            all_phones[idx]
            for idx in overlappers
        ]

        phone.set_feature("overlapped", True)

        phone.set_feature(
            "overlapper",
            SequenceList(*overlapper_list)
        )
1
Default to intervals not being overlapped.
2
This will be an array with False if an interval doesn’t overlap with our target phone, and True if it does.
3
This will return a list if indices where overlap is True.
4
All intervals overlap with themselves, so we check to see if there’s more than one overlap.
5
For adding a list of overlapping intervals, remove our target interval from the index list.
6
Create the list of overlapping phone.
7
Set the overlapped flag
8
Keep track of which intervals over overlapping.

Let’s grab one of the overlapped phones.

overlapped_phones = [
    phone 
    for phone in all_phones 
    if phone.overlapped
]

one_phone = overlapped_phones[0]

We can inspect its timing and compare it to the overlappers.

Code
print(
    f"Overlapped: {(one_phone.start, one_phone.end, one_phone.label)}"
)

print(
    f"Overlapped word: {one_phone.within.label}"
)

print(
    f"Overlapper: {one_phone.overlapper.starts, one_phone.overlapper.ends, one_phone.overlapper.labels}"
)

print(
    f"Overlapper words: {[x.within.label for x in one_phone.overlapper]}"
)
Overlapped: (10.7017, 10.7317, 'W')
Overlapped word: one
Overlapper: (array([10.7017]), array([10.7317]), ['Y'])
Overlapper words: ['yeah']

Session Info

Code
import sys
import aligned_textgrid

print(
    (
        f"Python version: {sys.version}\n"
        f"aligned-textgrid version: {aligned_textgrid.__version__}"
    )
)
Python version: 3.11.9 (main, May  9 2024, 14:13:20) [GCC 11.4.0]
aligned-textgrid version: 0.7.4

Reuse

GPLv3