from aligned_textgrid import AlignedTextGrid
from aligned_textgrid import Word, Phone
Navigating an AlignedTextGrid
This documentation covers reading in the output from the Montreal Forced Aligner using the Word
and Phone
classes from aligned_textgrid
, but everything will generalize to custom classes.
Reading in a TextGrid
To read in a one-speaker TextGrid, either give AlignedTextGrid()
the path to the file, or a textgrid that has already been read in with praatio.textgrid.openTextgrid()
.
You also need to specify the sequence classes of each tier in the order they appear. For MFA output, the top tier is Word
and the bottom tier is Phone
, but if these were reversed, you would have to pass [Phone, Word]
to entry_classes
. The information about which class is the superset and which is the subset is encoded in the class information, and is automatically handled.
= AlignedTextGrid(
one_speaker = "../resources/josef-fruehwald_speaker.TextGrid",
textgrid_path = [Word, Phone]
entry_classes )
With a two or more speaker TextGrid, you can either pass entry_classes
a single list of interval classes to re-use with each speaker (for example [Word, Phone]
), or an explicit list of nested classes (for example, [[Word, Phone], [Word, Phone]]
).
= AlignedTextGrid(
two_speaker = "../resources/KY25A_1.TextGrid",
textgrid_path = [Word, Phone]
entry_classes )
If you have a textgrid a mixture of sequence hierarchies, you have to read it in with then fully nested list of classes.
from aligned_textgrid import custom_classes
= custom_classes("Turn")
Turn
= AlignedTextGrid(
multi_hierarchy = "../resources/KY25A_1_multi.TextGrid",
textgrid_path = [[Word, Phone], [Turn], [Word, Phone], [Turn]]
entry_classes
)
print(multi_hierarchy)
AlignedTextGrid with 4 groups, each with [2, 1, 2, 1] tiers. [['Word', 'Phone'], ['Turn'], ['Word', 'Phone'], ['Turn']]
Get interval at time
The “Get interval at time” functionality from Praat has been implemented for each level of TextGrid representation.
= two_speaker[0]
speaker_one = speaker_one[0] speaker_one_word
11) speaker_one_word.get_interval_at_time(
1
This is the index for the word that appears at 11 seconds.
11) speaker_one.get_intervals_at_time(
[1, 2]
These are the indices for the word and phone tiers that are at 11 seconds.
11) two_speaker.get_intervals_at_time(
[[1, 2], [39, 96]]
11) two_speaker.get_intervals_at_time(
[[1, 2], [39, 96]]
These are the indices for the word and phone tiers for both speakers at 11 seconds.
Nested indexing
You can use the nested indices returned by .get_intervals_at_time()
to get the actual sequence intervals as well.
= two_speaker.get_intervals_at_time(11)
eleven_seconds two_speaker[eleven_seconds]
[[Class Word, label: yeah, .superset_class: Top_wp, .super_instance, None, .subset_class: Phone, .subset_list: ['Y', 'AE1'],
Class Phone, label: AE1, .superset_class: Word, .super_instance: yeah, .subset_class: Bottom_wp],
[Class Word, label: after, .superset_class: Top_wp, .super_instance, None, .subset_class: Phone, .subset_list: ['AE1', 'F', 'T', 'ER0'],
Class Phone, label: F, .superset_class: Word, .super_instance: after, .subset_class: Bottom_wp]]