from aligned_textgrid import AlignedTextGrid, custom_classes
atg = AlignedTextGrid(
"resources/josef-fruehwald_speaker.TextGrid",
entry_classes = custom_classes(["Word", "Phone"])
)Phrase Creation
When working with force-aligned TextGrid with a Word and Phone tier, you can also add Phrase tier.
Interleave a new Phrase tier
First, we need to interleave a new tier class above Word, copying its timing and labels.
atg.interleave_class(
name = "Phrase",
above = "Word",
timing_from = "below",
copy_labels = True
)
print(atg)AlignedTextGrid with 1 groups named ['group_0'] each with [3] tiers. [['Phrase', 'Word', 'Phone']]
Iterate through phrase and fuse
We need to define a function that will take an existing phrase label and add on an incoming label.
def make_phrase_label(a_label, b_label):
if len(b_label) > 0:
a_label = f"{a_label} {b_label}"
return a_labelThe fuse_rightwards() method will fuse the following interval to the current interval and pop the following interval from the tier. Therfore, we don’t want to use a for-loop.
Instead, we’ll use a while loop, which will end when we reach the end of the Phrase tier. We’ll update the interval we are fusing with when
- Its current interval label is “” (or a pause)
- The following interval label is “” and longer than 220 ms.
The continue keyword under the if statements bumps us back to the top of the while loop, which will check to see if we’re at the end of the Phrase tier.
this_interval = atg[0].Phrase.first
while this_interval is not atg[0].Phrase.last:
if this_interval.label == "":
this_interval = this_interval.fol
continue
following_long_pause = (
this_interval.fol.label == ""
and
this_interval.fol.duration >= 0.220
)
if following_long_pause:
this_interval = this_interval.fol
continue
this_interval.fuse_rightwards(
label_fun = make_phrase_label
)- 1
- Manually begin at the first interval.
- 2
-
The value of
.lastis dynamically updated, so this is safe. - 3
- If we are currently in a pause interval, move to the next interval.
- 4
-
Get a
TrueorFalseif the next interval is a pause equal to or greater than 220ms. - 5
-
If the following interval is a long pause, update
this_intervalto be the following interval. The previousifstatement will keep bumping us along until we get to a non-pause interval. - 6
-
If neither of the previous
ifstatements were triggered, we fusethis_intervalwith the following interval.
We can check on the results.
for phrase in atg[0].Phrase[0:10]:
print(phrase.label)
when the sunlight strikes raindrops in the air they act like a prism and formza rainbow
the rainbow is a division of white light into many beautiful colors
these take the shape of a long round arch
with its path high above and its two ends apparently beyond the horizon
there is according to legend a boiling pot of gold at one end
And just for clarity, each non-pause word is now a subset member of a phrase interval.
(
atg[0].Word[1].label,
atg[0].Word[1].within.label
)('when',
'when the sunlight strikes raindrops in the air they act like a prism and formza rainbow')
More ideas
We can also, for example, get a list of the duration of pauses that occur within a phrase.
import numpy as np
in_phrase_pauses = [
interval
for interval in atg[0].Word
if interval.label == ""
if interval.within.label != ""
]
pause_durs = np.array([
interval.duration
for interval in in_phrase_pauses
])
pause_dursarray([0.16, 0.11, 0.03, 0.03, 0.03, 0.22, 0.04, 0.04, 0.15, 0.08, 0.06,
0.04, 0.06, 0.03, 0.03, 0.12, 0.03, 0.03, 0.05, 0.04, 0.14, 0.06,
0.21, 0.05, 0.03, 0.03, 0.08, 0.03, 0.04, 0.06, 0.14, 0.03, 0.03,
0.03, 0.03])
Session Info
Code
import sys
import aligned_textgrid
print(
(
f"Python version: {sys.version}\n"
f"aligned-textgrid version: {aligned_textgrid.__version__}"
)
)Python version: 3.11.13 (main, Jun 4 2025, 04:12:12) [GCC 13.3.0]
aligned-textgrid version: 0.8.0