Label Set Parsers
There are some properties of label sets that you might want to include in your output labels. For example, the CMU dictionary encodes vowel stress like so:
| label | meaning |
|---|---|
AY0 |
unstressed /ay/ |
AY2 |
secondary stressed /ay/ |
AY1 |
primary stressed /ay/ |
A labelset parser can make these properties available so you can write a recoding rule like so:
yaml
- rule: ay
conditions:
- attribute: label
relation: contains
set: AY
return: ay_{stress}fave_recode has built in parser for CMU labels called cmu_parser that you can include like so
bash
fave_recode \
-i data/josef-fruehwald_speaker.TextGrid \
-s cmu2phila \
-a cmu_parserLabel Set Parser Basics
A labelset parser has two top level attributes
yaml
parser: CMU
properties: []parserjust names the parserpropertiesis a list of properties you wish to make available.
A property
A single property that parses primary stress out of the cmu label would look like this:
yaml
name: stress
updates: stress
default: ""
rules:
- rule: "1"
conditions:
- attribute: label
relation: contains
set: "1"
return: "1"The rule component is identical to rules for recoding.
The updates field defines the variable name you want to use to access the value “1” in our recoding rule.
Unlike a recoding rule, every segment will be given some value for “stress”, so a default value also needs to be provided.