Label Set Parsers

There are some properties of label sets that you might want to include in your output labels. For example, the CMU dictionary encodes vowel stress like so:

label	meaning
`AY0`	unstressed /ay/
`AY2`	secondary stressed /ay/
`AY1`	primary stressed /ay/

A labelset parser can make these properties available so you can write a recoding rule like so:

yaml

- rule: ay
  conditions:
    - attribute: label
      relation: contains
      set: AY
  return: ay_{stress}

fave_recode has built in parser for CMU labels called cmu_parser that you can include like so

bash

fave_recode \
 -i data/josef-fruehwald_speaker.TextGrid \
 -s cmu2phila \
 -a cmu_parser

Label Set Parser Basics

A labelset parser has two top level attributes

yaml

parser: CMU
properties: []

parser just names the parser
properties is a list of properties you wish to make available.

A property

A single property that parses primary stress out of the cmu label would look like this:

yaml

name: stress
updates: stress
default: ""
rules:
  - rule: "1"
    conditions: 
      - attribute: label
        relation: contains
        set: "1"
        return: "1"

The rule component is identical to rules for recoding.

The updates field defines the variable name you want to use to access the value “1” in our recoding rule.

Unlike a recoding rule, every segment will be given some value for “stress”, so a default value also needs to be provided.