fave-extract
Outputs
The subcommands of fave-extract
generate multiple output files by default (this can be customized). Each one will be named after the original file with a suffix. Below, each type of output file is described, followed by descriptions of the data columns in the csv files (in alphabetical order).
File Types
*_points.csv
Points files contain 1 row per vowel analyzed, with a single point measurement taken according to the measurement point heuristic used.
*_tracks.csv
Tracks files contain 1 row per measurement point per vowel analyzed. By default, there will be one measurement point every 2ms, so a 100ms vowel will have 50 rows in the data. `*_tracks.csv* files can get very large!
For analyzing the tracks data, the combination of the file_name
column and the id
will uniquely identify each individual token.
*_param.csv
and *_logparam.csv
These files contain the Discrete Cosine Transform coefficients for each analyzed vowel. *_param.csv
contains the coefficients when the DCT is applied to the formants in Hz, and *_logparam.csv
contains the coefficients when the DCT is applied to the log-transformed formants.
The DCT coefficients can be directly normalized (R package). This can be useful for
*_recoded.TextGrid
The *_recoded.TextGrid
will be a copy of the original textgrid passed to new-fave to which the recode rules have been applied.
Data Columns
The data columns are described in this searchable table, and in text below.
Alphabetic List
abs_fol_seg
Stands for ‘absolute following segment’. The segment following the measured vowel, regardless of word boundary.
abs_pre_seg
Stands for ‘absolute preceding segment’. The segment preceding the measured vowel, regardless of word boundary.
B1, B2, B3
The bandwidths of F1, F2, and F3.
context
The broad location of the measured vowel within the word.
dur
The duration of the measured vowel.
F1, F3, F3
In points and tracks files, the estimated formant values. In param and logparam files, DCT coefficients for each formant.
F1_s, F2_s, F3_s
These only appear in tracks files. The DCT smoothed formant tracks
file_name
The file stem of the analyzed file
fol_seg
The segment following the measured vowel. If the vowel is at the end of the word, this is ‘#’
fol_word
The word following the word that the measured vowel appears in.
group
The name of the word+phone tier group in the original textgrid. If tiers were just named ‘word’ and ‘phone’, this will be ‘group_0’. Otherwise, this will probably be the speaker’s name.
id
A unique id for the measured vowel that is shared across all file outputs. The numbers correspond to [the index of the tier group]-[the index of the word tier]-[the index of the word within the tier]-[the index of the vowel within the word].
label
The label of the measured vowel.
max_formant
The maximum formant setting used for this vowel
optimized
The number of optimization iterations that ran.
param
This only appears in param and logparam files. It identifies which DCT coefficient this row corresponds to.
point_heuristic
This only appears in points files. Identifies the measurement point heuristic used.
pre_seg
The segment preceding the measured vowel. If the vowel is at the beginning edge of the word, this is ‘#’.
pre_word
The word preceding the word that the measured vowel appears in.
prop_time
Time measured proportionally to the duration of the vowel. The very beginning of the vowel is time 0, and the very end is time 1.
rel_time
Time relative to the start of the vowel, in seconds. The very beginning of the vowel is time 0.
smooth_error
Ameasure of of the mismatch between the formant track smooths and the raw formant track estimates. A larger value corresponds to a larger mismatch.
speaker_num
The speaker index in the textgrid (beginning at 1)
stress
If present, the stress of the measured vowel
time
The time within the full recording, in seconds. The beginning of the recording is 0
word
The word that the measured vowel appeared in.