Normalization pattern for Komi transcription system

The transcription convention (which still needs to be described) needs a normalization model. Basically the goal should be either or both to:

1) Normalize the Iźva transcription to standard Komi
2) Normalize the Iźva transcription to exact phonemic level

The intermediate result doesn't need to be stored anywhere necessarily, but we need it as an intermediate stage in different places. FST would improve if the input was closed to standard Komi, and for different phonetic work we discussed this kind of phonemic level could be useful, since then searching different phonemes would be easier.

Should this be implemented as GT preprocessor? Would already existing standard Komi > Molodcov conversion be enough? 

## Example from *ö* ~ *e* alteration

One can probably argue that *ö* ~ *e* -distinction is not present in suffixal positions, and is generally rare in non-initial syllables, but we still have cases like: 

> Висер

And because of this we can't just turn all non-initial syllable *e* to *ö*.  Normally changes like:

> чолэм > чолӧм

Would work 100% time.

Now the question is whether all stems containing non-initial syllable *e* (I don't think there are verb stems with this property) are present in GT dictionaries or if they could be. Their number is finite anyway, and maybe thereby special rule could be deviced around them.  

There are maybe ~15 rules like these that would turn Iźva close to quite normal standard Komi.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Normalization pattern for Komi transcription system #4

Example from ö ~ e alteration

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Normalization pattern for Komi transcription system #4

Description

Example from ö ~ e alteration

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions