Pages in topic: < [1 2 3] | Is there anybody out there? Thread poster: Matthias Brombach
| Olav Karlsen Norway Local time: 13:03 English to Norwegian + ... Flexional languages | Feb 4, 2017 |
Wojciech Matyszkiewicz wrote:
mikhailo wrote:
Just a question. How will this feature deal with flexion languages? For now filter in DVX does not support search of all wordforms.
Most of langs has a limited flexion (different sing.-pl. forms) but DVX does not support even them.
How could such defective function be very useful?
But it's already there. Set the minimum score to about 70% and enable fuzzy terminology match and voila.
There might be some false positives, but as far as I know, no CAT tool does better here. Fuzzy matches is the only choice to deal with terminology in inflected languages.
Limiting lookups to e.g. the nominative for nouns and pronouns, infinitive for verbs can be beneficial. The process goes much faster since the glossary size can be made smaller. After all, the translator is supposed to have the linguistic competence to correct the inclination missing. This is part of the competence the contractor pays for. If there are repetitive set phrases, the translator can send them to the TM to save time. | | | Wojciech_ (X) Poland Local time: 13:03 English to Polish + ...
Olav Karlsen wrote:
Wojciech Matyszkiewicz wrote:
mikhailo wrote:
Just a question. How will this feature deal with flexion languages? For now filter in DVX does not support search of all wordforms.
Most of langs has a limited flexion (different sing.-pl. forms) but DVX does not support even them.
How could such defective function be very useful?
But it's already there. Set the minimum score to about 70% and enable fuzzy terminology match and voila.
There might be some false positives, but as far as I know, no CAT tool does better here. Fuzzy matches is the only choice to deal with terminology in inflected languages.
Limiting lookups to e.g. the nominative for nouns and pronouns, infinitive for verbs can be beneficial. The process goes much faster since the glossary size can be made smaller. After all, the translator is supposed to have the linguistic competence to correct the inclination missing. This is part of the competence the contractor pays for. If there are repetitive set phrases, the translator can send them to the TM to save time.
I fully agree. But some words e.g. in Polish, are so heavily inflected that we need the possibility to add synonyms to a glossary, because the root of a word may change as well.
Consider singular Polish "wieś" (village) and plural "wsie" (villages) - no pipe mark can help here, but only the option to add the plural as a synonym. Adding a sole "w|" would be ridiculous
[Edited at 2017-02-04 18:19 GMT] | | | CafeTran Training (X) Netherlands Local time: 13:03
Wojciech Matyszkiewicz wrote:
There might be some false positives, but as far as I know, no CAT tool does better here. Fuzzy matches is the only choice to deal with terminology in inflected languages.
Can you give a few examples of Polish words that are recognised as belonging to the same stem, and that aren't recognised in other CAT tools that offer stemming (e.g. via Hunspell)?
Wojciech Matyszkiewicz wrote:
But some words e.g. in Polish, are so heavily inflected that we need the possibility to add synonyms to a glossary, because the root of a word may change as well.
Consider singular Polish "wieś" (village) and plural "wsie" (villages) - no pipe mark can help here, but only the option to add the plural as a synonym. Adding a sole "w|" would be ridiculous
Where would you want to add the plural as a "synonym"? At the source side? Is this possible in Déjà Vu? And, if so, it would probably be impossible to have Déjà Vu match the plural of the target term automatically.
The number of forms in Polish look very complicated for me (as a native Dutch, just one singular: ingenieur, one plural: ingenieurs):
Just out of interest: Would it be possible to define 1 stem for all singular forms and 1 for the plurals? | | | Correct term matching | Feb 5, 2017 |
Accurate term matching makes for accurate quality assurance. It's not that the translator is lazy or incompetent. By allowing to enter RegEx expressions in the term base, one can be as detailed as one wants. If you only want the nominative, do it that way. But I want to be detailed and thorough, and I don't want incorrect flags about supposed QA errors.
[Edited at 2017-02-05 11:52 GMT] | |
|
|
Wojciech_ (X) Poland Local time: 13:03 English to Polish + ...
CafeTran Training wrote:
Where would you want to add the plural as a "synonym"? At the source side? Is this possible in Déjà Vu? And, if so, it would probably be impossible to have Déjà Vu match the plural of the target term automatically.
i think each of us understands the idea of a "synonym" differently in a CAT tool. I'm talking about the option that I'm using in Trados.
SOURCE:
wieś
wsie
TARGET:
village
villages
Trados recognizes both "wieś" and "wsie". It's not just an additional info (e.g. a note) that Trados wouldn't recognize. Deja Vu doesn't have this option - you can only add a note, but this wouldn't be recognized when the term appears.
That's why I'm asking for such a feature in Deja Vu. | | | CafeTran Training (X) Netherlands Local time: 13:03
Wojciech Matyszkiewicz wrote:
i think each of us understands the idea of a "synonym" differently in a CAT tool.
Yes
I'm talking about the option that I'm using in Trados.
SOURCE:
wieś
wsie
TARGET:
village
villages
Trados recognizes both "wieś" and "wsie". It's not just an additional info (e.g. a note) that Trados wouldn't recognize. Deja Vu doesn't have this option - you can only add a note, but this wouldn't be recognized when the term appears.
That's why I'm asking for such a feature in Deja Vu.
Thanks. Now, please back to your statement:
There might be some false positives, but as far as I know, no CAT tool does better here.
Can you give some examples please, where Déjà Vu does better in recognition of inflected terms than any other tool? | | | Wojciech_ (X) Poland Local time: 13:03 English to Polish + ...
CafeTran Training wrote:
Wojciech Matyszkiewicz wrote:
i think each of us understands the idea of a "synonym" differently in a CAT tool.
Yes
I'm talking about the option that I'm using in Trados.
SOURCE:
wieś
wsie
TARGET:
village
villages
Trados recognizes both "wieś" and "wsie". It's not just an additional info (e.g. a note) that Trados wouldn't recognize. Deja Vu doesn't have this option - you can only add a note, but this wouldn't be recognized when the term appears.
That's why I'm asking for such a feature in Deja Vu.
Thanks. Now, please back to your statement:
There might be some false positives, but as far as I know, no CAT tool does better here.
Can you give some examples please, where Déjà Vu does better in recognition of inflected terms than any other tool?
Not quite understand why you want me to give you any examples, since I didn't mention that Deja Vu does better than any other tool. I simply stated, that AS FAR AS I KNOW, all CAT tools have more or less the same problem with inflected languages and fuzzy matching seems like the only sensible option available now.
Pardon me, if you misunderstood that.
I would also be happy to know, which tool does better in this department, although bearing in mind your nick, I can guess which one | | | CafeTran Training (X) Netherlands Local time: 13:03 We are here to learn | Feb 6, 2017 |
Wojciech Matyszkiewicz wrote:
I simply stated, that AS FAR AS I KNOW, all CAT tools have more or less the same problem with inflected languages and fuzzy matching seems like the only sensible option available now.
Pardon me, if you misunderstood that.
I'm here to learn, and often that means that I have to ask questions. I indeed interpreted your statement that you had positive experiences with Déjà Vu (a great tool, that I've used for many years, until I migrated to Mac). I was hoping to learn from your experiences.
I'd say that any rule-based stemming would result in less loss (thus better recognition) than any simple cutting of letters or Levensthein algorithm.
I would also be happy to know, which tool does better in this department, although bearing in mind your nick, I can guess which one
I don't know which tool performs best. That's exactly what I was hoping to learn, since I don't work from a heavily inflected source language.
My assumption would be that Transit, omegaT and CafeTran perform best, since they all use rule-based stemming.
BTW: My nickname doesn't mean that I don't have a personal opinion about the tool I use or that I think that it's the greatest thing after chocolate. It's just a working tool. | | | Pages in topic: < [1 2 3] | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Is there anybody out there? Trados Business Manager Lite | Create customer quotes and invoices from within Trados Studio
Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.
More info » |
| CafeTran Espresso | You've never met a CAT tool this clever!
Translate faster & easier, using a sophisticated CAT tool built by a translator / developer.
Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools.
Download and start using CafeTran Espresso -- for free
Buy now! » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |