XTM Quality Assurance - Dictionaries
Dictionary administration

Users are also able to perform the following tasks:

  • Create a new custom dictionary. Select a language and enter the name of the dictionary.
  • Import a dictionary from their disk (utf or xml type).
  • Assign dictionaries to projects or specific tasks from a table showing all the dictionaries connected with the company
  • Delete a dictionary or to export it to disk.

List of dictionaries

The current list of dictionaries that can be downloaded under GNU GPL terms and used with XTM Spellchecker are:

  • Afrikaans (South Africa) - af_ZA
  • Bulgarian (Bulgaria) - bg_BG
  • Catalan (Spain) - ca_ES
  • Chichewa (Malawi) - ny_MW
  • Croatian (Croatia) - hr_HR
  • Czech (Czech Republic) - cs_CZ
  • Danish (Denmark) - da_DK
  • Dutch (Netherlands) - nl_NL
  • English (Australia) - en_AU
  • English (Canada) - en_CA
  • English (New Zealand) - en_NZ
  • English (United Kingdom) - en_GB
  • English (United States) - en_US
  • Esperanto (anywhere) - eo
  • Fijian (Fiji) - See iosn.net
  • Finnish (Finland) - fi_FI
  • French (Belgium) - fr_BE
  • French (France) - fr_FR
  • Faroese (Faroe Islands) - fo_FO
  • Galician (Spain) - gl_ES
  • German (Austria) - de_AT
  • German (Germany - default) - de_DE
  • German (Germany - mit Komposita, alte & neue Rechtschr.) de_DE_comb
  • German (Germany - mit Komposita, neue Rechtschr.) - de_DE_neu
  • German (Switzerland) - de_CH
  • Greek (Greece) - el_GR
  • Hebrew (Israel) - he_IL
  • Hungarian (Hungary) - hu_HU
  • Indonesian (Indonesia) - id_ID
  • Interlingua (x-register) - ia
  • Italian (Italy) - it_IT
  • Irish (Ireland) - ga_IE
  • Kinyarwanda (Rwanda) - rw_RW
  • Kiswahili (Africa) - sw_KE
  • Kurdish (Turkey) - ku_TR
  • Latvian (Latvia) - lv_LV
  • Lithuanian (Lithuania) - lt_LT
  • Malagasy (Madagascar) - mg_MG
  • Malay (Malaysia) - ms_MY
  • Maori (New Zealand) - mi_NZ
  • Norwegian Bokmaal (Norway) - nb_NO
  • Norwegian Nynorsk (Norway) - nn_NO
  • Polish (Poland) - pl_PL
  • Portuguese (Brazil) - pt_BR
  • Portuguese (Portugal) - pt_PT
  • Romanian (Romania) - ro_RO
  • Russian (Russia) - ru_RU
  • Russian ye (Russia) - ru_RU_ye
  • Russian yo (Russia) - ru_RU_yo
  • Setswana (Africa) - tn_ZA
  • Scottish Gaelic (Scotland) - gd_GB
  • Spanish (Mexico) - es_MX
  • Spanish (Spain-etal) - es_ES
  • Slovak (Slovakia) - sk_SK
  • Slovenian (Slovenia) - sl_SI
  • Swedish (Sweden) - sv_SE
  • Tagalog (Philippines) - tl_PH
  • Tetum (Indonesia) - tet_ID
  • Ukrainian (Ukraine) - uk_UA
  • Welsh (Wales) - cy_GB
  • Zulu (Africa) - zu_ZA

 

 
A translation memory, or TM, is a type of database that is used in software programs designed to aid human translators. Some software programs that use translation memories are known as translation memory managers (TMM). Translation memories are typically used in conjunction with a dedicated computer assisted translation (CAT) tool, word processing program, terminology management systems, multilingual dictionary, or even raw machine translation output. A translation memory consists of text segments in a source language and their translations into one or more target languages. These segments can be blocks, paragraphs, sentences, or phrases. Individual words are handled by terminology bases and are not within the domain of TM. Research indicates that many companies producing multilingual documentation are using translation memory systems. A translator first supplies a source text (that is, a text to be translated) to the translation memory. Some translation memories systems search for 100% matches only, that is to say that they can only retrieve segments of text that match entries in the database exactly, while others employ fuzzy matching algorithms to retrieve similar segments, which are presented to the translator with differences flagged. It is important to note that typical translation memory systems only search for text in the source segment. The flexibility and robustness of the matching algorithm largely determine the performance of the translation memory, although for some applications the recall rate of exact matches can be high enough to justify the 100%-match approach. Translation memory The unique identifiers are remembered during translation so that the target language document is 'exactly' aligned at the text unit level. If the source document is subsequently modified, then those text units that have not changed can be directly transferred to the new target version of the document without the need for any translator interaction. This is the concept of 'exact' or 'perfect' matching to the translation memory. xml:tm can also provide mechanisms for in-document leveraged and fuzzy matching. TMX Translation Memory Exchange format. This standard enables the interchange of translation memories between translation suppliers. TMX has been adopted by the translation community as the best way of importing and exporting translation memories. The current version is 1.4b - it allows for the recreation of the original source and target documents from the TMX data. TBX Termbase Exchange format. This standard allows for the interchange of terminology data including detailed lexical information. The framework for TBX is provided by two ISO 12620, ISO 12200 and ISO Committee Draft 16642, known as TMF or Terminological Markup Framework. ISO 12620 provides an inventory of well-defined “data categories” with standardized names that function as data element types or as predefined values. ISO 12200 (also known as MARTIF) provides the basis for the core structure of TBX. TMF includes a structural metamodel for Terminology Markup Languages in general, regardless of which XML style of representation is used. SRX Segmentation Rules Exchange format. SRX is intended to enhance the TMX standard so that translation memory data that is exchanged between applications can be used more effectively. The ability to specify the segmentation rules that were used in the previous translation increases the leveraging that can be achieved. GMX GILT Metrics. GILT stands for (Globalization, Internationalization, Localization, and Translation). The GILT Metrics standard comprises three parts: GMX-V for volume metrics, GMX-C for complexity metrics and GMX-Q for quality metrics. The proposed GILT Metrics standard is tasked with quantifying the workload and quality requirements for any given GILT task. OLIF Open Lexicon Interchange Format. OLIF is an open, XML-compliant standard for the exchange of terminological and lexical data. Although originally intended as a means for the exchange of lexical data between proprietary machine translation lexicons, it has evolved into a more general standard for terminology exchange. XLIFF XML Localisation Interchange File Format. It is intended to provide a single interchange file format that can be understood by any localization provider. XLIFF is the preferred way of exchanging data in XML format in the translation industry. TransWS Translation Web Services. TransWS specifies the calls needed to use Web services for the submission and retrieval of files and messages relating to localization projects. It is intended as a detailed framework for the automation of much of the current localization process by the use of Web Services. xml:tm xml:tm This approach to translation memory is based on the concept of text memory which comprises author and translation memory. xml:tm has been donated to Lisa OSCAR by XML-INTL.