Dictionaries file format |
Prefixes
Let's have a look at the prefixes dictionary gpl/pierrick/brihaye/aramorph/dictionaries/dictPrefixes :
; conjunctions w wa Pref-Wa and <pos>wa/CONJ+</pos> f fa Pref-Wa and;so <pos>fa/CONJ+</pos>
We can see that commentaries are introduced by ; and that significant lines are divided by tabs whose significance is respectively :
- the prefix' consonantic skeleton (using Buckwalter's transliteration system)
- the prefix' vocalization (using the same system)
- the prefix' morphological category
- one or several translations for the prefix, followed by one or several grammatical categories. Notice the + which indicates that a stem is expected after this prefix.
Some informations are optional. One good example is that of the empty prefix :
; The first category is the null prefix (has a null gloss as well): Pref-0
... where we just have a morphological category.
Suffixes
Let's now have a look at this snippet taken from the suffixes dictionary gpl/pierrick/brihaye/aramorph/dictionaries/dictSuffixes :
; perfect verb, null suffix: banA-h, daEA-h h hu PVSuff-0ah he/it <verb> it/him <pos>+(null)/PVSUFF_SUBJ:3MS+hu/PVSUFF_DO:3MS</pos> hmA humA PVSuff-0ah he/it <verb> them (both) <pos>+(null)/PVSUFF_SUBJ:3MS+humA/PVSUFF_DO:3D</pos> hm hum PVSuff-0ah he/it <verb> them <pos>+(null)/PVSUFF_SUBJ:3MS+hum/PVSUFF_DO:3MP</pos> hA hA PVSuff-0ah he/it <verb> it/them/her <pos>+(null)/PVSUFF_SUBJ:3MS+hA/PVSUFF_DO:3FS</pos> hn hun~a PVSuff-0ah he/it <verb> them <pos>+(null)/PVSUFF_SUBJ:3MS+hun~a/PVSUFF_DO:3FP</pos> k ka PVSuff-0ah he/it <verb> you <pos>+(null)/PVSUFF_SUBJ:3MS+ka/PVSUFF_DO:2MS</pos> k ki PVSuff-0ah he/it <verb> you <pos>+(null)/PVSUFF_SUBJ:3MS+ki/PVSUFF_DO:2FS</pos> kmA kumA PVSuff-0ah he/it <verb> you (both) <pos>+(null)/PVSUFF_SUBJ:3MS+kumA/PVSUFF_DO:2D</pos> km kum PVSuff-0ah he/it <verb> you <pos>+(null)/PVSUFF_SUBJ:3MS+kum/PVSUFF_DO:2MP</pos> kn kun~a PVSuff-0ah he/it <verb> you <pos>+(null)/PVSUFF_SUBJ:3MS+kun~a/PVSUFF_DO:2FP</pos> ny niy PVSuff-0ah he/it <verb> me <pos>+(null)/PVSUFF_SUBJ:3MS+niy/PVSUFF_DO:1S</pos> nA nA PVSuff-0ah he/it <verb> us <pos>+(null)/PVSUFF_SUBJ:3MS+nA/PVSUFF_DO:1P</pos>
The principle is exactly the same, although the example is slightly more complex. Indeed, we have a double suffixes sequence, the first one being the Ø perfective third masculine person suffix, the second one being relative to a pronominal direct object. One will notice the + that operates the junction with the stem then the subsequent + which operates the junction with the Ø suffix.
Stems
Let's have a look at this snippet taken from the stems dictionary gpl/pierrick/brihaye/aramorph/dictionaries/dictstems :
; ;--- ktb ;; katab-u_1 ktb katab PV write ktb kotub IV write ktb kutib PV_Pass be written;be fated;be destined ktb kotab IV_Pass_yu be written;be fated;be destined ;; kAtab_1 kAtb kAtab PV correspond with kAtb kAtib IV_yu correspond with ;; >akotab_1 >ktb >akotab PV dictate;make write Aktb >akotab PV dictate;make write ktb kotib IV_yu dictate;make write ktb kotab IV_Pass_yu be dictated ;; takAtab_1 tkAtb takAtab PV correspond tkAtb takAtab IV correspond ;; {inokatab_1 <nktb {inokatab PV subscribe Anktb {inokatab PV subscribe nktb nokatib IV subscribe ;; {ikotatab_1 <kttb {ikotatab PV register;enroll Akttb {ikotatab PV register;enroll kttb kotatib IV register;enroll ;; {isotakotab_1 <stktb {isotakotab PV make write;dictate Astktb {isotakotab PV make write;dictate stktb sotakotib IV make write;dictate ;; kitAb_1 ktAb kitAb Ndu book ktb kutub N books ;; kitAboxAnap_1 ktAbxAn kitAboxAn NapAt library;bookstore ktbxAn kutuboxAn NapAt library;bookstore ;; kutubiy~_1 ktby kutubiy~ Ndu book-related ;; kutubiy~_2 ktby kutubiy~ Ndu bookseller ktby kutubiy~ Nap booksellers <pos>kutubiy~/NOUN</pos> ;; kut~Ab_1 ktAb kut~Ab N kuttab (village school);Quran school ktAtyb katAtiyb Ndip kuttab (village schools);Quran schools ;; kutay~ib_1 ktyb kutay~ib NduAt booklet ;; kitAbap_1 ktAb kitAb Nap writing ;; kitAbap_2 ktAb kitAb Napdu essay;piece of writing ktAb kitAb NAt writings;essays ;; kitAbiy~_1 ktAby kitAbiy~ N-ap writing;written <pos>kitAbiy~/ADJ</pos> ;; katiybap_1 ktyb katiyb Napdu brigade;squadron;corps ktA}b katA}ib Ndip brigades;squadrons;corps ktA}b katA}ib Ndip Phalangists ;; katA}ibiy~_1 ktA}by katA}ibiy~ Nall brigade;corps <pos>katA}ibiy~/NOUN</pos> ktA}by katA}ibiy~ Nall brigade;corps <pos>katA}ibiy~/ADJ</pos> ;; katA}ibiy~_2 ktA}by katA}ibiy~ Nall Phalangist <pos>katA}ibiy~/NOUN</pos> ktA}by katA}ibiy~ Nall Phalangist <pos>katA}ibiy~/ADJ</pos> ;; makotab_1 mktb makotab Ndu bureau;office;department mkAtb makAtib Ndip bureaus;offices ;; makotabiy~_1 mktby makotabiy~ N-ap office <pos>makotabiy~/ADJ</pos> ;; makotabap_1 mktb makotab NapAt library;bookstore mkAtb makAtib Ndip libraries;bookstores ;; mikotAb_1 mktAb mikotAb Ndu printer ;; mukAtabap_1 mkAtb mukAtab NapAt correspondence ;; {ikotitAb_1 <kttAb {ikotitAb N/At enrollment;registration;subscription AkttAb {ikotitAb N/At enrollment;registration;subscription ;; {isotikotAb_1 <stktAb {isotikotAb N/At dictation AstktAb {isotikotAb N/At dictation <stktAby {isotikotAbiy~ N-ap dictation <pos>{isotikotAbiy~/ADJ</pos> AstktAby {isotikotAbiy~ N-ap dictation <pos>{isotikotAbiy~/ADJ</pos> ;; kAtib_1 kAtb kAtib N/ap writer;author kAtb kAtib N/ap clerk ktAb kut~Ab N authors;writers ktb katab Nap authors;writers ;; kAtib_2 kAtb kAtib Nall writing <pos>kAtib/ADJ</pos> ;; makotuwb_1 mktwb makotuwb N-ap written <pos>makotuwb/ADJ</pos> ;; makotuwb_2 mktwb makotuwb Ndu letter;message mkAtyb makAtiyb Ndip letters;messages ;; mukAtib_1 mkAtb mukAtib Nall correspondent;reporter ;; mukotatib_1 mkttb mukotatib Nall subscriber ;
The format is slightly different since we have a line beginning by ;; whose purpose is to provide a lemma identifier. The remaining is similar however.
We will notice that the grammatical category is often missing since it can be extrapolated from the morphological category. In some cases however, we will have some examples where the grammatical category is to be explicited because, for example, morphological categories like nisbas, which are morphologically nominal, may have adjectival usages.
This would help in a direct processing of the dictionaries in arabic rather than through the Buckwalter's transliteration system, thus taking profit from Java's native Unicode support.
The version 2.0 of the Aramorph's Perl version, uses XML dictionaries, but is unfortunately not compliant with the GPL.