[sword-devel] MDF/SFM Dictionary Import into Sword Format

Daniel Owens dhowens at pmbx.net
Sun Apr 17 19:52:21 MST 2011


I actually have a very crude Perl script to do something like that, but 
I haven't used it for awhile. If no one else bites on the project, I can 
take a look at what I have. The most significant challenge I would 
anticipate is that entries might be structured in a variety of ways, so 
something customizable would be needed, which would require someone 
knowing a little bit of Perl. But what I do is mostly search and replace 
using regular expressions. It's the extent of my programming knowledge.

One other option I am looking into, since at textonline.org we are using 
WeSay do develop lexicons that ultimately will end up in SWORD, is 
creating a script or xsl file to go from LIFT XML to TEI XML. I haven't 
tried to do that yet, but you could export from FieldWorks to LIFT and 
then run the script. I don't plan to tackle this real soon, but it could 
be another option.

But a more experienced person who is more than just a tinkerer like me 
could probably help you out very quickly. MDF files are very simple.

Daniel

On 04/17/2011 08:11 PM, Luke Schroeder wrote:
> Does anyone with a basic knowledge of programming see any merit in the 
> following idea?  Make a script that takes basic components of a 
> dictionary and formats it alphabetically into the sword imp format.  
> The imp format being as follows:
>
> $$$word in another language
> Definition or gloss of word immediately below.
>
> The dictionary format being the standard used by Bible translators, 
> called MDF (Manual for Dictionary Formatting?) and adapted in the 
> SIL's program Fieldworks.  My idea of a script would be to allow the 
> user to input the backslash codes into the script for the word, sense 
> number, part of speech, definition, example sentence, example sentence 
> translated.  In our dictionary these codes would be: \lc, \sn, \ps, 
> \ge, \xv, \xe but in others dictionaries they would want the ability 
> to adapt the backslash codes a little since often Fieldworks gives 
> backslash codes that include the language name.  The script would then 
> output a file in imp format.  The file would look like this:
> $$$Word
> [Part of speech] [Definition] (1st sense of word)
>
> [Example Sentence]
> [Example Sentence translation]
>
> [Part of speech] [Definition] (2nd sense of word
>
> [Example Sentence]
> [Example Sentence translation]
> etc.
>
> It seems to me like this is very basic programming.  It also seems to 
> me that if someone really go into the project they could do a lot more 
> than what I just described.  Finally I find this beneficial, because 
> most of the Bible translation teams for minority languages are working 
> on dictionaries at the same time.  For people like the group I work 
> with, the dictionary is valued more than the Bible.  It is easy to get 
> the Bibles into Sword format.  If this script were made it would be 
> easy to get dictionaries into sword format (at least in their most 
> basic form).  Having the two distributed together allows interaction 
> between both works.  It helps younger people in the language 
> understand the deeper words of their language.
>
> Any programmers out their interested?
>
> Sincerely yours,
> Luke S.
>
>
>
>
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
>



More information about the sword-devel mailing list