[bt-devel] Some pictures

Matthew Talbert ransom1982 at gmail.com
Thu Feb 26 13:02:11 MST 2009


> I changed the BibleCS installation directory to C:\Program Files
> (x86)\Žibřický\The SWORD Project and updated SWORD_HOME.  My drive is
> NTFS formatted.  BibleCS still finds the modules properly (granted, it
> is located in that directory), but BibleTime now does not find any
> modules.  So the problems lies somewhere outside of the MinGW
> toolchain, since MinGW has not had anything to do with my BibleTime
> build.  I'm pretty agnostic when it comes to most UNICODE issues in
> coding, so placing the blame isn't my expertise - but VC doesn't seem
> to handle this issue any better than MinGW.
>
>>
>> Here is my reasoning:
>>
>> char *getenv(const char *);
>> int open(const char *);
>>
>> However your toolchain implements these methods in a unicode aware world,
>> just from sheer consistency, whatever you receive from getenv, you should be
>> able to pass to open.
>>
>> I just tried BibleCS (Borland's impl) and it works fine with SWORD_PATH set
>> to d:/Žibřický
>
> Probably Borland has a self-implemented version of the runtime, but
> with people also using MinGW and VC, both of which fail to do this
> properly, it seems that onus is on us developers to at least find
> workarounds until Microsoft decides to patch its libraries.
>
> --Greg

Troy has reported that BibleCS fails on NTFS. I incorrectly mentioned
FAT32, where it apparently works. In your scenario, Greg, I suspect it
is only working because it is using the current directory.

At issue is the difference between the C runtime and the encoding used
by the NTFS filesystem. NTFS uses exclusively UTF-16 for filename
encodings. This does not match the C runtime, so there are cases where
some filenames cannot be obtained through the C runtime. This isn't
the place to discuss what Microsoft should have done. However, after
reading about this quite thoroughly, I am satisfied that they
implemented the only viable solution at the time which managed to
preserve backwards-compatibility. Keep in mind that UTF-8 was not in
use at the time MS decided to use UTF-16 for filename encodings. From
a technical standpoint, it is indeed sub-optimal that you can't just
use open across all systems. However, my understanding is that issues
can arise on *nix from this issue as well, specifically for users who
may have used a *nix that didn't use UTF-8 in the past, but some other
encoding and still have filenames in the old encoding. This is
probably a relatively rare scenario for *nix users, but more common
for Windows, which is why MS has an api that always returns the
correct name.

Matthew



More information about the bt-devel mailing list