[sword-devel] Contributing to sword-tools repo?

Jaak Ristioja jaak at ristioja.ee
Fri Jan 15 04:28:12 MST 2016


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Hi!

I apologize for having to write this e-mail in a haste. I've tried to
be concise, but might have missed a few points and apologies to Troy.

On 14.01.2016 22:12, Troy A. Griffitts wrote:
> This is the part I'm sure will be much less appealing and likely 
> offensive to many of you.  I do believe that the SWORD engine is
> mostly solid.  It has progressed over a period of 25 years and runs
> on a ton of platforms and is really a fairly complicated and
> optimized chunk of code.

Bit rot. While being mainly C++ library, the code does not meet modern
(2000+) coding best practices.

More than a year ago I was debugging an issue I had with a repote
Sword works repository, I ended up taking a look at the relevant
libsword code and found 2-3 security vulnerabilities in the 150 lines
of HTML loose parsing code in just a couple of hours! I guess anyone
with a bit of knowledge or a simple static analyzer could find such
bugs easily.

Afaik Sword contains a lot of hand-written error-prone parsers for
untrusted input (modules, HTML etc). So security-wise, IMHO the
situation is VERY, VERY BAD.

> libsword is not GREAT, but I do think it is really good and does a
> lot of stuff, from syndicated module repositories and module
> installation management, to parsing and referencing multiple
> versifications, to filtering (transforming) from and to a number of
> markup formats, encodings, encryption, features, attribute level
> entry map parsing and retrieval, compression, supporting modular
> storage and  drivers, multiple search engines, locales, and more.

Has there been any effort to make the system less complex and more
comprehensible and modular while still retaining the functionality? My
similar efforts in BibleTime code since 2009 can be summarized by the
following simple measure:

bibletime.git $ git log --numstat --author=Jaak --author=jotik|awk
'/^[0-9]/{insertions+=$1;deletions+=$2}END{print deletions-insertions}'
117675

And I'm far from finished! I guess the same can be done in Sword.

> In a complex system used by many projects, it is not easy to
> contribute to a core library.  I review all submitted changes and
> give advice for quite some time before giving rw trunk access to a
> developer.  There are probably only 7 or 8 individuals with full
> access to the entire repository.  Others have access to
> individual, less critical parts of the tree.  People have
> complained over the years that I don't accept code when submitted.
> I refuse submissions for a number of reasons.  Sometimes the code
> serves no purpose but to rewrite working code in a more en vogue
> way.  Sometimes the code introduced a 3rd party dependency that,
> while it might make things a little easier, also increases our
> reliance on a library we need to be sure continues to work on all
> the OSs and architectures we support; I lean toward doing a little
> more work if it avoids a 3rd party dependency.  Many people do 
> submit patches which are incorporated into the code base, but it
> is usually after a few rejects with suggestions, and almost always
> with lots of conversation back and forth before any change happens.
> I know this may seem to be against the free and open model of open
> source development, but I don't think it is.  Changes to core
> components of a project can be tightly managed while still giving
> entire freedom to see and use the source code.  I believe strict
> management of the libsword core has enabled it to survive and
> always progress (even if slowly) over our 25+ years of
> development.

25+ years? My respect, Troy! But if you want it to survive, it will
have to adapt to new systems, new developers and new requirements. To
adapt it needs to be modular and flexible; and the architecture needs
to be well thought through. Afaik this is not the case.

> Huge parts of the engine are submissions by other individuals.  I
> don't want to do all the work myself.  But I do want to assure that
> the library continues to work for all the projects which use the
> library.  I feel it is my primary task as a library administrator.
> I am a firm believer with Joel on Software that one should never
> rewrite a working code base (
> http://www.joelonsoftware.com/articles/fog0000000069.html ) but
> instead make baby steps forward.  I am very pragmatic.  You'll
> often hear me ask the questions: why? what's the problem you're
> having that you're trying to solve?  what can't you do with how
> things work right now?  have you tried your patch with a working
> frontend and how has that worked out?  Changing working code is
> always a heavy negative weight in my mind and you'll need to
> justify the benefit of a change with a heavier weight.  There are
> plenty of new features which need to be written that provide real
> world, end user benefit, to make theoretical changes because it is
> how project X does it or how University Y teaches it, not a
> priority for me or worth the problems that come with any code 
> change.  I certainly want to provide new features to the users of
> our library and thus on to their end users.  I do not want to
> rewrite working code to rewrite code.

I agree. But I think we differ about what exactly is pragmatic. I
think that the phrase "If it aint broken don't fix it" is often looked
at from the perspective of individual software bugs. Please consider a
more broad perspective:

  * code is broken if it overly difficult to maintain/fix
  * code is broken if it cannot be comprehended

I think Sword is currently in a somewhat similar situation as was
OpenSSL in 2014.

> I believe this is the long-standing complaint that brings people to
> say things like, the admins don't want contributions.  This is
> certainly not true.  I don't want many contributions offered
> without respecting the codebase which is already working and in
> place.  I don't want contributions from individuals who aren't
> willing to spend the time necessary to understand the complexity of
> the internals of the engine or the very difficult problem we are
> trying to solve with the engine.  It is actually a much more
> difficult problem than most people understand until they get into
> the details.  I do welcome people who want to work together,
> understand why things are the way they are right now, have a real
> world feature they would like to implement, and then work together 
> to discuss the incorporation of that feature, test it out with
> frontends in the field and then submit a collaboratively designed
> and well tested patch.

There are changes which are very difficult to implement without
API/ABI changes. Personally I'd like to see many of this kind, but I
don't want to start (nor currently have the time for) a lengthy
discussion in this thread.

Blessings,
Jaak
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAEBCAAGBQJWmNfLAAoJEIqsii/PFAD1LeQQAJoxu2HYzVHFnZHbYq6kJv/8
ykqE8InxSWabJVoyu+megWmFs1Rf2wPEwesbHwG0uwKomVRpGj5EbNhh68fnF/N4
0NdyIdr/V91be36GQNMifHORPGTp8SjYBri+g97VCf93I1czoN2f8rh3HvoApmL8
TLUkFXWupOTLQaZTEc6dSrzS7xTgoIYjSIUbohTThZufuvhZA7Oy2nhbF106w6HT
rUwPSvt5kjkfXE8xuq1lAyRZ4Gxc/FP4PCL8pgpty8DXbS4bfcRO2wO8DbUBXCS3
agh6LEEwyvBKXAJBl4aZb+vuWU7jkJ9er5/fLC/1Loy2CA1Ii9U/w0WpV/YBiIXV
3BlvRLLJZybW4MbuQD1LlraHd7moiOxqL87e53e4h6/eFlpTw+C+L3z2CuKPbj9S
A6F6my0aOBVg+wveZhT7i2ba2dlVt0AzWituq8faUqW6nsU5Q8psMEWMucILJh/p
3Gk7asRJ5hFdr4zk49ItVLFVJe160oitNhijRNtJnFu7o4WcYGMoOmBcXox0H4y1
INjz1XxmKsfQ4NgGUF39l8fsbG0Bk7N86wZJEXwYNKG+8s0sYi1A0EbMeC8HThWt
ZeZoRB4DvGSGF6OxH5Rv23WXCoo+eC4Mt5cVtSmafU0cDQRVvIpvp8rOj7EMpUEt
5dtS6WEr/l2oX/XOdqUL
=FWmX
-----END PGP SIGNATURE-----



More information about the sword-devel mailing list