[sword-devel] Architecture and DVCS - was : Re: Project "Free Scriptures" started

DM Smith dmsmith at crosswire.org
Wed Feb 26 12:54:27 MST 2014


The problem we experienced with binaries is that we had every version that we ever used checked into SVN. When you checked out SVN you got the latest. It bloats the repository only on the server, not on the client.

In git, my first pass was naive I just put all SVN history into it. The checkout was absolutely huge as git gets all history. It bloated both on the server and the client. With qmx's (Douglas) help, we wacked them by re-writing history. And we now have a different mechanism to get the libraries.

-- DM

On Feb 26, 2014, at 11:15 AM, Chris Burrell <chris at burrell.me.uk> wrote:

> In git it's fairly easy to rewrite the history to remove big binary files as well. Later versions of git are fine with binary and treat these are binary files rather than trying to work out a diff.
> 
> I'd argue the same with SVN, that the use of binaries bloats the repository ;)
> 
> Chris
> 
> 
> 
> On 26 February 2014 16:08, DM Smith <dmsmith at crosswire.org> wrote:
> Rather than continuing hijacking the prior thread, I'm actually breaking this out to a new top level thread. It will be easier to follow in the mailman listing.
> 
> JSword has benefited from going to git, especially with collaboration.
> 
> The vocabulary of git is interesting and perhaps off-putting to those used to SVN:
> Fork - Everyone who uses a git repository gets the whole repository from the beginning of history to the present. As the entire repository, it is really no different than the repository from which it came. This is called a fork. How it is used determines whether a project is forked into multiple separately supported streams. Fork is no longer a nasty word.
> Clone - The process by which a fork is created. It is akin but decidedly different than a checkout in SVN. An SVN checkout is tied to the repository from which it came and is only the top level.
> Branch - In Git, a branch is cheap and is the recommended way of making changes whether small or large. In SVN, a branch, while a cheap operation, is typically done to create separate long-lived streams of development. 
> Merge - This melds the code changes from one branch into another using a common ancestor. It is the same operation as in SVN. At least conceptually. In git, because branches are typically a change in a few files and short lived, it is an easy rather than painful operation. In git, it is common to merge changes from the master branch into a work branch. But a merge from a work branch into the master should only happen once.
> Commit - Makes a change a permanent part of the user's fork and branch.
> Push - Put the commits in one repository into another. In SVN this is called checkin. However in SVN, this can only go to one repository. In git, it can go to any other repository that allows it.
> Pull - Get the commits from another repository into the another repository. In SVN this is called update. However in SVN, this can only come from the one repository. In Git, it can come from any other repository.
> Pull Request - This is notification to another repository that a commit is available from another repository. This is akin to a patch.
> 
> Currently, we host the master on github. I'd like the master to be on CrossWire, but I think that we'll still use github as the point of collaboration. I like that pull requests are available to all for review. Truly review, comments and all. We've got it hooked into Jenkins (via CloudBees), which allows pull requests to be tested.
> 
> It encourages collaboration because it is an egalitarian model of participation. In practice, we've locked this down to a few gate-keepers being able to
> 
> BTW, it doesn't matter to me if SWORD uses SVN or git. I can do either. Moving to git was a bit painful as it is important to only hold text files in it. Static binaries (such as small images) are OK. In the past we held all 3-rd party binaries. These had to be removed. We've replace them with a pull from Maven.
> 
> Hope this is interesting/helpful.
> 
> In Him,	
> DM
> 
> 
> On Feb 26, 2014, at 8:44 AM, Chris Burrell <chris at burrell.me.uk> wrote:
> 
>> From a JSword point of view, I feel it has encouraged more frequent commits and easier to review the changes. I certainly felt I could commit to the engine more readibly - i.e. branching, building STEP of the branch of JSword while the pull request gets reviewed/merged.
>> 
>> 
>> On 26 February 2014 13:23, Peter Von Kaehne <refdoc at gmx.net> wrote:
>> 
>> > Gesendet: Mittwoch, 26. Februar 2014 um 10:50 Uhr
>> > Von: "Nic Carter" <niccarter at mac.com>
>> 
>> > Sorry for the top-post-reply, but here it is, so I guess I'm not all that sorry ;)
>> >
>> > The main bit of code you are referring to (parsing the HTML) is my code. There is other code that parses the return from an FTP server, which is ancient code. My code is (relatively) new, only about 3 years old? (I'm sure you can look it up?)
>> > I agree it is completely a hack. I have had no time to fix it, but TBH, when I do "fix" it I will be ripping curl out of PocketSword and using native iOS stuff and will do all downloads that way. (Currently I download various bits using the build-in SWORD methods & various bits using native iOS Obj-C methods.)
>> 
>> The main background to this of course is that HTML parsing was never meant to be in the code, but that the FTP distribution was considered as the norm. The architectural decision behind all this was that any module installation can act as the root for a distribution. No index files, no fuss, no effort. And HTTP transfer was only grudgingly accepted as a way of allowing some frontends to work.
>> 
>> Personally I think this is not really anymore a convincing decision - many not terribly IT savvy people might want to distribute small collections of modules and could do this a lot easier from some webspace than by way of maintaining a public FTP server.
>> 
>> But this is the background. And I think if you want to change that you need to challenge this specific view of practicality.
>> 
>> > I agree that switching to DVCS is a sane move & that sticking with SVN is like shooting yourself in the foot. However, it seems like it's never going to change, so I'm not going to fight that battle (insert comment about loosing battles in order to win the war, and the "war" is producing excellent software for iOS, which I'm actually currently loosing, but that has nothing to do with CrossWire and everything to do with myself and lack of time right now).
>> 
>> I would like to see a move to Git. I understand it. I can use it , it gives heaps of benefits which are all lost when using hacks like git-svn and there is no sane reason that using Git would loose that central repo with tight control.
>> 
>> Jsword has done that move, and I think it was beneficial for all concerned.
>> 
>> Peter
>> 
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
>> 
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20140226/67aa5d1b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4145 bytes
Desc: not available
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20140226/67aa5d1b/attachment.p7s>


More information about the sword-devel mailing list