- Thursday, June 12, 2008
DVCS Myths
My last post on distributed version control systems generated some interesting discussion, both in the comments here and elsewhere on the Web. A number of the responses were interesting and thought provoking, while others were so full of FUD and misinformation I couldn't help but wonder if they were serious. I'll admit that I was surprised by some of the negative backlash against DVCS. I have explained it to many former users of centralized systems, and it simply never struck me as a very controversial technology. I don't want to just completely ignore the criticism, however. This post is an attempt to respond directly to some of the more common criticisms, and hopefully convince some of the skeptics that even if DVCS isn't the solution for you, at least it won't start your computer on fire.
DVCS Myth #1: You must change your workflow to adopt DVCS
Many descriptions of DVCS focus on the new and interesting workflows it enables. Indeed, this is a key feature of distributed version control, but it has a tendency to give the implication that DVCS is only useful if you really need to change your workflow.
This is entirely untrue. DVCS is flexible, and can be implemented in some very interesting and unique ways. However, it can also act just like your centralized system, and its advantages are no less significant.
At our company, for example, we switched from Subversion to Mercurial without changing our model at all, at least initially. We kept the same branch structure, used the same server, and did things in generally the same way. As our team has grown and diversified, our needs have as well, so we've leveraged some of the strengths of the DVCS model to match our workflow. The key is that DVCS works with your desired workflow rather than dictating it. If your desired workflow is similar to or identical to the "central server" model, that's a perfectly acceptable use case for applying DVCS.
DVCS Myth #2: Workflows enabled by DVCS are less natural than the centralized workflow
For long-time users of centralized systems, this is an understandable belief. Indeed, the workflow mandated by a centralized system may in some cases be the most natural. In these cases, DVCS offers the best implementation of the centralized workflow I've found. It's in cases where the centralized model is not the most natural workflow, however, that the unique properties of DVCS really shine.
As a specific example, DVCS has enabled me to manage changes to my home directory much more naturally than in a centralized system. I keep the contents of my home directory (dot files, elisp, etc.) under version control. I was using Subversion prior to discovering DVCS. With Subversion, I ran the server on my home development workstation, which I left powered on during the day so it was accessible from work (forcing me to pay otherwise unnecessary power costs). In addition, I paid $5 per month to my ISP for a static IP (dynamic DNS was unfortunately not an option due to the NAT configuration of my fiber-to-home service).
Despite these costs, the workflow in this setup was extremely unnatural. When I would make an update to the repository on the bus, I would have to leave the files in a modified state. Upon arriving at work, I would then have to open the laptop, connect to the network, then either make a bulk checkin with all of the changes or manually partition the modified files into the proper groups for changesets.
If, on the other hand, I made changes on my work computer and wanted to check them in while my home server was down (because of a network outage, or simply because I forgot to turn it on in the morning), I would have to manually generate patches from the repository (again, forcing myself to later reassemble them into logical changesets). Of course, accessing source control over the Internet is never ideal from a performance perspective, even when the server is always up. This is particularly true when using a strained corporate connection to talk to a server on an upload limited consumer line.
This was an annoying process, to say the least, and while it was a huge improvement over manually copying my home directory around, it left much room for improvement.
With DVCS, all of the annoyances of the previous model are gone. I can make commits from the bus without network access, and these commits are properly organized into the appropriate changesets as opposed to a giant single patch. I can easily pull these changes into my work computer's repository when I get to work, or I can leave the laptop in the bag and merge them another time. The changes I make on my work computer, meanwhile, need to make it back to my home machine. However, with my DVCS-powered workflow I now keep my machine turned off during the day (making DVCS the green SCM choice). I have also canceled my static IP service, saving myself $5 a month. In the absence of direct access to my home repository, I use a variety of mechanisms for sharing changes. Most commonly, I transfer changesets to my home machine via my laptop's repository. In other cases, I will export a handful of changesets and transfer them with a USB thumb drive or via email. In general, I use the most convenient option available, though I have used all three in various situations.
Regardless of where the work happens or how it's transferred, merging the changesets is simple with DVCS because, well, DVCS is designed to make merging changesets simple. It's simple no matter where the changesets originated, in part because DVCS uses unique hashes to identify changesets. Of course, it also tracks the parent revision of each changeset, so it can determine cases where a merge isn't necessary at all. This, unsurprisingly is the most common case given that I'm the only user in this scenario.
One thing you may have noticed in the workflow description is that it's a bit ambiguous which of my computers is "the server". Previously, it was my home machine, but why? It could just as easily have been any other machine ... in fact, I probably would have been better off running the centralized server on the laptop, though that doesn't seem quite right either. The fact is, this is a workflow where "the server" is naturally ambiguous. There is no real value in designating my home machine (or any other) in this role. Thus, the centralized model for version controlling my home directory simply isn't a natural fit. The DVCS model, on the other hand, easily and naturally supports my desired workflow. There are no "hacks" required to make this work cleanly.
As an added bonus, I get free offsite backups of my home directory repository. This leads to our next myth.
DVCS Myth #3: DVCS users don't believe in backups
The idea that DVCS users don't believe in backups is surprisingly pervasive, perhaps because of the passive attitude DVCS advocates tend to have about server outages. At our company, we have the same attitude, but we also make very frequent backups of our centralized repository. Using DVCS may theoretically reduce the need for backups, but by no means does it eliminate it.
So, why make backups of a source control server with so many backups? It is improbable that many servers will suffer catastrophic hardware failures simultaneously, but it is not impossible. A more likely scenario might be a particularly nasty computer virus that sinks its teeth into an entire network of vulnerable machines. In any case, the probability of any or all of your backups becoming suddenly unavailable is really not the point. The bottom line is that using independent clones as canonical backups (as opposed to temporary stopgaps) is a suboptimal strategy.
Security, for example, should be considered. If you are using authorization rules to control access to specific portions of your repository, canonicalizing an arbitrary clone of the repository effectively renders those rules useless. While this would rarely be a matter of practical concern in a controlled corporate environment, it is nonetheless possible. It is worth noting that in an environment where a backup process is infeasible (for financial, political or other reasons), backing up hashes of the repository files and their revisions for post-backup verification could provide a mitigation.
The key win of DVCS for backups, then, is that you don't really need to invest in a "hot" backup. When the server inevitably goes down, DVCS will buy you time. Lots of time. You'll essentially be running at full productivity (or very nearly so) while you rebuild your server from backup. When changesets created during the server downtime are pushed back to the restored server, the freshly restored authorization rules will be reapplied and you'll be back on track.
DVCS Myth #4: Authentication and authorization don't exist with DVCS
I touched on this a bit in the previous myth, but it's worth emphasizing. Authentication and authorization absolutely do exist in the DVCS model. They only apply where you choose to apply them, however.
Our company has a canonical source control server which applies both authentication and authorization rules. The authentication rules are specified via the Apache server configuration which provides network access to the repositories. In fact, we leveraged the exact same Apache authentication configuration we had used for our Subversion installation (this configuration allows us to leverage the user database in the company's Windows domain). For authorization, we use the more flexible options offered by Mercurial's ACL configuration. In the simplest case, we have developer-specific copies of mainline development branches which can be pulled by anyone, but only pushed (written) to by the developer who owns it. Grouping users and splitting access based on subpaths of a single repository are nearly as simple.
Because the authorization rules are applied when changesets are pushed, developers working on local repositories are not denied any flexibility until they attempt to push their changes. A user in this scenario can still commit changes against that repository, they just can't push them directly. Thus, they would have to convince a developer who does have such permission that their changes are worthy of inclusion. Despite the flexibility, the effectiveness of the authentication and authorization rules are not compromised.
DVCS Myth #5: DVCS can be used in corporate environments, but its advantages are mostly geared towards open source projects
DVCS is indeed quite popular for open source projects, and the reasons are fairly obvious. When many disconnected developers are working on the same project, the workflow flexibility provided by DVCS becomes increasingly important. It also provides a clean mechanism for remote users without commit access to the primary repository to create new functionality within the code base.
A user working on a new experimental feature, for example, can perform the work on their local clone of the repository. Within this local copy, they can commit changesets and integrate changes from other users. Importantly, as the mainline codebase evolves, they can also merge their changes with updated upstream code in a clean and organized way. In a centralized system, they would be forced to maintain their changes as a set of patches, manually rebasing them when they pull down new changes in the upstream repository. When their experimental feature is complete, they can easily export the new work and send a compact package to the appropriate maintainer.
The workflow flexibility DVCS offers is particularly valuable in an open source project with multiple maintainers. The maintainers in this scenario would be responsible for integrating contributions from the community for the modules they are responsible for. The contributions would mostly come in the form of structured patches exported by the DVCS client. The process of integrating these patches is easier and more organized with DVCS. Additionally, merging responsibilities can easily be split amongst several maintainers for patches that are accepted.
In corporations, on the other hand, it is rare to find groups collaborating by sharing patches. Thus, at a glance the flexibility offered by DVCS model might seem to be overkill. In some cases, this is true, I doubt any company or organization needs every feature provided by DVCS. However, having the capability to restructure your workflow is extremely valuable, even if you don't need it yet. And the parts of DVCS you don't use certainly don't cost you anything.
Perhaps you want to prevent a group of developers from committing changes to your mainline repository until they are reviewed. Using a centralized system, the developers from this group must submit patches for review and integration with the main code base. Their commit logs are thus lost, overwritten by the reviewing developer who applies the patch. DVCS makes this significantly easier. Those responsible for reviewing the changes simply pull reviewable changesets directly from the developer, or they can pull them via a developer-specific branch on the server. If the changes pass review, they can push them along to the main repository (only they would have access to do this). This group might also collect their changes together into their own shared repository, enabling a variety of changes to be tested together.
These sorts of scenarios can be especially useful when collaborating on a single project with an external development organization. I have seen attempts to use centralized version control systems in these scenarios fail miserably. Corporate centralized servers are rarely designed to be exposed on public networks, so naturally administrators shy away from enabling remote access, except via VPN. When no shared central server is available, the inevitable result is a hackish process. In the best case this might be a process based on exchanging patches from known parents, but more commonly it involves trading full copies of the source code with all revision history lost, then performing painful manual merges. With DVCS, you simply send the whole repository once, then share the exported changesets by whatever transfer mechanism is most convenient (server access is just a bonus). Merges can again be performed by either development organization, which is especially convenient.
DVCS Myth #6: Having a server with perfect uptime invalidates the advantages of DVCS
Even if you are not on a plane, you may very well be on the bus, or at home, or at a coffee shop, or in a hotel room, or on vacation. Just because your server has perfect uptime doesn't mean you're always in position to access it.
I have personally needed source control repository access in every single one of these places (including the plane), and I cannot rely on high speed Internet access in most of them. A couple of weeks ago I got a call about a bug that needed fixing while riding in on the bus. I was only 10 minutes from my destination, but that was enough time to crack open my laptop and run a bisect session to uncover the changeset which introduced the bug. Upon arriving at work, I knew exactly what the problem was and I was able to fix it immediately. With more time, I could have prepared the fixed changeset on the bus, ready to transfer as soon as I arrived. DVCS not only gives you access to your repository everywhere, it offers the most performant experience possible in all of these scenarios.
The performance aspect is a relevant point when speaking about uptime. As far as I'm concerned, if my "annotate" command takes 10 seconds to run, that counts as 9.5 seconds of downtime, because with DVCS it's virtually instantaneous. A slow responding centralized server can easily cost you as much productivity as an occasionally inaccessible one in the long run.
Hiccups can be mitigated to some extent by spending a great deal of money on your source control server, the hardware between it and your workstations, and appropriately qualified staff to make it all work together. However, I've rarely seen companies willing to make the required investment (never in my own experience, and quite rarely in others). Even in those that do, it's still only a mitigation (a server running on good hardware still inevitably gets sluggish under load), and it only lasts so long. As your repository grows and your team expands, the scaling pressure increases. DVCS, on the other hand, grows with your company, and always provides optimal performance. Thus, you don't need to spend an extra $25,000 on your server hardware "for future hires".
DVCS Myth #7: DVCS encourages chaos in your development process
This seems to be the issue that, more than any other, causes the anti-DVCS crowd to load up the FUD cannons. I've not seen any evidence to suggest that DVCS encourages degradation in teams. In fact, I have seen the opposite effect. Because DVCS can be shaped to the natural workflows of your team, when implemented properly it enables teams to work more smoothly, with less communication overhead. With that said, the fear, uncertainty and doubt surrounding this issue isn't going away, so it is only fair to address what seems to be such a "hot button" issue.
For starters, let's define chaos. I think it's important to understand that "chaos" is a moving target. A happy Subversion user might see the flexibility offered in the DVCS model as potentially chaotic. Meanwhile, there are many happy Visual SourceSafe users (really, there are -- I've met some, they are nice people despite this) who find the idea of a non-exclusive locking source control system to be the very definition of chaos.
Most people who have been writing software for a long period of time in a team environment accept that having multiple developers edit the same file at the same time is not only an acceptable form of chaos, but a very necessary one. It may not be intuitive (it sounds a lot like chaos until you've realized its importance), but it's almost universally recognized within mature development organizations that the cost of merging file changes when multiple developers edit the same file far outweighs the massive productivity cost of a strict locking system.
So at least on this single issue, the DVCS proponents and detractors agree with each other wholeheartedly, or at least the vast majority of them do. It is not a huge step, then, to imagine other scenarios which might seem chaotic on the surface, but in fact enable huge gains in productivity.
We can acknowledge, having established that chaos can be valuable, that DVCS allows for chaos. All systems allow for some form of it. It is up to your team to determine the appropriate level of chaos that is permitted, and to enforce the process. This is true no matter what system or process you are introducing. If a particular developer working under a centralized source control system never checks in their work, that's a process failure, not a technology problem.
Many common situations in centralized systems lead to chaos as well. To me, the fact that a user cannot check in a set of changes until they've merged in everyone else's work on the same branch is chaos. This makes it far too easy to lose work because of a "merge gone wrong". I have seen developers switch workstations to resolve merge conflicts on more than one occasion. Being disallowed from checking in broken changes that you don't wish to share with others also leads to chaos. Developers wishing to add this layer of control with a centralized system today are forced to either do it manually (by making a copy of the in-progress repository or the relevant patches in case you want to back out) or to adopt a local DVCS.
The rapidly increasing popularity of running DVCS locally on top of centralized repositories really speaks to the need for the flexibility it offers. If you ask around, you'll find a number of different reasons why a given developer might have adopted this strategy. Nearly all of them are good arguments for DVCS in general. Some may want to version changes at a more granular level before sharing their changesets. Some might want a layered mechanism for transferring partial changesets between different environments. Others might value the ability to seamlessly create private branches for managing a particular single user workflow.
Indeed, DVCS provides significant benefits when used by a sole developer on top of a centralized server. But when enabled for an entire organization, it becomes even more powerful. For starters, all users of the system instantly gain access to the valuable features of DVCS. Even developers that don't take advantage of the more advanced DVCS features will instantly benefit from a speed improvement. More importantly, the workflow flexibility enjoyed by the individual user now extends to the entire team.
Having a source control system that supports your workflow and enables people to work together optimally is very likely to lead less chaos in your company or organization.
DVCS Myth #8: All DVCS proponents think centralized version control systems are useless pieces of garbage, and that you're insane for using them
I think this perception is common, and triggers a defense mechanism that in many cases gets in the way of having a rational discussion of DVCS. First of all, most DVCS users used a centralized version control system before switching over. And most of them didn't choose to use diff & patch in lieu of that centralized system (with one rather notable exception).
I personally have several years of experience with CVS, Perforce and Subversion. I have actually had generally positive experiences with all of those tools, and I'd take any of them over a diff & patch based version control strategy. However, part of the reason for my being able to co-exist peacefully with these tools is that I bent my development processes to fit the limitations of the tools. Subversion's sub-par branching, for example, was annoying but not crippling because I avoided having lots of branches, instead choosing to unnaturally manipulate process (or even release dates). Perforce won't let you blink without server access, so I wrote a layer of proprietary code on top of p4 to manually reattribute files and generate scripts to eventually notify the server of opened-and-or-changed-but-the-server-doesn't-know-about-it files (yeah, and DVCS is chaotic). Everyone I worked with either had their own hacky solution to this problem, or they stopped getting work done when they didn't have server access.
As a generally content user of these centralized systems, I was curious enough about DVCS to read the occasional article touting it, but it never really hit me that it could make such a significant impact on my own development process, or the process at our company. It's difficult to see just how broken particular workflows are until they're fixed. As I began to better understand the advantages of DVCS, I started to become more aware of the annoying hacks that I was employing in an attempt to get work done under a centralized system.
DVCS Myth #9: DVCS is hard to learn
Before becoming a DVCS user, I definitely had this perception. DVCS can seem very intimidating. Typical explanations of DVCS are littered with complex workflow descriptions that are rarely familiar or intuitive to users indoctrinated in a centralized source control system mindset. This often makes DVCS seem overly complex or even irrelevant to one's needs.
To a degree, DVCS is difficult to learn. A system that allows for a great deal of flexibility is naturally more difficult to learn than a system with limited capability. However, in the context of a particular need one needs to solve, DVCS is quite easy to learn. If, for example, you decide to replace Subversion with Mercurial and continue using the same trunk / branch model, there is very little to learn in order to make the switch.
Thus, DVCS itself is not "hard to learn". It can be quite challenging, however, to determine the best possible workflow for change management at your company. Because DVCS expands your options in this area, it's easy to mistake it as "difficult". Conceptually, DVCS is really quite simple. It's the optimized application of DVCS that is challenging. If you're intimidated by it, start by using it to imitate your existing workflow, then look for gaps in the efficiency or flexibility of your workflow. Chances are, DVCS will be able to solve them.
DVCS Myth #10: DVCS is hard to use
Once a particular DVCS workflow has been established, the difficulty of day-to-day usage of the system is very similar to centralized systems with equally complex workflows. Many DVCS implementations include more granular commands than are offered by centralized systems, but it's usually simple to emulate them. Following are a few examples of common Subversion commands and their equivalent in Mercurial.
Operation Subversion Mercurial Commit changes to remote server svn ci hg ci && hg push Get changes from remote server svn up hg pull -u Show change log svn log hg log Annotate a revision svn blame hg annotate Show status of changed files svn status hg status Show changes in current files svn diff hg diff Print a file's contents at a particular revision svn cat -r 55 hg cat -r 55 Cherry pick a single revision from branches/main to trunk svn merge -r720 ../../branches/main hg transplant -s ../../branches/main f587e Merge all unmerged revisions from branches/main to trunk svn log | grep -i merging
...
svn merge -r640:646 ../branches/main
svn merge -r681:682 ../branches/main
svn merge -r689:662 ../branches/main
svn merge -r667:669 ../branches/main
svn merge -r676:719 ../branches/main
svn merge -r725:730 ../branches/main
svn merge -r734:HEAD ../branches/mainhg pull ../branches/main To learn this basic set of commands given a background with a centralized system and a similar or identical workflow would take only a couple of minutes. Fortunately, you'll buy back those minutes and many, many more each time you run these commands. It can be a bit startling at first to adjust to all of your VCS commands running so fast, but you'll cope, I promise. And if you're a Subversion or CVS user, you can stop scheduling "branch days" on your calendar.
DVCS Myth #11: DVCS is a fad
At some point, it became acceptable to discount the value of all new technology with a reference to some unrelated technological flop. DVCS is the new Betamax, apparently, simply by virtue of the fact that it's new and different. Despite these inane comparisons, the question itself is worth pondering.
For a technology to be a fad, there needs to be some initial period of excitement and adoption, followed by a relatively rapid dilution of interest after this initial period. Technologies that end up in the "fad" category tend to be those that can drum up excitement with marketable promises, but either fail to deliver on that promise or miss a key element required to reach a "tipping point". Most technologies that we associate with the "fad" term were interesting enough to justify at least some initial excitement at one point in their history. Laserdisc was a failure, historically speaking, but putting video content on an optical disk and enabling interactive features doesn't seem like such a bad idea these days.
DVCS certainly meets the criteria for fad potential, at least at this point in its history. It has a strong and growing base of highly passionate users and evangelists. It's also a relatively new technology, despite having a few years of success stories in its wake. So, will DVCS continue to accelerate? Let's look at some of the "fad factors" as they apply to DVCS.
We might decide that Laserdisc was a failure because of a poor technical implementation. That is, putting video content on an optical disc was a good idea, but the discs were too large or the quality was too low to back it up. So, does DVCS have the same problem? I think there was some legitimacy to the "good idea, bad implementation" complaint as recently as a couple of years ago. There were a several DVCS tools to choose from at that time, but each had significant quirks. In the meantime, however, the quality of the DVCS experience has increased dramatically. Excellent newcomers like git and Mercurial have burst onto the scene, while quirks have steadily been disappearing from their competition.
From a technological implementation perspective, the state of DVCS implementations is strong, and getting stronger. Having used Mercurial for over a year now (well before their 1.0 release), I'm amazed by how trouble free it has been. As the repository has grown and the complexity of our source control usage has increased, Mercurial has continued to be as fast and pleasant to use as it was on day 1. Perhaps it's just our good luck, but it has also been less painful to administer than any source control system I've managed in the past.
A bigger concern when evaluating whether or not something has fad potential is marketing a product that the market doesn't exist for. This is sometimes because the product is "ahead of its time", but more often it's because the benefits were oversold. The Segway comes to mind, although I'm not entirely sure they ever had the initial adoption to justify the "fad" label.
With DVCS, this argument is a bit more challenging to evaluate, because it involves some speculation. I know from personal experience that DVCS offers unique capabilities that at least some segment of the market needs. However, even if I'd been doing this for a lifetime, it is a pretty microscopic sample size size. Perhaps a better way of looking at it is to understand what you lose by moving to DVCS. It's very, very difficult to imagine a company or organization that can't benefit from at least one aspect of DVCS. Any organization that allows employees to work from home, as a simple example, would benefit from improved productivity with DVCS. But what do they lose?
Obviously the answer to this question depends on the specific scenarios, but even assuming that you want to keep your centralized workflow I don't see much downside. You end up checking in merged changesets more often in DVCS, though every modern DVCS system has a way of making this happen nearly as seamlessly as in a traditional centralized system. The difference between the two is basically a safety tradeoff (that is, the ability to commit your changes before merging). And the safety tradeoff is optional in many cases, for brave users who wish to merge remote changes with uncommitted files. Of course, there are other advantages to decoupling commit and merge which are not realized in this case.
What about cost? Most DVCS software implementations are available free of charge. Compared with best of breed commercial implementations of centralized systems, this can save quite a bit of money right off the bat. Perhaps more significantly, the DVCS model minimizes the amount of money you need to spend on your server hardware. All operations that don't involve sharing changesets are done on local clones of the repository, so the server has far less work to do. Thus, your shared repository will happily run on less-than-stellar hardware without impacting most SCM use cases. Server administration is essentially identical to a centralized server, so you won't find hidden costs there either. The only relevant cost consideration, in fact, is the cost of the initial migration.
While there are no guarantees that DVCS will break through into the mainstream, it's difficult to find many compelling arguments against it. For all its pros, there just aren't many cons. The limitations that do exist today can be eliminated with modifications to the technology implementations as opposed to the idea itself. There is no doubt in my mind that centralized systems will continue to exist for some time (CVS is still quite popular, and its been years since there was a legitimate case for starting a project on it). However, it is inevitable that centralized systems will start to gain more and more DVCS functionality.
It's easy to imagine these DVCS / centralized system hybrids eventually becoming quite popular, in fact. They might operate in full "DVCS mode" for the majority of operations, but automatically consult the server for files larger than a certain threshold, or for inspecting changesets that are several years old. Or perhaps they will be able to enforce certain aspects of policy to ease the fears of those who remain fearful of "DVCS chaos" but desire the productivity boost it provides.
DVCS Myth #12: DVCS is the perfect solution in all cases
Having spent a fair amount of time talking about the benefits of DVCS, it's only fair to spend some time talking about cases where it might not be the optimal solution, at least in its current forms.
If your company or organization has a single centralized repository with hundreds of thousands of files or millions of revisions, it may be infeasible to store the entire repository on each client. As we discussed in our last myth, this doesn't necessarily disqualify the DVCS concept, but current DVCS implementations do not yet have features to optimize this scenario. Not all companies or organizations keep the entirety of their source code in a single repository, but it's certainly not uncommon. That said, there is no reason that future DVCS implementations (perhaps in hybrid form) shouldn't excel in this scenario.
Even in these "massive repository" cases, it is sometimes possible to restructure the repository into a collection of smaller repositories (see OpenJDK). This allows users to work optimally with a full repository clone in the area or areas of the system that are relevant to them. A downside is that changesets cannot span repositories, so this is not always ideal. In any case, this scenario is relevant in a very limited number of cases (if you don't work at a very large company, it probably doesn't apply to you). Looking forward, it would take an army of developers several years to develop that much source code, and in a few years it will no longer be prohibitive to store repositories of this size on each client. If this problem doesn't affect you today, it probably never will.
Comments
- Thursday, June 12, 2008 2:54:46 PM by Jonathan Allen
- Thursday, June 12, 2008 3:09:20 PM by MByou should make the article into a pdf, it's quiet long.
- Thursday, June 12, 2008 3:30:38 PM by David GoodladJonathan Allen: He did mention centralized security ("authorization"). Yes, a developer is free to change any file in his local repository, but for any of those changes to make it into the hypothetical 'central' repo it has to be pushed there. With a central repo in the dvcs model you can still implement authorization and rules at that point...
For an example of a bug tracking system working nicely with git (another DVCS), check out Lighthouse (http://www.lighthouseapp.com) and Github (http://www.github.com). Github acts as a kind of 'central hub' for my projects, and I have it automatically tell lighthouse about any changesets that I push to it. I can format my commit messages to reference specific bugs in lighthouse, and it all works quite nicely. - Thursday, June 12, 2008 3:49:37 PM by RootWhat the hell is DVCS? ALWAYS-ALWAYS give a definition for you acronyms at least one at the head of an article.
I knew what it meant but many who might stumble across this article through one of the MANY social bookmark sites. Who knows you might actually inspire someone rather than keep them guessing what the hell you're talking about. - Thursday, June 12, 2008 4:37:18 PM by charlesGreat content, thanks!
- Thursday, June 12, 2008 4:43:55 PM by YannGreat article!
I've managed to sell the day-job (a smallish startup company) on the use of Git. Its working out very well for us, despite having over half of the workstations being Windows (we use Cygwin-git atm). - Thursday, June 12, 2008 5:40:59 PM by TroyI've never seen issues with a DVCS and centralized security. Every implementation I've seen allows for access rules and controls - it's just common sense to have access controls.
Control freaks may not like that an individual developer can checkin a branch on his own workstation - centralized repositories have lead to the notion a commit is somehow sacred, requiring a lead developer to sign off on the commit, etc. Many tyrranical developers don't like the idea of anybody being able to do a "commit" without his express permission.
This actually hurts productivity. What a DVCS does is allow a developer to commit early & often, providing more checkpoints in history to roll back to. It's a security blanket for an individual developer to know his code is "checked in" somewhere, and then those changes can be reviewed and merged with the master repository with as stringent security controls as any other VCS. - Thursday, June 12, 2008 5:44:35 PM by heyroot@#4 check the first line
- Friday, June 13, 2008 2:45:17 AM by Greg MWe keep a huge history of art assets, large undiffable files - every minor tweak is stored as a separate copy of that large file. Although we desire some of its other advantages, DVCS can only work for us if artists can regularly and easily purge (to a certain age) their local history (since it's safely committed upstream) for selected files or directories. And of course programmers will only want the very latest version of each art assets. I'd be very interested to hear what facilities are out there in the various DVCS implementations.
- Friday, June 13, 2008 8:39:24 PM by AnonymousAll of the "pros" I've heard about DVCS seem to point back to "working remotely" - I can't fit this into the way I work because every company I've worked for in a decade of software development and consulting gigs has had very strict requirements that the source code to their apps never be taken off authorized company resources - I couldn't work on the code on the bus or at the coffee shop even if I were so inclined.
If your business requires a centralized development model, and the source code is never allowed off-site, is there value in using a DVCS? If so, what is it? - Saturday, June 14, 2008 1:46:33 PM by David TerrellJust comparing the DVCSes I've used to svn and CVS, they do merges and branches so much better it's not even funny. SVN, in particular, is terrible at merging between branches -- you completely lose original authorship and metadata.
- Thursday, June 26, 2008 4:55:31 AM by james lorenzenI would love to switch straight from svn to a dvcs such as mercurial. However, coworkers might disagree, but I liked your myth buster #1. I do have one question though: if you treat a dvcs like a centralized system, who is responsible for committing changes into the central server? The whole team? On large teams do you need dedicated people who are responsible for submodules that do nothing but commit patches?
- Thursday, June 26, 2008 6:01:13 PM by RyanRe: the person who said the "pros" only amount to "working remotely", that's really not it at all. Like the poster after you said, the real win for me is decent merge support. I've had too many years of wasted time with CVS and SVN doing things that literally take seconds in Mercurial.
I'm so glad we switched to Mercurial here at work 2 years ago -- and for the record, almost nobody is using it for remote development. 99% of work happens on site. - Saturday, June 28, 2008 9:50:01 PM by Jakub Narebski@Jonathan Allen: there are distributed bugtrackers (Ditz, TicGit, Bugs Everywhere), and there is support for distributed version control systems in traditional bugtrackers / issue trackers (Trac, Lighthouse, Launchpad). As to ACLs: see #4
@Greg M: Git (one of DVCS, used for Linux kernel) has good support for binary files; the problem of course is merging. As to "purging history" (to limit size of 'current work' repositories): they can re-clone main repository using so-called "shallow clone", i.e. clone only part of history (this has some limitations, which could perhaps be lifted if there is "an itch"). Alternatively they can share some parts of storage using alternates mechanism, if there is possible to have some shared network filesystem folder (it only needs to be read-only for all developers). Dana How worked hard on better support for large binary files in Git: search git mailing list archives and read her posts.
*Ad restricting access*: while it is possible also using distributed version control systems (DVCS), e.g. using ACL extension/plugin in Mercurial, or update-paranoid example hook or Gitosis tool for Git, Karl Fogel in "Producing Open Source Software" (producingoss) wrote that he found restricting access by tool (technically) rather by convention and culture (socially) detrimental; this might be true also for non-OSS development.
*Ad fad*: ate last one company, BitMover, Inc. has made proprietary distributed version control system, BitKeeper, and it exists to this day (or at least seems to). BTW. using BitKeeper for Linux kernel, and later revoking free license, is what spurned creation of Git and Mercurial (and at least indirectly improvements in Bazaar-NG).
@David Terrell: upcoming Subversion 1.5 will have better support for merging. It will still be centralized SCM... - Sunday, August 31, 2008 8:04:02 PM by GGood article.
But to me, none of this matters to the point of argument; in 10 years the pendulum will swing back. Or be broken altogether.
Remember when PC's meant having applications and computing power all to yourself, instead of a dumb terminal? (Well, probably not the dude with *TEN WHOLE YEARS* of "programming gigs"). Guess what? Now it's all about having a thin client presentation layer with centralized application and processing allowing ease of maintenance...ie, a dumb terminal. Sounds a bit like a mainframe?
As for working on the beach...when you hit your 40's you might realize there's more to life than work, and that blogging about work is NOT a hobby.
Then again, what am I doing but elliptically arguing about it at 42? :-P
Peace. - Wednesday, December 26, 2012 10:53:11 PM by QuintonOne thing concerning DVCSs that I have to ask is why choose Mercurial over Git?
- Sunday, September 06, 2015 8:44:42 AM by MAUDE NEWCOMBInvaluable analysis , I Appreciate the facts - Does anyone know if my business would be able to locate a blank 2014 AU Form 1263 example to use ?
- Monday, September 07, 2015 3:55:10 AM by CAMILA RAINERHey MAUDE , I was able to get a AU Form 1263 from this link http://pdf.ac/2bzbs0. It also allowed me to fill out the form, e-sign and print.
All you did was repeat the marketing material for DVCS's strong points.