|
|
Subscribe / Log in / New account

Distributed bug tracking

LWN.net needs you!

Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing

By Jonathan Corbet
May 14, 2008
It is fair to say that distributed source code management systems are taking over the world. There are plenty of centralized systems still in use, but it is a rare project which would choose to adopt a centralized SCM in 2008. Developers have gotten too used to the idea that they can carry the entire history of their project on their laptop, make their changes, and merge with others at their leisure.

But, while any developer can now commit changes to a project while strapped into a seat in a tin can flying over the Pacific Ocean, that developer generally cannot simultaneously work with the project's bug database. Committing changes and making bug tracker changes are activities which often go together, but bug tracking systems remain strongly in the centralized mode. Our ocean-hopping developer can commit a dozen fixes, but updating the related bug entries must wait until the plane has landed and network connectivity has been found.

There are a number of projects out there which are trying to change this situation through the creation of distributed bug tracking systems. These developments are all in a relatively early state, but their potential - and limitations - can be seen.

One of the leading projects in this area is Bugs Everywhere, which has recently moved to a new home with Chris Ball as its new maintainer. Bugs Everywhere, like the other systems investigated by your editor, tries to work with an underlying distributed source code management system to manage the creation and tracking of bug entries. In particular, Bugs Everywhere creates a new directory (called .be) in the top level of the project's directory. Bugs are stored as directories full of text files within that directory, and the whole collection is managed with the underlying SCM.

The advantages to an approach like this are clear. The bug database can now be downloaded along with the project's code itself. It can be branched along with the code; if a particular branch contains a fix for a bug, it can also contain the updated bug tracker entry. That, in turn, ensures that the current bug tracking information will be merged upstream at exactly the same time as the fix itself. Contemporary projects are characterized by large numbers of repositories and branches, each of which can contain a different set of bugs and fixes; distributing the bug database into these repositories can only help to keep the code and its bug information consistent everywhere.

There are also some disadvantages to this scheme, at least in its current form. Changes to bug entries don't become real until they are committed into the SCM. If a bug is fixed, committing the fix and the bug tracker update at the same time makes sense; in cases where one is trying to add comments to a bug as part of an ongoing conversation the required commit is just more work to do. That fact that, in git at least, one must explicitly add any new files created by the bug tracker (which have names like 12968ab9-5344-4f08-9985-ef31153e504f/comments/97f56c43-4cf2-4569-9ef4-3e8f2d9eb1fe/body) does not help the situation.

Beyond that, tracking bugs this way creates two independent sets of metadata - the bug information itself, and whatever the developer added when committing changes. There is currently no way of tying those two metadata streams together. Then, there is the issue of merging. Bugs Everywhere appears to reflect some thought about this problem; most changes involve the creation of new, (seemingly) randomly-named files which will not create conflicts at merge time. It did not take long, however, for your editor to prove that changing the severity of a bug in two branches and merging the result creates a conflict which can only be resolved by hand-editing the bug tracker's files. Said files are plain text, but that is less comforting than one might think.

All of this can make distributed bug tracking look like a source of more work for developers, which is not the path to world domination. All of this can make distributed bug tracking look like a source of more work for developers, which is not the path to world domination. What is needed, it seems, is a combination of more advanced tools and better integration with the underlying SCM. Bugs Everywhere, by trying to work with any SCM, risks not being easily usable with any of them.

A project which is trying for closer integration is ticgit, which, as one might expect, is based on git. Ticgit takes a different approach, in that there are no files added to the project's source tree, at least not directly; instead, ticgit adds a new branch to the SCM and stores the bug information there. That allows the bug database to travel with the source (as long as one is careful to push or pull the ticgit branch!) while keeping the associated files out of the way. Ticgit operations work on the git object database directory, so there is no need for separate commit operations. On the other hand, this approach loses the ability to have a separate view of the bug database in each branch; the connection between bug fixes and bug tracker changes has been made weaker. This is something which can be fixed, and it would appear (from comments in the source) that dealing with branches is on the author's agenda.

Ticgit clearly has potential, but even closer integration would be worthwhile. Wouldn't it be nice if a git commit command would also, in a single operation, update the associated entry in the bug database? Interested developers could view a commit which is alleged to fix a bug without the need for anybody to copy commit IDs back and forth. Reverting a bugfix commit could automatically reopen the bug. And so on. In the long run, it is hard to see how a truly integrated, distributed bug tracker can be implemented independently of the source code management system.

There are some other development projects in this area, including:

  • Scmbug is a relatively advanced project which aims "to solve the integration problem once and for all." It is not truly a distributed bug tracker, though; it depends on hooks into the SCM which talk to a central server. Regardless, this project has done a significant amount of thinking about how bug trackers and source code management systems should work together.

  • DisTract is a distributed bug tracker which works through a web interface. To that end, it uses a bunch of Firefox-specific JavaScript code to run local programs, written in Haskell, which manipulate bug entries stored in a Monotone repository. Your editor confesses that he did not pull together all of the pieces needed to make this tool work.

  • DITrack is a set of Python scripts for manipulating bug information within a Subversion repository. It is meant to be distributed (and, eventually, "backend-agnostic"), but its use of Subversion limits how distributed it can be for now.

  • Ditz is a set of Ruby scripts for manipulating bug information within a source code management system; it has no knowledge of the SCM itself.

As can be seen, there is no shortage of work being done in this area, though few of these projects have achieved a high level of usability. Only Scmbug has been widely deployed so far. A few of these projects have the potential to change the way development is done, though, once various integration and user interface issues are addressed.

There is one remaining problem, though, which has not been touched upon yet. A bug tracker serves as a sort of to-do list for developers, but there is more to it than that. It is also a focal point for a conversation between developers and users. Most users are unlikely to be impressed by a message like "set up a git repository and run these commands to file or comment on a bug." There is, in other words, value in a central system with a web interface which makes the issue tracking system accessible to a wider community. Any distributed bug tracking system which does not facilitate this wider conversation will, in the end, not be successful. Creating a distributed tracker which also works well for users could be the biggest challenge of them all.


(Log in to post comments)

Distributed bug tracking

Posted May 14, 2008 16:44 UTC (Wed) by iabervon (subscriber, #722) [Link]

I think that nobody's worked out the "what this is all really about" model for bug tracking.
It seems to me that there should only be one set of bugs, rather than one per branch, because
the same bug can affect multiple branches, and be fixed by corresponding patches to those
branches. And it doesn't matter just whether a bug has been fixed; it matters what commit(s)
fix it, and whether the version you're looking at contains any of those commits. It's only a
relatively specialized case in which you want to know the bugs that nobody has found a fix
for.

I suspect that the website for access to the bug tracker should be relatively easy in the
distributed system; it's just another client in a system with a lot of clients. (For that
matter, I still want to see a website that lets you edit documentation in a project
wiki-style, generating a branch that you can look at before and after making commits and the
maintainer can pull to accept your changes.) On the other hand, it could be nice for users to
be able to report bugs when they don't have network access, either, especially for bugs that
cause or require a lack of connectivity, so an optional offline client is an advantage there,
too. And it would be really nice to be able to do one thing and see the status of all of the
bugs you care about, across different projects from different organizations.

Distributed bug tracking

Posted May 14, 2008 17:06 UTC (Wed) by tzafrir (subscriber, #11501) [Link]

I added a comment while offline and you added a comment while off-line.

When is the potential conflict between our added messages get merged?

Distributed bug tracking

Posted May 14, 2008 18:20 UTC (Wed) by iabervon (subscriber, #722) [Link]

Different people adding different comments should be trivial to merge; the system just gets
both comments in some arbitrary order, just like if two people send email to a mailing list at
the same time. The interesting case is when people try to make incompatible changes to the
same bug, but I sort of feel like there aren't really any incompatible changes if the changes
are represented in enough detail. If the model is that people make statements about bugs in an
unordered (except when they reference each other), append-only fashion, and the mutable fields
of a bug are generated from the currently-known set of these, rather than being stored as
particular values, then there shouldn't be any possibility of conflicts to resolve.

Merging

Posted May 14, 2008 18:38 UTC (Wed) by corbet (editor, #1) [Link]

In fact, the merging of comments works just fine, at least in Bugs Everywhere. Each comment is its own file, so there's no conflicts; they all just mix in together.

The problem is the "incompatible" changes - global stuff like severity, assigned-to, title, resolution status, etc. I don't think there will be any way to automatically merge changes to such parameters. What would be nice, though, is merging code which understands things at the right level rather than just forcing somebody to run an editor on its internal files to clean things up.

Merging

Posted May 14, 2008 18:51 UTC (Wed) by tzafrir (subscriber, #11501) [Link]

When you're commenting on an open bug and someone else closes it, there's no 
technical conflict. But does it warant a reopen?

Merging

Posted May 14, 2008 18:59 UTC (Wed) by iabervon (subscriber, #722) [Link]

I still think the right thing is to not do global changes exactly, but give comments the
ability to attempt to modify them, and have the effective status come from combining them,
where a comment that references another comment can override its changes, and comments that
don't reference each other have a cumulative effect. This then requires that, for each field,
there's a way of combining a bunch of values: take the maximum severity and priority, all of
the assignees, all of the titles, all of the resolutions, and so forth.

That is, somebody reports a bug. Somebody else posts a request for more info, with a flag to
close the bug as "needs info". The first person replies to the post with the info, with a flag
to remove the "needs info" resolution. But first, a third person commits a fix, with a
resolution "fixed in commit X". Since the earlier reply doesn't reference this comment, the
reopen effect doesn't apply to it, so it's left as "fixed in commit X"; then the original
person can confirm it or deny it in a reply to the "fixed" comment.

My contention is that all of the global stuff can be handled like this, and likely handled
better than actually having central values anyway.

Merging

Posted May 14, 2008 19:35 UTC (Wed) by elanthis (guest, #6227) [Link]

"take the maximum severity and priority, all of the assignees, all of the titles, all of the
resolutions, and so forth."

Maximum is not correct in most cases.  How would I lower a priority?

In general, though, I like your idea.  Each update includes a set of changes (with or without
a comment) to an earlier revision of the bug.  If any of those fields conflict with any
revisions between the target and the new one, those over-ride the new one.  Pretty
straightforward.

Merging

Posted May 14, 2008 20:22 UTC (Wed) by iabervon (subscriber, #722) [Link]

You'd lower the priority by replying to the comment(s) that raised it with your reply changing
the priority. That would remove them from the set that the maximum is taken over. The idea is
that a bug has a high priority for some reason, and the reason is given in the comment that
raises it. In order for the bug to go back to being low priority, you need to refute (or at
least have seen) the reason it was high. If it was also high priority for some other reason
you haven't looked at yet, it's still high priority until you do.

So my algorithm is: take all of the comments (including the original report) that affect
(field) that don't have any descendants that affect (field); these are the relevant comments.
Of the relevant comments, take the least upper bound of their values (least upper bound being
maximum for numbers, and union for sets). This is the effective value of the field.

Merging

Posted May 15, 2008 7:28 UTC (Thu) by rvfh (guest, #31018) [Link]

Or... only one person can change the priority or whatever of a given package at a given time:
the triager. Then there's not conflict to solve to start with.

If you need more people changing, ok, but they can still talk before, and agree on who is in
charge (does not need to be the person solving the bug).

Why should computers programs solve _all_ our problems when we can just use a bit of
communication?

So my algorithm is:
- bug reported (no priority or whatever)
- triager takes responsibility over it (decided in a 'live' session; if you are in a plane,
then someone else will do it, tough! At least until you land and talk to them)
- people only add comments, maybe saying: please push priority to 'severe'

Obviously, only the core maintainers' request for priority change are taken into account :)
But they don't do it themselves anymore. The central server is replaced in this function by a
central person (who can delegate, of course).

I obviously consider the ability of posting comments/patches much more important to distribute
that that of changing the priority of the bug :)

Merging

Posted May 15, 2008 12:07 UTC (Thu) by tzafrir (subscriber, #11501) [Link]

If you can't distribute the work over more than one person, what's the point in a 
distributed system in the first place? Only the copy of the triager counts. The other 
copies are unreliable.

Merging

Posted May 15, 2008 15:04 UTC (Thu) by rvfh (guest, #31018) [Link]

Only the 'metadata' is centralised, and again, the triager can delegate (which a centralised
server usually doesn't do ;) so you still have a better system.

IMHO, only the comments/patches really need to live with the code.

Merging

Posted May 15, 2008 21:46 UTC (Thu) by tzafrir (subscriber, #11501) [Link]

So what is it better than today? A centralized place for meta-data: the bug tracking system.

We have the metadata centralized. There are many places where you have hacks to refer beterrn
the source and the bugs database (e.g: specially-formatted commit messages close bugs,
trac-style formatting of commit messages)

Merging

Posted May 16, 2008 7:34 UTC (Fri) by rvfh (guest, #31018) [Link]

I got your point, but do you get mine?

What was being discussed was the merging of metadata such as priority and all the other stuff,
which could be problematic on a decentralised system (comments/patches are the easy part), and
I was suggesting to _not_ automatise and risk making mistakes (e.g. priority inversion), but
rather let someone (human being) decide and solve the conflict: the triager.

A server does not delegate, a triager can, so this is a marginal and dynamic centralisation,
rather than a full and static one, like e.g. Bugzilla.

Merging

Posted May 15, 2008 15:56 UTC (Thu) by iabervon (subscriber, #722) [Link]

For priority, I'm not particularly sure a single value even makes any sense. Surely, different
developers should see different priorities depending on whose opinions they care about. If a
bug only applies to one site, and that site has a Red Hat support contract, it's going to be a
high priority for any Red Hat employees, and for anyone who particularly wants to help Red
Hat, but a Novell engineer who's just looking for something to fix might not even have the
ability to get a candidate fix tested without going through a Red Hat contact. And if a single
bug is shared between multiple projects when it's unclear where the problem originates, each
project may have a different priority for it, depending on whether it's looking like it's in
that project or not. I think it's best to follow the git model of distribution with
centralization: the official database is only special in that lots of people happen to care
more about it. If people decide, temporarily or permanently, to care about some other
database, the transfer is seamless.

For severity, I think you really do want to have end users who hit the problem to be able to
say how bad it is for them, and revise this as they find workarounds and remove the failing
case from their workflows or are forced into the failing case more frequently. And the
severity from the point of view of people who might work on it should reflect some sort of
vote among the users.

Merging

Posted May 18, 2008 14:43 UTC (Sun) by oak (guest, #2786) [Link]

> For severity, I think you really do want to have end users who hit the 
problem to be able to say how bad it is for them

In the comment yes.  Letting them to edit the fields is not going to work 
though, if you want the severity to have any meaning.  Users in general 
think their problems to be blockers (for them) or at least critical and 
don't read what the severity values are supposed to mean.

Besides, for users the bug reporting should be easy, not showing them the 
extra fields makes the bug filing easier (Bugzilla may be a bit 
intimidating for first timers).

Merging

Posted May 15, 2008 13:33 UTC (Thu) by Jel (guest, #22988) [Link]

Well, one way to manage conflicts to attributes like Status would be to 
prioritise based on some combination of update time, and authority level 
of the person doing the update.  But how you manage authority/roles in a 
distributed, version-controlled bug tracker is another question.

Merging

Posted May 15, 2008 21:53 UTC (Thu) by tzafrir (subscriber, #11501) [Link]

Update time assumes you can serialize them. What if I worked in my personal branch on
something for a whole year unaware of the fact that the project has moved elsewhere?

I might have added tons of irrelevant comments and changes to bug reports.


Authority level basically assumes a centralized model.

Distributed bug tracking

Posted May 14, 2008 17:29 UTC (Wed) by joey (guest, #328) [Link]

The article touches on bug tracking systems being used for communication with users. I have
not been trying to develop a distributed BTS, but in working on a wiki backed by distributed
version control (ikwiki), I have kind of came at the same problem from the other end, starting
with a centralised documentation and communication medium (the wiki and blog), making it
decentralised, and then also using the wiki for (rather basic) bug tracking.

If that sounds interesting, here are a few links.

http://ikiwiki.info/tips/distributed_wikis/
http://ikiwiki.info/tips/integrated_issue_tracking_with_i...


Distributed bug tracking

Posted May 14, 2008 18:01 UTC (Wed) by JoeBuck (subscriber, #2330) [Link]

One possibility for maintaining a centralized point for public reports, while still distributing the bug database for day-to-day work, is just to treat the public repository as one of a number of peers. You can still update in both directions. Users wouldn't need their own git tree; they could use a bugzilla-style web interface to interact with one particular tree.

But developers wouldn't use the same approach. If Alice has fixed a bug in her own tree, she would want to update her local bug database as well (ideally by the same checkin, so that the mod to the bug's state is an atomic transaction that goes with the mod to the code). But since the bug still exists in developer Bob's tree, the bug database is naturally in a different state. Once developer trees are merged with the public tree, the database users see would be updated.

Distributed bug tracking

Posted May 14, 2008 18:40 UTC (Wed) by mh (guest, #7058) [Link]

Most email based bug trackers, such as the Debian BTS could be considered distributed in as
much as they allow you to use them while "strapped into a seat in a tin can"

Distributed bug tracking

Posted May 14, 2008 18:45 UTC (Wed) by joey (guest, #328) [Link]

Well, not so much. You can open new bugs and reply to existing bugs, and close bugs from the
plane, but you cannot *see* bugs, unless you've performed an expensive caching operation
before getting on the plane.

And yeah, been there, wanted to do that, and had to settle for watching in-flight movies
instead.

But, I don't think that use on a plane is really the intereting use case for distributed
revision control, or BTS.

Distributed bug tracking

Posted May 14, 2008 19:04 UTC (Wed) by tzafrir (subscriber, #11501) [Link]

The Debian BTS also tracks the status of a bug in different branches.

Here is a bug that is currently open (marked as red) on two different branches:

http://bugs.debian.org/456780

This can be done in retrospect. I can mark a version (= tag) as buggy well after 
releasing it. Though there's a useful shortcut for closing a bug in a changelog of a new 
release, which is mostly an equivalent of a commit log .

There is a different use case that has not been mentioned so far:

What about this bug upstream? What about this bug in Gentoo?

Distributed bug tracking

Posted May 14, 2008 19:20 UTC (Wed) by iabervon (subscriber, #722) [Link]

One thing I'd really like to be able to do is report that using xscreensaver crashes my X
server. Does this go to Ubuntu? X? xscreensaver? ATI? DRM? Mesa? The really right thing would
be to encourage all of the above to discuss it amongst themselves, because the actual answer
may well be that it's an incompatibility between particular versions of two different
projects, and the solution is for Ubuntu to apply a patch known to one of those projects to
the version they're shipping or use a newer version of the project.

(IIRC, it turned out to be that OpenGL had issues with the vesa driver and the particular
graphics card; telling xscreensaver to just blank the screen worked, and switching to the
proprietary driver to get dual-head fixed it as a side effect)

Distributed bugs

Posted May 14, 2008 19:35 UTC (Wed) by tialaramex (subscriber, #21167) [Link]

Yes, that's actually a much more interesting problem for Free Software (well, it'd be
fascinating outside of Free Software but the issues quickly turn into a quagmire that would
drown any attempt to solve it)

Wouldn't it be great if a user could create a bug "Why doesn't my foozle work with banjo
properly?" and after a while having exhausted their own ideas they could send it to Fedora, a
Fedora developer could look at it, and after discussion with other Fedora developers, send it
to the Foozle developers, who would take one look and say "Aha, that's a Banjo problem" and
copy it over to the Banjo guys, who might say "Yeah, we fixed that in Banjo 1.4.3" and
everyone involved would see all this happening so that then the Fedora developer can add
"Banjo 1.4.3 is in testing, and you should be able to update to it in a few days", set a
key-value field and later a Fedora automated script could add "Banjo-1.4.3.rpm is now
available, it may fix your problem. Please tell us if it does.

The key thought would be that this is /one bug/ not a series of different bugs in possibly
different bug tracking systems, which are merely related. It's one guy's bug about his Foozle
not working as he expected, and it simply connects into all these different bug tracking
systems. The same theory as Git would apply, disk space is cheap, so every system can have
visibility of the bug report in full, with every previous action and they just share updates.

Other ideas while I'm scribbling: Priority fields don't belong to the bug. This isn't a P1
bug, or a low-priority bug, it's just a bug, and individuals would have their own perspective
on its priority that was independent of and needn't even be sent around with, the bug report.
Severity may or may not need to be treated the same way.

Now that's something worth building. I wonder how hard it would be to get a working prototype
and then to sell existing projects on supporting this way of working.

Distributed bugs

Posted May 15, 2008 12:59 UTC (Thu) by emk (subscriber, #1128) [Link]

Hmm. This is an excellent and interesting idea. We frequently need to shift bug reports
between projects at work: "Oops! That's not our bug. That's really a bug in $OTHER_LIBRARY."

Working out the semantics may be challenging, though. In particular, the original project
needs to keep some sort of "tracking bug" (probably just the same bug), and remember where to
pull updates from.

You'd also want tags--not in a git sense, but in del.icio.us sense ("release-1.2",
"subsystem-foo", "mystery-crasher"). And those might need to be project- or
maintainer-specific under certain circumstances.

And would there be new "resolution" values? "Waiting for upstream to fix", for example?

Distributed bugs

Posted May 17, 2008 18:47 UTC (Sat) by vonbrand (guest, #4458) [Link]

The technical problems to be solved are (comparatively) trivial (even a single bugzilla keeping track of all bugs would be enough to solve half of it or so), the hairy, so far unsolved problems are the organizational (people) problems.

Distributed bug tracking

Posted May 15, 2008 13:39 UTC (Thu) by Jel (guest, #22988) [Link]

What you're talking about is remote access and possibly offline 
preparation of bug reports.

When people say distributed in terms of version control, they're talking 
about a system that has no central repository; just lots of peers, like a 
p2p system.  In particular, they're talking about a system where each 
peer can be a completely self-sufficient branch, looking after it's own 
new features, bugs, etc., until it's all accepted into another (so-called 
upstream) branch and merged.

Distributed bug tracking

Posted May 23, 2008 1:53 UTC (Fri) by kevinbsmith (guest, #4778) [Link]

There seem to be (at least) two distinct clumps of benefits that might come from distributed
bug tracking: Offline work, and...something about p2p, no central authority, etc. I completely
understand the first, and it was the focus of the first few paragraphs of the article. The
second is much fuzzier for me.

I have been a proponent of DVCS for years, largely for that second reason. It can make
projects more democratic, avoiding bad gatekeeping. But it's not clear to me that bug tracking
has the same set of issues. It's pretty rare to be denied write rights to a bug database for
an open project. It's also hard for me to imagine cases where I would really want to create a
personal branch of bugs.

The other aspect that has been mixed into this conversation really has little to do with
whether a bug tracker is distributed or not: Integration. The ability to attach a commit to an
issue resolution is powerful in many ways...but could be done with CVS and any centralized bug
tracker.

Finally, at least for now, I am definitely in the camp of not wanting bug reports to be stored
inside each code branch. The bug database definitely feels like something separate, to me.

Bugs with sensitive security information

Posted May 14, 2008 21:13 UTC (Wed) by scottt (guest, #5028) [Link]

For bugs that contain sensitive security information and can not yet be disclosed the
distributed bug trackers would need a per-bug property that stops some bugs from being cloned
along with all the others.

Distributed bug tracking

Posted May 14, 2008 21:22 UTC (Wed) by aleXXX (subscriber, #2742) [Link]

Although I'm still not convinced from DVCS, the idea to have the bugs in 
the same directory tree as the project sounds like a really good idea to 
me.
I mean, with regular bugtrackers like mantis, bugzilla, Trac you have to 
dig through a big database of bugs by using the right search fields.

It would be so much more productive if I could just do something like

CMake/Modules/$ ls
.....
FindQt4.cmake
... more files

# now ask for bugs:
CMake/Modules/$ buggy list FindQt4.cmake
   #1234 "It doesn't work"
   #1345 "Sometimes it fails"
   #3454 "If I do something wrong, it fails" 
CMake/Modules/$ buggy show 1234
   Title: It doesn't work
   Reporter: Karl Karlsen
   Note: If Qt4 isn't installed, FindQt4.cmake doesn't find it !
CMake/Modules/$ buggy change 1234 nochangerequired
CMake/Modules/ $ cd ..
CMake/ $ buggy list
   #1 General: "CMake works too good"
   #2 
   ...
   #1345 FindQt4.cmake: "Sometimes it fails"
   ...

Also for a huge project like KDE this would be a huge gain I could 
imagine.

I also have the suspicion that this will need really tight integration 
with the VCS (which I think doesn't necessarily have to be a DVCS).

A comment about DVCS: now honestly, is "committing to a local repository" 
really worth that much ? I mean, it is still on my local harddisk. So it 
is not backed up as on a central server. Still nobody else can see the 
code, so there is no peer review yet. Still nobody else can test the 
code. Doesn't this encourage big commits later on to the official/main 
tree, which are too big to understand easily ?
I completely agree that having version control also for local files is a 
vey nice thing, but IMO this is different from using a VCS for 
development.

Alex

Distributed bug tracking

Posted May 14, 2008 21:53 UTC (Wed) by iabervon (subscriber, #722) [Link]

The local commit thing means that (a) you can use the version control system to checkpoint
states where you've done some good work, but it's currently broken as a whole, and you're
about to risk totally messing everything up; (b) you can commit your state before resolving
conflicts with other people's work, so you can retry resolving if you mess up in your first
attempt and lose all your work; (c) you can look at what you've committed and realize that
it's ugly, incomplete, or something like that, after you've put together exactly what you're
planning to send in but before you send it; (d) you can do a dozen related commits, each
self-contained, and as a set leading up to the last one, which adds a feature, and you can
deliver the set together; (e) before you submit a series, you can reorganize it into small,
clean commits that total the big chunk of work you've done.

My experience is that centralized systems lead to commits that each make all the changes
necessary to implement a feature, but touch a lot of different parts of the code, while
distributed systems lead to small focused commits, which are separate changes but all arrive
at once.

Distributed bug tracking

Posted May 15, 2008 3:33 UTC (Thu) by robertknight (guest, #42536) [Link]

> Still nobody else can see the 
> code, so there is no peer review yet

Linus amongst others would argue that is a major benefit of DVCS.  Sometimes a programmer will
have an idea which they want to be able to play with without telling the whole world about it
- which is what happens if you create a branch in SVN. 

Good reasons for keeping code private, at least initially:

* The idea may turn out to be impractical or stupid.  In that case you won't upset anyone
because nobody else knew about it and you won't leave 'dead-end' branches lying around in the
main repository.
* People may appreciate the ability to work on something for a while without subjecting it to
criticism until they are ready.
* There may be competition within the project or in competing projects. 
* The code might not be suitable for inclusion in the main code base - it might be a
modification of the software to meet the needs of a specific company for example.  

In addition to secrecy, there is also the case where developers want a branch to handle their
specific needs which are not useful to anyone else.


> Doesn't this encourage big commits later on to the official/main 
> tree, which are too big to understand easily ?

Not necessarily - because users can easily browse the history of the merged branch and see the
individual changes which went into the final merge.    

One big practical advantage of course is speed.  Branching, merging and viewing history can be
very slow in SVN if the server is under load.  In git/bzr etc. creating a new branch is
instant and checking it out normally takes only a few seconds.  Merging is also trivial and
again takes only a few moments in most cases and it isn't affected by what is going on at the
remote end.  Being able to get a near-instant log is also very nice.  Having to wait two
minutes for "svn log" to fetch the history of a particular file is pretty frustrating.

The Lumiera-project is thinking new

Posted May 15, 2008 6:57 UTC (Thu) by Velmont (guest, #46433) [Link]

Lumiera, the successor (in a few years, if more developers hop on board) of Cinelerra-CV, wants to be distributed. You should be able to clone the repository, and not only get the code - but also the website, all documentation and the bugs. The whole essential project inside Git.

For this to work, two extra projects have been created. First, a web interface, webgit (not released yet AFAIK), to Git, where users can edit files in Git just by using their browser. This will make small corrections and contributions really easy (and also documentation). It is mainly centered around code, but since everything about the project is supposed to be in Git - it will be easy for everyone to help a bit with what they know best.

The other is a wiki currently backed by Git and using asciidoc-files (but markup and backend are pluginable), uWiki is announced on freshmeat. The homepage will also be written using uWiki, so that it can easily be tracked using Git.

Lastly, the distributed bug tracking. Lumiera won't be useful for a long time, so there is some time to get that. Right now the wiki is being used for such things, however, as the project will start to get interesting for others than developers - the bug tracking problem must be solved. Nice to see others are working hard to find solutions.

The Lumiera-project is thinking new

Posted May 23, 2008 1:56 UTC (Fri) by kevinbsmith (guest, #4778) [Link]

Advocacy tip: Don't assume readers know what the heck your project does. Nor the older project
that yours is replacing. I had to do a search to reveal that Cinelerra-CV is a video editing
system. That's 2 minutes of my life that I'll never get back.

Distributed bug tracking

Posted May 15, 2008 17:27 UTC (Thu) by markshuttle (guest, #22379) [Link]

A project like Ubuntu, which wants to exchange code directly with upstreams and also with
Debian and other distros, really feels the need for some solution to this problem.

Truly distributed bug tracking (where the bug list follows the code everywhere) is very
exciting, and may be the long term solution. In the interim, you can address it with just
tracking the state of the bug in a few different places. Canonical has been funding work on
Bugzilla, Trac and other bug trackers to make it easier to talk to them programatically, so
that we can keep Ubuntu developers up to date automatically.

We have a "centralised view of distributed bug status" in Launchpad, which helps us keep track
of the status of an issue upstream, in Debian, or in other distros. For example, check out
these bugs:

 https://bugs.launchpad.net/moblin-applets/+bug/209870
 https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/...
 https://bugs.launchpad.net/ubuntu/+source/tuxmath/+bug/22...
 https://bugs.launchpad.net/ubuntu/+source/linux-source-2....
 https://bugs.launchpad.net/ubuntu/+source/warsow/+bug/131582

In each case, you can see how the bug is linked to reports in other bug trackers, and then the
status is updated automatically. As a small consequence, you can subscribe to any bug report
on any bug tracker (of the supported types) via LP.

A centralised view isn't the ultimate solution, but it works for us right now and quite a few
other projects - upstreams and distributions - are using it too.

Mark

Distributed bug tracking

Posted May 15, 2008 17:41 UTC (Thu) by tzafrir (subscriber, #11501) [Link]

How common is the language between different bug trackers?

If the status on Debian changes from "critical" to "grave", what does it mean with 
Ubuntu?

If a bug is moved from libtoolchain0.3-1 to libtoolchain-dev on Debian, what does it 
mean for Fedora?

What about the various reasons to close a bug? "duplicate", "fixed" and "invalid" are 
different things.

Distributed bug tracking

Posted May 15, 2008 18:31 UTC (Thu) by markshuttle (guest, #22379) [Link]


The bug trackers are all in different languages, and we have been working with contractors who
are upstream on them to have a reasonably standardised network service API to talk to them.
Other folks will of course be able to use that too, which is great.

In Launchpad, the status in each place a bug is tracked is separate. So Debian can consider a
bug "medium" while Ubuntu can consider it "important". We try to map all the different kinds
of status in different bug trackers to something sensible in Launchpad.

Distributed bug tracking

Posted May 27, 2008 9:12 UTC (Tue) by kragil (guest, #34373) [Link]

Hello Mark,

I think I heard that Ubuntu wants to store most sources in BZR. Will BZR support distributed
bug tracking in the future or will it be something like Bugs Everwhere?

Cheers

Distributed bug tracking

Posted May 16, 2008 3:16 UTC (Fri) by localhost (guest, #15238) [Link]

Another relatively new project in this same space is Fossil SCM.  Written by D. Richard Hipp
of SQLite.  

http://www.fossil-scm.org/index.html

A couple of its design goals are: "Support disconnected, distributed development" and
"Integrated bug tracking and wiki"

I have been trying it out lately, and so far its been pretty nice.

Fossil

Posted May 23, 2008 2:25 UTC (Fri) by kevinbsmith (guest, #4778) [Link]

The design looks pretty reasonable. The "autosync" features sounds a lot like "bind" in bzr.
The home page mentions bug tracking, but I couldn't find any details other than a technical
description of the ticket data format. It's either hidden well, or a future feature.

Too bad it's written in C. Bias on my part, but I would much rather see an app like this
written in a higher-level language, using C where necessary as a speed optimization or to
create a highly reusable library. 

The author is proud that the whole system is a single executable file with few dependencies.
That's nice, but for me, the benefits of faster development, easier maintenance, and
crash-safety are more compelling. If the author only knows C, I understand the choice. It's
still unfortunate. Might be fun to implement a clone in another language.


Bugs Everywhere and Git

Posted May 17, 2008 6:36 UTC (Sat) by bignose (subscriber, #40) [Link]

> That fact that, in git at least, one must explicitly add any new files created by the bug
tracker [...] does not help the situation.

As of today, the development version of Bugs Everywhere has direct support for Git (adding and
removing the Bugs Everywhere files, etc.) just as for Bazaar.

Distributed bug tracking

Posted Jul 8, 2008 5:04 UTC (Tue) by joey (guest, #328) [Link]

I've been thinking about this on and off since this article was posted. I have some
interesting ideas (like using a microformat to embed distributed bug info) that I've love to
get batted around by people, but this thread doesn't seem the place.

Assuming anyone's still following this thread.. Any interest in forming a group to discuss and
work on problems of distributed bug tracking? If so, email me joey@kitenet.net and I'll set up
a list or other forum.

Distributed bug tracking

Posted Jul 8, 2008 23:23 UTC (Tue) by joey (guest, #328) [Link]

Update: Several of us have started a project and mailing list to discuss distributed bug
tracking further. The web site is http://dist-bugs.kitenet.net/


Copyright © 2008, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds