Desktop Team Triage Policy

Matěj Cepl, 2007-10-18

The purpose of this document is to stabilize current policy of the handling bugs on the triaging side of the pipeline (i.e., I won't deal here with MODIFIED, ON_QA and similar states of bugs; if needed, this document certainly can be extended for dealing with whole pipeline), so basically in the moment a bug is marked as ASSIGNED, it falls out of the scope of this document.

The structure of this document follows a bug from its entering our bugzilla to either the moment the bug is ASSIGNED to particular developer, or until it is CLOSED for some reason other than that it is resolved.

NEW

Reporter files a new bug against some particular component and the bug gets automatically assigned to the default owner of the component.

Practically a bug in this state means that the bug hasn't been triaged and that the people who are responsible for triaging bugs should deal with it. In the ideal state (i.e., there are enough triagers available), developers shouldn't touch such bugs at all. Of course, it doesn't mean that developers cannot assign a bug to themselves and taking it out of dirty hands of bug triagers -- whole point of bug triage is to make bugs digestable by developers, and when a developer decides that he knows about a bug and wants to take, there is no reason in the world to stop him from doing so. The only thing is to be emphasized which is that when developer takes over a bug, he should switch status of the bug to ASSIGNED and then he is on his own.

Because bug 234261 has been fixed (yay!) we can finally say that the bug is NEW as long as it has been not triaged and only then (manually) will be switched to ASSIGNED.

Groups

Bugs against the components maintained by the desktop team are grouped to these super-components (except for Gnome bugs--I need a maintainers' input on how they imagine to group their components):

XGL bugs
components with the name matching this regular expression:

xorg|X11|compiz|chkfontpath|imake|libdmx|libdrm|libfontenc|libFS|libICE|libSM\
|libwnck|libxkbfile|mesa|pyxf86config|system-config-display|xkeyboard-config\
|xrestop|xsri

Gecko & co. bugs
components with the name matching this regular expression:

devhelp|epiphany.*|fedora-bookmarks|firefox|galeon|gecko-sharp2|htmlview|\
mozilla|seamonkey|thunderbird|yelp

Evolution bugs
components with the name matching this regular expression:

evolution.*|gnome-pilot.*|gtkhtml[23]|libsoup

OpenOffice bugs
components with the name matching this regular expression:

^(agg|fribidi|hunspell.*|icu|libgsf|libwmf|libwpd|openoffice.org|planner)

Needles to say, that these super-components are not written in the stone, they are just an examples (and records of the current practice) and could (and should) be changed anytime developers will feel like changing them.

Virtual owners

Related to these super-components is the use of virtual owners. The idea is that instead of using real email lists (e.g. xgl-maint@redhat.com) we could assign bugs from some group of components to the virtual owner (e.g., virtual email address xgl-owner@redhat.com) and anybody who is interested in the particular components would use Users to watch field in the User preferences in bugzilla to watch all emails to the particular group. It is not that big deal to create new virtual users, and watching virtual owners is easier than fiddling with email lists (the former could be done by each user, the latter has to be done by bugzilla administrators).

Prior to bug 234261 being fixed, the method of using email addresses like xgl-maint@redhat.com (which I won't describe here in detail) was required because we needed to have a way to easily determine if a X or OpenGL bug had been triaged or not. The policy was that bugs assigned to xgl-maint were either not yet triaged or in the process of being triaged and bugs assigned to developers had been fully triaged. Now that bug 234261 has been fixed, the holding area email lists are redundant.

There is currently non-uniform policy of using *-maint@redhat.com addresses between Xorg groups and caillon with Martin Stránský. Whereas in xorg group, xgl-maint@redhat.com is used as a placeholder for bugs which were not yet triaged and assigned to a particular developer, and when being switched to ASSIGNED state, they are always assigned to the one human being (with possibly some people on Cc: list), I was asked by caillon not change assignee of ASSIGNED (bugzilla terminology is confusing) bugs from gecko-maint@redhat.com, that he and Martin Stránský will divide work between them later. Of course, there is nothing wrong with either approach as long as it works for developers, to whom whole bug triaging and related stuff should serve in the first place.

FIXME: I would like to also expand this section with another paragraph at the end that explains how we might use virtual owners. You give some hints that we might use them as super-components and also as more fine grained components (e.g., xorg-video-drivers-owner). I would like to see these expanded into a full strawman proposal. For example, you could suggest that we create a hierarchy of virtual owners that would allow people to monitor from the highest level super-components all the way down to the individual packages. The idea here is to get people to start thinking about _how_ we will use virtual owners. It will probably not be the final way we implement the strategy, but at least it will be a starting point to help people start thinking in the right direction.

However, this could be an inspiration for other people. I am thinking for example about xorg-video-drivers-owner@redhat.com shared by (at least) ajax and airlied. [FIXME: 2B decided by developers}

Possible transitions from NEW state

Dark magic of NEEDINFOing

Although I claimed that NEEDINFO is not valid transition state, it doesn't mean by any means that NEEDINFO is not important, or that it shouldn't be used. Just to the contrary, for the bug triager it is the most important state that the bug under her control could be in.

The main reason why I don't want to talk about NEEDINFO as another transition state is to emphasize that the bug in NEEDINFO state doesn't leave bug triagers' control -- it should be kept under control and if it is not alive enough, it should be closed (about that later).

Putting bug into NEEDINFO will get it into something I would call The Iron Grip of NEEDINFO. The point is that one of the bug triager's regular tasks is to run daily a query old NEEDINFO which finds all bugs in the NEEDINFO status in (currently) Xorg, Gecko, and Evolution super-components which were not changed in the last 30 days. Then his duty is to do something about each of this bug:

The point is obviously that something MUST change in the bug, so that the bug won't show up next day in the same query. Of course, if the reporter gives reasonable case for not being able to answer immediately, bug triager should be free to allow him to do that, but always in one month installments with pinging him at least every month. By keeping one month as rigorous schedule for looking at every NEEDINFO bug we can be sure that the bug doesn't slip out of control.

The question is how many times a bug can show up in the query (i.e., how many months it can get saved from being closed). And answer is very vague. Usually, the bug gets at least once a comment which explitly says, that if the answer won't be provided in a month, it will be closed as CLOSED/INSUFFICIENT_DATA, but judgement when the bowl breaks is not rigid. If there are clear signs that the bug is useless and the answer won't come or it will be useless (e.g., a year old bug without a comment in the last ten months, in a component which was drastically changed in the last year), then probably even no reminder need to be sent. However, bugs in RHEL-like distribution with grave impact may get multiple reminders, direct email/IRC can be tried, before the patience will run up.

This iron grip of NEEDINFO is so useful in eliminating incovenient bugs, that the basic strategy of dealing with any problematic bug for any reason, is to convert it into NEEDINFO state, and then it will be dealt with automatically. Flip side of this is that every developer should be aware that a bug in NEEDINFO state is on the slippery slope towards being closed in the limited amount of time. (Of course, developers decisions are sacred, so if needed comment Please, don't close ever. could be added to avoid closing.)

A controversial issue is closing RHEL-like bugs for inactivity. Certainly the general rules holds, that RHEL bugs shouldn't be close due to inactivity ever. Except there are many bugs in bugzilla assigned to RHEL product, which are not really RHEL bugs. Many of them were reported by Red Hat employees while testing Beta versions of RHEL. Of course, if the bug hits our internal reporter, than probably it will hit our customer as well, so such bugs should get high level of scrutiny. Another members of this group are the ones which are actually CentOS bugs (although for the sake of both distributions, reporters should be asked to file a bug in CentOS bugzilla).

Nevertheless, I believe that even such bug doesn't have to be treated with the same level of rigidity as another bug, which comes through IssueTracker. If the internal beta testers are unresponsive to NEEDINFO, and the best effort to communicate with them were used (that includes emails and pinging them on IRC; but remember Red Hat is not only engineering team and not everybody with @redhat.com in her email address uses IRC), then the bug should closed without any mercy. ([FIXME: 2B discussed; Kevin doesn't agree with this.]).

Related to that, but much less important is the question whether the reporter of bug against the RHEL product has a valid entitlement for the official Red Hat support and so whether the bug shouldn't be filed in Issue Tracker instead. The standard message for redirecting reporter to IT is:

For official Red Hat Enterprise Linux support, please log into the Red Hat support website at http://www.redhat.com/support and file a support ticket, or alternatively contact Red Hat Global Support Services at 1-888-RED-HAT1 to speak directly with a support associate and escalate an issue.

The point of such message is that although certainly nobody will remove a RHEL bug from bugzilla only because reporter hasn't filed it in IssueTracker, they will get lower priority, and the bug is suspectible to WONTFIX, or UPSTREAMing. Of course, it should be noted (and even to the reporters' themselves) that their bug won't be closed only because they use CentOS, but their bug will be treated with the priority as Fedora bugs.

One important aspect of NEEDINFO is that it is easy to create standardized responses. There are many standard comments for switching a bug to NEEDINFO state (original version of many of these comments comes from X/GL Team Bug Policies). This is the list:

Bugs retention policy

(currently this policy so far applies only to Xorg, Evolution, and Gecko & co. bugs)

Although we may like to keep bugs open for as long as they are solved, Red Hat operates only with limited resources (which is especially true about the desktop team), and so we have to limit the number of bugs which are currently open (and where we implicitly promise that some development will be done). Therefore old bugs have to be removed out of the way.