Desktop Team Triage Policy

Matěj Cepl, 2007-10-18

The purpose of this document is to stabilize current policy of the handling bugs on the triaging side of the pipeline (i.e., I won't deal here with MODIFIED, ON_QA and similar states of bugs; if needed, this document certainly can be extended for dealing with whole pipeline), so basically in the moment a bug is marked as ASSIGNED, it falls out of the scope of this document.

The structure of this document follows a bug from its entering our bugzilla to either the moment the bug is ASSIGNED to particular developer, or until it is CLOSED for some reason other than that it is resolved.

NEW

Reporter files a new bug against some particular component and the bug gets automatically assigned to the default owner of the component.

Practically a bug in this state means that the bug hasn't been triaged and that the people who are responsible for triaging bugs should deal with it. In the ideal state (i.e., there are enough triagers available), developers shouldn't touch such bugs at all. Of course, it doesn't mean that developers cannot assign a bug to themselves and taking it out of dirty hands of bug triagers -- whole point of bug triage is to make bugs digestable by developers, and when a developer decides that he knows about a bug and wants to take, there is no reason in the world to stop him from doing so. The only thing is to be emphasized which is that when developer takes over a bug, he should switch status of the bug to ASSIGNED and then he is on his own.

Because bug 234261 has been fixed (yay!) we can finally say that the bug is NEW as long as it has been not triaged and only then (manually) will be switched to ASSIGNED.

Groups

Bugs against the components maintained by the desktop team are grouped to these super-components (except for Gnome bugs--I need a maintainers' input on how they imagine to group their components):

XGL bugs: components with the name matching this regular expression:
xorg|X11|compiz|chkfontpath|imake|libdmx|libdrm|libfontenc|libFS|libICE|libSM\ |libwnck|libxkbfile|mesa|pyxf86config|system-config-display|xkeyboard-config\ |xrestop|xsri
Gecko & co. bugs: components with the name matching this regular expression:
devhelp|epiphany.*|fedora-bookmarks|firefox|galeon|gecko-sharp2|htmlview|\ mozilla|seamonkey|thunderbird|yelp
Evolution bugs: components with the name matching this regular expression:
evolution.*|gnome-pilot.*|gtkhtml[23]|libsoup
OpenOffice bugs: components with the name matching this regular expression:
^(agg|fribidi|hunspell.*|icu|libgsf|libwmf|libwpd|openoffice.org|planner)

Needles to say, that these super-components are not written in the stone, they are just an examples (and records of the current practice) and could (and should) be changed anytime developers will feel like changing them.

Virtual owners

Related to these super-components is the use of virtual owners. The idea is that instead of using real email lists (e.g. xgl-maint@redhat.com) we could assign bugs from some group of components to the virtual owner (e.g., virtual email address xgl-owner@redhat.com) and anybody who is interested in the particular components would use Users to watch field in the User preferences in bugzilla to watch all emails to the particular group. It is not that big deal to create new virtual users, and watching virtual owners is easier than fiddling with email lists (the former could be done by each user, the latter has to be done by bugzilla administrators).

Prior to bug 234261 being fixed, the method of using email addresses like xgl-maint@redhat.com (which I won't describe here in detail) was required because we needed to have a way to easily determine if a X or OpenGL bug had been triaged or not. The policy was that bugs assigned to xgl-maint were either not yet triaged or in the process of being triaged and bugs assigned to developers had been fully triaged. Now that bug 234261 has been fixed, the holding area email lists are redundant.

There is currently non-uniform policy of using *-maint@redhat.com addresses between Xorg groups and caillon with Martin Stránský. Whereas in xorg group, xgl-maint@redhat.com is used as a placeholder for bugs which were not yet triaged and assigned to a particular developer, and when being switched to ASSIGNED state, they are always assigned to the one human being (with possibly some people on Cc: list), I was asked by caillon not change assignee of ASSIGNED (bugzilla terminology is confusing) bugs from gecko-maint@redhat.com, that he and Martin Stránský will divide work between them later. Of course, there is nothing wrong with either approach as long as it works for developers, to whom whole bug triaging and related stuff should serve in the first place.

FIXME: I would like to also expand this section with another paragraph at the end that explains how we might use virtual owners. You give some hints that we might use them as super-components and also as more fine grained components (e.g., xorg-video-drivers-owner). I would like to see these expanded into a full strawman proposal. For example, you could suggest that we create a hierarchy of virtual owners that would allow people to monitor from the highest level super-components all the way down to the individual packages. The idea here is to get people to start thinking about _how_ we will use virtual owners. It will probably not be the final way we implement the strategy, but at least it will be a starting point to help people start thinking in the right direction.

However, this could be an inspiration for other people. I am thinking for example about xorg-video-drivers-owner@redhat.com shared by (at least) ajax and airlied. [FIXME: 2B decided by developers}

Possible transitions from NEW state

ASSIGNED -- the bug has been triaged, the way how to reproduce it was found and described, and comments/attachments to the bug contain all information which the reporter is able to produce. For hardware related bugs (e.g., most xorg bugs unfortunately), the requirement of reproduction cannot be usually satisfied (unless by chance a triager has the same hardware as the reporter).
Generally speaking, ASSIGNED bug means that bug triager won't touch such bugs anymore with the exception of old NEEDINFO bugs (see below).
NEEDINFO -- no! This should not be considered valid transition from any state, IMHO, but just always temporary state after which the bug returns to its current base state. See below for more discussion about NEEDINFO and standardized comments for setting it on.
CLOSED/INSUFFICIENT_DATA -- despite what the training materials for Bugzilla say, this is not automatic (thankfully) and it has to be set manually for each bug (or using mass-changes in bugzilla queries). Two main reasons when this is used is when reporter loses interest in the bug (because of our delays in replying, should happen less and less) and never replies to the questions, or when they lose access to the important piece of hardware for reproduction of the bug.
Currently the process is that the special query is set which shows all bugs in NEEDINFO state for a month, and then bugmaster goes through this list daily (or at least every other day) and manually decides what to do with these bugs (one more notice, close as INSUFFICIENT_DATA, switch to ASSIGNED because all information was actually provided and only reporter forgot to check proper checkbox).

When the bug should be closed with this resolution, a standard comment is added:

Since there are insufficient details provided in this report for us to investigate the issue further, and we have not received feedback to the information we have requested above, we will assume the problem was not reproducible, or has been fixed in one of the updates we have released for the reporter's distribution.

Users who have experienced this problem are encouraged to upgrade to the latest update of their distribution, and if this issue turns out to still be reproducible in the latest update, please reopen this bug with additional information.

Closing as INSUFFICIENT_DATA.
CLOSED/CANTFIX -- most often used for binary-only drivers in Xorg, where this comment is added (for less often happening ATI drivers, I add only the first paragraph):

Thanks for the report. We are sorry that we cannot help you with your problem, but we are not able to support binary-only drivers. If you would be able to reproduce this issue using only open source software, please, reopen this bug with the additional information, but in meantime I have no choice than to close this bug as CANTFIX (because we really cannot fix it).

For users who are experiencing problems installing, configuring, or using the unsupported 3rd party proprietary nvidia video driver, Nvidia provides indirect customer support via an online web based support forum. Nvidia monitors these web forums for commonly reported problems and passes them on to Nvidia engineers for investigation. Once they've isolated a particular problem, it is often fixed in a future video driver update.

The NVNews Nvidia Linux driver forum is located at:

http://www.nvnews.net/vbulletin/forumdisplay.php?s=&forumid=14

Once you have reported this issue in the Nvidia web forums, others who may have experienced the particular problem may be able to assist. If there is a real bug occuring, Nvidia will be able to determine this, and will likely resolve the issue in a future driver update for the operating system releases that they officially support.

While Red Hat does not support the proprietary nvidia driver, users requiring technical support may also find the various X.Org, XFree86, and Red Hat mailing lists helpful in finding assistance:

X.Org mailing lists:

http://www.freedesktop.org/XOrg/XorgMailingLists

XFree86 mailing lists:

http://www.xfree86.org/sos/lists.html

Red Hat mailing lists:

https://listman.redhat.com/mailman/listinfo
CLOSED/CURRENTRELEASE (or CLOSED/RAWHIDE) Used for Fedora bugs (RHEL bugs have to go through QA etc.) when the bug is fixed in the updated release of the package or in Rawhide. Also a reaction to the bug where reporter after getting NEEDINFO with request for reproduction on the updated system, is not able to reproduce the bug. There is not much standard about the closing comment, if any comment is needed at all (which is not the case in most situations), usually something in the style of this is enough (this is more for the reporter's inability to reproduce a bug from older distribution):

This issue has been already resolved in the latest version of the package (NAME-OF-THE-PACKAGE). Please, reopen if you are able to reproduce this with currently supported distribution.
CLOSED/NOTABUG Some use of this resolution is obvious -- the reported issue is not a bug at all, just reporter made a mistake in using the program, or there are some other factors which caused the problem (faulty hardware, configuration, etc.).
Other use of this resolution is more tricky. The issue is when the triager despite all best efforts is unable to reproduce the bug. Either the reason for such inability is obvious and appropriate resolution could be used (e.g., CLOSED/CURRENTRELEASE), but quite often it isn't. Then this resolution is appropriate. (or should it be CLOSED/WORKSFORME?)

Dark magic of NEEDINFOing

Although I claimed that NEEDINFO is not valid transition state, it doesn't mean by any means that NEEDINFO is not important, or that it shouldn't be used. Just to the contrary, for the bug triager it is the most important state that the bug under her control could be in.

The main reason why I don't want to talk about NEEDINFO as another transition state is to emphasize that the bug in NEEDINFO state doesn't leave bug triagers' control -- it should be kept under control and if it is not alive enough, it should be closed (about that later).

Putting bug into NEEDINFO will get it into something I would call The Iron Grip of NEEDINFO. The point is that one of the bug triager's regular tasks is to run daily a query old NEEDINFO which finds all bugs in the NEEDINFO status in (currently) Xorg, Gecko, and Evolution super-components which were not changed in the last 30 days. Then his duty is to do something about each of this bug:

close as CLOSED/INSUFFICIENT_DATA,
ASSIGN to developer (e.g., it happens that NEEDINFO state was not cleared even though the information was provided), or if neither of these is appropriate,
do something else,
make a comment, any comment, at least the one threating with closing.

The point is obviously that something MUST change in the bug, so that the bug won't show up next day in the same query. Of course, if the reporter gives reasonable case for not being able to answer immediately, bug triager should be free to allow him to do that, but always in one month installments with pinging him at least every month. By keeping one month as rigorous schedule for looking at every NEEDINFO bug we can be sure that the bug doesn't slip out of control.

The question is how many times a bug can show up in the query (i.e., how many months it can get saved from being closed). And answer is very vague. Usually, the bug gets at least once a comment which explitly says, that if the answer won't be provided in a month, it will be closed as CLOSED/INSUFFICIENT_DATA, but judgement when the bowl breaks is not rigid. If there are clear signs that the bug is useless and the answer won't come or it will be useless (e.g., a year old bug without a comment in the last ten months, in a component which was drastically changed in the last year), then probably even no reminder need to be sent. However, bugs in RHEL-like distribution with grave impact may get multiple reminders, direct email/IRC can be tried, before the patience will run up.

This iron grip of NEEDINFO is so useful in eliminating incovenient bugs, that the basic strategy of dealing with any problematic bug for any reason, is to convert it into NEEDINFO state, and then it will be dealt with automatically. Flip side of this is that every developer should be aware that a bug in NEEDINFO state is on the slippery slope towards being closed in the limited amount of time. (Of course, developers decisions are sacred, so if needed comment Please, don't close ever. could be added to avoid closing.)

A controversial issue is closing RHEL-like bugs for inactivity. Certainly the general rules holds, that RHEL bugs shouldn't be close due to inactivity ever. Except there are many bugs in bugzilla assigned to RHEL product, which are not really RHEL bugs. Many of them were reported by Red Hat employees while testing Beta versions of RHEL. Of course, if the bug hits our internal reporter, than probably it will hit our customer as well, so such bugs should get high level of scrutiny. Another members of this group are the ones which are actually CentOS bugs (although for the sake of both distributions, reporters should be asked to file a bug in CentOS bugzilla).

Nevertheless, I believe that even such bug doesn't have to be treated with the same level of rigidity as another bug, which comes through IssueTracker. If the internal beta testers are unresponsive to NEEDINFO, and the best effort to communicate with them were used (that includes emails and pinging them on IRC; but remember Red Hat is not only engineering team and not everybody with @redhat.com in her email address uses IRC), then the bug should closed without any mercy. ([FIXME: 2B discussed; Kevin doesn't agree with this.]).

Related to that, but much less important is the question whether the reporter of bug against the RHEL product has a valid entitlement for the official Red Hat support and so whether the bug shouldn't be filed in Issue Tracker instead. The standard message for redirecting reporter to IT is:

For official Red Hat Enterprise Linux support, please log into the Red Hat support website at http://www.redhat.com/support and file a support ticket, or alternatively contact Red Hat Global Support Services at 1-888-RED-HAT1 to speak directly with a support associate and escalate an issue.

The point of such message is that although certainly nobody will remove a RHEL bug from bugzilla only because reporter hasn't filed it in IssueTracker, they will get lower priority, and the bug is suspectible to WONTFIX, or UPSTREAMing. Of course, it should be noted (and even to the reporters' themselves) that their bug won't be closed only because they use CentOS, but their bug will be treated with the priority as Fedora bugs.

One important aspect of NEEDINFO is that it is easy to create standardized responses. There are many standard comments for switching a bug to NEEDINFO state (original version of many of these comments comes from X/GL Team Bug Policies). This is the list:

Xorg asking for more information:

Thanks for the bug report. We have reviewed the information you have provided above, and there is some additional information we require that will be helpful in our diagnosis of this issue.

Please attach your X server config file (/etc/X11/xorg.conf) and X server log file (/var/log/Xorg.*.log) to the bug report as individual uncompressed file attachments using the bugzilla file attachment link below.

Could you please also try to run without any /etc/X11/xorg.conf whatsoever and let X11 autodetect your display and video card? Attach to this bug /var/log/Xorg.0.log from this attempt as well, please.

We will review this issue again once you've had a chance to attach this information.

Thanks in advance.
Trying to force activity from reporter (when the situation is not ready for CLOSED/INSUFFICIENT_DATA yet):

Reporter, could you please reply to the previous question? If you won't reply in one month, I will have to close this bug as INSUFFICIENT_DATA. Thank you.
There is also a message asking reporter to attempt to reproduce bug on the current updates of her distribution (I have also developed a tendency to add to all NEEDINFOs a threat about possible closing after a predetrmined period of time, especially when I suspect high possibility of non-answer):

Since this bugzilla report was filed, there have been several major updates, which may have resolved this issue. Users who have experienced this problem are encouraged to upgrade their system to the latest version of their distribution available.

Please, if you experience this problem on the up-to-date system, let us now in the comment for this bug, or whether the upgraded system works for you.

If you won't be able to reply in one month, I will have to close this bug as INSUFFICIENT_DATA. Thank you.
Special suicide NEEDINFO for asking reporter to upstream bug (of course, http://bugzilla.mozilla.org needs to be replaced by particular bug database -- http://bugzilla.gnome.org or http://bugzilla.freedesktop.org):

If this issue turns out to still be reproduceable in the latest updates for this Fedora Core release, please file a bug report in the the upstream bugzilla located at http://bugzilla.mozilla.org in the particular component.

Once you've filed your bug report to the upstream bugzilla, if you paste the new bug URL here, Red Hat will continue to track the issue in the centralized upstream bug tracker, and will review any bug fixes that become available for consideration in future updates.

Setting status to NEEDINFO, and awaiting upstream bug report URL for tracking.

Thanks in advance.

When this NEEDINFO request is fulfilled, then the number of the upstream bug has to be inserted in External Bugzilla References section of the bug page, and the bug is CLOSED/UPSTREAM with this comment:

We believe it is more appropriate to let this bug be resolved upstream. Red Hat will continue to track the issue in the centralized upstream bug tracker, and will review any bug fixes that become available for consideration in future updates.

Thank you for the bug report.

Bugs retention policy

(currently this policy so far applies only to Xorg, Evolution, and Gecko & co. bugs)

Although we may like to keep bugs open for as long as they are solved, Red Hat operates only with limited resources (which is especially true about the desktop team), and so we have to limit the number of bugs which are currently open (and where we implicitly promise that some development will be done). Therefore old bugs have to be removed out of the way.

Fedora-test bugs have to managed -- when the distribution being tested is released, all bugs against *-test[1234] product should be either closed or moved to the released distribution itself. So, when F10 will be released, F10-test* bugs should be either closed or moved to F10. When the next distribution is released (e.g., F11 in our case) all these test bugs will be CLOSED with please, reproduce and reopen if still applicable:

Fedora Core 5 and Fedora Core 6 are, as we're sure you've noticed, no longer test releases. We're cleaning up the bug database and making sure important bug reports filed against these test releases don't get lost. It would be helpful if you could test this issue with a released version of Fedora or with the latest development / test release. Thanks for your help and for your patience.

[This is a bulk message for all open FC5/FC6 test release bugs. I'm adding myself to the CC list for each bug, so I'll see any comments you make after this and do my best to make sure every issue gets proper attention.]

Therefore, all bugs which are just placeholders for future action, notes about things to investigate etc., should never be filed against *test* version, and actually it probably shouldn't be filed against any other version than against Rawhide.
The same applies to obsolete bugs. After Fedora Legacy shut down, the policy is that a distribution is declared as obsolete one month after release of n+2 distribution (e.g., one month after release of Fedora 7 for Fedora Core 5 bugs). When this happens, the following comment (with changed version numbers, of course) is added to all applicable bugs:

Fedora Core 5 is no longer supported, could you please reproduce this with the updated version of the currently supported distribution (Fedora Core 6, or Fedora 7, or Rawhide)? If this issue turns out to still be reproducible, please let us know in this bug report. If after a month's time we have not heard back from you, we will have to close this bug as CANTFIX.

Setting status to NEEDINFO, and awaiting information from the reporter.

Thanks in advance.

After a month of waiting (so, actually reporters get two months for liquidation of obsolete bugs), all such bugs which are still opened are closed with a message like this:

We haven't got any reply to the last question about reproducability of the bug with Fedora Core 6, Fedora 7, or Fedora devel. Mass closing this bug, so if you have new information that would help us fix this bug, please reopen it with the additional information.