Problem Management: Process Definitions
This article answers common questions about the Problem Management module and process.
NOTE: The Action Checklist for Unplanned Service Disruptions and Outages (z.umn.edu/outage) is the definitive process for handling service disruptions.
What are the goals of Problem Management?
- Identify and remove underlying causes of Incidents.
- Incident and Problem prevention.
- Improve organizational efficiency by ensuring that Problems are prioritized correctly according to impact, urgency, and severity.
- Greater service availability by eliminating recurring Incidents.
- Incidents are contained before they impact other systems.
- Elimination of incidents before they impact services through proactive problem management.
- Prevention of known errors recurring or occurring elsewhere across the system.
- Improved First Call Resolution rate.
How does Problem Management differ from Incident Management?
- The purpose of Incident Management is to restore normal service as quickly as possible and minimize adverse impacts on business operations. Incident Management is used to manage any event that disrupts or has the potential to disrupt any IT service and associated processes.
- The purpose of Problem Management is to eliminate the root cause of Incidents, prevent them from recurring or happening in the first place, and to minimize the impact of Incidents that cannot be prevented. Problem Management includes activities to diagnose and discover the resolution to the underlying cause of Incidents, ensure that the resolution is implemented (often through Change Management), and eliminate errors before they result in Incidents.
- One of the outcomes of the problem management process is a known error record.
- A known error is a problem that is successfully diagnosed and either a work-around or a permanent resolution has been identified. Known errors should be documented in the knowledge base as articles so that a resolution is captured and shared across the organization and the user community. That way, if end users encounter the issue in the future they can self-solve it or the Service Desk can easily provide a solution.
- Aside from end users, anyone in ServiceNow may create a problem record. Exactly who should create a problem depends somewhat on how the problem was detected (see above) and the nature of the problem. Typically, creating a problem often falls to a functional team member (Tier 2 or 3), service director, or service desk manager.
- Problems may be identified in a number of ways, but one of the most common is by tracking multiple incidents to a single underlying cause. A number of Incident records may be related to a single problem record and managed much more effectively. Several features in ITSM Problem Management help communicate workarounds, publish knowledge base articles, initiate change management actions, and complete root cause analysis. Whereas Incidents are more often concerned with alleviating symptoms, problems deal directly with the true cause of a disruption.
- Problems may also be discovered before any incidents have been logged. For instance, a security vulnerability that has yet to be exploited.
- Senior managers and service directors are set up to receive automatic notifications any time a critical- or high-priority problem is created. Users may subscribe to these and other notifications by clicking Self Service > My Profile > Notification Preferences and following these instructions.
- When a problem is assigned to a group, members of that group will automatically be notified by email.
- Aside from the actions you take to discover the root cause of the problem and resolve it, you should document your findings in the Work Notes field and, as the nature or scope of the disruption becomes clearer, the Description.
- If you discover a workaround that might allow users to continue using the affected service, enter the steps in to the Workaround field and use the Communicate Workaround link to distribute that information to end users (see below). At this time, it may be appropriate to resolve Incidents associated with the problem.
- Conduct root cause analysis: What are the underlying factors that caused the disruption and how could/will they be avoided in the future? This is perhaps the most important part of the problem management process, since the information may help users avoid the problem in the future. Creating a knowledge article is instrumental in making such information accessible to both the Service Desk and end users.
- Once you've resolved the issue, documented the root cause and resolution, drafted a knowledge article, the last step is to resolve the problem. Clicking "Close Problem" will not only close the problem record, but will resolve all open related incidents as well.
- Short Description: Ideally, it should contain the name of the affected service and a very brief description of the problem. The information in this field will be included in all problem email and SMS notifications (see above), hence the need for brevity.
- Description: A clear, concise description of the problem written from the end user's perspective. The information contained in the field will appear in knowledge base articles (KBAs) and requests for change (RFCs) generated from this problem record so it should not be written haphazardly. If the problem was discovered via the Incident Management process, look to the related incidents for symptomatic information and summarize it in this field.
- Work Notes: As with incident, this field updates the Activity Log with troubleshooting actions and information from the individuals investigating the problem.
- Workaround: This field should describe temporary solutions or actions that end users can take to accomplish whatever tasks are being inhibited by the service disruption. There may be more than one viable workaround over the course of investigating a problem and a history of past workarounds is displayed on the form. Workaround text will often be sent to the end users and can appear in KBAs generated from a problem record so bear that in mind when drafting a workaround. The Communicate Workaround link will send the latest Workaround to each caller of any incidents related to the problem record.
- Resolution: This is the solution or fix to the disruption, which sometimes could be to do nothing or just mitigate the problem if a solution is perhaps too costly or not resolvable by the organization. The resolution text will be distributed to the end users, so it should be written with them in mind -be thorough, yet still clear and concise. Overly-technical language should be saved for the Root Cause (see below). This field must be completed before a problem can be closed.
- Root Cause: The root cause of the problem is the underlying cause of the disruption so this field should contain a technical overview of the cause along with enough detail to have a basic understanding of what caused the disruption. This field must be completed before a problem can be closed.
- RFC: A request for change (RFC) may be initiated to resolve a problem and can be created or opened from the problem form.
- Incidents: Any number of opened or resolved incidents may be related to a problem record. As noted above, Workarounds can be easily sent to each incident's caller and when a problem record is closed, all related incidents are resolved with the problem Resolution being automatically sent to each caller.
- Knowledge: One goal of the problem management process is to identify known errors and record them as knowledge base articles (KBAs) in the knowledge base. A KBA can be created using either or both of the following links on the problem form:
- Post Knowledge (available to anyone): This creates a new KBA in the draft state under the "Known Error" topic that includes both the Description text and the latest Workaround (both fields are required to use this feature). Articles created in this manner follow the standard knowledge review process.
- Post News (knowledge editors only): This feature creates a new KBA in the published state under the "News" topic, which means they will appear on the Self Service > Knowledge page under the News dashboard widget. Description and Workaround next will be included automatically in the article so those fields must be completed before this feature may be used.