Common Sense Software Engineering – Part III; Risk Analysis

Steve Naidamast

5.00/5 (3 votes)

Sep 30, 2015

CPOL

19 min read

8664

The following piece describes a process for performing “Risk Analysis”, also known as “Risk Management”.

The following piece describes a process for performing “Risk Analysis”, also known as “Risk Management”. What the reader will find is that contrary to popular development paradigms, true software engineering practices require quite a bit of upfront analysis for new project development as the prior piece on “Requirements Analysis” demonstrated.

In the frenzy of the so called “new development environments”, many technical managers as well as professional developers have attempted and still are attempting to find techniques that will allow them to avoid such in-depth processes and yet still create quality software deliverables. No matter how much marketing, PR, and other technical propaganda is thrown over the issue of quality analysis, without it, quality will never be part of the end result.

On another note, the reader will note that Steve McConnell and other software engineering analysts of years ago are relied upon for this work as has been the same with the other articles. Steve McConnell’s 1996 classic, “Rapid Development”, to this day has never been refuted and is still in fact being corroborated by subsequent studies in this arena. As a result, for many quality business technical personnel, it is still considered the “Bible” of software engineering.

>>>

One of the most important aspects of the management of any software project, large or small, is the management of risk. “Risk” in terms of project management, is any situation that either prevents a project from reaching a successful and timely conclusion or interferes to delay a well-run project towards that same conclusion.

Overwhelmingly, IT project management ignores this very crucial aspect of software development. As a result, project development schedules are affected negatively with the same managers putting pressure on their staff to make up for lost time, when it should have been properly planned for.

Every software project is faced with risks of all types and severities. In fact, any endeavor is faced with similar potential obstacles to completion. If you attempt to climb a mountain you risk the possibility of breaking a limb and in the more treacherous climbs, your life.

Even simply commuting or driving to work in the morning presents commuters with any number of potential risks from simple schedule delays to derailment. Though such situations tend to have low levels of occurrence they do in fact happen.

Every software project when initiated always begins with one primary risk to timely completion; that of technical support for setting up development resources when needed as well as moving the completed project to production. Though many companies have extensive technical support infrastructures, their response to development requirements are always measured against the needs of the production environment making all such requests difficult to measure against for scheduling constraints. Production support is somewhat better. However, here too, like development, it is also dependent on who is assigned to your task and their experience level.

By no means does this mean that every project will experience technical support delays but it can happen and must be planned for accordingly.

Planning for project risk usually falls into one of five categories with the most credible and most valuable finding itself relegated to the “bottom of the heap”:

Crisis Management

“Firefighting”; addresses risks after only they have become problems

Fix on Failure

Detect and react to risks quickly but only after they have occurred

Risk Mitigation

Plan ahead of time to provide resources to cover risks if they occur,

but do nothing to eliminate them initially

Prevention

Implement and execute a plan as part of the software project to identify

risks and prevent them from becoming problems

Elimination of Root Causes

Identify and eliminate factors that make it possible for risks to exist at all

Planning for the elimination of potential risk factors is by far the best cost and time saving process a software project manager can possibly infuse into his or her planning of a project. To ignore this most valuable asset for any other level of risk management is simply to increase the “gambling” quotient against a project. Here, a project manager begins to enter the “wishful thinking” syndrome as he or she gambles against the chance that a risk factor will not crop up that can’t be handled properly.

Risk management, outside of schedule estimation, is the most complex issue facing software project managers. Risk management, also known as “risk assessment”, requires quite a bit of up front effort prior to the beginning of any technical aspects of a project. As such, it is an effort that most managers would prefer to avoid and the still high numbers of project failure in the industry demonstrate this.

This effort can then be broken down into the following components which the following chart demonstrates:

(Rapid Application, Steve McConnell – 1996)

Click Image to Enlarge

The most common project risks also fall into the “classic mistakes” category and can be summed up in the following list:

Feature creep
Requirements or developer gold-plating
Shortchanged quality
Overly optimistic schedules
Inadequate design
Silver-bullet syndrome
Research oriented development
Weak personnel
Contractor failure
Friction between developers and users\customers

Risks to a successful project life-cycle can come in all different sizes and shapes making software development one of the most risk-adverse technical endeavors. Some observers who have studied the field believe that gambling in a Las Vegas casino offers better odds at success than those currently proffered for the completion of a project successfully.

As a result, any project that has no risk management planning implemented is guaranteed to fail and those that don’t would have simply succeeded by sheer luck. However, it is surprising to find out just how many project managers who have succeeded with a few small projects tend to use them as their yardstick when planning more complex and difficult projects.

It is also just as surprising to see technicians come out of very difficult development environments convinced that the arduousness of the scheduling and the monumental stress are actually part and parcel to successful development stratagems when instead such environments are simply examples of poor planning, poor scheduling practices, and poor management.

Though not listed in the preceding group, stress is actually a terrible risk to quality software development and has been found to be a major factor in high rates of defects; sometimes up to 40 percent of errors are due to the stress levels placed on developers.

Even so, many managers and even developers have come to believe that the “risk-factor” of stress is simply part and parcel to the software industry and so much so it is not even considered as a risk-factor when planning projects. Since management likes to portray high stress levels with a reduction in costs and development time many technicians have come to believe that without such working conditions projects cannot be completed on time. However, just the opposite has been found; stress instead increases cost and the time it takes to complete a given project and for a variety of reasons. For example, high levels of stress reduce developer creativity, lowers morale, lowers the ability to concentrate, and simply fatigues technicians to a point where they cannot produce optimally (see chapter 9.1, “Overly Optimistic Scheduling”, “rapid Development, Steve McConnell – 1996).

Most such stress comes from schedules determined by management without any understanding of what they are asking. And again we can turn to the financial organization in the trading room where we often find stress levels that are so enormous that most often technicians leave after around 12 to 18 months. High turnover, as a result, drives up the costs to such companies and their project development dramatically but few give it any thought and in fact even plan it into their budgets.

Despite the reputations of the quality of the technicians that work for American financial organizations, such organizations that do engage development with high levels of stress do not turn out high quality products because they simply can’t.

Excessive or irrational schedules are probably the single most destructive influence in all of software (Jones – 1991, 1994).

To get an idea just how many risk factors are involved in software development beyond the few already mentioned, please take a look at the detailed risk-factor chart that follows …

Potential project risks that will affect schedule…

Risk Type	Risk Detail
Schedule Creation	Schedule, resources, and project definition have all been dictated by the customer or upper management and are not in balance.
	Schedule is optimistic, “best case” (rather than realistic, “expected case”)
	Schedule omits necessary tasks
	Schedule was based on the use of specific team members, but those team members were not available.
	Cannot build a project of the size specified in the time allocated.
	“Product” of the project is larger than estimated (in lines of code, function points or percentage of a previous project’s size).
	Effort is greater than estimated.
	Re-estimation in response to schedule slips is overly optimistic or ignores project history.
	Excessive schedule pressure reduces productivity.
	Target date is moved up with no corresponding adjustments to the project scope or available resources.
	A delay in one task causes cascading delays in dependent tasks.
	Unfamiliar areas of the product take more time than expected to design and implement.
Organization & Management	Project lacks an effective top-management sponsor.
	Project languishes too long in fuzzy front end.
	Layoffs and cutbacks reduce team’s capacity.
	Management or marketing insists on technical decisions that lengthen the schedule.
	Inefficient team structure reduces productivity.
	Management review/decision cycle is slower than expected.
	Budget cuts upset project plans.
	Management makes decisions that reduce the project team’s motivation.
	Nontechnical third-party tasks take longer than expected (budget approval, equipment purchase approval, legal reviews, security clearances, etc).
	Planning is too poor to support the desired project speed.
	Project plans are abandoned under pressure, resulting in chaotic, inefficient development.
	Management places more emphasis on heroics than accurate status reporting, which undercuts its ability to detect and correct problems.
Development Environment	Facilities are not available on time.
	Facilities are available but inadequate (e.g., no phones, network wiring, furniture, office supplies, etc.).
	Facilities are crowded, noisy or disruptive.
	Development tools are not in place by the desired time.
	Development tools do not work as expected; developers need time to create workarounds or to switch to new tools.
	Development tools are not chosen based on their technical merits and do not provide the planned productivity.
	Learning curve for new development tool is longer or steeper than expected.
End-Users	End-user insists on new requirements.
	End-user ultimately finds product to be unsatisfactory, requiring redesign and rework.
	End-user does not buy into the project and consequently does not provide needed support.
	End-user input is not solicited, so product ultimately fails to meet user expectations and must be reworked.
Customer	Customer insists on new requirements.
	Customer review/decision cycles for plans, prototypes and specifications are slower than expected.
	Customer will not participate in review cycles for plans, prototypes and specifications or is incapable of doing so – resulting in unstable requirements and time-consuming changes.
	Customer communication time (e.g., time to answer requirements-clarification questions) is slower than expected.
	Customer insists on technical decisions that lengthen the schedule.
	Customer micro-manages the development process, resulting in slower progress than planned.
	Customer-furnished components are a poor match for the product under development, resulting in extra design and integration work.
	Customer-furnished components are poor quality, resulting in extra testing, design and integration work and in extra customer-relationship management.
	Customer-mandated support tools and environments are incompatible, have poor performance or have inadequate functionality, resulting in reduced productivity.
	Customer will not accept the software as delivered even though it meets all specifications.
	Customer has expectations for development speed that developers cannot meet.
Contractors	Contractor does not deliver components when promised.
	Contractor delivers components of unacceptably low quality, and time must be added to improve quality.
	Contractor does not buy into the project and consequently does not provide the level of performance needed.
Requirements	Requirements have been baselined but continue to change.
	Requirements are poorly defined and further definition expands the scope of the project.
	Additional requirements are added.
	Vaguely specified areas of the product are more time-consuming than expected.
Product	Error-prone modules require more testing, design and implementation work than expected.
	Unacceptably low quality requires more testing, design and implementation work to correct than expected.
	Pushing the computer science state-of-the-art in one or more areas lengthens the schedule unpredictably.
	Development of the wrong software functions requires redesign and implementation.
	Development of the wrong software interface results in redesign and implementation
	Development of extra software functions that are not required (gold-plating) extends the schedule
	Meeting product’s size or speed constraints requires more time than expected, including time for redesign and reimplementation
	Strict requirements for compatibility with existing system required more testing, design and implementation than expected
	Requirements for interfacing with other systems, other complex systems, or other systems that are not under the team’s control result in unforeseen design, implementation and testing
	Requirements to operate under multiple operating systems takes longer to satisfy than expected
	Operation in an unfamiliar or unproved software environment causes unforeseen problems
	Operation in an unfamiliar or unproved hardware environment causes unforeseen problems
	Development of a kind of component that is brand new to the organization takes longer than expected
	Dependency on a technology that is still under development lengthens the schedule
External Environment	Product depends on government regulations, which change unexpectedly
External Environment	Product depends on draft technical standards, which change unexpectedly
Personnel	Hiring takes longer than expected
	Task prerequisites (e.g. training, completion of other projects, acquisition f work permit) cannot be completed on time
	Poor relationships between developers and management slow decision making and follow through
	Team members do not buy into the project and consequently do not provide the level of performance needed
	Low motivation and more reduce productivity
	Lack of needed specialization increases defects and rework
	Personnel need extra time to learn unfamiliar software tools or environment
	Personnel need extra time to learn unfamiliar hardware environment
	Personnel need extra time to learn unfamiliar programming language
	Contract personnel leave before project is complete
	Permanent employees leave before project is complete
	New development personnel are added late in the project, and additional training and communications overhead reduces existing team members effectiveness
	Team members do not work together efficiently
	Conflicts between team members result in poor communication, poor designs, interface errors and extra rework
	Problem team members are not removed from the team, damaging overall team motivation
	The personnel most qualified to work on the project are not available for the project
	The personnel most qualified to work on the project are available for the project but are not used for political or other reasons
	Personnel with critical skills needed for the project can not be found
	Key personnel are available only part time
	Not enough personnel are available for the project
	People’s assignments do not match their strengths
	Personnel work slower than expected
	Sabotage by project management results in inefficient scheduling and ineffective planning
	Sabotage by technical personnel results in lost work or poor quality and requires rework
Design & Implementation	Overly simple design fails to address major issues and leads to redesign and reimplementation
	Overly complicated design requires unnecessary and unproductive implementation overhead
	Poor design leads to redesign and reimplementation
	Use of unfamiliar methodology results in extra training time and in rework to fix first time misuses of the methodology
	Product is implemented in a low-level language (e.g. assembler) and productivity is lower than expected
	Necessary functionality cannot be implemented using the selected code or class libraries; developers must switch to new libraries or custom-build the necessary functionality
	Code or class libraries have poor quality, causing extra testing, defect correction, and rework
	Schedule savings from productivity enhancing tools are overestimated
	Components developed separately cannot be integrated easily, requiring redesign and rework.
Process	Amount of paperwork results in slower progress than expected
	Inaccurate progress tracking results in not knowing the project is behind schedule until late in the project
	Upstream quality assurance activities are shortchanged, resulting in time-consuming rework downstream
	Inaccurate quality tracking results in not knowing about quality problems that affect the schedule until late in the project
	Too little formality (lack of adherence to software policies and standards) results in miscommunications, quality problems and rework
	Too much formality (bureaucratic adherence to software policies and standards) results in unnecessary, time-consuming overhead
	Management-level progress reporting takes more developer time than expected
	Half-hearted risk management fails to detect major project risks
	Software project risk Management takes more time than expected.

Estimating Risk Exposure

Estimating risk exposure is a rather subjective form of analysis but nonetheless must be performed in order to be able to understand the severity of risk factors to a project. However, even subjectivity can be made more accurate by the use of basic risk exposure analysis.

Risk exposure analysis is comprised of two estimations; that of the size of potential loss (in time) from an identified risk-factor and the corresponding probability that the loss will actually occur. A simple formula is standard in the industry for determining a risk-exposure calculation:

(% probability of loss) * (size of loss in weeks) = risk exposure factor

For example, if we were to use any development project at one major financial organization that the author worked at a number of years ago as a baseline, we would have to assume that there is upwards of a 75% probability of some impact on a project by delays from technical support. If we assume a conservative estimate of a loss in time of approximately 4 weeks than the risk exposure would be as follows:

.75 * 4 = 3 (weeks)

The result is a risk exposure factor of a possible loss of 3 weeks. Since we are only estimating a 75% probability of this occurring we are then not expecting to lose the complete 4 weeks in time.

Estimating the size of a loss is much easier than doing the same for the probability of a loss. As a result, there are several possible activities that can be performed to more accurately determine this part of the equation.

Have the person most familiar with the system as well as the development environment and political infrastructure estimate the probability of each risk and then convene a risk-estimate review.
Use the Delphi approach where each project team-member estimates each risk-factor individually. Then convene risk-estimate reviews to discuss and determine the most likely probability of each risk-factor until the entire team is satisfied with a final analysis.
Use simple “betting analogies” with personally significant amounts of money. For example, “If the facilities are ready on time you would win $125.00, if they are not you would lose $100.00. The risk probability then would be the dollar amount on the downside divided by the total dollar amount at stake ( 100.00 / (100.00 +$125.00) = .44 ). (paraphrased from “Rapid Development” Steve McConnell – 1996)
Use “adjective calibration” where each team member would select a risk level in terms of a verbal scale of phrases from “highly likely” through “highly unlikely”. Then convert the verbal assessments to quantitative assessments (Boehm – 1989).

These methodologies may look rather unscientific. However, such estimation initially has no basis in fact so such a process would be only a “best estimate” at the time this is accomplished. Estimating the effort and time for a complete project life-cycle is theoretically impossible at initiation unless there is a good amount of metrics available from previous projects that allow a new project to be compared with the measurements of a similar project that has already been accomplished.

Once a list of risk factors have been determined along with their loss-probabilities, loss sizes in weeks, and risk exposure (RE) percentages, a sorted listing based upon the RE should be laid out for planning purposes. A sample of such a listing can be found below…

Example of a Prioritized Risk-Assessment Table

Risk	Probability of Loss	Size of Loss (wks)	Risk Exposure (wks)
Addition of unknown features	35%	8	2.8
Overly optimistic schedule	50%	5	2.5
Inadequate design –redesign required	15%	15	2.25
New programming tools do not produce promised savings	30%	5	1.5
Additional requirements	5%	20	1.0
New graphics sub-system unstable	25%	4	1.0
Project approval takes longer than expected	25%	4	1.0
Late delivery of vital support components	10—20%	4	0.4-0.8
Facilities not ready on time	10%	2	0.2
Management-level reporting takes more developer time than expected	10%	1	0.1

Setting up the table in the above sorted manner produces a listing of risks that are ordered in terms of exposure, the greatest amount of exposure topping the list. If the top five risks in the above table were to be planned for successfully, 9.8 weeks of scheduled overruns could be eliminated.

The table above also provides information on those risks that could be considered low priority as a result of their minimal chance of occurrence. By documenting such information in such a way then, time wasted on planning for low-exposure risks is also eliminated.

It should be noted that “risk assessment as the chart above demonstrates, is only a subjective estimation of the possibilities that can affect the outcome of a project. Because such an assessment is performed in a subjective manner the accuracy is completely dependent on the quality of the input given to it. The more effort on both the part of the project manager as well as the team given to this process, the more likely the resulting assessments will be within the assigned time ranges thus providing a clearer understanding as to how much effort could be saved by proper planning for such project delays. However, it cannot be stressed enough that no matter how well thought out and performed, project risk assessment will always remain a “best estimate” scenario.

The question then arises after such a process has been completed is how one goes about managing such risk factors. If we were to apply risk resolution to the top 10 risk factors (see page 32) that could affect any software development project then the common approaches and recommendations for controlling such problems can be found in the table below.

Means of Controlling The Most Common Schedule Risks

Risk	Means of Control
Feature creep	Use customer oriented approaches Use incremental development practices Control the feature-set Design for change
Requirements gold-plating or developer gold-plating	Scrub requirements Timebox development Feature-set control Use of staged-delivery Use throw-away prototyping Design to schedule
Shortchanged quality	Allow time for QA activities and use QA fundamentals
Overly optimistic schedules	Use multiple estimation practices, multiple estimators, and automated estimation tools Use principled negotiation Design to schedule Use incremental development practices
Inadequate design	Use an explicit design activity and schedule enough time for it Hold design inspections
Silver-bullet syndrome	Be skeptical of product claims Set up a software tools group
Research oriented development	Don’t try to do research and maximize development speed at the same time Use risk-oriented life-cycle
Weak personnel	Staff project with top talent Recruit and schedule team members long before project begins Training Teambuilding
Contractor failure	Check references Assess contractor’s ability before hiring Actively manage relationship with the contractor
Friction between developers and customers	Use customer-oriented practices