Keywords

1 Introduction

Information Technology (IT) service providers compete to win high valued IT service contracts [1, 2]. Typically, clients ask for proposals that provide the pricing of particular services. Then, service providers prepare the deal pricing, submit their proposals, and enter a deal bidding process trying to win the contract.

The kind of services included in these deals are often complex, high valued, and very hard to quote [3]. Examples of these services include account management, storage systems, databases, and migrating the client infrastructure to the cloud.

Traditional practical approaches price deals via what we call the “bottom up” approach. This approach involves estimating the cost of individual activities at a granular level that together form a total cost for each individual IT service. Later, a markup/gross profit margin is added either to each service separately or to the sum of costs of all services in order to reach the overall price of the deal. Once this price is reached, usually solution designers assess the competitiveness of the solution by comparing it to historical deals and market data. Akkiraju et al. [3] provides a methodology for such assessment.

Note that services in this type of business are usually hierarchal. The first highest level of each service is always decomposed into smaller levels. For instance, end user is one of the common high level service that are usually further decomposed into hardware for end users, end user refresh (which refers to users who would get refresh/replacement for their assets),…etc. Hardware for end users can be further decomposed to different hardware devices,…. and so on. For the bottom up approach to work, a detailed solution with all levels of all provided services needs to be prepared and costed. This can be time consuming and solution designers might not know all the detailed requirements at all service levels in early stages of the bidding process. Thus, there is a need to come up with an alternative “agile” approach that estimates prices of these complex IT deals with minimum information based on a typical scope of such services, e.g., using the highest level of services to be offered in the deal. “Agile” here refers to needing fewer inputs and coming up with prices very fast.

In this paper, we provide such an approach in a framework that consists of two parts. In the first part, we develop a calculation logic algorithm that comes up with the costing and pricing of the high level services included in a deal based on both market and historical data. While in the second part, we construct a methodology for predicting the relative win probabilities of these different price points. Using this “top-down” approach, solution designers can very quickly price complex deals and assess the relative chances of winning these deals for different pricing points. Figure 1 illustrates a comparison between the two approaches.

Fig. 1.
figure 1

Description of our proposed top down approach versus the existing bottom up approach

Additionally, service providers traditionally use market data collected from market consultancy companies to benchmark pricing generated from bottom up solutioning. In the first part of our framework, not only do we automate this process in an algorithmic fashion by estimating the cost and price directly using such market data, but we also statistically show that mining historical data in our top down approach might be more realistic in costing services compared to purely using market benchmark data. In opposed to market data, historical data is stored data of past deals for the same service provider.

Therefore, the contribution of this paper is three-fold. We develop an approach that both enables solution designers to determine price points of complex deals as well as assess the relative chances of winning these deals. Furthermore, we introduce the notion of assessing the cost of complex IT deals based on historical data and show that it could be effective for accurately costing some services compared to using the more traditional method of using market data.

The rest of this paper is organized as follows: in Sect. 2, we review the literature of related work. We present our approach with its two parts in Sect. 3, and then illustrate some results in Sect. 4. Section 5 ends the paper with the conclusions and some recommendations for future work.

2 Related Work

The term “service” is often used to identify online services (e.g. web services) in the area of service-oriented computing. We note that “service” in this paper is broader than online services in the sense that our “services” identify and cover various constructs of IT services that include labor cost, management cost such as building customer relations and monitoring customer satisfactions, and other operational cost of human activities provided by IT solution venders/providers. Gamma et al. [4] describe an Information Technology Infrastructure Library (ITIL) which provides a service design and catalog approach to organize service solutions. The service solutions that we study in this paper are organized using a particular taxonomy that follows the approach of our previous work in Akkiraju et al. [4] where one solution (deal) consists of a structure of particular hierarchical services. The top level is the highest level for the services and each service at that level is further decomposed into lower levels.

Pricing services has long been discussed in the literature of marketing and business management. Researchers have developed different pricing methods based on several pricing objectives. Avlonitis et al. [5, 6] summarized three categories of pricing methods linked with pricing objectives based on interviews of 170 service sector companies; (1) Cost based methods – where a profit margin is added on the cost, (2) Competition based methods – where pricing is done according to the market’s average prices or relevant to the competitors prices, and (3) Demand based pricing – in which the price is set to satisfy customer’s needs. Their results showed that the cost based method is the most adopted approach in their sample. However, it also showed that when the pricing objectives are associated with both the competitors and the customers, the possibility of adopting the market’s associated pricing method increases. In [7], Tawalbeh stated an empirical study of a mobile service provider in which pricing was traditionally driven by a cost based method. The study concluded that service providers should focus more on market oriented pricing when the provider’s pricing objectives are profit, market share, and sales maximization.

Multiple previous papers illustrated different methods of pricing some IT services. For instance, Basu et al. [8] developed a managerial guideline for pricing cloud offerings. Their approach models the utility of customers as a function of some parameters which have positive and negative effects on that utility. However, this paper (and its references) focuses on a particular service rather than the special case that we study here for deals composed of several complex IT services.

In the area of service sciences, several papers have been published investigating the assessment of “winnability” of IT services deals. The main relevant papers in this area are our previous works of Greenia et al. [1] and Megahed et al. [2]. In the former paper, the authors developed a predictive analytics model for predicting the winnability of in-flight deals, that is, deals that the service provider has already submitted the bid on. In the latter work, a similar predictive analytics model was developed but its focus was on predicting winnability of deals in an earlier stage, i.e., before submitting the bid. The main conclusion in both works is that the bidding price is not the only factor affecting the chances of winning IT services deals. Other attributes, such as competition, the type of client, and client’s geographical location, have a statistically significance on the prediction of winnability. We incorporate these findings in the second part of our approach for pricing such IT deals. In the next Section, we develop our approach and explain its different pieces.

3 Methodology

As indicated in the introduction, our top down approach consists of two parts. In the first part, we provide a calculation logic that uses both historical data and market data in order to estimate service costs and prices of a deal using information about the highest level services included in such deal. Then, because the bidding price is not the only factor affecting the chances of winning a deal [1, 2], we adopt a predictive analytics model in the second part of our approach to come up with the relative probability of winning the deal corresponding to each price point. Below, we first start with some definitions and then we describe the two parts of our approach.

3.1 Definitions

We first differentiate between two categories of services that are included in any deal: (a) regular services (referred to as services below): for which cost will be independent of other services. Regular services are also services that have baselines/units. Examples of regular services are databases and end user, where the baseline for each is number of databases and number of end users, respectively- and (b) common services: for which the cost is dependent on the different regular services included in the deal. An example for common services is account management, where its cost is determined depending on the costs/amounts of all regular services included in the deal since each of these services would need some account management.

Let D be the set of deals (historical and market data deals). We then define any deal \( d \in D \) by the tuple of sets (Meta Information, Services, Common Services) where Meta Information is the set of the meta-data of the deal, namely:

$$ Meta \, Information = \left\{ {Deal \, Outcome, \, Contract \, Year, \, Geography, \, Industry} \right\} $$

where the Deal Outcome is either won or lost. Losing a deal might be due to another competitor winning it, it might be the case that the client decided to not pursue it anymore, or it might be because the service provider decided to not continue bidding on it. Contract year is the calendar year at which delivery of the services will begin. Geography and industry are the geographical location and industry of the client, respectively. The Meta Information can also be extended to include additional attributes of the deal. Remember that we are modeling this whole problem with respect to the service provider.

Services is the set of regular services, Services = {Service 1 , ……, Service i , ….., Service M }, where M is the cardinality of the set Services. Similarly, Common Services is the set of common services, Common Services = {Common Service 1 , ……, Common Service j , ….., Service N }, where N is the cardinality of the set Common Services.

Following the definitions of regular and common services, we define any regular service \( Service_{i} \in Services \), where \( i = \{ 1, \ldots ,M\} \), by the tuple (Baseline, Cost , Price). Baseline of a regular service Service i is the unit/measure of the amount of the service provided by the IT service provider to a client. Refer to the two examples of databases and end user above. Cost is the total cost of Service i and Price is the price of Service i . Any common service \( Common Serivce_{j} \in Common Services \) is defined by the tuple (Percentage of Total Cost, Cost , Price), where Percentage of Total Cost is the cost of that common service as a percentage of the total deal cost, Cost is its total cost, and Price is its total price. Note that the total deal cost/price is the sum of the costs/price of all regular and common services. Also, note that the cost is what the service provider pays to provide the service (cost of labor, hardware,…etc.), while the price is what is included in the bidding price, i.e., it is the cost with some profit margin (gross profit added to it).

Let us specify any scenario S as a new deal for which a solution designer needs to estimate its target price/cost. We assume that the following are given for such scenario: the elements of the sets Services s , Common Services s , the values of Baselines for each \( Service_{i} \in Services_{s} \) and the elements of the set Meta Information s, where each set is given the index s to specify that it is related to scenario S. Our target is to estimate the Cost and Price for each element of the sets Services and Common Services, and thus the total cost and price of Scenario s . The next Subsection details our methodology of calculating these.

3.2 Peer Selection and Calculation Logic

Peer Selection. The first step in our calculation logic is to select peer deals. That is, we load historic and market benchmark data from the IT service provider’s databases and carry out the deal selection at two stages. In the first stage, we use the Meta Information s (Deal Outcome, Contract Year, Geography, and Industry) of the scenario to filter in the deals that have the matching values for respective fields. The reason behind this is that each of these fields is a characteristic of the deal and affects the cost of delivering each of its services. For instance, a service delivered from Asia is likely to have a different delivery cost compared to a service delivered to North America. Similarly, delivery a service in 2015 is likely to happen at a different cost compared to delivering the same service in 2016. Then, for each service in the set Services or Common Services, we filter out the peer deals that do not have that service. Thus, we have a different set of peers for each service of our scenario. The second stage of deal selection is to order these deals according to some criteria that we explain below.

Since there are two types of data sources - historic and market benchmark, peer selections are done separately for each source so that cost computation for a scenario’s services can be computed in two perspectives. This is what we referred to above as the two different price points that we calculated using our approach.

We also ensures that a minimum required number peer historical deals for each service of a scenario exist in the database; if not, we report that no data is found for the historical deals prospective; so as not to report inaccurate results. A solution designer will specify the minimum threshold for the required number of peers, for each scenario. However, we do not specify that minimum number of peers for market data, as usually, there are a few market data/standard deal(s) for each service-Meta Data combination.

Sorting of Selected Peers for a regular service. \( \forall {\text{Service}} \,i \in Services_{s} \), the sorting criteria of the set of peers selected for that service that we adopt is based on baselines proximity. \( \forall i \in Services_{s} \), let Baseline Proximity dsi be the baseline proximity between deal \( d \in D \) and scenario S for service \( i \in Services_{s} \). We define Baseline Proximity dsi as follows:

$$\begin{aligned} Baseline \, Proximity_{dsi} =& |Baseline \, for \, Service_{i}~of \, deal \, d~\\ & {-}~Baseline \, for \, service \, i \, of \, scenario \, s| \end{aligned}$$

That is we assume that a deal and scenario are similar with respective to a service if the difference between the baselines is small. This assumption is justified because unit costs (which we will use below from peers to calculate the costs of our scenario) are typically similar for deals with similar/close proximity. That is because baselines define the complexity of the service and the variation of the unit costs for the same service across different deals is related to the quantity (baselines) of that service in each deal. The reason behind this is that service providers can usually achieve some kind of a quantity discount on unit costs for larger quantities. There is no set function that relates such quantity discount to the baselines and thus we adopt the baseline proximity to account for all that. Therefore, the outcome of the deal selection at service level is a set of similar deals, which are ordered based on their proximity value.

Sorting of Selected Peers for a common service. We sort peer deals for common services according to a different proximity. Let that proximity by Common Service Proximity dsj (the proximity between deal \( d \in D \) and scenario S for common service \( j \in Common Services \)). Since common services do not have baselines and since they are related to the overall cost of regular services, we base that proximity on the total cost of regular services. That is:

$$ {\begin{aligned} \,Common \, Service \, Proximity_{dsj} =& |Sum \, of \, Costs \, of \, regular \, services \, for \, deal \, d\\ &{-} Sum \, of \, costs \, of \, regular \, services \, for \, our \, scenario \, s| \end{aligned}}$$

We note that in order for us to calculate the above proximity, we first have to have calculated the costs for regular services of our scenario, which we show in the next calculation step.

Lastly, we set a maximum threshold T on the set of peer deals. Typically for market data, the threshold is 1, while for historical data, this can be set by the solution designer. We then use the top T peers in each ordered set of peers for each service to do our calculations below.

Calculation Logic. We here show how we calculate the costs for each service in the two sets of regular and common services for both the historical data and market data prospectives. Note that the cost calculation for each service is performed for each year of the total number of contract years of a scenario.

Cost calculation for Regular Services of a Scenario. For each regular service \( i \in Services_{i} \), we first compute the unit cost of that service in each of its peer deals by dividing the cost of that service by its baselines. Then, we retrieve the l th Percentile of these unit costs. Typically, one would use the median, but the solution designer can choose any arbitrary value for l. The rationale behind using the percentile is to allow the user to adjust for the complexity of the service if not captured by the chosen peers, i.e., if the unit costs of the selected peers are too diverse, then the user can input a percentile that is related to the complexity of the service that he/she might know and that we cannot capture automatically/algorithmically. Let us call the resulting unit-cost for service i : Unit-costi

Let Baseline i be the baseline of service i of our scenario. Therefore, the cost of the regular service \( i \in Services_{s} \) for our scenario S, Cost s,i , is computed as:

$$ Cost_{s, \, i} = Unit {\text{-}} cost_{i} * \, Baseline_{i} $$

Cost Calculation for Common Services of a Scenario. Since the costs of common services are related to the costs of regular services, for each common service \( j \in Common Services_{s} \), Cost s,j (which is the cost of service j for our scenario S) will be computed as follows:

For each service \( j \in Common Services_{s} \), we calculate the percentage of the cost of that service to the overall cost of the deal for each peer deal in the ordered list of peer deals for that service. Then, we again apply an arbitrary percentile to these percentages to get the percentage of that service to that the total cost of our scenario S. We call the resulting percentage value of a common service \( j \in Common Services_{s} \) P s, j

Now, the total cost of all services in our scenario S is

$$ SUM_{s,all} = \mathop \sum \limits_{{j \in Common \,Services_{s} }} Cost_{s,j} + SUM_{s,reg} $$

Where SUM s,all is the total cost of the scenario (sum of the costs for all services; both regular and common ones), SUM s,reg is the sum of the costs for the regular services. Now we have that for each \( j \in Common Services_{s} \) in our scenario S:

$$ Cost_{s,j} = SUM_{s,all} *P_{s,j} $$

We thus transform the above set of linear equations to a standard format as:

$$ ]{\begin{aligned} \left( {{\text{P}}_{{{\text{s}}, 1}} - 1} \right) \, *{\text{ Cost}}_{{{\text{s}},{ 1}}} + {\text{P}}_{{{\text{s}},{ 1}}} *{\text{ Cost}}_{{{\text{s}},{ 2}}} + \ldots + {\text{P}}_{{{\text{s}},{ 1}}} *{\text{ Cost}}_{{{\text{s}},{\text{ J}}}} = -{\text{ P}}_{{{\text{s}},{ 1}}} *{\text{ SUM}}_{{{\text{s}},{\text{ reg}}}} \hfill \\ {\text{P}}_{{{\text{s}},{ 2}}} *{\text{ Cost}}_{{{\text{s}},{ 1}}} + \left( {{\text{P}}_{{{\text{s}},{ 2}}} - 1} \right) \, *{\text{ Cost}}_{{{\text{s}},{ 1}}} + \ldots + {\text{P}}_{{{\text{s}},{ 2}}} *{\text{ Cost}}_{{{\text{s}},{\text{ J}}}} = -{\text{ P}}_{{{\text{s}},{ 2}}} *{\text{ SUM}}_{{{\text{s}},{\text{ reg}}}} \hfill \\ \ldots \ldots \ldots \qquad\qquad\qquad\qquad\qquad\qquad \hfill \\ \hfill \\ {\text{P}}_{{{\text{s}},{\text{ j}}}} *{\text{ Cost}}_{{{\text{s}},{ 1}}} + {\text{P}}_{{{\text{s}},{\text{ j}}}} *{\text{ Cost}}_{{{\text{s}},{ 1}}} + \ldots + \left( {{\text{P}}_{{{\text{s}},{\text{ j}}}} - 1} \right) \, *{\text{ Cost}}_{{{\text{s}},{\text{ J}}}} = -{\text{ P}}_{{{\text{s}},{\text{ J}}}} *{\text{ SUM}}_{{{\text{s}},{\text{ reg}}}} \hfill \\ \end{aligned}} $$

Where J is the cardinality of the set Common Services s . By using the Cramer’s rule [9], we solve the above equations to compute the cost of each common service per year.

Since the only difference in calculation steps between historical data and market data is that for market data we typically have a maximum of 1 (or a few) market deals, we do not apply the percentiles for calculating unit costs (for regular services) and unit percentages (for common services) for the market data calculations when that maximum threshold is 1. Other than that, everything else is exactly similar to historical data calculations.

In the Sect. 4, we show the usefulness of using historical data in addition to the more traditional adoption of market data through some numerical experiments.

Now, adding up the costs of both regular and common services, we reach the estimated cost of a deal, for each of the historical data and market data cases. Then, by adding a chosen arbitrary gross profits (GPs) to the cost, we get different price points. Our overall approach, as can be seen from the details in the previous subsections, uses a minimal amount of inputs from the user and generates prices very fast, and thus is “agile” as required by modern business practices in this type of industry.

To assess the relative chance of winning the deal corresponding to each price point, we use a win prediction model discussed in the next subsection.

3.3 Win Prediction

We use the predictive analytics model developed in our earlier work in [2]. The model is based on the well-known naïve Bayes classifier. We refer the reader to references [10, 11] for an explanation of the naïve Bayesian model. The factors included that were shown to be significant are some of the deal attributes, in addition to some derived parameters. Beside the bidding price, we summarize the other significant factors used in the model as follows:

Complexity of the Deal. Complexity is determined based on the number and effort of delivery of the offered services to the client.

Global Versus Local. Deals are global if the services will be delivered to multiple countries. Local deals are ones in which services are delivered to one or two countries of close proximity (e.g., Australia and New Zealand).

Key Services Delivery Executive. Deals are sometimes assigned to a delivery executive responsible for the delivery of services after contract signing. The parameter here is whether a delivery executive is assigned early on for the deal or not.

Third Party Advisor. A third party advisor is used by some clients. The parameter here is whether the client has such advisor or not.

Contract Length. The number of years of the deal delivery.

Client-Market Segmentation. Clients are classified based on size, market audience, and market potential.

Number of Competitors. This is a count of the number of other service providers competing to win the same deal.

Competitor Classification. Competitors are classified according to whether they provide cloud, software, and network, whether they are niche or consultant.

The model is fairly accurate. It produces an average accuracy of 86 % and 93 % on training and testing data, respectively. The idea here is that multiple copies of the deal that we are trying to price will be entered as testing data to the model. All copies share the same meta data/attributes, except for the bidding price. Each copy has a different price point, out of the ones we calculated above (as well as any user chosen price). Note also that since the GPs are arbitrary, multiple GPs can be applied to the calculated costs and more copies of the same deal can be added. The predictive model will then output a ranked list for these copies and provide a relative winning probability score for each price point. Using that way, we are able to quantitatively/analytically assess different bidding price points given the fact that price is not the only factor affecting winnability and we can thus get a chart like the one in Fig. 2. The chart shows different pricing options (cost + GP) with the corresponding relative winning probabilities. Figure 3 gives an overview of the architecture of our overall approach.

4 Numerical Results

In the bottom up approach, accurate costs of services in the IT deal are evaluated at the lowest levels and summed up to come up with the costs of services at the higher levels. In coming up with fast evaluations of the costs of services in the early stages of the bidding process, traditionally, solution designers use market data. In the first part of our approach described in the previous Section, we proposed mining historical deal data besides market data. In this Section, we conduct some experiments to show that doing so might be beneficial, i.e., might result in costs that are closer to the more accurate actual costs obtained using the detailed traditional bottom top approach.

Fig. 2.
figure 2

Different price points with their corresponding values and relative winning probabilities

Fig. 3.
figure 3

Architecture/Overview of our overall approach

For our experiments, we selected 39 deals at random from a repository of real industry historical deals for an IT service provider that has complete costs cases. For each of the deals, we used baselines at the highest level for the services included in the deal and performed our calculation algorithm described in Sect. 3.1 to get market data and historical data costs. Then, we compared these costs with the actual costs in the sample data. The metric we used is the relative absolute difference between the calculated value and the actual value. Thus, for historical data and market data, respectively, the error would be:

$$ Error_{historical~data} = \frac{{\left| {Actual~Cost - Cost~Calculated~Using~Historical~Data} \right|}}{Actual~Cost} $$
$$ Error_{Market~data} = \frac{{\left| {Actual~Cost - Cost~Calculated~Using~Market~Data} \right|}}{Actual~Cost} $$

Note that we do the comparison at the cost level since prices are calculated by adding arbitrary GPs. Note also that the accuracy of cost estimation of a new deal does not imply a higher probability of winning the deal; since accurate costs do not imply competitive costs/prices and even competitive costs/prices are not the sole predictive factor for winnability, as discussed in the previous Section of this paper.

We calculated the error for each service out of 13 services for the 39 sample points for both market data and historical data. Then, for each service, we performed a paired t-test to test the following hypothesis:

$$ H_{o} :\mu_{D} = 0 $$
$$ H_{1} :\mu_{D} < 0 $$

Where,

$$ \mu_{D} =\upmu_{\text{historical data error}} - \mu_{market~data~error} $$

Here, \( \mu_{D} \) is the difference between the mean of the historical data error (denoted as \( \mu_{historical~data~error)} \) and that of market data (denoted as \( \mu_{market~data~error} \)). For the used test of hypothesis, we used the same notation, assumptions, and details in the texts of Montgomery et al. [10] and Walpole et al. [11]. The test is justified since calculations of each of market data and that of historical data were done independently on each service using the same historical complete costs cases/deals. After assessing the assumptions of the test, we calculated the p-value for each service. Table 1 illustrates the results of the tests.

One can see from the results that there is a statistical evidence/significance that using historical data would yield more accurate costs than using market data for some of the services. This illustrates the usefulness of adopting the historical data mining into our approach. We next state the conclusions and directions for future work.

Table 1. P-value results for the paired t-test of each service

5 Conclusion and Future Work

In this paper, we provided an approach that not only gives a quick agile estimate for the costs and prices of information technology complex services deals with minimal input, but also assesses the relative probabilities of winning such deals for each price estimate. Our approach consists of two phases. In the first phase, we used both historical and market data to estimate the costs. In addition, we showed experimental results based on industry data that illustrates that using historical data is more accurate in estimating the costs for many services, when compared to using market data (which is the more traditional business approach). In the second phase of our approach, we incorporated our price estimates in a predictive analytics model to come up with relative winning probabilities corresponding to each price point. Providing this output helps the solution designers and business executives decide on the final bidding price they would like to pursue.

There are several directions for future research to this work. Instead of estimating the costs/prices based on the highest level of the services, if the solution designer knows a little more detail (e.g., for the second highest level baselines of the offered services in the deal), then one can estimate the costs/prices based on historical and market data at that level. One challenge would be that not all chosen peer deals have all these services at that second level in them. Thus, some machine learning approach might need to be used to compensate these values. Another direction for future research would then be comparing the accuracy of the cost estimation based on the top level of services (as we do in this paper) with that of the second level.