Skip to main content
Log in

PASS Approximation: A Framework for Analyzing and Designing Heuristics

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

We introduce a new framework for designing and analyzing algorithms. Our framework applies best to problems that are inapproximable according to the standard worst-case analysis. We circumvent such negative results by designing guarantees for classes of instances, parameterized according to properties of the optimal solution. Given our parameterized approximation, called PArametrized by the Signature of the Solution (PASS) approximation, we design algorithms with optimal approximation ratios for problems with additive and submodular objective functions such as the capacitated maximum facility location problems. We consider two types of algorithms for these problems. For greedy algorithms, our framework provides a justification for preferring a certain natural greedy rule over some alternative greedy rules that have been used in similar contexts. For LP-based algorithms, we show that the natural LP relaxation for these problems is not optimal in our framework. We design a new LP relaxation and show that this LP relaxation coupled with a new randomized rounding technique is optimal in our framework.

In passing, we note that our results strictly improve over previous results of Kleinberg et al. (J. ACM 51(2):263–280, 2004) concerning the approximation ratio of the greedy algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. Technically, in [15] a different parameter μ is considered, which in our terminology is \(\mu= \frac{1}{\alpha} - 1\). It is straightforward to translate results expressed in terms of μ to results expressed in terms of α and vice versa.

  2. A better notation might be to write r i (S) instead of r i , but we use r i for brevity.

  3. Note that function f can be possibly negative and therefore the result of Feige et al. [10] does not apply.

  4. We note this relaxation is not a relaxation in the usual sense, because the value of the objective function of the LP is a lower bound on the value of an optimal solution, rather than an upper bound.

  5. The greedy algorithm specified prior to Theorem 2.3 in [15] does not specify a rule of which facility to open next, as long as its marginal revenue is larger than its cost. However, the proof of Theorem 2.4 in [15] is based on the use of a greedy-margin rule, without stating this explicitly.

  6. Note that function f can be possibly negative and therefore the result of Feige et al. [10] does not apply.

References

  1. Ageev, A., Sviridenko, M.: An 0.828-approximation algorithm for uncapacitated facility location problem. Discrete Appl. Math. 93, 289–296 (1999)

    Article  MathSciNet  Google Scholar 

  2. Blum, A., Spencer, J.: Coloring random and semi-random k-colorable graphs. J. Algorithms 19(2), 204–234 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  3. Cornuejols, G., Nemhauser, G., Wolsey, L.: Locations of bank accounts to optimize float: an analytic study of exact and approximate algorithms. Manag. Sci. 23, 789–810 (1977)

    Article  MATH  Google Scholar 

  4. Feige, U.: A threshold of ln for approximating set cover. J. ACM 45(4), 634–652 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  5. Feige, U.: On maximizing welfare when utility functions are subadditive. In: Proceedings of the 38th Annual ACM Symposium on Theory of Computing (STOC), pp. 41–50 (2006)

    Google Scholar 

  6. Feige, U., Immorlica, N., Mirrokni, V., Nazerzadeh, H.: A combinatorial allocation mechanism with penalties for banner advertising. In: Proceedings of the 17th International World Wide Web Conference (WWW), pp. 169–178 (2008)

    Chapter  Google Scholar 

  7. Feige, U., Immorlica, N., Mirrokni, V., Nazerzadeh, H.: PASS approximation: a framework for analyzing and designing heuristics. In: Proceedings of 19th International Workshop Approximation Algorithms for Combinatorial Optimization (APPROX), pp. 111–124 (2009)

    Chapter  Google Scholar 

  8. Feige, U., Killian, J.: Heuristics for semirandom graph problems. J. Comput. Syst. Sci. 63, 639–671 (2001)

    Article  MATH  Google Scholar 

  9. Feige, U., Lovász, L., Tetali, P.: Approximating min-sum set cover. In: Proceedings of 5th International Workshop Approximation Algorithms for Combinatorial Optimization (APPROX), pp. 94–107 (2002)

    Chapter  Google Scholar 

  10. Feige, U., Mirrokni, V., Vondrak, J.: Maximizing non-monotone submodular functions. In: Proceedings of the 48th annual IEEE symposium on Foundations of Computer Science (FOCS), pp. 461–471 (2007)

    Chapter  Google Scholar 

  11. Feige, U., Vondrák, J.: Approximation algorithms for allocation problems: improving the factor of 1-1/e. In: Proceedings of 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pp. 667–676 (2006)

    Google Scholar 

  12. Grötschel, M., Lovasz, L., Schrijver, A.: Geometric Algorithms and Combinatorial Optimization (Algorithms and Combinatorics). Springer, Berlin (1988)

    Book  Google Scholar 

  13. Hastad, J.: Clique is hard to approximate within n 1−ε. Acta Math. 182, 105–142 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  14. Kempe, D., Kleinberg, J., Tardos, E.: Maximizing the spread of influence through a social network. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 137–146 (2003)

    Chapter  Google Scholar 

  15. Kleinberg, J.M., Papadimitriou, C.H., Raghavan, P.: Segmentation problems. J. ACM 51(2), 263–280 (2004)

    Article  MathSciNet  Google Scholar 

  16. Leskovec, J., Krause, A., Guestrin, C., Faloutsos, C., VanBriesen, J., Glance, N.S.: Cost-effective outbreak detection in networks. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 420–429 (2007)

    Chapter  Google Scholar 

  17. Mossel, E., Roch, S.: On the submodularity of influence in social networks. In: Proceedings of the 39th ACM Symposium on Theory of Computing (STOC), pp. 128–134 (2007)

    Google Scholar 

  18. Spielman, D., Teng, S.: Smoothed analysis: why the simplex algorithm usually takes polynomial time. J. ACM 51, 385–463 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  19. Zuckerman, D.: Linear degree extractors and the inapproximability of max clique and chromatic number. Theory Comput. 3(1), 103–128 (2007)

    Article  MathSciNet  Google Scholar 

  20. Zwick, U.: Outward rotations: a tool for rounding solutions of semidefinite programming relaxations, with applications to MAX CUT and other problems. In: Proceedings of the 31st ACM Symposium on Theory of Computing (STOC), pp. 679–687 (1999)

    Google Scholar 

Download references

Acknowledgements

The work of the first author was supported in part by The Israel Science Foundation (grant No. 873/08). We thank Adam Meyerson and anonymous referees who commented on previous versions of this manuscript for the their useful suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hamid Nazerzadeh.

Additional information

An extended abstract of this work has appeared in [7].

U. Feige work was performed in part at Microsoft Research.

Appendix A: Special Cases of SMFL

Appendix A: Special Cases of SMFL

In this section we list some important special cases of submodular maximum facility location (SMFL). For completeness, we first restate the definition of SMFL.

Problem A.1

(Submodular Maximum Facility Location (SMFL))

Consider a set N of n facilities and a set function f:2NR +. For any subset SN, f(S)=R(S)−c(S), where R is a non-negative non-decreasing submodular set function corresponding to the revenue, and c(S)=∑ iS c i is a linear cost function. As a result, set function f is a non-monotone submodular function and the goal is to find a subset S that maximizes f(S).Footnote 6 We assume a value oracle for the revenue function R and a description for the cost c (this is of polynomial size) are given.

The most important special case, and the one we use to illustrate our results, is the maximum facility location (MFL) in which the revenue is defined by a matching.

Problem A.2

(Maximum Facility Location (MFL))

A set \(\mathcal{F}\) of m facilities is given. For every facility i, there is an opening cost of c i . There is also a set \(\mathcal{J}\) of n clients. The revenue of connecting client j to facility i is u ij ≥0 (this may be interpreted as a client revenue minus a connection cost). Every client can connect to at most one open facility (or none). The goal in MFL is to open some facilities and connect clients to them so as to maximize the total revenue from the connected clients minus the total cost of the opened facilities.

For comparison with some previous work [15], we shall discuss also the following problem that [15] call the variable catalog segmentation problem.

Problem A.3

(Catalog Segmentation Problem)

A company has a collection of products and a collection of potential clients. Clients have various levels of interest associated with each type of product. The company wishes to produce several types of catalogs, each type containing a subset of the products (the number of products in a catalog may be limited by considerations such as weight), and mail to every potential client at most one catalog (presumably, of a type that would be of interest to the client). Assuming that producing a catalog-type has unit cost, and that for each type i and client j there is a expected revenue of u ij from mailing a catalog of type i to client j, which catalogs should the company produce in order to maximize its expected profit (expected benefit minus production cost)? If all potential types of catalogs can be listed beforehand and all values u ij are known, then this is a special case of MFL, with the catalogs serving as facilities. (In [15] it is assumed that all types of catalogs cost the same to produce, and we follow this assumption in our presentation. More generally, we may associate a cost c i for producing the catalog of type i, and then the problem becomes equivalent to MFL.)

The MFL problem is also referred to as uncapacitated facility location (see [1] for example). More generally, each facility may be limited in the number of clients it can serve. This problem is called the capacitated maximum facility location problem (CMFL).

Problem A.4

(Capacitated Maximum Facility Location (CMFL))

This problem is identical to MFL except that each facility i additionally has a capacity constraint k i , meaning that in a feasible solution, the number of clients connected to facility i is at most k i .

Although not entirely obvious, we show in Sect. A.1 that CMFL is a special case of SMFL.

The CMFL has many important applications, including, for example, the allocation of banner advertisements in online advertising.

Problem A.5

(Banner Ad Allocation)

In an instance of this problem, there is a set \(\mathcal{F}\) of m advertisers, and a set \(\mathcal{J}\) of n ad opportunities (or ads, for short). Each advertiser \(i\in \mathcal{F}\) is interested in a subset of ads \(T_{i}\subseteq \mathcal{J}\), and is associated with a bid (willingness to pay) value b i , and a desired number of ads d i . We should find a subset of advertisers and assign ads to them. We are also given a penalty parameter, γ as follows: if we assign a set X i T i to advertiser i where |X i |≤d i , the net profit that we collect from advertiser i is |X i |b i γb i (d i −|X i |)=(1+γ)b i |X i |−γb i d i . Our goal is to accept and serve a set of advertisers \(\mathcal{W}\), and assign ads to them to maximize the total revenue.

The banner ad allocation problem is a special case of CMFL. Each bid i can be thought of as a facility of capacity d i , opening cost γb i d i and revenue (1+γ)b i per item (and 0 for items that the advertiser is not interested in).

Another special case of SMFL, which is not a special case of CMFL is the following problem.

Problem A.6

(Influence Maximization)

The goal in this problem is to choose an initial set of people such that their adoption of a new product or technology spreads over a social network. For various random influence dynamics, the expected revenue obtained from the spread of influence has been proved to be a non-decreasing submodular function of the set of initial people [14, 17]. Assuming that there exists a cost c i for motivating person i to adopt the new product, the problem is defined as finding a set of initial people such that the revenue obtained from spread of influence from these people minus the cost incurred to motivate these people is maximized.

Other special cases of SMFL include a variety of optimization problems such as set buying and optimal sensor installation for outbreak detection [16].

1.1 A.1 Submodularity of CMFL

In this section we prove the following proposition.

Proposition A.7

The revenue R(.) of CMFL is a submodular function. Namely, for every two sets S and T of facilities, R(S)+R(T)≥R(ST)+R(ST). Equivalently, for every facility i and every two sets of facilities S and T with ST, M(i|S)≥M(i|T).

Before proving Proposition A.7, let us remove a possible misunderstanding regarding the statement of the proposition. The marginal revenue of a facility does not refer only to the revenue from clients served by the facility. It takes into account also the revenue change to other facilities. The following simple example illustrates this distinction.

Example

There are two clients, a 1 and a 2, and three facilities 1, 2, and 3. We set the revenues as follows. u 11=2, u 12=0, u 21=3, u 22=2, u 31=0, u 32=3. All capacities are 1. Let S={1} and T={1,3}. Then M(2|S)=2 (client a 2 will be allocated to facility 2), and M(2|T)=1 (client a 1 transfers from facility 1 to facility 2), even though facility 2 shows a revenue of 3 in this latter case.

If there are no capacity constraints, the proof of Proposition A.7 is straightforward. Given a set of open facilities, every client connects to the facility that offers maximum revenue to that client. As more facilities open, the revenue of each client is a nondecreasing function. Hence the possible benefit of a new facility cannot increase if more facilities are open.

When there are capacity constraints, the situation becomes more complicated. Clients can no longer greedily choose which facility to connect to. Rather, a matching problem needs to be solved. As a consequence, the revenue of a client might decrease when a new facility is opened. This is illustrated in the following example.

Example

There are two clients a 1 and a 2 and two facilities 1 and 2, each of capacity 1. The revenues are u 11=3, u 12=2, u 21=2, u 22=0. Opening facility 1, the revenue of a 1 is 3. Then, opening facility 2, a 1 is transferred and its revenue decreases to 2, whereas the revenue of a 2 increases from 0 to 2.

Proof of Proposition A.7

Let S and T be two arbitrary sets of facilities. Consider the following bipartite multi-graph (we will allow parallel edges). The set V 1 of left hand side vertices contains one vertex for every facility in ST. The set V 2 of right hand side vertices contains one vertex for every client. The graph contains two types of edges, white and black. The white edges corresponds to an optimal (or arbitrary, this does not matter for the proof) legal assignment of clients to the facilities in ST (each client is assigned to at most one facility, the number of clients served by a facility does not exceed the facility’s capacity). Thus a white edge connects facility i to client j iff i serves client j. The black edges correspond in a similar way to an optimal (or arbitrary) legal assignment of clients to facilities in ST. Hence every right hand side vertex is connected to at most one white edge and at most one black edge, and every left hand side vertex of capacity k i is connected to at most k i edges of each color. To prove submodularity of the revenue function, it suffices to show that all edges can be recolored in red and blue in a way corresponding to legal assignments of clients to facilities in S and in T respectively. Namely, every client is connected to at most one red edge and at most one blue edge, every facility in ST is connected only to red edges, every facility in TS is connected only to blue edges, and for every facility iST, neither the number of red edges nor the number of blue edges connected to it exceed its capacity k i .

We now show an algorithm that obtains a legal red/blue coloring from a legal white/black coloring. Observe that all edges incident with ST are initially colored white, and likewise with edges incident with TS. The algorithm will proceed in steps. In every step, the algorithm will take a maximal alternating white/black path or an alternating white/black cycle and color all its edges red/blue in an alternating way. We show that this can be done in a way that eventually gives a legal red/blue coloring.

Given an alternating white/black cycle, color its white edges red and its black edges blue. All clients on the cycle receive one red edge and one blue edge and hence become legally colored. All facilities on the cycle must belong to ST and hence are allowed to have both red and blue edges. Moreover, the cycles result in the same number of red and blue edges in a facility, and hence capacity constraints cannot be exceeded in any color.

Given a maximal alternating white/black path (maximal in the sense that it cannot be extended in either direction), check whether one of its endpoints lies in ST. If yes, the edge incident with it must have been colored white. Color all white edge red and all black edges blue. Else (if no endpoint lies in ST), color all white edge blue and all black edge red. All clients on the path receive at most one red edge and one blue edge, and all their edges are exhausted (because the path is maximal and initially they had at most one white edge and at most one black edge). Hence they are legally colored. Vertices in ST can appear only as endpoints of the path (as they are not incident with black edges). Hence they are legally colored. To see that the same applies to vertices in TS, one needs to consider the case that (at least) one of the endpoints lies in TS. The only source of trouble in this case might have been if the other endpoint lies in ST (because then all white edges are colored red, and this is illegal for vertices in TS). However, a simple parity argument shows that this cannot happen without having a black edge enter a vertex in TS, but there are no such black edges. Last, for vertices in ST we need to check that capacity constraints are not exceeded in ST. Note that as long that such a vertex has both white and black edges incident with it, it receives the same number of red and blue edges (because paths are maximal). Hence if W,B denote the numbers of white and black edges it started with, and r,b denote the numbers of red and blue edges it ended with (hence W+B=r+b), we have that min[r,b]≥min[W,B] implying that max[r,b]≤max[W,B], and no capacity constraint can be exceeded. □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Feige, U., Immorlica, N., Mirrokni, V.S. et al. PASS Approximation: A Framework for Analyzing and Designing Heuristics. Algorithmica 66, 450–478 (2013). https://doi.org/10.1007/s00453-012-9646-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-012-9646-2

Keywords

Navigation