Abstract
Given a set P of n points on the real line and a (potentially infinite) family of functions, we investigate the problem of finding a small (weighted) subset \({\mathcal{S}} \subseteq P\), such that for any \(f \in {\mathcal{F}}\), we have that f(P) is a (1±ε)-approximation to \(f({\mathcal{S}})\). Here, f(Q) = ∑ q ∈ Q w(q) f(q) denotes the weighted discrete integral of f over the point set Q, where w(q) is the weight assigned to the point q.
We study this problem, and provide tight bounds on the size \({\mathcal{S}}\) for several families of functions. As an application, we present some coreset constructions for clustering.
The latest version of this paper is avaiable online [1].
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Har-Peled, S.: Coresets for discrete integration and clustering (2006), Available from: http://www.uiuc.edu/~sariel/papers/06/integrate
Matoušek, J.: Geometric Discrepancy. Springer, Heidelberg (1999)
Har-Peled, S., Kushal, A.: Smaller coresets for k-median and k-means clustering. In: Proc. 21st Annu. ACM Sympos. Comput. Geom., pp. 126–134 (2005)
Har-Peled, S., Mazumdar, S.: Coresets for k-means and k-median clustering and their applications. In: Proc. 36th Annu. ACM Sympos. Theory Comput., pp. 291–300 (2004)
Chen, K.: On k-median clustering in high dimensions. In: Proc. 17th ACM-SIAM Sympos. Discrete Algorithms, pp. 1177–1185 (2006)
Agarwal, P.K., Har-Peled, S., Varadarajan, K.: Geometric approximation via coresets. In: Goodman, J.E., Pach, J., Welzl, E. (eds.) Combinatorial and Computational Geometry. Math. Sci. Research Inst. Pub., Cambridge (2005)
Feldman, D., Fiat, A., Sharir, M.: Coresets for weighted facilities and their applications (manuscript 2006)
Alon, N., Dar, S., Parnas, M., Ron, D.: Testing of clustering. In: Proc. 41st Annu. IEEE Sympos. Found. Comput. Sci., pp. 240–250 (2000)
Agarwal, P.K., Procopiuc, C.M.: Approximation algorithms for projective clustering. In: Proc. 11th ACM-SIAM Sympos. Discrete Algorithms, pp. 538–547 (2000)
Bădoiu, M., Clarkson, K.: Smaller coresets for balls. In: Proc. 14th ACM-SIAM Sympos. Discrete Algorithms, pp. 801–802 (2003)
Eppstein, D.: Fast hierarchical clustering and other applications of dynamic closest pairs. In: Proc. 9th ACM-SIAM Sympos. Discrete Algorithms, pp. 619–628 (1998)
Feder, T., Greene, D.H.: Optimal algorithms for approximate clustering. In: Proc. 20th Annu. ACM Sympos. Theory Comput., pp. 434–444 (1988)
Gonzalez, T.: Clustering to minimize the maximum intercluster distance. Theoret. Comput. Sci. 38, 293–306 (1985)
Indyk, P.: A sublinear time approximation scheme for clustering in metric spaces. In: Proc. 40th Annu. IEEE Sympos. Found. Comput. Sci., pp. 100–110 (1999)
Mishra, N., Oblinger, D., Pitt, L.: Sublinear time approximate clustering. In: Proc. 12th ACM-SIAM Sympos. Discrete Algorithms, pp. 439–447 (2001)
Ostrovsky, R., Rabani, Y.: Polynomial time approximation schemes for geometric k-clustering. In: Proc. 41st Symp. Foundations of Computer Science, pp. 349–358. IEEE, Los Alamitos (2000)
Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: A local search approximation algorithm for k-means clustering. Comput. Geom. Theory Appl. 28, 89–112 (2004)
Inaba, M., Katoh, N., Imai, H.: Applications of weighted voronoi diagrams and randomization to variance-based k-clustering. In: Proc. 10th Annu. ACM Sympos. Comput. Geom., pp. 332–339 (1994)
Har-Peled, S.: How to get close to the median shape. In: Proc. 22nd Annu. ACM Sympos. Comput. Geom. (to appear, 2006), Available from: http://www.uiuc.edu/~sariel/papers/05/l1_fitting/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Har-Peled, S. (2006). Coresets for Discrete Integration and Clustering. In: Arun-Kumar, S., Garg, N. (eds) FSTTCS 2006: Foundations of Software Technology and Theoretical Computer Science. FSTTCS 2006. Lecture Notes in Computer Science, vol 4337. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11944836_6
Download citation
DOI: https://doi.org/10.1007/11944836_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49994-7
Online ISBN: 978-3-540-49995-4
eBook Packages: Computer ScienceComputer Science (R0)