Skip to main content

Advertisement

Log in

The analysis of social networks

  • Published:
Health Services and Outcomes Research Methodology Aims and scope Submit manuscript

Abstract

Many questions about the social organization of medicine and health services involve interdependencies among social actors that may be depicted by networks of relationships. Social network studies have been pursued for some time in social science disciplines, where numerous descriptive methods for analyzing them have been proposed. More recently, interest in the analysis of social network data has grown among statisticians, who have developed more elaborate models and methods for fitting them to network data. This article reviews fundamentals of, and recent innovations in, social network analysis using a physician influence network as an example. After introducing forms of network data, basic network statistics, and common descriptive measures, it describes two distinct types of statistical models for network data: individual-outcome models in which networks enter the construction of explanatory variables, and relational models in which the network itself is a multivariate dependent variable. Complexities in estimating both types of models arise due to the complex correlation structures among outcome measures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. To aid readers in applying these methods, we provide some references to network software throughout, but our coverage of software is not comprehensive. Huisman and Van Duijn (2005) review software resources available earlier in this decade.

  2. The extent to which distances in a graphical representation correspond to the data on which they rest—dyadic measurements of social distance or proximity—depends on the objective function that serves as a fitting criterion when the plot is constructed. The most widely-used “nonmetric” multidimensional scaling algorithm requires an ordinal correspondence between data and plotted distances; “metric” scaling uses a stronger (linear) criterion. Objective functions used by many spring-embedder methods involve a “node repulsion” term that simplifies the visual representation by discouraging co-location of vertices within a plot, but simultaneously limits the extent to which data and plotted distances correspond. Moreover, a low (ordinarily 2)-dimensional Cartesian plot may do more or less well in representing data on the relationships among N actors, which may in principle be (N − 1)-dimensional.

  3. Note that some or all of the intermediaries along these geodesic paths may be physicians 11–33.

  4. Recall that the undirected physician network is identical to that shown in Fig. 1, except that ties lack directionality.

  5. We calculated centrality scores using the software package UCINET 6 (Borgatti, Everett, and Freeman, 2002).

  6. Eigenvalue centrality can in principle be calculated for a nonsymmetric matrix, but the routine in UCINET 6 handles only the symmetric case.

  7. Because two actors have outdegrees of 0, the associated rows of W sum to 0 as opposed to 1. Therefore, although these actors contribute to the estimation of β and σ2, they do not directly contribute any information about the autocorrelation parameters α and ρ. We retained these actors in the analysis because they were cited by other physicians as influencing them and so removing them would omit information about how other actors were influenced.

  8. Computations were performed using the StOCNET software package (Boer et al. 2006).

  9. Under p1, the estimate of a receiver parameter is infinitely small for actors with indegree 0; likewise, the estimate of a sender parameter is -∞ when the corresponding outdegree is 0.

  10. The p2 model is closely related to a social relations model developed by Kenny and La Voie (1984) for quantitative network variables.

  11. The large number of terms in κ(θ) complicates the estimation of ERGMs. There are 2N(N−1)possible directed binary-valued networks; for example, with = 10, the number of possible networks—hence terms in κ(θ)—is 1.238 × 1027.

  12. For example, an actor with degree 3 contributes 1 3-star, 3 2-stars, and 3 1-stars; 1-stars are equivalent to individual edges.

  13. The set of k-star statistics is equivalent to the set of degree statistics (the number of nodes of degree k, k = 1,2,3,…) in that a bijection exists between the two sets of the statistics (Snijders et al. 2006).

  14. An analogous “sender covariate” statistic allows the density effect to depend on an attribute of the sender (i).

  15. \( {\mathbf{y}}_{ij}^{ + } \)is the realization of the complement network with y ij  = 1, while \( {\mathbf{y}}_{ij}^{ - } \) is the realization of the complement network with y ij  = 0.

  16. An equivalent statistic based on the degree distribution itself is known as the “geometrically weighted degree” statistic; see Hunter and Handcock (2006).

  17. No mutuality term is included, since this is redundant with the edges term in an undirected network. Constraining the value of ρ when fitting the model with the GWESP term is often helpful; attaining adequate convergence is more difficult when it is estimated as a free parameter. We found that setting ρ = 1.2 served well here; the likelihood surface is relatively flat, so that using a value between 1.0 and 1.5 did not affect inferences about other parameters. Note, however, ρ was estimated at 0.93 when we left it as a free parameter in the third model in Table 9.

  18. Although the missing values are replaced with non-missing values during model fitting, the statistics measuring model fit are only evaluated using actors with non-missing values throughout the corresponding interval of time. Thus, standard imputation is not performed.

References

  • Anderson, C., Wasserman, S., Crouch, B.: A p* primer: logit models for social networks. Soc. Networks 21(1), 37–66 (1999). doi:10.1016/S0378-8733(98)00012-4

    Google Scholar 

  • Anselin, L.: Spatial Econometrics: Methods and Models. Kluwer Academic Publishers, Dordrecht, The Netherlands (1988)

    Google Scholar 

  • Anselin, L.: Some robust approaches to testing and estimation in spatial econometrics. Reg. Sci. Urban Econ. 20(2), 141–163 (1990). doi:10.1016/0166-0462(90)90001-J

    Google Scholar 

  • Banerjee, S., Carlin, B., Gelfand, A.: Hierarchical Modeling and Analysis for Spatial Data. Chapman and Hall, Boca Raton, FL (2004)

    Google Scholar 

  • Barabási, A.-L.: Linked: The New Science of Networks. Perseus, New York (2002)

    Google Scholar 

  • Bartholomew, D., Steele, F., Moustaki, I., Galbraith, J.: The Analysis and Interpretation of Multivariate Data for Social Scientists. Chapman and Hall, New York (2002)

    Google Scholar 

  • Batagelj, V., Mrvar, A.: Pajek: analysis and visualization of large networks. In: Jünger, M., Mutzel, P. (eds.) Graph Drawing Software, pp. 77–103. Springer, New York (2003)

    Google Scholar 

  • Beauchamp, M.: An improved index of centrality. Behav. Sci. 10, 161–163 (1965). doi:10.1002/bs.3830100205

    PubMed  CAS  Google Scholar 

  • Behrman, J., Kohler, H.-P., Watkins, S.: Social networks and changes in contraceptive use over time: evidence from a longitudinal study in rural Kenya. Demography 39, 713–738 (2002). doi:10.1353/dem.2002.0033

    PubMed  Google Scholar 

  • Berkman, L., Glass, T.: Social integration, social methods, social support, and health. In: Berkman, L., Kawachi, I. (eds.) Social Epidemiology, pp. 137–173. Oxford University Press, New York (2000)

    Google Scholar 

  • Berkman, L., Syme, S.: Social networks, host resistance, and mortality: a nine-year follow-up study of Alameda County residents. Am. J. Epidemiol. 109, 86–204 (1979)

    Google Scholar 

  • Besag, J.: Spatial interaction and statistical-analysis of lattice systems. J. Roy. Stat. Soc. B Met. 36(2), 192–236 (1974)

    Google Scholar 

  • Besag, J.: Statistical analysis of non-lattice data. J. Inst. Statisticians 24, 179–196 (1975)

    Google Scholar 

  • Best, N., Cowles, M., Vines, K.: Convergence Diagnosis and Output Analysis Software for Gibbs Sampling Output. MRC Biostatistics Unit, Institute of Public Health, Robinson Way, Cambridge CB2 2SR, UK (1995)

    Google Scholar 

  • Boer, P., Huisman, M., Snijders, T., Steglich, M., Wicher, L., Zeggelink, E.: StOCNET User’s Manual, Version 1.7. ICS, Groningen, NL (2006)

    Google Scholar 

  • Bonacich, P.: Power and centrality: a family of measures. Am. J. Sociol. 92, 1170–1182 (1987). doi:10.1086/228631

    Google Scholar 

  • Borgatti, S.: NetDraw: Graph Visualization Software. Analytical Technologies, Lexington, KY (2008)

    Google Scholar 

  • Borgatti, S., Everett, M., Freeman, L.: UCINET 6 for Windows: Software for Social Network Analysis. Analytical Technologies, Lexington, KY (2002)

    Google Scholar 

  • Burt, R.: Structural Holes: The Social Structure of Competition. Harvard University Press, Cambridge, MA (1992)

    Google Scholar 

  • Burt, R., Doreian, P.: Testing a structural model of perception: conformity and deviance with respect to journal norms in elite sociological methodology. Qual. Quant. 16, 109–150 (1982). doi:10.1007/BF00166880

    Google Scholar 

  • Butts, C.: sna: Tools for Social Network Analysis (release 1.5) (2007)

  • Christakis, N.: Social networks and collateral health effects. BMJ 329(7459), 184–185 (2004). doi:10.1136/bmj.329.7459.184

    PubMed  Google Scholar 

  • Christakis, N., Fowler, J.: The spread of obesity in a large social network over 32 years. N. Engl. J. Med. 357, 370–379 (2007). doi:10.1056/NEJMsa066082

    PubMed  CAS  Google Scholar 

  • Christakis, N., Fowler, J.: The collective dynamics of smoking in a large social network. N. Engl. J. Med. 358, 2249–2258 (2008). doi:10.1056/NEJMsa0706154

    PubMed  CAS  Google Scholar 

  • Coleman, J., Katz, E., Menzel, H.: Medical Innovation: A Diffusion Study. Bobbs-Merrill, Indianapolis (1966)

    Google Scholar 

  • Doreian, P.: Linear-models with spatially distributed data-spatial disturbances or spatial effects. Sociol. Method. Res. 9(1), 29–60 (1980). doi:10.1177/004912418000900102

    Google Scholar 

  • Doreian, P.: Estimating linear models with spatially distributed data. In: Leinhardt, S. (ed.) Sociological Methodology, pp. 359–388. Jossey-Bass, San Francisco (1981)

    Google Scholar 

  • Doreian, P.: Network autocorrelation models: problems and prospects. In: Griffith, D.A. (ed.) Spatial Statistics: Past, Present, Future, pp. 369–389. Michigan Document Services, Ann Arbor (1989)

    Google Scholar 

  • Doreian, P., Stokman, F.: Evolution of social networks: processes and principles. In: Doreian, P., Stokman, F. (eds.) Evolution of Social Networks, pp. 233–250. Gordon and Breach Publishers, Amsterdam (1997)

    Google Scholar 

  • Dow, M.: A biparametric approach to network autocorrelation. Sociol. Method. Res. 13, 201–217 (1984). doi:10.1177/0049124184013002002

    Google Scholar 

  • Erdös, P., Rényi, A.: On random graphs. Pub. Math. 6, 290–297 (1959)

    Google Scholar 

  • Fienberg, S., Wasserman, S.: Categorical data analysis of single sociometric relations. In: Leinhardt, S. (ed.) Sociological Methodology, pp. 156–192. Jossey-Bass, San Francisco (1981)

    Google Scholar 

  • Frank, O.: Statistical Inference in Graphs. Stockholm: FOA Repro, Stockholm (1971)

    Google Scholar 

  • Frank, O.: Sampling and estimation in large social networks. Soc. Networks 11, 91–101 (1978)

    Google Scholar 

  • Frank, O.: A survey of statistical methods for graph analysis. In: Leinhardt, S. (ed.) Sociological Methodology, pp. 110–155. Jossey-Bass, San Francisco (1981)

    Google Scholar 

  • Frank, O.: Random sampling and social networks: a survey of various approaches. Math. Informatique Sci. Hum. 26, 19–33 (1988)

    Google Scholar 

  • Frank, O., Strauss, D.: Markov graphs. J. Am. Stat. Assoc. 81(395), 832–842 (1986). doi:10.2307/2289017

    Google Scholar 

  • Freeman, L.: Centrality in social networks, I. Conceptual clarification. Soc. Networks 1, 215–239 (1979). doi:10.1016/0378-8733(78)90021-7

    Google Scholar 

  • Freeman, L.: Social networks and the structure experiment. In: Freeman, L., White, D., Romney, A. (eds.) Research Methods in Social Network Analysis, pp. 11–40. George Mason University Press, Fairfax, VA (1989)

    Google Scholar 

  • Freeman, L.: The Development of Social Network Analysis: A Study in the Sociology of Science. Empirical Press, Vancouver, BC (2004)

    Google Scholar 

  • Friedkin, N.: Social networks in structural equations models. Soc. Psychol. Q. 53, 316–328 (1990). doi:10.2307/2786737

    Google Scholar 

  • Friedkin, N., Cook, K.: Peer group influence. Sociol. Method. Res. 19(1), 122–143 (1990). doi:10.1177/0049124190019001006

    Google Scholar 

  • Fruchterman, T., Reingold, E.: Graph drawing by force-directed placement. Software Pract. Exper. 21(11), 1129–1164 (1991). doi:10.1002/spe.4380211102

    Google Scholar 

  • Geyer, C., Thompson, E.: Constrained Monte Carlo maximum likelihood for dependent data. J. Roy. Stat. Soc. B Met. 54(3), 657–699 (1992)

    Google Scholar 

  • Gill, P., Swartz, T.: Bayesian analysis of directed graphs data with applications to social networks. J. Roy. Stat. Soc. C-App. Stat. 53, 249–260 (2004)

    Google Scholar 

  • Goodreau, S.: Advances in exponential random graph (p*) models applied to a large social network. Soc. Networks 29, 231–248 (2007). doi:10.1016/j.socnet.2006.08.001

    PubMed  Google Scholar 

  • Haines, V., Hurlbert, J.: Network range and health. J. Health Soc. Behav. 33, 254–266 (1992). doi:10.2307/2137355

    PubMed  CAS  Google Scholar 

  • Handcock, M.: Assessing Degeneracy in Statistical Models of Social Networks. Center for Statistics and Social Sciences, University of Washington, Seattle (2003)

    Google Scholar 

  • Handcock, M., Hunter, D., Butts, C., Goodreau, S., Morris, M.: Statnet: Software tools for the Statistical Modeling of Network Data (release Version 2.1). Center for Statistics and Social Sciences, University of Washington, Seattle, WA; Project home page at http://statnetproject.org; Software available at http://CRAN.R-project.org/package=statnet (2003)

  • Handcock, M., Raftery, A., Tantrum, J.: Model-based clustering for social networks. J. R. Stat. Soc. [Ser A] 170(2), 301–354 (2007). doi:10.1111/j.1467-985X.2007.00471.x

    Google Scholar 

  • Harville, D.: Matrix Algebra from a Statistician’s Perspective. Springer-Verlag Inc, New York (1997)

    Google Scholar 

  • Hoff, P.: Bilinear mixed-effects models for dyadic data. J. Am. Stat. Assoc. 100, 286–295 (2005). doi:10.1198/016214504000001015

    CAS  Google Scholar 

  • Hoff, P., Raftery, A., Handcock, M.: Latent space approaches to social network analysis. J. Am. Stat. Assoc. 97, 1090–1098 (2002). doi:10.1198/016214502388618906

    Google Scholar 

  • Holland, P., Leinhardt, S.: Local structure in social networks. In: Heise, D. (ed.) Sociological Methodology, pp. 1–45. Jossey-Bass, San Francisco (1976)

    Google Scholar 

  • Holland, P., Leinhardt, S.: A dynamic model for social networks. J. Math. Sociol. 5, 5–20 (1977)

    Google Scholar 

  • Holland, P., Leinhardt, S.: An exponential family of probability-distributions for directed-graphs. J. Am. Stat. Assoc. 76(373), 33–50 (1981). doi:10.2307/2287037

    Google Scholar 

  • Holland, P., Laskey, K., Leinhardt, S.: Stochastic blockmodels: first steps. Soc. Networks 5, 109–137 (1983). doi:10.1016/0378-8733(83)90021-7

    Google Scholar 

  • House, J., Kahn, R.: Measures and concepts of social support. In: Cohen, S., Syme, S. (eds.) Social Support and Health, pp. 83–108. Academic Press, New York (1985)

    Google Scholar 

  • Huisman, M., Van Duijn, M.: Software for statistical analysis of social networks. In: Van Dijkum, C., Blasius, J., Kleijer, H., Van Hilten, B. (eds.) The Sixth International Conference on Logic and Methodology, Amsterdam, The Netherlands (2004)

    Google Scholar 

  • Huisman, M., Snijders, T.: Statistical analysis of longitudinal network data with changing composition. Sociol. Method. Res. 32, 253–287 (2003). doi:10.1177/0049124103256096

    Google Scholar 

  • Huisman, M.,Van Duijn, M.: Software for social networks analysis. In: Carrington, P.J., Scott, J., Wasserman, S. (eds.) Models and Methods in Social Network Analysis, pp. 270–316. Cambridge University Press, Cambridge (2005)

    Google Scholar 

  • Hunter, D.: Curved exponential family models for social networks. Soc. Networks 29, 216–230 (2007). doi:10.1016/j.socnet.2006.08.005

    PubMed  Google Scholar 

  • Hunter, D., Goodreau, S., Handcock, M.: Goodness of fit of social network models. J. Am. Stat. Assoc. 103, 248–258 (2008). doi:10.1198/016214507000000446

    CAS  Google Scholar 

  • Hunter, D., Handcock, M.: Inference in curved exponential family models for networks. J. Comput. Graph. Stat. 15, 565–583 (2006)

    Google Scholar 

  • Katz, L., Powell, J.: Probability distributions of random variables associated with a structure of the sample space of sociometric investigations. Ann. Math. Stat. 28, 442–448 (1957). doi:10.1214/aoms/1177706972

    Google Scholar 

  • Keating, N., Ayanian, J., Cleary, P., Marsden, P.: Factors affecting influential discussions among physicians: a social network analysis of a primary care practice. J. Gen. Intern. Med. 22(6), 794–798 (2007). doi:10.1007/s11606-007-0190-8

    PubMed  Google Scholar 

  • Kenny, D., La Voie, L.: The social relations model. In: Berkowitz, L. (ed.) Advances in Experimental Social Psychology, pp. 142–182. Academic Press, New York (1984)

    Google Scholar 

  • Klovdahl, A.: Social networks and the spread of infectious diseases. Soc. Sci. Med. 21, 1203–1216 (1985). doi:10.1016/0277-9536(85)90269-2

    PubMed  CAS  Google Scholar 

  • Land, K., Deane, G.: On the large-sample estimation of regression models with spatial or network effects terms: a two-stage least-squares approach. In: Marsden, P.V. (ed.) Sociological Methodology, pp. 221–248. Basil Blackwell, Ltd., Oxford (1992)

    Google Scholar 

  • Laumann, E., Youm, Y.: Racial/ethnic group differences in the prevalence of sexually transmitted diseases in the United States: a network explanation. Sex. Transm. Dis. 26, 250–261 (1999). doi:10.1097/00007435-199905000-00003

    PubMed  CAS  Google Scholar 

  • Laumann, E., Marsden, P., Prensky, D.: The boundary specification problem in network analysis. In: Burt, R., Minor, M. (eds.) Applied Network Analysis A Methodological Introduction, pp. 18–34. Sage Publications, Beverly Hills, CA (1983)

    Google Scholar 

  • Laumann, E., Mahay, J., Paik, A., Youm, Y.: Network data collection and its relevance for the analysis of STDs: the NHSLS and CHSLS. In: Morris, M. (ed.) Network Epidemiology: A Handbook for Survey Design and Data Collection, pp. 27–41. Oxford University Press, New York (2004)

    Google Scholar 

  • Leenders, R.: Modeling social influence through network autocorrelation: constructing the weight matrix. Soc. Networks 24(1), 21–47 (2002). doi:10.1016/S0378-8733(01)00049-1

    Google Scholar 

  • Marsden, P.: Core discussion networks of Americans. Am. Sociol. Rev. 52(1), 122–131 (1987). doi:10.2307/2095397

    Google Scholar 

  • Marsden, P.: Network data and measurement. Annu. Rev. Sociol. 16, 435–463 (1990). doi:10.1146/annurev.so.16.080190.002251

    Google Scholar 

  • Marsden, P.: Egocentric and sociocentric measures of network centrality. Soc. Networks 24, 407–422 (2002). doi:10.1016/S0378-8733(02)00016-3

    Google Scholar 

  • Marsden, P.: Network methods in social epidemiology. In: Oakes, J.M., Kaufman, J.S. (eds.) Methods in Social Epidemiology, pp. 267–286. Jossey-Bass, San Francisco (2006)

    Google Scholar 

  • McGrath, C., Blythe, J., Krackhardt, D.: The effect of spatial arrangement on judgments and errors in interpreting graphs. Soc. Networks 19(3), 223–242 (1997). doi:10.1016/S0378-8733(96)00299-7

    Google Scholar 

  • McPherson, M., Smith-Lovin, L., Cook, J.: Birds of a feather: homophily in social networks. Annu. Rev. Sociol. 27, 415–444 (2001). doi:10.1146/annurev.soc.27.1.415

    Google Scholar 

  • Miguel, E., Kremer, M.: Networks, Social Learning, and Technology Adoption: The Case of Deworming Drugs in Kenya. Poverty Action Laboratory (2003)

  • Morris, M., Handcock, M., Miller, W., Ford, C., Schmitz, J., Hobbs, M., Cohen, M., Harris, K., Udry, J.: Prevalence of HIV infection among young adults in the U.S.: results from the add health study. Am. J. Public Health 96(6), 1091–1097 (2006). doi:10.2105/AJPH.2004.054759

    PubMed  Google Scholar 

  • Nowicki, K., Snijders, T.A.B.: Estimation and prediction for stochastic blockstructures. J. Am. Stat. Assoc. 96, 1077–1087 (2001). doi:10.1198/016214501753208735

    Google Scholar 

  • Pattison, P., Wasserman, S.: Logit models and logistic regressions for social networks: II. Multivariate relations. Br. J. Math. Stat. Psychol. 52(Pt 2), 169–193 (1999). doi:10.1348/000711099159053

    PubMed  Google Scholar 

  • Robins, G., Pattison, P., Wasserman, S.: Logit models and logistic regressions for social networks: III. Valued relations. Psychometrika 64(3), 371–394 (1999). doi:10.1007/BF02294302

    Google Scholar 

  • Robins, G., Pattison, P., Woolcock, J.: Small and other worlds: global network structures from local processes. Am. J. Sociol. 110(4), 894–936 (2005). doi:10.1086/427322

    Google Scholar 

  • Robins, G., Pattison, P., Kalish, Y., Lusher, D.: An introduction to exponential random graph (p*) models for social networks. Soc. Networks 29(2), 173–191 (2007). doi:10.1016/j.socnet.2006.08.002

    Google Scholar 

  • Rothenberg, R., Potterat, J., Woodhouse, D., Muth, S., Darrow, W., Klovdahl, A.: Social network dynamics and HIV transmission. AIDS 12, 1529–1536 (1998). doi:10.1097/00002030-199812000-00016

    PubMed  CAS  Google Scholar 

  • Salganik, M., Heckathorn, D.: Sampling and estimation in hidden populations using respondent-driven sampling. Sociol. Methodol. 34, 193–239 (2004). doi:10.1111/j.0081-1750.2004.00152.x

    Google Scholar 

  • Snijders, T.: The degree variance: an index of graph heterogeneity. Soc. Networks 3, 163–174 (1981). doi:10.1016/0378-8733(81)90014-9

    Google Scholar 

  • Snijders, T.: Enumeration and simulation methods for 0–1 matrices with given marginals. Psychometrika 56(3), 397–417 (1991). doi:10.1007/BF02294482

    Google Scholar 

  • Snijders, T.: Stochastic actor-oriented models for network change. J. Math. Sociol. 21, 149–172 (1996)

    Google Scholar 

  • Snijders, T.: The statistical evaluation of social network dynamics. In: Sobel, M.E., Becker, M.P. (eds.) Sociological Methodology, pp. 361–395. Basil Blackwell, Boston (2001)

    Google Scholar 

  • Snijders, T.: Models for longitudinal social network data. In: Carrington, P., Scott, J., Wasserman, S. (eds.) Models and Methods in Social Network Analysis, pp. 215–247. Cambridge University Press, Cambridge (2005)

    Google Scholar 

  • Snijders, T.: Markov Chain Monte Carlo estimation of exponential random graph models. J. Soc. Struct. 3(2) (2002). Available at: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.20.5323

  • Snijders, T., Pattison, P., Robins, G., Handcock, M.: New specifications for exponential random graph models. In: Stolzenberg, R. (ed.) Sociological Methodology, pp. 99–153. Blackwell, Boston, MA (2006)

    Google Scholar 

  • Snijders, T., Steglich, C., Schweinberger, M., Huisman, M.: Manual for SIENA Version 3.2. University of Groningen, Groningen, The Netherlands (2007)

    Google Scholar 

  • Strauss, D., Ikeda, M.: Pseudolikelihood estimation for social networks. J. Am. Stat. Assoc. 85, 204–212 (1990). doi:10.2307/2289546

    Google Scholar 

  • Sudman, S., Kalton, G.: New developments in the sampling of special populations. Annu. Rev. Sociol. 12, 401–429 (1986). doi:10.1146/annurev.so.12.080186.002153

    Google Scholar 

  • Thompson, S.: Adaptive web sampling. Biometrics 62(4), 1224–1234 (2006)

    PubMed  Google Scholar 

  • Travers, J., Milgram, S.: An experimental study of the small world problem. Sociometry 32(4), 425–443 (1969). doi:10.2307/2786545

    Google Scholar 

  • Unger, J., Chen, X.: The role of social networks and media receptivity in predicting age of smoking initiation: a proportional hazards model of risk and protective factors. Addict. Behav. 24, 371–381 (1999). doi:10.1016/S0306-4603(98)00102-6

    PubMed  CAS  Google Scholar 

  • Valente, T., Watkins, S., Jato, M., van der Straten, A., Tsitol, L.: Social network associations with contraceptive use among Cameroonian women in voluntary associations. Soc. Sci. Med. 45, 1837–1843 (1997). doi:10.1016/S0277-9536(96)00385-1

    Google Scholar 

  • Van Duijn, M., van Busschbach, J., Snijders, T.: Multilevel analysis of personal networks as dependent variables. Soc. Networks 21, 187–209 (1999). doi:10.1016/S0378-8733(99)00009-X

    Google Scholar 

  • Van Duijn, M., Snijders, T., Zijlstra, B.: P2: a random effects model with covariates for directed graphs. Stat. Neerl. 58(2), 234–254 (2004). doi:10.1046/j.0039-0402.2003.00258.x

    Google Scholar 

  • Waller, L., Gotway, C.: Applied Spatial Statistics for Public Health Data. Wiley Interscience, Hoboken, NJ (2004)

    Google Scholar 

  • Wang, P., Robins, G., Pattison, P.: PNet: Program for the Simulation and Estimation of P* Exponential Random Graph Models (release: Department of Psychology, University of Melbourne (2008)

  • Wang, W., Wong, G.: Stochastic blockmodels for directed graphs. J. Am. Stat. Assoc. 82, 8–19 (1987). doi:10.2307/2289119

    Google Scholar 

  • Wasserman, S.: A stochastic model for directed graphs with transition rates determined by reciprocity. In: Schuessler, K.F. (ed.) Sociological Methodology, pp. 392–412. Jossey-Bass, San Francisco (1979)

    Google Scholar 

  • Wasserman, S.: Analyzing social networks as stochastic processes. J. Am. Stat. Assoc. 75, 280–294 (1980). doi:10.2307/2287447

    Google Scholar 

  • Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge (1994)

    Google Scholar 

  • Wasserman, S., Pattison, P.: Logit models and logistic regressions for social networks: I. An introduction to Markov graphs and p*. Psychometrika 61, 401–425 (1996). doi:10.1007/BF02294547

    Google Scholar 

  • Wellman, B., Frank, K.: Network capital in a multilevel world: getting support from personal communities. In: Lin, K., Cook, K., Burt, R. (eds.) Social Capital: Theory and Research, pp. 233–273. Aldine de Gruyter, New York (2001)

    Google Scholar 

  • White, D., Harary, F.: The cohesiveness of blocks in social networks: node connectivity and conditional density. In: Becker, M.P. (ed.) Sociological Methodology, pp. 140–148. Blackwell, Boston (2001)

    Google Scholar 

  • Wolfram, S.: A New Kind of Science. Wolfram Media (2002)

  • Wong, G.: Bayesian models for directed graphs. J. Am. Stat. Assoc. 82, 140–148 (1987)

    Google Scholar 

  • Zeggelink, E.: Dynamics of structure—an individual oriented approach. Soc. Networks 16(4), 295–333 (1994). doi:10.1016/0378-8733(94)90014-0

    Google Scholar 

  • Zijlstra, B., Van Duijn, M., Snijders, T.: The multilevel p2 model: a random effects model for the analysis of multiple social networks. Methodology 21, 42–47 (2006)

    Google Scholar 

Download references

Acknowledgements

We thank Nancy Keating for allowing us to re-analyze the data from the physician influence network. Research for the paper was supported by NIH grants R01 AG024448-02, P01 AG031093, and Robert Wood Johnson Foundation Award #58729.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. James O’Malley.

Appendices

Appendices

Appendix A: R-code for fitting Individual Outcome Model

  • ## Functions used in the analysis ##

  • geodist <- function(adj) {

  • #Derive the geodesic distance between the actors

  • dist <- matrix(0,nr,nr)

  • matpow <- diag(1,nr)

  • for (k in 1:nr) {

  • matpow <- matpow %*% adj

  • ind <- ifelse(dist>0,1,0)

  • dist <- dist*ind+k*ifelse(matpow>0,1,0)*(1-ind)

  • }

  • ind <- ifelse(dist>0,1,0)

  • dist <- dist*ind-1*(1-ind) #-1 indicates of not connected

  • diag(dist) <- 0 #Always 0 distance on the diagonal!

  • return(dist)

  • }

  • like.auto <- function(al,y,x,w) {

  • #Evaluate log-likelihood function of outcome autocorrelation model

  • n <- nrow(x)

  • icov <- diag(1,n) - al*w

  • z <- icov %*% y

  • icov <- t(icov) %*% icov

  • be <- solve(t(x) %*% x) %*% (t(x) %*% z) #coefficients of exogeneous vbles in IV model

  • ri <- z - x %*% be #residuals

  • std2 <- as.vector(t(ri) %*% ri)/n #variance of outcome

  • icov <- icov/std2

  • return(log(det(icov)) - (t(ri) %*% ri)/std2)

  • }

  • parSE.auto<- function(al,y,x,w) {

  • #Standard errors of outcome autocorrelation model

  • n <- nrow(x)

  • p <- ncol(x)

  • icov1<- diag(1,n) - al*w

  • icov <- t(icov1) %*% icov1

  • z <- icov1%*% y

  • be.be <- (t(x) %*% x)

  • be <- solve(be.be) %*% (t(x) %*% z) #coefficients of x

  • ri <- z - x %*% be #residuals

  • std2 <- as.vector(t(ri) %*% ri)/n #variance of outcome

  • G <- w %*% solve(icov1)

  • Gxb <- G %*% x %*% be

  • #The following derivation is based on Harville ( 1997 , p. 309).

  • cov <- solve(icov)

  • wsym <- t(w)+w

  • wsqr <- t(w) %*% w

  • dicov.al <- 2*al*wsqr-wsym

  • dicov.alal <- 2*wsqr

  • t.alal <- cov %*% dicov.alal

  • t.al <- cov %*% dicov.al

  • detterm <- sum(diag(t.alal - (t.al %*% t.al)))/2

  • #Information matrix - extension of expression on page 370 of

  • #Doreian ( 1981 ) to asymmetric W

  • be.be <- be.be/std2

  • std2.std2 <- n/(2*(std2^2))

  • al.be <- (t(Gxb) %*% x)/std2

  • al.std2 <- sum(diag(G))/std2

  • al.al <- sum(diag(t(G) %*% G))+(t(Gxb) %*% Gxb)/std2 - detterm

  • Infm <- matrix(0,p+2,p+2)

  • Infm[1:p,1:p] <- be.be

  • Infm[p+2,1:p] <- al.be

  • Infm[1:p,p+2] <- t(al.be)

  • Infm[p+1,p+1] <- std2.std2

  • Infm[p+1,p+2] <- al.std2

  • Infm[p+2,p+1] <- al.std2

  • Infm[p+2,p+2] <- al.al

  • #Estimates and covariance matrix

  • parms <- c(be=be, std2=std2, al=al)

  • covar <- solve(Infm)

  • SE <- sqrt(diag(covar))

  • names(SE) <- names(par)

  • tval=parms/SE

  • pval=2*(1-pt(abs(tval),n-p))

  • return(list(estimates=parms,SE=SE,tval=tval,pval=pval))

  • }

  • like.sar <- function(rho,y,x,w) {

  • #Evaluate log-likelihood function

  • n <- nrow(x)

  • icov <- diag(1,n) - rho*w

  • icov <- t(icov) %*% icov

  • be <- solve(t(x) %*% icov %*% x) %*% (t(x) %*% icov %*% y) #coefficients of exogeneous vbles in IV model

  • ri <- y - x %*% be #residuals

  • std2 <- as.vector(t(ri) %*% icov %*% ri)/n #variance of outcome

  • icov <- icov/std2

  • return(log(det(icov)) - t(ri) %*% icov %*% ri)

  • }

  • parSE.sar <- function(rho,y,x,w) {

  • n <- nrow(x)

  • p <- ncol(x)

  • icov1 <- diag(1,n) - rho*w

  • icov <- t(icov1) %*% icov1

  • be.be <- (t(x) %*% icov %*% x)

  • be <- solve(be.be) %*% (t(x) %*% icov %*% y) #coefficients of x

  • ri <- y - x %*% be #residuals

  • std2 <- as.vector(t(ri) %*% icov %*% ri)/n #variance of outcome

  • G <- w %*% solve(icov1)

  • #The following computation is based on Harville ( 1997 , p. 309).

  • cov <- solve(icov)

  • wsym <- t(w)+w

  • wsqr <- t(w) %*% w

  • dicov.rho <- 2*rho*wsqr-wsym

  • dicov.rhorho <- 2*wsqr

  • t.rhorho <- cov %*% dicov.rhorho

  • t.rho <- cov %*% dicov.rho

  • detterm <- sum(diag(t.rhorho - (t.rho %*% t.rho)))/2

  • #Information matrix - extension of expression on page 366 of

  • #Waller and Gotway ( 2004 ) to asymmetric W

  • be.be <- be.be/std2

  • std2.std2 <- n/(2*(std2^2))

  • rho.std2 <- sum(diag(G))/std2

  • rho.rho <- sum(diag(t(G) %*% G)) - detterm

  • Infm <- matrix(0,p+2,p+2)

  • Infm[1:p,1:p] <- be.be

  • Infm[p+1,p+1] <- std2.std2

  • Infm[p+1,p+2] <- rho.std2

  • Infm[p+2,p+1] <- rho.std2

  • Infm[p+2,p+2] <- rho.rho

  • #Estimates and covariance matrix

  • parms <- c(be=be, std2=std2, rho=rho)

  • covar <- solve(Infm)

  • SE <- sqrt(diag(covar))

  • names(SE) <- names(par)

  • tval=parms/SE

  • pval=2*(1-pt(abs(tval),n-p))

  • return(list(estimates=parms,SE=SE,tval=tval,pval=pval))

  • }

  • ## Code for loading data and fitting model ##

  • source(“NetAnalFns.r”)

  • #Load network

  • nr <- 33

  • adjdata <- scan(“your_network_file.txt”)

  • adjdata <- matrix(adjdata,nrow=nr,byrow=T)

  • adjdata[19,19] <- 0 #Tidy up data (self-connected nodes not possible)

  • iddata <- seq(1,33,1)

  • #Load node covariate data

  • covdata <- scan(“your_covariate_file.txt”)

  • covdata <- matrix(covdata,nrow=nr,byrow=T)

  • nodecov <- list(male=covdata[,2], whexpert=covdata[,3], pctwom=covdata[,4],

  • numsess=covdata[,5], practice=covdata[,6])

  • #Directed network

  • adjdir <- ifelse(adjdata>0,1,0)

  • #Standardize adjacency matrix so that row sums=1

  • numalters <- apply(adjdir,1,sum)

  • scale=as.vector(numalters^(-1))

  • noalters <- ifelse(is.infinite(scale)==1,1,0)

  • scale[noalters*seq(1,nr)] <- 0

  • on=matrix(1,ncol=nr,nrow=1)

  • wtadjdir <- adjdir * (scale %*% on)

  • #Compute mean value of dependent variables for directly connected

  • # actors - i.e., based on the adjacency matrix

  • hrtalt <- wtadjdir %*% as.vector(covdata$sumhrt)

  • regdata <- data.frame(covdata,hrtalt=hrtalt,noalters=noalters,numalters=numalters)

  • #Compute scaled geodesic distances having row sums equal to 1

  • gdist <- geodist(adjdir)

  • igdist <- gdist^(-1)

  • diag(igdist) <- 0 #Always 0 distance on the diagonal!

  • ind < ifelse(igdist <0,1,0) #Egos with no influential conversations

  • igdist <- igdist*(1-ind)

  • sumigeo <- apply(igdist,1,sum)

  • scale <- as.vector(sumigeo^(-1))

  • noalters <- ifelse(is.infinite(scale)==1,1,0)

  • scale[noalters*seq(1,nr)] <- 0 #Influence set equal to 0 if no influential conversations

  • wtigeo <- igdist * (scale %*% on)

  • #Augment analysis dataset with additional variables

  • hrtgeo <- wtigeo %*% as.vector(covdata$sumhrt)

  • regdata <- data.frame(regdata,hrtgeo=hrtgeo)

  • on <- as.vector(rep(1,nr))

  • x <- as.matrix(cbind(on,regdata[,c(“male”,”pctwom”,”numalters”)]))

  • #Use maximum likelihood to obtain MLEs for autoregressive outcomes and network autocorrelation (SAR) models

  • # - uses optim optimization function in R.

  • strtval <- 0

  • #Fit autoregressive outcomes model

  • #Use scaled adjacency matrix as weight matrix

  • mle <- optim(par=strtval, fn=like.auto, gr=NULL, method=“BFGS”,

  • control=list(fnscale=−1, trace=6, maxit=100, reltol=1e-16),

  • hessian=TRUE, y=regdata$sumhrt,x=x,w=wtadjdir)

  • #Compute estimates of all parameters and standard errors

  • Auto.adj <- parSE.auto(mle$par,y=regdata$sumhrt,x=x,w=wtadjdir)

  • #Use scaled geodesic matrix as weight matrix

  • mle <- optim(par=strtval, fn=like.auto, gr=NULL, method=“BFGS”,

  • control=list(fnscale=-1, trace=6, maxit=100, reltol=1e-16),

  • hessian=TRUE, y=regdata$sumhrt,x=x,w=wtigeo)

  • #Compute estimates of all parameters and standard errors

  • Auto.geo <- parSE.auto(mle$par,y=regdata$sumhrt,x=x,w=wtigeo)

  • #Fit corresponding SAR models

  • #Use scaled adjacency matrix as weight matrix

  • mle <- optim(par=strtval, fn=like.sar, gr=NULL, method=“BFGS”,

  • control=list(fnscale=-1, trace=6, maxit=100, reltol=1e-16),

  • hessian=TRUE, y=regdata$sumhrt,x=x,w=wtadjdir)

  • #Compute estimates of all parameters and standard errors

  • SAR.adj <- parSE.sar(mle$par,y=regdata$sumhrt,x=x,w=wtadjdir)

  • #Use scaled geodesic matrix as weight matrix

  • mle <- optim(par=strtval, fn=like.sar, gr=NULL, method=“BFGS”,

  • control=list(fnscale=-1, trace=6, maxit=100, reltol=1e-16),

  • hessian=TRUE, y=regdata$sumhrt,x=x,w=wtigeo)

  • #Compute estimates of all parameters and standard errors

  • SAR.geo <- parSE.sar(mle$par,y=regdata$sumhrt,x=x,w=wtigeo)

  • #Print out all results

  • print(data.frame(Auto.adj))

  • print(data.frame(Auto.geo))

  • print(data.frame(SAR.adj))

  • print(data.frame(SAR.geo))

  • ##Alternative code using lnam function in Carter Butt’s sna package. ##

  • library(sna)

  • library(numDeriv)

  • lnam1.adj <- lnam(regdata$sumhrt,x,wtadjdir)

  • lnam1.geo <- lnam(regdata$sumhrt,x,wtigeo)

  • lnam2.adj <- lnam(regdata$sumhrt,x,NULL,wtadjdir)

  • lnam2.geo <- lnam(regdata$sumhrt,x,NULL,wtigeo)

  • #Print out all results

  • summary(lnam1.adj)

  • summary(lnam1.geo)

  • summary(lnam2.adj)

  • summary(lnam2.geo)

Appendix B: R-code for Relational Data Model

  • #Install software on R: Can comment out after first use.

  • install.package(“statnet”)

  • install.packages(“coda”)

  • #Attached libraries each time use StatNet

  • library(statnet)

  • library(coda)

  • #Load network

  • nr <- 33

  • adjdata <- scan(“your_network_file.txt”)

  • adjdata <- matrix(adjdata,nrow=nr,byrow=T)

  • adjdata[19,19] <- 0 #Tidy up data (self-connected nodes not possible)

  • iddata <- seq(1,33,1)

  • #Load node covariate data

  • covdata <- scan(“your_covariate_file.txt”)

  • covdata <- matrix(covdata,nrow=nr,byrow=T)

  • nodecov <- list(male=covdata[,2], whexpert=covdata[,3], pctwom=covdata[,4],

  • numsess=covdata[,5], practice=covdata[,6])

  • #Make directed network

  • pnet <- network(adjdata,directed=TRUE,matrixtype=“adjacency”,

  • vertex.attr=nodecov,

  • vertex.attrnames=c(“male”,”whexpert”,”pctwom”,”numsess”,”practice”,

  • “bcma”,”bima”,”bpp”,”wnhlth”,”numcat”,”pctcat”))

  • #Plot directed network

  • plot(pnet,mode=“fruchtermanreingold”,displaylabels=T)

  • #Fit models to directed network

  • model1a <- ergm(pnet~edges)

  • model1b <- ergm(pnet~edges+mutual)

  • model1c <- ergm(pnet~edges+mutual+ttriad, theta0=“MPLE”, MPLEonly=TRUE)

  • model1d <- ergm(pnet~edges+mutual +receivercov(“whexpert”) +

  • receivercov(“pctwom”)+receivercov(“numsess”))

  • #Evaluate Goodness of fit with respect to distance and degrees

  • dist.gof <- gof(model1d~distance, nsim=10, verbose=T)

  • plot(dist.gof)

  • ideg.gof <- gof(model1d~idegree, nsim=10, verbose=T)

  • plot(ideg.gof)

  • odeg.gof <- gof(model1d~odegree, nsim=10, verbose=T)

  • plot(odeg.gof)

  • #Define network using mutual ties.

  • adjmut <- ifelse(adjdata>0,1,0)+ifelse(t(adjdata)>0,1,0)

  • adjmut <- ifelse(adjmut>0,1,0); #One possible definition of mutual tie

  • #Make undirected network

  • pnetmut <- network(adjmut,directed=FALSE,matrixtype=“adjacency”,

  • vertex.attr=nodecov,

  • vertex.attrnames=c(“male”,”whexpert”,”pctwom”,”numsess”,”practice”,

  • “bcma”,”bima”,”bpp”,”wnhlth”,”numcat”,”pctcat”))

  • #Plot undirected network

  • plot(pnetmut,mode=“fruchtermanreingold”,displaylabels=T)

  • #Fit models to undirected network

  • model2a <- ergm(pnetmut~edges)

  • model2b <- ergm(pnetmut~edges+gwesp(1.2,fixed=T),

  • burnin=10000, MCMCsamplesize=10000, interval=100, maxit=3)

  • model2c <- ergm(pnetmut~edges+gwesp(1.2,fixed=F)+kstar(2:3),

  • burnin=10000, MCMCsamplesize=10000, interval=100, maxit=3)

  • model2d <- ergm(pnetmut~edges+gwesp(1.2,fixed=T)+

  • nodematch(“male”,diff=F)+nodematch(“practice”,diff=F),

  • burnin=10000, MCMCsamplesize=10000, interval=100, maxit=3)

  • model2e <- ergm(pnetmut~edges+gwesp(1.2,fixed=T) +

  • nodematch(“male”,diff=F)+nodematch(“practice”,diff=T),

  • burnin=10000, MCMCsamplesize=10000, interval=100, maxit=3)

  • #Test goodness of fit with respect to espartners statistic

  • esppart.gof <- gof(model2e~espartners,nsim=10,verbose=T)

  • plot(esppart.gof)

Rights and permissions

Reprints and permissions

About this article

Cite this article

O’Malley, A.J., Marsden, P.V. The analysis of social networks. Health Serv Outcomes Res Method 8, 222–269 (2008). https://doi.org/10.1007/s10742-008-0041-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10742-008-0041-z

Keywords

Navigation