Skip to main content

Information Dynamics and Aspects of Musical Perception

  • Chapter
  • First Online:
The Structure of Style

Abstract

Musical experience has been often suggested to be related to forming of expectations, their fulfillment or denial. In terms of information theory, expectancies and predictions serve to reduce uncertainty about the future and might be used to efficiently represent and “compress” data. In this chapter we present an information theoretic model of musical listening based on the idea that expectations that arise from past musical material are framing our appraisal of what comes next, and that this process eventually results in creation of emotions or feelings. Using a notion of “information rate” we can measure the amount of information between past and present in the musical signal on different time scales using statistics of sound spectral features. Several musical pieces are analyzed in terms of short and long term information rate dynamics and are compared to analysis of musical form and its structural functions. The findings suggest that a relation exists between information dynamics and musical structure that eventually leads to creation of human listening experience and feelings such as “wow” and “aha”.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This is known as relative entropy, or Kullback-Leibler distance.

  2. 2.

    This derivation summarizes and corrects a sign error of a derivation that appeared in [5].

  3. 3.

    If a single observation carries a lot of information about the model, then \(I(x_n,\theta) \approx H(\theta)\) and model-IR becomes \(H(\theta) - E[D(\theta||\theta^*)]\).

  4. 4.

    Note that due to the indexing convention chosen for the transition matrix, Markov process operates by left side matrix multiplication. The stationary vector then is a left (row) eigenvector with an eigenvalue that equals to one.

  5. 5.

    The smoothing was done using a liner phase low pass filter with frequency cutoff at 0.3 of the window advance rate

  6. 6.

    http://www.classicalarchives.com/artist/4028.html

  7. 7.

    The dramatic power of a silence is of course well known to performers, creating a suspense by delaying a continuation. What is new here is the fact that this effect is captured by SR in terms of introduction of new spectral contents.

  8. 8.

    Partition function gives a measure of spread of configuration of the different signal types.

References

  1. Berns G (2005) Satisfaction: the science of finding true fulfillment. Henry Holt, New York

    Google Scholar 

  2. Burns K (2006) Bayesian beauty: on the art of eve and the act of enjoyment. In: Proceedings of the AAAI06 workshop on computational aesthetics

    Google Scholar 

  3. Casey MA (2001) Mpeg-7 sound recognition tools. IEEE Trans Circuits Sys Video Technol 11(6):737–747

    Article  Google Scholar 

  4. Csikszentmihalyi M (1990) Flow: the psychology of optimal experience. Harper & Row, New York, NY

    Google Scholar 

  5. Dubnov S (2008) Unified view of prediction and repetition structure in audio signals with application to interest point detection. IEEE Trans Audio Speech Lang Process 16(2):327–337

    Article  Google Scholar 

  6. Dubnov S (2006) Analysis of musical structure in audio and midi using information rate. In: Proceedings of the international computer music conference, New Orleans, LA

    Google Scholar 

  7. Dubnov S (2006) Spectral anticipations. Compu Music J 30(2):63–83

    Article  Google Scholar 

  8. Foote J, Cooper M (2001) Visualizing musical structure and rhythm via self-similarity. In: Proceedings of the international computer music conference, pp 419–422

    Google Scholar 

  9. Fraisse P (1957) Psychologie du temps [Psychology of time]. Presses Universitaires de France, Paris

    Google Scholar 

  10. Goffman E (1974) Frame analysis: an essay on the organization of experience. Harvard University Press, Cambridge, MA

    Google Scholar 

  11. Huron D (2006) Sweet anticipation: music and the psychology of expectation. MIT Press, Cambridge, MA

    Google Scholar 

  12. Kohs EB (1976) Musical form: studies in analysis and synthesis. Houghton Mifflin, Boston, MA

    Google Scholar 

  13. Lewicki MS (2002) Efficient coding of natural sounds. Nature Neurosci 5(4):356–363

    Article  Google Scholar 

  14. McAdams S (1989) Psychological constraints on form-bearing dimensions in music. Contemp Music Rev 4:181–198

    Article  Google Scholar 

  15. Moreau N, Kim H, Sikora T (2004) Audio classification based on mpeg-7 spectral basis representations. IEEE Trans Circuits Syst Video Technol 14(5):716–725

    Article  Google Scholar 

  16. Narmour E (1990) The analysis and cognition of basic melodic structures: the implication-realization model. University of Chicago Press, Chicago, IL

    Google Scholar 

  17. Nemenman I, Bialek W, Tishby N (2001) Predictability, complexity and learning. Neural Comput 13:2409–2463

    Article  MATH  Google Scholar 

  18. Oppenheim AV, Schafer RW (1989) Discrete-time signal processing. Prentice Hall Upper Saddle River, NJ

    Google Scholar 

  19. Reynolds R (2005) Mind models. Routledge, New York, NY

    Google Scholar 

  20. Reynolds R (2005) Form and method: composing music. Routledge, New York, NY

    Google Scholar 

  21. Reynolds R, Dubnov S, McAdams S (2006) Structural and affective aspects of music from statistical audio signal analysis. J Am Soc Inf Sci Technol 57(11):1526–1536

    Article  Google Scholar 

  22. Stein L (1962) Structure and style: the study and analysis of musical forms. Summy-Birchard, Evanston, IL

    Google Scholar 

  23. Tversky A, Kahneman D (1981) The framing of decisions and the psychology of choice. Science 221:453–458

    Article  MathSciNet  Google Scholar 

  24. Zhang H-J, Lu L, Wenyin L (2004) Audio textures: theory and applications. IEEE Trans Speech Audio Process 12(2):156–167

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shlomo Dubnov .

Editor information

Editors and Affiliations

Appendix

Appendix

Probability of the observations (data) depends on a parameter that describes the distribution and probability of occurrence of the parameter itself

$$P(x_1,\ldots,x_n) = \int P(x_1,\ldots,x_n| \theta) P (\theta) d\theta.$$
((7.7))

Considering an approximation of probability around an empirical distribution θ, [17],

$$\begin{array}{lll} P(x_1^n) & =& P(x_1^n | \theta) \int \frac{P\left(x_1^n | \alpha \right)}{P\left(x_1^n | \theta\right)} P(\alpha) d\alpha \\ &=& P(x_1^n | \theta) \int \exp\left[- \log \frac{P\left(x_1^n | \theta \right)}{P\left(x_1^n | \alpha \right)} \right]P(\alpha) d \alpha \\ &\approx& P(x_1^n | \theta) \int e^{-n D(\theta ||\alpha)}P(\alpha) d\alpha,\end{array}$$
((7.8))

the entropy of a block of samples can be written in terms of conditional entropy given model θ and logarithm of partition functionFootnote 8 \(Z_n(\theta) = \int P(\alpha) e^{-nD(\theta||\alpha)} d\alpha\),

$$\begin{array}{lll} H(x_1^n) & = & -\int P(\theta)\int P\left(x_1^n | \theta\right) \mathit{log} P\left(x_1^n | \theta \right) dx_1^n d \theta \\ & - & \int P(\theta) \int P(x_1^n | \theta) \cdot \mathit{log} \left[\int e^{-nD(\theta||\alpha)} P(\alpha) d \alpha\right] dx_1^n d \theta \\ & = & H(x_1^n | \theta) - E_\theta[\log Z_n(\theta)],\end{array}$$
((7.9))

where we used the fact that \(\int P(x_1^n | \theta) dx_1^n = 1\) independent of θ. With entropy of a single observation expressed in terms of conditional entropy and mutual information

$$H(x_n) = H_\theta(x_n | \theta) + I(x_n,\theta).$$
((7.10))

we express IR in terms of data, model and configuration factors

$$\begin{array}{lll} \rho(x_1^n) & = & H(x_n) + H(x_1^{n-1}) - H(x_1^n) \\ & = & \rho_{\theta}(x_1^n) + I(x_n,\theta) + E_\theta\left[\log \frac{Z_{n}(\theta)}{Z_{n-1}(\theta)}\right]\end{array}$$
((7.11))

Assuming that the space of models comprises of several peaks centered around distinct parameter values, the partition function \(Z_n(\theta)\) can be written through Laplace’s method of saddle point approximation in terms of a function proportional to its argument at an extremal value \(\theta = \theta^*\). This allows writing the right hand of (7.11) as

$$\log \frac{Z_{n}(\theta)}{Z_{n-1}(\theta)} \approx -D(\theta||\theta^*).$$
((7.12))

resulting in equation of model-based IR

$$\rho(x_1^n) \approx \rho_{\theta}(x_1^n)+ I(x_n,\theta) - E_\theta[D(\theta||\theta^*)].$$
((7.13))

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Dubnov, S. (2010). Information Dynamics and Aspects of Musical Perception. In: Argamon, S., Burns, K., Dubnov, S. (eds) The Structure of Style. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12337-5_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12337-5_7

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12336-8

  • Online ISBN: 978-3-642-12337-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics