Skip to main content

Credit Assignment

  • Reference work entry
  • First Online:
Encyclopedia of Machine Learning and Data Mining
  • 280 Accesses

Synonyms

Structural credit assignment; Temporal credit assignment

Definition

When a learning system employs a complex decision process, it must assign credit or blame for the outcomes to each of its decisions. Where it is not possible to directly attribute an individual outcome to each decision, it is necessary to apportion credit and blame between each of the combinations of decisions that contributed to the outcome. We distinguish two cases in the credit assignment problem. Temporal credit assignment refers to the assignment of credit for outcomes to actions. Structural credit assignment refers to the assignment of credit for actions to internal decisions. The first subproblem involves determining when the actions that deserve credit were taken and the second involves assigning credit to the internal structure of actions (Sutton 1984).

Motivation

Consider the problem of learning to balance a pole that is hinged on a cart (Michie and Chambers 1968; Anderson and Miller 1991). The cart...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 699.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 949.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  • Albus JS (1975) A new approach to manipulator control: the cerebellar model articulation controller (CMAC). J Dyn Syst Measur Control Trans ASME 97(3):220–227

    Article  MATH  Google Scholar 

  • Anderson CW, Miller WT (1991) A set of challenging control problems. In: Miller W, Sutton RS, Werbos PJ (eds) Neural networks for control. MIT Press, Cambridge

    Google Scholar 

  • Atkeson C, Schaal S, Moore A (1997) Locally weighted learning. AI Rev 11:11–73

    Google Scholar 

  • Banerjee B, Liu Y, Youngblood GM (eds) (2006) Proceedings of the ICML workshop on “structural knowledge transfer for machine learning, Pittsburgh

    Google Scholar 

  • Barto A, Sutton R, Anderson C (1983) Neuron-like adaptive elements that can solve difficult learning control problems. IEEE Trans Syst Man Cybern SMC-13:834–846

    Article  Google Scholar 

  • Benson S, Nilsson NJ (1995) Reacting, planning and learning in an autonomous agent. In: Furukawa K, Michie D, Muggleton S (eds) Machine intelligence, vol 14. Oxford University Press, Oxford

    Google Scholar 

  • Bertsekas DP, Tsitsiklis J (1996) Neuro-dynamic programming. Athena Scientific, Nashua

    MATH  Google Scholar 

  • Caruana R (1997) Multitask learning. Mach Learn 28:41–75

    Article  Google Scholar 

  • Dejong G, Mooney R (1986) Explanation-based learning: an alternative view. Mach Learn 1:145–176

    Google Scholar 

  • Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. Addison-Wesley Longman Publishing, Boston

    MATH  Google Scholar 

  • Grefenstette JJ (1988) Credit assignment in rule discovery systems based on genetic algorithms. Mach Learn 3(2–3):225–245

    Google Scholar 

  • Hinton G, Rumelhart D, Williams R (1985) Learning internal representation by back-propagating errors. In: Rumelhart D, McClelland J, Group TPR (eds) Parallel distributed computing: explorations in the microstructure of cognition, vol 1. MIT Press, Cambridge, pp 31–362

    Google Scholar 

  • Holland J (1986) Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems. In: Michalski RS, Carbonell JG, Mitchell TM (eds) Machine learning: an artificial intelligence approach, vol 2. Morgan Kaufmann, Los Altos

    Google Scholar 

  • Laird JE, Newell A, Rosenbloom PS (1987) SOAR: an architecture for general intelligence. Artif Intell 33(1):1–64

    Article  Google Scholar 

  • Mahadevan S (2009) Learning representation and control in Markov decision processes: new frontiers. Found Trends Mach Learn 1(4):403–565

    Article  MATH  Google Scholar 

  • Michie D, Chambers R (1968) Boxes: an experiment in adaptive control. In: Dale E, Michie D (eds) Machine intelligence, vol 2. Oliver and Boyd, Edinburgh

    Google Scholar 

  • Minsky M (1961) Steps towards artificial intelligence. Proc IRE 49:8–30

    Article  MathSciNet  Google Scholar 

  • Mitchell TM, Keller RM, Kedar-Cabelli ST (1986) Explanation based generalisation: a unifying view. Mach Learn 1:47–80

    Google Scholar 

  • Mitchell TM, Utgoff PE, Banerji RB (1983) Learning by experimentation: acquiring and refining problem-solving heuristics. In: Michalski R, Carbonell J, Mitchell T (eds) Machine kearning: an artificial intelligence approach. Tioga, Palo Alto

    Google Scholar 

  • Moore AW (1990) Efficient memory-based learning for robot control. Ph.D. thesis, UCAM-CL-TR-209, Computer Laboratory, University of Cambridge, Cambridge

    Google Scholar 

  • Niculescu-mizil A, Caruana R (2007) Inductive transfer for Bayesian network structure learning. In: Proceedings of the 11th international conference on AI and statistics (AISTATS 2007), San Juan

    Google Scholar 

  • Reid MD (2004) Improving rule evaluation using multitask learning. In: Proceedings of the 14th international conference on inductive logic programming, Porto, pp 252–269

    Google Scholar 

  • Reid MD (2007) DEFT guessing: using inductive transfer to improve rule evaluation from limited data. Ph.D. thesis, School of Computer Science and Engineering, The University of New South Wales, Sydney

    Google Scholar 

  • Rosenblatt F (1962) Principles of neurodynamics: perceptrons and the theory of Brain mechanics. Spartan Books, Washington, DC

    MATH  Google Scholar 

  • Samuel A (1959) Some studies in machine learning using the game of checkers. IBM J Res Develop 3(3):210–229

    Article  MathSciNet  Google Scholar 

  • Silver D, Bakir G, Bennett K, Caruana R, Pontil M, Russell S et al (2005) NIPS workshop on “inductive transfer: 10 years later”, Whistler

    Google Scholar 

  • Sutton R (1984) Temporal credit assignment in reinforcement learning. Ph.D. thesis, Department of Computer and Information Science, University of Massachusetts, Amherst

    Google Scholar 

  • Sutton R, Barto A (1998) Reinforcement learning: an introduction. MIT Press, Cambridge

    Google Scholar 

  • Taylor ME, Stone P (2009) Transfer learning for reinforcement learning domains: a survey. J Mach Learn Res 10:1633–1685

    MathSciNet  MATH  Google Scholar 

  • Wang X, Simon HA, Lehman JF, Fisher DH (1996) Learning planning operators by observation and practice. In: Proceedings of the second international conference on AI planning systems (AIPS-94), Chicago, pp 335–340

    Google Scholar 

  • Watkins C (1989) Learning with delayed rewards. Ph.D. thesis, Psychology Department, University of Cambridge, Cambridge

    Google Scholar 

  • Watkins C, Dayan P (1992) Q-learning. Mach Learn 8(3–4):279–292

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media New York

About this entry

Cite this entry

Sammut, C. (2017). Credit Assignment. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_185

Download citation

Publish with us

Policies and ethics