Skip to main content

MDP for Query-Based Wireless Sensor Networks

  • Chapter
  • First Online:
Markov Decision Processes in Practice

Part of the book series: International Series in Operations Research & Management Science ((ISOR,volume 248))

  • 3973 Accesses

Abstract

Increased sensors availability and growing interest in sensor monitoring has lead to an significant increase in the number of sensor networks deployed in the last decade. Simultaneously, the amount of sensed data and the number of queries calling this data significantly increased. The challenge is to respond to the queries in a timely manner and with relevant data, without having to resort to hardware updates or duplication. In this chapter we focus on the trade-off between the response time of queries and the freshness of the data provided. Query response time is a significant Quality of Service for sensor networks, especially in the case of real-time applications. Data freshness ensures that queries are answered with relevant data, that closely characterizes the monitored area. To model the trade-off between the two metrics, we propose a continuous-time Markov decision process with a drift, which assigns queries for processing either to a sensor network, where queries wait to be processed, or to a central database, which provides stored and possibly outdated data. To compute an optimal query assignment policy, a discrete-time discrete-state Markov decision process, shown to be stochastically equivalent to the initial continuous-time process, is formulated. This approach provides a theoretical support for the design and implementation of WSN applications, while ensuring a close-to-optimum performance of the system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. E.B. Dynkin, Markov Processes, vol. 1 (Academic, New York, 1965)

    Book  Google Scholar 

  2. I.I. Gikhman, A.V. Skorokhod, The Theory of Stochastic Processes: II, vol. 232 (Springer, New York, 2004)

    Book  Google Scholar 

  3. A. Hordijk, R. Schassberger, Weak convergence for generalized semi-markov processes. Stoch. Process. Appl. 12 (3), 271–291 (1982)

    Article  Google Scholar 

  4. A. Hordijk, F.A. van der Duyn Schouten, Discretization and weak convergence in Markov decision drift processes. Math. Oper. Res. 9 (1), 112–141 (1984)

    Article  Google Scholar 

  5. A. Jensen, Markoff chains as an aid in the study of markoff processes. Scand. Actuar. J. 1953 (sup1), 87–91 (1953)

    Google Scholar 

  6. M. Mitici, M. Onderwater, M. de Graaf, J.-K. van Ommeren, N. van Dijk, J. Goseling, R.J. Boucherie, Optimal query assignment for wireless sensor networks. AEU-Int. J. Electron. Commun. 69 (8), 1102–1112 (2015)

    Article  Google Scholar 

  7. A.R. Odoni, On finding the maximal gain for Markov decision processes. Oper. Res. 17 (5), 857–860 (1969)

    Article  Google Scholar 

  8. M.L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming (Wiley, New York, 1994)

    Book  Google Scholar 

  9. N. van Dijk, On a simple proof of uniformization for continuous and discrete-state continuous-time markov chains. Adv. Appl. Probab. 22 (3), 749–750 (1990)

    Article  Google Scholar 

  10. N. van Dijk, A. Hordijk, Time-discretization for controlled Markov processes. I. General approximation results. Kybernetika 32 (1), 1–16 (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mihaela Mitici .

Editor information

Editors and Affiliations

Appendices

Appendices

1.1 Proof of Theorem 1

Proof (Theorem 20.1).

Let h(x) be a measurable function on some state space E of a Markov process. Let P(v, x, Ξ) be a transition function expressing the probability that a process which started in a state x is in the set Ξ at time v. Let T v h(x) = ∫ E P(v, x, dy)h(y) denote a shift operator on the space E. Then the operator

$$\displaystyle{\mathcal{H}h(x) =\lim _{v\rightarrow 0}\frac{T_{v}h(x) - h(x)} {v} }$$

is called the infinitesimal generator of the Markov process. The quantity \(\mathcal{H}h(x)\) can be interpreted as the mean infinitesimal rate of change of the process starting in state x. Moreover, the infinitesimal generator uniquely define a Markov process [1, Chap. 1]. Therefore, it is sufficient to show that the infinitesimal generator of the exponential uniformized Markov decision process and the original continuous-time Markov decision process with a drift are identical.

In our setting, we consider the state x = (i, j, t). Before addressing the infinitesimal generator of the exponentially uniformized Markov process defined in Sect. 20.4, we first define the transition probability measure under action a ∈ A. Let P Δ t a denote the transition probability measures over a time interval of length Δ t > 0, given that at the last jump the system is in state (i, j, t) and that following a upon a next jump, which occurs in the interval Δ t, decision d is taken and the system is in a new state.

As we implicitly made the assumption that a policy π, prescribing an action a upon a query arrival, when the system is in state (i, j, t), is right continuous and since the set of decisions is finite and discrete, for any state (i, j, t) and fixed policy π there exists a Δ t > 0 such that:

$$\displaystyle{\pi (i,j,t + u) =\pi (i,j,t) = a,\text{ for all }u \leq \varDelta t.}$$

Let \(f: \mathbb{N} \times \mathbb{N} \times \mathbb{R}\) be an arbitrary real valued function, differentiable in t. Then by conditioning upon the exponential jump epoch with variable χ and for arbitrary function f we obtain,

$$\displaystyle\begin{array}{rcl} \mathbf{P}_{\varDelta t}^{a}f(i,j,t)& =& \:e^{-\varDelta t\cdot \chi }f(i,j,t +\varDelta t) {}\\ & & +\int _{0}^{\varDelta t}\chi e^{-u\chi }\sum \limits _{ (i,j,t)'}\mathbf{P}^{a}[(i,j,t),(i',j',t + u)]f(i',j',t + u)du + o(\varDelta t)^{2} {}\\ & & = f(i,j,t +\varDelta t) -\varDelta t\chi f(i,j,t +\varDelta t) {}\\ & & +\varDelta t\chi \sum \limits _{(i',j')\neq (i,j)}q^{a}[(i,j,t),(i',j',t)]f(i',j',t +\varDelta t)\chi ^{-1} {}\\ & & +\varDelta t\chi [1 - q^{a}(i,j)\chi ^{-1}]f(i,j,t +\varDelta t) + o(\varDelta t)^{2} {}\\ & & = f(i,j,t +\varDelta t) +\chi \sum \limits _{(i',j')\neq (i,j)}q^{a}[(i,j,t),(i',j',t)][f(i',j',t +\varDelta t) {}\\ & & -f(i,j,t +\varDelta t)] + o(\varDelta t)^{2}, {}\\ \end{array}$$

where we have used that q a[(i, j, t), (i′, j′, t)] = q a[(i, j, t + u), (i′, j′, t + u)] for any (i′, j′) ≠ (i, j) and arbitrary s. The term o(Δ t)2 reflects the probability of at least two jumps and the second term of the Taylor expansion for e −Δ χ.

Hence, by subtracting f(i, j, t), dividing by Δ t and letting Δ t → 0, we obtain,

$$\begin{array}{*{20}{c}}{\frac{{{\bf{P}}_{\Delta t}^af(i,j,t) - f(i,j,t)}}{{\Delta t}}}& = &{\,[f(i,j,t + \Delta t) - f(i,j,t)]/\Delta t}&{}\\{}& + &{\chi [f(i,j,t + \Delta t) - f(i,j,t)] + o{{(\Delta t)}^2}}&{}\\{}& + &{\sum\limits_{(i',j') \ne (i,j)} {{q^a}} [(i,j,t),(i',j',t)][f(i',j',t) - f(i,j,t)]}&{}\\{}& \to &{\frac{d}{{dt}}f(i,j,t)}&{}\\{}& + &{\sum\limits_{(i',j') \ne (i,j)} {{q^a}} [(i,j,t),(i',j',t)][f(i',j',t) - f(i,j,t)]}&{}\\{}& = &{{{\cal H}^a}f(i,j,t),{\rm{which is the generator in (20}}{\rm{.2)}}.}&{}\end{array}$$

Since the exponentially uniformized Markov decision process (defined in Sect. 20.4) and the continuous-time Markov decision process with a drift (defined in Sect. 20.3) share the same generators [1], the two processes are stochastically equivalent.

1.2 Notation

S

State space

(i, j, N)

A state, given a discrete state space

C a(i, j, N)

Expected one step cost rate in state (i, j, N), under action a

A

Set of actions available in state (i, j, N)

a

Action when in state (i, j, N), given a discrete state space

W, DB

Stationary Policy

P W, P DB

One step transition probability distribution/matrix

 

under policy W, DB

P a[(i, j, N), (i, j, N)′]

Transition probability into state (i, j, N)′, from

 

state (i, j, N), under action a

q a[(i, j, N), (i, j, N)′]

Transition rate from state (i, j, N) into (i, j, N)′

 

under action a

V n W(i, j, N), V n DB(i, j, N)

Value function under policy W, DB of expected

 

cumulative cost over n steps

V n (i, j, N)

Optimal value function of expected cumulative cost

 

over n steps, starting in state (i, j, N)

g∗

Optimal average expected cost function

B

Uniformization parameter

\(\mathcal{H}\)

Infinitesimal generator of Markov decision process

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Mitici, M. (2017). MDP for Query-Based Wireless Sensor Networks. In: Boucherie, R., van Dijk, N. (eds) Markov Decision Processes in Practice. International Series in Operations Research & Management Science, vol 248. Springer, Cham. https://doi.org/10.1007/978-3-319-47766-4_20

Download citation

Publish with us

Policies and ethics