Pascal and Francis Bibliographic Databases

Help

Search results

Your search

kw.\*:("Policy iteration")

Publication Year[py]

A-Z Z-A Frequency ↓ Frequency ↑
Export in CSV

Discipline (document) [di]

A-Z Z-A Frequency ↓ Frequency ↑
Export in CSV

Author Country

A-Z Z-A Frequency ↓ Frequency ↑
Export in CSV

Results 1 to 25 of 40

  • Page / 2
Export

Selection :

  • and

A policy-improvement type algorithm for solving zero-sum two-person stochastic games of perfect informationRAGHAVAN, T. E. S; SYED, Zamir.Mathematical programming. 2003, Vol 95, Num 3, pp 513-532, issn 0025-5610, 20 p.Article

Adaptive policy-iteration and policy-value-iteration for discounted Markov decision processesHÜBNER, G; SCHÄL, M.ZOR. Zeitschrift für Operations-Research. 1991, Vol 35, Num 6, pp 491-503, issn 0340-9422Article

Policy iteration and Newton-Raphson methods for Markov decision processes under average cost criterionOHNISHI, M.Computers & mathematics with applications (1987). 1992, Vol 24, Num 1-2, pp 147-155, issn 0898-1221Conference Paper

Finding optimal memoryless policies of POMDPs under the expected average reward criterionYANJIE LI; BAOQUN YIN; HONGSHENG XI et al.European journal of operational research. 2011, Vol 211, Num 3, pp 556-567, issn 0377-2217, 12 p.Article

Bias optimality for multichain continuous-time Markov decision processesXIANPING GUO; XINYUAN SONG; JUNYU ZHANG et al.Operations research letters. 2009, Vol 37, Num 5, pp 317-321, issn 0167-6377, 5 p.Article

Review of a Markov decision algorithm for optimal inspections and revisions in a maintenance system with partial informationWIJNMALEN, D. J. D; HONTELEZ, J. A. M.European journal of operational research. 1992, Vol 62, Num 1, pp 96-104, issn 0377-2217Article

Iterative computation of noncooperative equilibria in nonzero-sum differential games with weakly coupled playersSRIKANT, R; BASAR, T.Journal of optimization theory and applications. 1991, Vol 71, Num 1, pp 137-168, issn 0022-3239Conference Paper

Reduced complexity dynamic programming based on policy iterationBAYARD, D. S.Journal of mathematical analysis and applications. 1992, Vol 170, Num 1, pp 75-103, issn 0022-247XArticle

A policy improvement method for constrained average Markov decision processesHYEONG SOO CHANG.Operations research letters. 2007, Vol 35, Num 4, pp 434-438, issn 0167-6377, 5 p.Article

Incremental value iteration for time-aggregated markov-decision processesTAO SUN; QIANCHUAN ZHAO; LUH, Peter B et al.IEEE transactions on automatic control. 2007, Vol 52, Num 11, pp 2177-2182, issn 0018-9286, 6 p.Article

Policy iteration for customer-average performance optimization of closed queueing systemsLI XIA; XI CHEN; CAO, Xi-Ren et al.Automatica (Oxford). 2009, Vol 45, Num 7, pp 1639-1648, issn 0005-1098, 10 p.Article

Optimal and near-optimal policies for lost sales inventory models with at most one replenishment order outstandingHILL, Roger M; JOHANSEN, Søren Glud.European journal of operational research. 2006, Vol 169, Num 1, pp 111-132, issn 0377-2217, 22 p.Article

Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systemsVRABIE, Draguna; LEWIS, Frank.Neural networks. 2009, Vol 22, Num 3, pp 237-246, issn 0893-6080, 10 p.Conference Paper

An analysis of transient markov decision processesJAMES, Huw W; COLLINS, E. J.Journal of applied probability. 2006, Vol 43, Num 3, pp 603-621, issn 0021-9002, 19 p.Article

Dynamic shortest paths in acyclic networks with Markovian arc costsPSARAFTIS, H. N; TSITSIKLIS, J. N.Operations research. 1993, Vol 41, Num 1, pp 91-101, issn 0030-364XArticle

Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamicsKIUMARSI, Bahare; LEWIS, Frank L; MODARES, Hamidreza et al.Automatica (Oxford). 2014, Vol 50, Num 4, pp 1167-1175, issn 0005-1098, 9 p.Article

Risk-averse dynamic programming for Markov decision processesRUSZCZYNSKI, Andrzej.Mathematical programming (Print). 2010, Vol 125, Num 2, pp 235-261, issn 0025-5610, 27 p.Conference Paper

Policy set iteration for Markov decision processesHYEONG SOO CHANG.Automatica (Oxford). 2013, Vol 49, Num 12, pp 3687-3689, issn 0005-1098, 3 p.Article

Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample pathANTOS, Andras; SZEPESVARI, Csaba; MUNOS, Remi et al.Machine learning. 2008, Vol 71, Num 1, pp 89-129, issn 0885-6125, 41 p.Article

Optimization of a special case of continuous-time Markov decision processes with compact action setTANG HAO; ZHOU LEI; TAMIO, Arai et al.European journal of operational research. 2008, Vol 187, Num 1, pp 113-119, issn 0377-2217, 7 p.Article

A unified approach to Markov decision problems and performance sensitivity analysis with discounted and average criteria: multichain casesCAO, Xi-Ren; XIANPING GUO.Automatica (Oxford). 2004, Vol 40, Num 10, pp 1749-1759, issn 0005-1098, 11 p.Article

Continuous-time Markov decision processes with nth-bias optimality criteriaJUNYU ZHANG; CAO, Xi-Ren.Automatica (Oxford). 2009, Vol 45, Num 7, pp 1628-1638, issn 0005-1098, 11 p.Article

Average optimality for continuous-time Markov decision processes with a policy iteration approachQUANXIN ZHU.Journal of mathematical analysis and applications. 2008, Vol 339, Num 1, pp 691-704, issn 0022-247X, 14 p.Article

A mean-variance optimization problem for discounted Markov decision processesXIANPING GUO; LIUER YE; GEORGE YIN et al.European journal of operational research. 2012, Vol 220, Num 2, pp 423-429, issn 0377-2217, 7 p.Article

A policy improvement method in constrained stochastic dynamic programmingHYEONG SOO CHANG.IEEE transactions on automatic control. 2006, Vol 51, Num 9, pp 1523-1526, issn 0018-9286, 4 p.Article

  • Page / 2