马尔可夫决策过程

  • 网络markov decision process;POMDP;MDP;MDPS
马尔可夫决策过程马尔可夫决策过程
  1. 针对异构网络环境下移动用户的业务需求特点,提出将传统用户偏好提取技术与马尔可夫决策过程建模方法相结合,创建用户偏好评估模型。解决动态判决环境下基于不完整信息的智能判决问题。

    For the service requirement features of mobile user under the heterotypic network environment , we propose a new model for evaluating user preference by combining traditional preference methods with MDP ( Markov Decision Process ) to realize the decision scenario with dynamic and incomplete information .

  2. 首先建立任务调度问题的目标模型,在分析Q学习算法的基础上,给出调度问题的马尔可夫决策过程描述;

    The paper first presents an objective model of task scheduling , and then based on the analysis of Q learning algorithm , the Markov decision process description of the scheduling problem is given .

  3. 在Markov性能势基础上讨论了一种基于强化学习的马尔可夫决策过程(MDP)优化方法。

    We discuss the reinforcement learning-based optimization methods of Markov decision processes ( MDPs ) using the Markov performance potentials .

  4. 马尔可夫决策过程基于TD(0)学习和性能势的NDP优化

    The NDP Optimization of Markov Decision Processes Based on TD ( 0 ) Learning and Performance Potentials

  5. 该文提出了一个基于半马尔可夫决策过程理论的最优准入控制策略来支持有服务质量要求的多类业务的无线CDMA网络。

    In this paper , an optimal Call Admisition Control ( CAC ) scheme based on Semi-Markov Decision Processes ( SMDP ) is presented to support multiple class services for QoS wireless networks .

  6. 基于马尔可夫决策过程和DT-Golog的动态工作流集成

    Dynamic Workflow Composition Using MDP and DT-Golog

  7. 部分可观测马尔可夫决策过程(POMDP)是一种用于制定序列决策的经典模型。在该模型中,智能体做出动作所产生的效果是不确定的,对环境状态信息的观测也是不完整的。

    Partially Observable Markov Decision Process ( POMDP ) is a general sequential decision-making model where the effects of actions are nondeterministic and the information about world states is partially available .

  8. 马尔可夫决策过程在防空系统目标分配中的应用

    An Application of the Markov Decision Process to the Target Assignment

  9. 以上研究结果均可适用于连续时间马尔可夫决策过程(CTMDP)。

    The above results will be applicable to continuous-time Markov decision processes .

  10. 马尔可夫决策过程在视情维修中的应用

    Application of Markov Decision Process in Condition Based Maintenance

  11. 论文首先介绍了马尔可夫决策过程的基本概念和再励学习的框架。

    The concepts of Markov decision process and reinforcement learning are introduced firstly .

  12. 一种基于马尔可夫决策过程的认知无线电网络传输调度方案

    A Transmission and Scheduling Scheme Based on Markov Decision Process in Cognitive Radio Networks

  13. 基于马尔可夫决策过程的炮兵群动态火力分配方法

    Research of the Dynamic Firing Distribute Way of Artillery Based on Markov Decision Processes

  14. 马尔可夫决策过程自适应决策的进展

    New Achievements in Adaptive Markov Decision Process

  15. 本文研究了基于准马尔可夫决策过程方法的多业务最优呼叫接纳控制问题。

    Based on the Semi-Markov decision process ( SMDP ) approach , this paper studies the optimal call admission control policy problem .

  16. 应用多代理马尔可夫决策过程,建立了一种新的多管理者网络故障监控机制,并给出了该机制下基于强化学习的轮询策略。

    This fault monitoring policy is based on the model of multi-agent Markov Decision Processes and makes use of the reinforcement learning mechanism .

  17. 本文把这样一个优化问题构造为马尔可夫决策过程,并提出了用动态规划解决该问题的方法。

    The paper formulates this optimization problem as a Markov decision process ( MDP ) and use dynamic programming techniques to obtain the solution .

  18. 根据马尔可夫决策过程理论,实现了道路行程时间的实时估计。

    According Discrete-time Markovian Decision Process ( DTMDP ) theory , the dynamic travel time on the signalized arterial over the time horizon was then obtained .

  19. 根据连续时间马尔可夫决策过程的平均准则,给出了一种特殊的马尔可夫决策过程&受控排队系统平均最优以及约束最优的新条件。

    By the embedded Markov chain , the problems of optimal stationary policies are studied for controlled M / G / 1 queuing systems with the infinite horizon average-cost criteria .

  20. 在确定油气管道维修过程中的腐蚀状态划分以后,分析了油气管道的维修措施及相应费用,并采用策略改进算法对马尔可夫决策过程进行求解。

    Maintenance measures and their corresponding cost are analyzed after dividing the corrosion state during oil-gas pipelines ' maintenance , and Markov decision process is solved by appling strategy improvement algorithm .

  21. 运用马尔可夫决策过程理论,给出了不同维修方式转移概率的表达式,解决了预防性维修策略的组合优化问题。

    Using Markov decision process theory , the expression of the transition probability for different PM actions was set up , and then the optimal combination problem on multi-components PM strategy was solved .

  22. 给出了基于约束马尔可夫决策过程的网络生存性定义,提出了一个多层网络生存性研究框架和一种新的网络生存性设计方法。

    A new definition of survivability based on constrained Markov Decision Processes is introduced . And a multi - ( layered ) network survivability framework and a new method of network survivability design are also proposed .

  23. 马尔可夫决策过程是确定性动态规划和马尔可夫过程结合的产物,是研究随机环境下多阶段决策过程优化问题的理论工具。

    Markov Decision Process which is based on the deterministic dynamic planning and the Markov Process , is a theoretic tool for the researches on the optimizing problems of the multistage decision-making processes in the stochastic environment .

  24. 传统的强化学习算法应用到大状态、动作空间和任务复杂的马尔可夫决策过程问题时,存在收敛速度慢,训练时间长等问题。

    The extension of reinforcement learning to MDPs with large state , action space and high complexity has inevitably encountered the problem of the curse of dimensionality , which results in slow convergence and long training time .

  25. 在有限马尔可夫决策过程的线性规划求解方法以及神经网络算法的基础上提出了运用神经网络求解有限马尔可夫决策问题的方法.并通过算例验证了该方法的有效性。

    Based of the method of linear programming of finite Markov decision process and neural network algorithm , a method for the solution of finite Markov decision problems has been introduced , the efficiency of the method being explained with examples .