Adaptive Dynamic Programming for Control: Algorithms and by Huaguang Zhang, Derong Liu, Yanhong Luo, Ding Wang

By Huaguang Zhang, Derong Liu, Yanhong Luo, Ding Wang

There are many equipment of strong controller layout for nonlinear platforms. In looking to transcend the minimal requirement of balance, Adaptive Dynamic Programming in Discrete Time ways the demanding subject of optimum keep an eye on for nonlinear structures utilizing the instruments of adaptive dynamic programming (ADP). the variety of structures handled is wide; affine, switched, singularly perturbed and time-delay nonlinear platforms are mentioned as are the makes use of of neural networks and methods of price and coverage new release. The textual content positive aspects 3 major elements of ADP within which the equipment proposed for stabilization and for monitoring and video games enjoy the incorporation of optimum keep watch over equipment:
• infinite-horizon keep watch over for which the trouble of fixing partial differential Hamilton–Jacobi–Bellman equations at once is triumph over, and facts only if the iterative worth functionality updating series converges to the infimum of the entire price capabilities received through admissible keep an eye on legislation sequences;
• finite-horizon keep watch over, carried out in discrete-time nonlinear platforms displaying the reader how you can receive suboptimal keep an eye on options inside a hard and fast variety of keep watch over steps and with effects extra simply utilized in actual platforms than these frequently received from infinite-horizon keep watch over;
• nonlinear video games for which a couple of combined optimum guidelines are derived for fixing video games either whilst the saddle aspect doesn't exist, and, while it does, heading off the lifestyles stipulations of the saddle aspect.
Non-zero-sum video games are studied within the context of a unmarried community scheme within which guidelines are received making certain process balance and minimizing the person functionality functionality yielding a Nash equilibrium.
In order to make the assurance appropriate for the coed in addition to for the specialist reader, Adaptive Dynamic Programming in Discrete Time:
• establishes the elemental concept concerned sincerely with each one bankruptcy dedicated to a basically identifiable keep an eye on paradigm;
• demonstrates convergence proofs of the ADP algorithms to deepen realizing of the derivation of balance and convergence with the iterative computational tools used; and
• exhibits how ADP equipment will be placed to exploit either in simulation and in actual functions.
This textual content may be of substantial curiosity to researchers drawn to optimum regulate and its functions in operations study, utilized arithmetic computational intelligence and engineering. Graduate scholars operating up to the mark and operations study also will locate the information provided right here to be a resource of strong tools for furthering their study.

Show description

Read Online or Download Adaptive Dynamic Programming for Control: Algorithms and Stability PDF

Similar system theory books

Interval methods for circuit analysis

The aim of the current e-book is to acquaint the reader with a few purposes of period research in electrical circuit idea. extra particularly, period types and resulting period equipment for circuit research are offered intimately for the next subject matters: linear electrical circuit tolerance research (steady-state in addition to temporary analysis); linear circuit balance and nonlinear circuit research (including either resistive and dynamic circuits).

Statistical Mechanics of Complex Networks (Lecture Notes in Physics)

Networks offers an invaluable version and picture photo beneficial for the outline of a wide number of web-like constructions within the actual and man-made realms, e. g. protein networks, meals webs and the web. The contributions collected within the current quantity offer either an advent to, and an summary of, the multifaceted phenomenology of complicated networks.

Asymptotic Theory of Nonlinear Regression

Allow us to imagine that an commentary Xi is a random variable (r. v. ) with values in 1 1 (1R1 , eight ) and distribution Pi (1R1 is the genuine line, and eight is the cr-algebra of its Borel subsets). allow us to additionally imagine that the unknown distribution Pi belongs to a 1 sure parametric family members {Pi() , () E e}. We name the triple £i = {1R1 , eight , Pi(), () E e} a statistical test generated via the commentary Xi.

From Stochastic Calculus to Mathematical Finance: The Shiryaev Festschrift

Devoted to the eminent Russian mathematician Albert Shiryaev at the get together of his seventieth birthday, the Festschrift is a set of papers, together with a number of surveys, written by way of his former scholars, co-authors and associates. those mirror the wide variety of medical pursuits of the instructor and his Moscow tuition.

Additional info for Adaptive Dynamic Programming for Control: Algorithms and Stability

Sample text

With λi (x(k)) = ∂Vi (x(k))/∂x(k), we conclude that the corresponding costate function sequence {λi } is also convergent with λi → λ∗ as i → ∞. Since the costate function is convergent, we can conclude that the corresponding control law sequence {vi } converges to the optimal control law u∗ as i → ∞. 46), which does not require the computation of ∂Vi (x(k + 1))/∂x(k + 1). 10) there v (x(k)) −T ¯ −1 is an integral term 2 0 i ϕ (U s)U¯ Rds to compute at each iteration step, which is not an easy task.

50) with λ0 (·) = 0. Then, the costate function sequence {λi } and the control law sequence {vi } are convergent as i → ∞. The optimal value λ∗ is defined as the limit of the costate function λi when vi approaches the optimal value u∗ . , J ∗ (x(k)) = inf {x T (k)Qx(k) + W (u(k)) + J ∗ (x(k + 1))}. , Vi → J ∗ as i → ∞. With λi (x(k)) = ∂Vi (x(k))/∂x(k), we conclude that the corresponding costate function sequence {λi } is also convergent with λi → λ∗ as i → ∞. Since the costate function is convergent, we can conclude that the corresponding control law sequence {vi } converges to the optimal control law u∗ as i → ∞.

50). Define the error function for the critic network as ec(i+1) (k) = λˆ i+1 (x(k)) − λi+1 (x(k)). 58) The objective function to be minimized for the critic network is 1 T Ec(i+1) (k) = ec(i+1) (k)ec(i+1) (k). 60) 42 2 Optimal State Feedback Control for Discrete-Time Systems where αc > 0 is the learning rate of the critic network, and j is the inner-loop iteration step for updating the weight parameters. In the action network, the state x(k) is used as the input of the network and the output can be formulated as T vˆi (x(k)) = wai φ(x(k)).

Download PDF sample

Rated 4.00 of 5 – based on 28 votes