-
Notifications
You must be signed in to change notification settings - Fork 0
/
ch-mpc.tex
361 lines (329 loc) · 17.2 KB
/
ch-mpc.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
%-------------------------------------------------------------------------------
\chapter{Formulation of Model Predictive Control Problem}
\label{ch.MPC}
%-------------------------------------------------------------------------------
Model predictive control \cite{MacMPC} is a mature and widely used control
paradigm. Its name points to two important characteristics: this strategy
is based on the knowledge of the model of the underlying process, and the behavior
of the system is predicted over some preview horizon. The goal of \ac{MPC} is to
choose a sequence of control inputs for the system over prediction horizon with
respect to given objective function and constraints. This goal can be informally
interpreted as choosing such control inputs, that do not lead to undesirable
effects in the future. The problem is resolved periodically in order to
accommodate for the current situation, the preview window is usually shifted in
time on each iteration of \ac{MPC}.
Generation of \ac{CoM} and \ac{ZMP} trajectories using an \ac{MPC} scheme was
proposed in \cite{LIPM-MPC}. This chapter reviews some extensions of the scheme
introduced in \cite{WieberMPC, dimitrov2011sparse}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Unconstrained Model Predictive Control Formulation}
The original \ac{MPC} problem defined in \cite{LIPM-MPC} has finite horizon, a
quadratic objective function, and is based on a discrete system~\eqref{eq.system_orig}.
The objective function penalizes distance from predefined reference points for \ac{ZMP},
the change in state and control. Since this formulation has no inequality constraints
an explicit control law ({\bf linear quadratic regulator}) was obtained.
Consider the unconstrained \ac{MPC} based on the system~\eqref{eq.system_orig}
where the objective function is defined equivalently to the objective function
used in \cite{WieberMPC}.
\begin{equation}\label{eq.opt_simple}
\begin{split}
\minimize{\dddot{\mbm{c}}_0 \dots \dddot{\mbm{c}}_{N-1}; \hat{\mbm{c}}_1 \dots \hat{\mbm{c}}_{N}}
& \frac{1}{2} \sum_{k=1}^{N} \norm{ \mbm{C} \hat{\mbm{c}}_{k} - \mbm{z}_k^{ref} }^2_{\mbm{Q}}
+ \frac{1}{2} \sum_{k=0}^{N-1} \norm{ \dddot{\mbm{c}}_k }^2_{\mbm{P}}\\
\subjectto & \hat{\mbm{c}}_{k+1} = \mbm{A}\hat{\mbm{c}}_{k}+\mbm{B}\dddot{\mbm{c}}_{k},\\
\end{split}
\end{equation}
where
$$
\mbm{C} =
\begin{bmatrix}
1 & 0 & -h & 0 & 0 & 0\\
0 & 0 & 0 & 1 & 0 & -h\\
\end{bmatrix}.
$$
$N$ is the number of sampling intervals in the preview window, matrices $\mbm{Q}$
and $\mbm{P}$ contain gains, $\mbm{z}_k^{ref}$ are reference positions for \ac{ZMP}
at discrete sampling time $k$. The objective function does not penalize the change
in the state $\hat{\mbm{c}}_{k}$, and penalizes the absolute values of control
inputs instead of change between subsequent control inputs.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Introducing the Inequality Constraints}\label{sec.ds_constraints}
The authors of \cite{WieberMPC} impose constraints on the position of \ac{ZMP}
in order to keep it within the support area and demonstrate that a robot can cope
with stronger disturbances in this case. Computation of explicit control law for
\ac{MPC} with inequality constraints \cite{BemporadConstrLQR} is not possible here,
since the constraints change on each iteration.
Therefore, the application of MPC requires the solution of a quadratic programming
problem. The quadratic programming is discussed in \cref{ch.QP}. This section is
focused on the formulation of \ac{MPC} for \ac{CoM}/\ac{ZMP} trajectory generation
with inequality constraints on \ac{ZMP} positions. This \ac{MPC} is referred to in
the thesis as \ac{MPCWMG}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Double Support Constraints}
Note that the support area differs depending on the type of support. \ac{SS} can
be easily represented by a rectangle, while \ac{DS} in a general case cannot.
Therefore, we need to find a suitable representation of \ac{DS} constraints.
The support area in \ac{DS} is the convex hull of two adjacent \ac{SS}. When the
\ac{SS} constraints are rectangular, their convex hull is a polygon as shown in
\cref{fig.ds_hull}. In this case the constraints for \ac{SS} and \ac{DS} must
be handled differently. We introduce an approximation of this convex hull by
a sequence of rectangles, as demonstrated in \cref{fig.ds_approx}. This approach
allows to define constraints uniformly, furthermore, it makes possible an
extension of the \ac{MPCWMG} to perform footstep repositioning to compensate for
disturbances \cite{dimitrov2011walking}.
\begin{figure}[ht]
\begin{minipage}[b]{0.4\linewidth}
\centerline{%
\includegraphics[scale=0.4]{Figures/ds_convex_hull.eps}}
\caption[Double support]{A double support represented by a convex hull of two single supports}
\label{fig.ds_hull}
\end{minipage}
\hfill
\begin{minipage}[b]{0.4\linewidth}
\centerline{%
\includegraphics[scale=0.4]{Figures/ds_approx.eps}}
\caption[Approximate double support]{A double support approximated by a sequence of rectangles}
\label{fig.ds_approx}
\end{minipage}
\end{figure}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Definition of the Inequality Constraints}
A rectangle can be defined with respect to some reference frame $\mathbb{F}$ by
four positive distances
$\mbm{d} = \begin{bmatrix} u^x & u^y & -l^x & -l^y \end{bmatrix}^T$
from the origin of $\mathbb{F}$ to the edges of the rectangle. The position and
orientation of the rectangle are given by the displacement $\mbm{r}$ and angle
of rotation $\varphi$ of $\mathbb{F}$ with respect to the global reference frame.
An example is depicted in \cref{fig.constr}.
\begin{figure}[ht]
\psfragscanon
\psfrag{r}{$\mbm{r}$}
\psfrag{a}{$\varphi$}
\psfrag{h}{$u^x$}
\psfrag{j}{$u^y$}
\psfrag{k}{$-l^x$}
\psfrag{l}{$-l^y$}
\centerline{%
\includegraphics[scale=0.5]{Figures/rect_constraints.eps}}
\caption[Support rectangle]{Definition of a support rectangle with respect to the global reference frame}
\label{fig.constr}
\end{figure}
All points $\mbm{p}$ lying within the rectangle satisfy
$$
\mbm{D}\mbm{R}^T\left(\mbm{p} - \mbm{r} \right) \leq \mbm{d},
$$
where matrix $\mbm{D}$ and rotation matrix $\mbm{R}$ are defined as
$$
\mbm{D} =
\begin{bmatrix}
\mbm{I}\\
-\mbm{I}
\end{bmatrix}, \quad
%
\mbm{R} =
\begin{bmatrix}
\cos\varphi & -\sin\varphi \\
\sin\varphi & \cos\varphi
\end{bmatrix}.
$$
Thus the constraints for the $k$-th sampling period in the preview window are
\begin{equation}
\mbm{D}\mbm{R}^T_k \mbm{C} \hat{\mbm{c}}_k \leq \mbm{d}_{k} + \mbm{D}\mbm{R}^T_k \mbm{r}_k,
\end{equation}
where
$$
\mbm{C} =
\begin{bmatrix}
1 & 0 & -h & 0 & 0 & 0\\
0 & 0 & 0 & 1 & 0 & -h\\
\end{bmatrix}.
$$
is the output matrix of the system~\ref{eq.system_orig} defined in \cref{sec.3dlipm}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Time-variant Constrained MPC}\label{sec.timevarsys}
We can exclude the dynamics of the system~\eqref{eq.system_orig} from the inequality
constraints by replacing the position of the \ac{CoM} by the position of the \ac{ZMP}
so that
\begin{equation}\label{eq.substitution1}
\tilde{\mbm{c}} =
\begin{bmatrix} z^x & \dot{c}^x & \ddot{c}^x & z^y & \dot{c}^y & \ddot{c}^y \end{bmatrix}^T,
\end{equation}
to obtain
\begin{equation}\label{eq.constraints}
\mbm{D}\mbm{R}^T_k \mbm{C}_p \tilde{\mbm{c}}_k \leq \mbm{d}_{k} + \mbm{D}\mbm{R}^T_k \mbm{r}_k,
\end{equation}
where
$$
\mbm{C}_p =
\begin{bmatrix}
1 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 1 & 0 & 0 \\
\end{bmatrix},
$$
The system must be changed accordingly
\begin{equation}\label{eq.system_sub1}
\tilde{\mbm{c}}_{k+1} = \tilde{\mbm{A}}_k\tilde{\mbm{c}}_{k} + \tilde{\mbm{B}}_k\dddot{\mbm{c}}_{k},
\end{equation}
where the state transition and control matrices are defined as
$$
\tilde{\mbm{A}}_k =
\begin{bmatrix}
1 & T_k & \frac{T_k^{2}}{2}-\Delta h_k & 0 & 0 & 0\\
0 & 1 & T_k & 0 & 0 & 0\\
0 & 0 & 1 & 0 & 0 & 0\\
0 & 0 & 0 & 1 & T_k & \frac{T_k^{2}}{2}-\Delta h_k\\
0 & 0 & 0 & 0 & 1 & T_k\\
0 & 0 & 0 & 0 & 0 & 1
\end{bmatrix},
$$
$$
\tilde{\mbm{B}}_k =
\begin{bmatrix}
\frac{T_k^{3}}{6} - h_k T_k & 0\\
\frac{T_k^{2}}{2} & 0\\
T_k & 0\\
0 & \frac{T_k^{3}}{6} - h_k T_k\\
0 & \frac{T_k^{2}}{2}\\
0 & T_k
\end{bmatrix}.
$$
These matrices can be defined as time-invariant, but their parameterization gives us more options
for tuning. Variation of the \ac{CoM} height during walk, which can be realized through the
appropriate changes of $h_k$, may have a positive effect on the quality of the gait. Also,
the solution of the \ac{QP} may be obtained faster, if the sampling time $T_k$ varies in
the preview window. Note that the duration of the sampling period in an \ac{MPC} is determined
by the control sampling time, and all but the first computed control inputs are usually
discarded. Hence, longer sampling periods in the end of the preview window can be used to
decrease $N$, which directly affects the time required for solution, without decreasing the
length of this preview window \cite{dimitrov2008implementation}.
Now we rewrite the optimization problem~\eqref{eq.opt_simple} to reflect the changes
in the model
\begin{equation}\label{eq.opt}
\begin{split}
\minimize{\dddot{\mbm{c}}_0 \dots \dddot{\mbm{c}}_{N-1}; \tilde{\mbm{c}}_1 \dots \tilde{\mbm{c}}_{N}}
& \sum_{k=1}^{N} \norm{ \tilde{\mbm{c}}_k - \mbm{C}_p^T \mbm{z}_k^{ref} }^2_{\tilde{\mbm{Q}}}
+ \sum_{k=0}^{N-1} \norm{ \dddot{\mbm{c}}_k }^2_{\mbm{P}}\\
\subjectto & \tilde{\mbm{c}}_{k+1} = \tilde{\mbm{A}}_k\tilde{\mbm{c}}_{k} + \tilde{\mbm{B}}_k\dddot{\mbm{c}}_{k}\\
& \mbm{D}\mbm{R}^T_k \mbm{C}_p \tilde{\mbm{c}}_k \leq \mbm{d}_{k} + \mbm{D}\mbm{R}^T_k \mbm{r}_k,
\end{split}
\end{equation}
where the gain matrices are
$$
\tilde{\mbm{Q}} =
\begin{bmatrix}
\frac{\alpha_g}{2} & 0 & 0 & 0 & 0 & 0\\
0 & \frac{\beta_g}{2} & 0 & 0 & 0 & 0\\
0 & 0 & \frac{\gamma_g}{2} & 0 & 0 & 0\\
0 & 0 & 0 & \frac{\alpha_g}{2} & 0 & 0 \\
0 & 0 & 0 & 0 & \frac{\beta_g}{2} & 0 \\
0 & 0 & 0 & 0 & 0 & \frac{\gamma_g}{2} \\
\end{bmatrix}, \quad
\mbm{P} =
\begin{bmatrix}
\frac{\eta_g}{2} & 0 \\
0 & \frac{\eta_g}{2} \\
\end{bmatrix}.
$$
The first term in the objective function is
\begin{equation*}
\begin{split}
\sum_{k=1}^{N} \norm{ \tilde{\mbm{c}}_k - \mbm{C}_p^T \mbm{z}_k^{ref} }^2_{\tilde{\mbm{Q}}} &=
\sum_{k=1}^{N} \left(
\norm{ \tilde{\mbm{c}}_k }^2_{\tilde{\mbm{Q}}}
+ \norm{ \mbm{C}_p^T \mbm{z}_k^{ref} }^2_{\tilde{\mbm{Q}}}
- 2 \tilde{\mbm{c}}_k^T \tilde{\mbm{Q}} \mbm{C}_p^T \mbm{z}_k^{ref}
\right)\\
&= \sum_{k=1}^{N} \left(
\norm{ \tilde{\mbm{c}}_k }^2_{\tilde{\mbm{Q}}}
+ \norm{ \mbm{C}_p^T \mbm{z}_k^{ref} }^2_{\tilde{\mbm{Q}}}
- \alpha_g (\mbm{C}_p^T \mbm{z}_k^{ref})^T \tilde{\mbm{c}}_k
\right)\\
&= \sum_{k=1}^{N} \left(
\norm{ \tilde{\mbm{c}}_k }^2_{\tilde{\mbm{Q}}}
+ \norm{ \mbm{C}_p^T \mbm{z}_k^{ref} }^2_{\tilde{\mbm{Q}}}
+ \mbm{q}_k^T \tilde{\mbm{c}}_k
\right),
\end{split}
\end{equation*}
where $\mbm{q}_k = - \alpha_g (\mbm{C}_p^T \mbm{z}_k^{ref})$.
The terms $\norm{ \mbm{C}_p^T \mbm{z}_k^{ref} }^2_{\tilde{\mbm{Q}}}$ are constant
and can be dropped. Thus we obtain the following formulation \ac{MPCWMG}
\begin{equation}\label{eq.opt_final}
\begin{split}
\minimize{\dddot{\mbm{c}}_0 \dots \dddot{\mbm{c}}_{N-1}; \tilde{\mbm{c}}_1 \dots \tilde{\mbm{c}}_{N}}
& \sum_{k=1}^{N} \norm{ \tilde{\mbm{c}}_k }^2_{\tilde{\mbm{Q}}}
+ \sum_{k=0}^{N-1} \norm{ \dddot{\mbm{c}}_k }^2_{\mbm{P}}
+ \sum_{k=1}^{N} \mbm{q}_k^T \tilde{\mbm{c}}_k\\
\subjectto & \tilde{\mbm{c}}_{k+1} = \tilde{\mbm{A}}_k\tilde{\mbm{c}}_{k} + \tilde{\mbm{B}}_k\dddot{\mbm{c}}_{k}\\
& \mbm{D}\mbm{R}^T_k \mbm{C}_p \tilde{\mbm{c}}_k \leq \mbm{d}_{k} + \mbm{D}\mbm{R}^T_k \mbm{r}_k.
\end{split}
\end{equation}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Generation of an Initial Guess}\label{sec.init_guess}
The \ac{MPC} problem~\eqref{eq.opt_final} is always feasible. This is a useful
property for generation of an initial feasible point.
At each iteration of \ac{MPCWMG} the current state of the system~\eqref{eq.system_sub1}
is assumed to be known. It is also possible to find a set of $N$ points satisfying the
inequality constraints~\eqref{eq.constraints} for each sampling period. This
set is a feasible profile for \ac{ZMP}, the next step is to generate control inputs
to follow this profile and determine the unknown state variables using a simple
iterative procedure.
The $k$-th control inputs can be computed based on the current state and the next
\ac{ZMP} position by multiplying both sides of \cref{eq.system_sub1} by $\mbm{C}_p$
$$
\dddot{\mbm{c}}_{k} = (\mbm{C}_p\tilde{\mbm{B}}_k)^{-1} (\mbm{z}_{k+1}
- \mbm{C}_p \tilde{\mbm{A}}_k\tilde{\mbm{c}}_{k}).
$$
Matrix $\mbm{C}_p\tilde{\mbm{B}}_k$ is not invertible when $c^z_k = T_k^2 / 6$.
If the sampling period is less than $0.1$ second, then the height of \ac{CoM}
of a robot must be lower than approximately $2$ millimeters in order to satisfy
this equality. Since the sampling period used in simulations and experiments is
always below $0.1$ second, it is safe to assume that the matrix is always invertible.
Having the $k$-th control inputs and the current state it is possible to find
velocity and acceleration of the next state using \cref{eq.system_sub1}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Stability of the MPC scheme}\label{sec.mpc_stability}
When an \ac{MPC} with finite horizon is used, its stability must be considered.
It is possible to ensure stability using appropriate modifications of the problem,
for a comprehensive review refer to \cite{MPCstab}. Stability of the \ac{MPC}
formulations presented so far is not guaranteed. Moreover, it is difficult to
judge on the stability of the \ac{MPCWMG}, since the problem is altered on each
iteration due to the change of the set of the inequality constraints.
Let us neglect the changes of the constraints for the moment and consider the
\ac{QP} problem, which is solved on a particular iteration. Note that the discrete
systems~\eqref{eq.system_orig} and~\eqref{eq.system_sub1} are unstable, since the
state transition matrices have repeated eigenvalues equal to $1$ and the dimesion
of the respective Jordan blocks is greater than $1$. Consequently, it is necessary to
impose terminal constraints in order to ensure stability \cite{MuskeLinearMPC}.
\ac{QP} with terminal constraints may become infeasible, and generation of an initial
guess (\cref{sec.init_guess}) may become more computationally expensive. The stable
\ac{MPCWMG} may improve the ability of a robot to cope with external disturbances
and may require shorter preview window. However, the effect on the quality of the
gait is unknown. Furthermore, the stability of \ac{MPCWMG} does not imply, that the
generated \ac{CoM} trajectory can be executed by the controlled robot. Hence, the
stability of the \ac{MPCWMG} scheme requires special consideration and experiments,
and it was decided to leave it out of the scope of the thesis.
In order to avoid stability issues a ``sufficiently long'' preview window is
used. In \cite{LIPM-MPC}, where \ac{MPC} problem is not constrained, the explicit
control law was computed, and the preview gain was shown to be decreasing to a very
small value in about $2$ seconds. The built-in walking module for Nao uses preview
window of $0.8$ second \cite{NaoWalk}. In a similar way, the necessary preview window
length for the developed walking module was found experimentally (it is about $1.6$
second, see \cref{ch.results}). Since this time is related to the speed of walk,
the number of footsteps is likely to be a better measure of a preview horizon. In our
implementation $1.6$ second includes three \ac{SS}, or one full oscillation of the
\ac{ZMP} trajectory.
% {\bf robust} \ac{MPC} and {\bf stochastic} \ac{MPC}
%orbital stability
%constraints on z vs c cddot
%\cite{RaoIPMPC}