Difference between revisions of "Manual"

From openpipeflow.org
Jump to: navigation, search
m (Compiling, starting and ending a run)
m
 
(28 intermediate revisions by one user not shown)
Line 1: Line 1:
* This manual covers the '''core''' code, setup procedures and the mathematical model.   
+
If something needs updating or you have suggestions, please communicate them ([[Main_Page#Author]]) and/or request an openpipeflow login (top right corner)This website uses Mediawiki and is easy to edit.
  
* The addition of extra outputs, post-processing, and just about anything else should be done with a '''utility'''.  See [[Utilities]].
 
  
<!-- LATEX PREAMBLE -->
+
'''Overview of the solver'''
$
+
* [[File:TheOpenpipeflowSolver.pdf]] - An overview of the code, its context, and how to cite Openpipeflow.
\newcommand{\bnabla}{\vec{\nabla}}
+
\newcommand{\Rey}{Re}
+
\def\vechat#1{\hat{\vec{#1}}}
+
\def\mat#1{#1}
+
$
+
  
= Getting started =
+
'''Equations, properties, methods, etc.''':
 +
* [[Equations and parameters]] - Non-dimensionalisation, Navier-Stokes, Reynolds numbers.
 +
* [[The PPE formulation]] - Pressure-Poisson Equation, and the influence matrix technique.
 +
* [[Table of unit conversions]] - scale factors between code and 'lab' units.
 +
* [[Differential operators in cylindrical coordinates]] - grad, div, curl, Laplacian, etc.
 +
* [[Symmetries of pipe flow]] - discrete and continuous symmetries.
 +
* [[Method of slices]] - symmetry reduction, elimination of physically irrelevant spatial shifts.
 +
* Newton-Krylov method - see non-problem specific code below.
  
== Overview of files ==
+
'''Using the simulation code''':
 +
* [[Getting started]] - overview of files, setup, starting and ending a job.
 +
* [[Tutorial]] - setup a job, basic monitoring and visualisation of outputs.
 +
* [[Core implementation]] - Discretisation, Predictor-Corrector, Code Structure, Parallelisation.
 +
* [[Parallel i/o]] - a brief note on parallel data access.
 +
* [[Utilities]] - pre/post-processing, runtime processing and manipulations, Newton solver for pipe flow.
  
* <tt>Makefile</tt> will require modification for your compiler and libraries (see [[#Compiling_libraries]]). Sample commands for other compilers can be found near the top of the file.
+
'''Non-problem specific codes'''
* <tt>parallel.h</tt> Ensure <tt>_Np</tt> is set to <tt>1</tt> if you do not have MPI. This file contains macros for parallelisation that are only invoked if the number of processes <tt>Np</tt> is greater than 1.
+
* These are designed for integration with any pre-existing code.
* <tt>program/parameters.f90</tt>, where you’ll find parameters. See [[#Parameters]]
+
* [[File:arnoldi.f|arnoldi.f]] - Krylov-subspace method for calculating eigenvalues of a matrix.
* <tt>utils/</tt> contains a number of utilities. Almost anything can be done in a util, both post-processing and analysing data at runtime. There should be no need to alter the core code. See [[#Making_utils]]
+
* [[File:GMRESm.f90|GMRESm.f90]] - Krylov-subspace method for solving the linear system Ax=b for x.
* <tt>Matlab/</tt>, a few scripts. See <tt>Matlab/Readme.txt</tt> .
+
* [[File:NewtonHook.f90|NewtonHook.f90]] - Newton-Krylov-Hookstep method for finding nonlinear solutions x of F(x)=0.
 
+
== Parameters ==
+
 
+
<tt>./parallel.h</tt>:<br />
+
<tt> _Np</tt>
+
 
+
Number of processors (set 1 for serial)<br />
+
<tt>./program/parameters.f90</tt>:<br />
+
 
+
 
+
<table>
+
<tr class="odd">
+
<td align="left"><tt>i_N</tt></td>
+
<td align="left">Number of radial points $n\in[1,N]$</td>
+
</tr>
+
<tr class="even">
+
<td align="left"><tt>i_K</tt></td>
+
<td align="left">Maximum k (axial), $k\in(-K,\,K)$</td>
+
</tr>
+
<tr class="odd">
+
<td align="left"><tt>i_M</tt></td>
+
<td align="left">Maximum m (azimuthal), $m\in[0,\,M)$</td>
+
</tr>
+
<tr class="even">
+
<td align="left"><tt>i_Mp</tt></td>
+
<td align="left">Azimuthal periodicity, i.e. $m=0,M_p,2M_p,\dots,(M-1)M_p$</td>
+
</tr>
+
<tr class="odd">
+
<td align="left"></td>
+
<td align="left">(set =1 for no symmetry assumption)</td>
+
</tr>
+
<tr class="even">
+
<td align="left"><tt>d_Re</tt></td>
+
<td align="left">Reynolds number $Re$ or $Rm_m$</td>
+
</tr>
+
<tr class="odd">
+
<td align="left"><tt>d_alpha</tt></td>
+
<td align="left">Axial wavenumber $\alpha=2\pi/L_z$</td>
+
</tr>
+
<tr class="even">
+
<td align="left"><tt>b_const_flux</tt></td>
+
<td align="left">Enforce constant flux $U_b=\frac1{2}$.</td>
+
</tr>
+
<tr class="odd">
+
<td align="left"><tt>i_save_rate1</tt></td>
+
<td align="left">Save frequency for snapshot data files</td>
+
</tr>
+
<tr class="even">
+
<td align="left"><tt>i_save_rate2</tt></td>
+
<td align="left">Save frequency for time-series data</td>
+
</tr>
+
<tr class="odd">
+
<td align="left"><tt>i_maxtstep</tt></td>
+
<td align="left">Maximum number of timesteps (no limit if set $<0$)</td>
+
</tr>
+
<tr class="even">
+
<td align="left"><tt>d_cpuhours</tt></td>
+
<td align="left">Maximum number of cpu hours</td>
+
</tr>
+
<tr class="odd">
+
<td align="left"><tt>d_time</tt></td>
+
<td align="left">Start time (taken from <tt>state.cdf.in</tt> if set $<0$)</td>
+
</tr>
+
<tr class="even">
+
<td align="left"><tt>d_timestep</tt></td>
+
<td align="left">Fixed timestep (typically 0.01 or dynamically controlled if set $<0$)</td>
+
</tr>
+
<tr class="odd">
+
<td align="left"><tt>d_dterr</tt></td>
+
<td align="left">Maximum corrector norm, $\|f_{corr}\|$</td>
+
</tr>
+
<tr class="even">
+
<td align="left"><tt>d_courant</tt></td>
+
<td align="left">Courant number $\mathrm{C}$</td>
+
</tr>
+
<tr class="odd">
+
<td align="left"><tt>d_implicit</tt></td>
+
<td align="left">Implicitness $c$</td>
+
</tr>
+
</table>
+
 
+
== Input files ==
+
 
+
State files are stored in the NetCDF data format and can be transferred across different architectures safely. The program <tt>main.out</tt> runs with the compiled parameters (see <tt>main.info</tt>) but will load states of other truncations. Copy any state file e.g. state0018.cdf.dat to state.cdf.in It will interpolate if necessary.
+
 
+
ll
+
 
+
<tt>state.cdf.in</tt>:<br />
+
$t$ &amp; – '''Start time'''. Overridden by <tt>d_time</tt> if <tt>d_time</tt>$\ge$0<br />
+
$\Delta t$ &amp; – '''Timestep'''. Ignored, see parameter <tt>d_timestep</tt>.<br />
+
$N, M_p, r_n$ &amp; – '''Number of radial points, azimuthal periodicity,'''<br />
+
&amp; $\quad$ '''radial values''' of the input state.<br />
+
&amp; – If the radial points differ the fields are<br />
+
&amp; $\quad$ interpolated onto the new points.<br />
+
$u_r,\,u_\theta,\, u_z$ &amp; – '''Field vectors'''.<br />
+
&amp; – If $K\ne\,$<tt>K</tt> or $M\ne\,$<tt>M</tt>, harmonics are truncated or<br />
+
&amp; $\quad$ zeros appended.
+
 
+
== Output ==
+
 
+
=== Snapshot data ===
+
 
+
Data saved every <tt>i_save_rate1</tt> timesteps:
+
 
+
<pre>  state????.cdf.dat
+
  vel_spectrum.dat</pre>
+
All output is sent to the current directory, and state files are numbered <tt>0000</tt>, <tt>0001</tt>, <tt>0002</tt>,…. Each file can be renamed <tt>state.cdf.in</tt> should a restart be necessary. To list times $t$ for each saved state file,<br />
+
<tt>  grep state OUT</tt><br />
+
The spectrum files are overwritten each save as they are retrievable from the state data. To verify sufficient truncation, a quick profile of the energy spectrum can be plotted with<br />
+
<tt>  gnuplot&gt; set log</tt><br />
+
<tt>  gnuplot&gt; plot 'vel_spectrum.dat' w lp</tt><br />
+
 
+
 
+
=== Time-series data ===
+
 
+
Data saved every <tt>i_save_rate2</tt> timesteps:
+
 
+
<table>
+
<tr class="odd">
+
<td align="left"><tt>tim_step.dat</tt></td>
+
<td align="left">$t$, $\Delta t$, $\Delta t_{\|f\|}$, $\Delta t_{CFL}$</td>
+
<td align="left">current and limiting step sizes</td>
+
</tr>
+
<tr class="even">
+
<td align="left"><tt>vel_energy.dat</tt></td>
+
<td align="left">$t$, $E$</td>
+
<td align="left">energies and viscous dissipation</td>
+
</tr>
+
<tr class="odd">
+
<td align="left"><tt>vel_friction.dat</tt></td>
+
<td align="left">$t$, $U_b$ or $\beta$, $\langle u_z(r=0)\rangle_z$, $u_\tau$</td>
+
<td align="left">bulk velocites, pressure, friction vel.</td>
+
</tr>
+
</table>
+
 
+
 
+
== Compiling libraries ==
+
 
+
Take a look at Makefile. The compiler and flags will probably need editing. There are suggested flags for many compilers at the top of this file.
+
 
+
Building the main code is successful if running <tt>make</tt> produces no errors, but this requires the necessary libraries to be present - LAPACK, netCDF and FFTW. Often it is necessary to build LAPACK and NetCDF with the compiler and compile flags that will be used for the main simulation code.
+
 
+
The default procedure for building a package (FFTW3, netCDF) is
+
 
+
<pre>tar -xvvzf package.tar.gz
+
cd package/
+
[set environment variables if necessary]
+
./configure --prefix=&lt;path&gt;
+
make
+
make install</pre>
+
'''FFTW3'''. This usually requires no special treatment. Install with your package manager or build with the default settings.
+
 
+
– '''LAPACK'''. If using gfortran you might be able to use the binary version supplied for your linux distribution. Otherwise, edit the file make.inc that comes with LAPACK, setting the fortran compiler and flags to those you plan to use. Type ‘make’. Once finished, copy the following binaries into your library path (see Makefile LIBS -L$<$path$>$/lib/)<br />
+
<tt>cp lapack.a &lt;path&gt;/lib/liblapack.a</tt><br />
+
<tt>cp blas.a &lt;path&gt;/lib/libblas.a</tt>
+
 
+
– '''netCDF'''. If using gfortran you might be able to use the supplied binary version for your linux distribution. Otherwise, typical environment variables required to build netCDF are<br />
+
<tt>CXX=&quot;&quot;</tt><br />
+
<tt>FC=/opt/intel/fc/10.1.018/bin/ifort</tt><br />
+
<tt>FFLAGS=&quot;-O3 -mcmodel=medium&quot;</tt><br />
+
<tt>export CXX FC FFLAGS</tt><br />
+
After the ‘make install’, ensure that files <tt>netcdf.mod</tt> and <tt>typesizes.mod</tt> appear in your include path (see Makefile COMPFLAGS -I$<$path$>$/include/).  Several versions can be found at
+
<tt>http://www.unidata.ucar.edu/downloads/netcdf/current/index.jsp</tt>. Version '''4.1.3''' is relatively straight forward to install, the above flags should be sufficient.
+
 
+
Installation is trickier for more recent versions [2013-12-11 currently netcdf-4.3.0.tar.gz]. First<br />
+
<tt>./configure --disable-netcdf-4 --prefix-&lt;path&gt;</tt><br />
+
which disables HDF5 support (not currently required in code). Fortran is no longer bundled, so get netcdf-fortran-4.2.tar.gz or a more recent version from here<br />
+
<tt>http://www.unidata.ucar.edu/downloads/netcdf/index.jsp</tt><br />
+
Build with the above flags, and in addition<br />
+
<tt>CPPFLAGS=-I&lt;path&gt;/include</tt><br />
+
<tt>export CPPFLAGS</tt><br />
+
Add the link flag <tt>-lnetcdff</tt> before <tt>-lnetcdf</tt> in the <tt>Makefile</tt>.
+
 
+
== Compiling, starting and ending a run ==
+
 
+
* (See also the [[Tutorial]].)
+
 
+
* For serial use set <tt>_Np</tt> to <tt>1</tt> in <tt>parallel.h</tt> (no MPI required).  The compiler in <tt>Makefile</tt> might need altering if switching between parallel and serial use, or call <tt>mpirun -np 1 ...</tt>.
+
 
+
* Set the '''parameters''' in <tt>program/parameters.f90</tt>.
+
 
+
* Compile
+
<pre>&gt; make
+
&gt; make install  </pre>
+
The second command creates the directory <tt>install/</tt> and a text file [[main.info]], which is a record of settings at compile time.
+
 
+
* Next an '''initial condition''' <tt>state.cdf.in</tt> is needed. '''NOTE''': Any output state, e.g. state0012.cdf.dat can be copied to <tt>state.cdf.in</tt> to be used as an initial condition. If resolutions do not match, they are '''automatically interpolated or truncated'''.
+
<pre>&gt; mv install ~/runs/job0001
+
&gt; cd ~/runs/job0001/
+
&gt; cp .../state0012.cdf.dat state.cdf.in</pre>
+
 
+
* '''To start''' the run (good to first double-check in [[main.info]] that the executable was compiled with correct parameters)
+
<pre>&gt; nohup ./main.out &gt; OUT 2&gt; OUT.err &amp;</pre>
+
After a few moments, press enter again to see if main.out stopped prematurely.  If it has stopped there will be a message e.g. '[1]+ Done nohup...'; check <tt>OUT.err</tt> or <tt>OUT</tt> for clues why.
+
 
+
* '''To end''' the run
+
<pre>&gt; rm RUNNING</pre>
+
This will terminate the job cleanly.
+
 
+
* '''NOTE''': I generate almost all initial conditions by taking a state from a run with similar parameters. If there is a mismatch in <tt>i_Mp</tt>, use the utility [[changeMp.f90]].
+
 
+
== Monitoring a run ==
+
 
+
Immediately after starting a job, it’s a good idea to check for any warnings
+
 
+
<pre>&gt; less OUT</pre>
+
To find out number of timesteps completed, or for possible diagnosis of an early exit,
+
 
+
<pre>&gt; tail OUT</pre>
+
The code outputs timeseries data and snapshot data, the latter has a 4-digit number e.g. <tt>state0012.cdf.dat</tt>.
+
 
+
To see when the in the run each state was saved,
+
 
+
<pre>&gt; grep state OUT | less  [OR]
+
&gt; head -n 1 vel_spec* | less</pre>
+
I often monitor progress with <tt>tail vel_energy.dat</tt> or
+
 
+
<pre>&gt; gnuplot
+
&gt; plot 'vel_energy.dat' w l</pre>
+
Use <tt>rm RUNNING</tt> to end the job.
+
 
+
== Making utils ==
+
 
+
The core of the code in <tt>program/</tt> rarely needs to be changed. Almost anything can be done by creating a utility instead. There are many examples in <tt>utils/</tt> and further information on the [[Utilities]] page.
+
 
+
In <tt>Makefile</tt>, set <tt>UTIL = utilname</tt> (omitting the <tt>.f90</tt> extension), then type <tt>'make util'</tt> which builds <tt>utilname.out</tt>.
+
 
+
It is good practice to do a <tt>'make install'</tt> to generate a [[main.info]] file to keep alongside the executable.
+
 
+
= Spatial representation =
+
 
+
== Finite differences — $r$ ==
+
 
+
A function $p(r)$ is represented by the values $p_n=p(r_n)$, where $p(r)$ is evaluated on the $N$ radial points $r_n$.
+
 
+
=== Differentiation ===
+
 
+
Taylor expansions about central point $x_0$, $k$ neighbouring points each side
+
 
+
$\label{eq:taylorexp}
+
\begin{array}{lclrrrr}
+
  f(x_{-k}) &=& f(x_0)
+
  &+& (x_{-k}-x_0)\,f'(x_0) &+& \frac{(x_{-k}-x_0)^2}{2!}\,f''(x_0)+\dots
+
  \\ & \vdots & \\
+
  f(x_k) &=& f(x_0)
+
  &+& (x_k-x_0)\,f'(x_0) &+& \frac{(x_k-x_0)^2}{2!}\,f''(x_0)+\dots
+
\end{array}$
+
 
+
The expansions \ref{eq:taylorexp} can be written
+
 
+
$\vec{f}=\mat{A}\,\vec{df}, \quad
+
  \vec{df}=\mat{A}^{-1}\,\vec{f}, \quad
+
  \vec{f}=\left[\begin{array}{c}f(x_{-k})\\
+
          \vdots\\ f(x_k)\end{array}\right], \quad
+
  \vec{df}=\left[\begin{array}{c}
+
          f(x_0)\\ f'(x_0)\\ f''(x_0)\\ \vdots \end{array}\right]$
+
 
+
Derivatives are calculated using weights from the appropriate row of $\mat{A}^{-1}$.
+
 
+
=== Integration ===
+
 
+
Integrating \ref{eq:taylorexp} the indefinite integral may be approximated about $x_0$
+
 
+
$\begin{eqnarray*}
+
    \int f(x) \, {\mathrm d}x & = &
+
    \begin{array}{c}(x-x_0)\,f(x_0) + \frac{(x-x_0)^2}{2!}\,f'(x_0) +
+
    \frac{(x-x_0)^3}{3!}\,f''(x_0) + \dots \end{array}
+
    \\
+
    & = &
+
    [\begin{array}{lll}(x-x_0) & \frac{(x-x_0)^2}{2!} &
+
    \dots \end{array}] \,\, \mat{A}^{-1}\vec{f}\end{eqnarray*}$
+
 
+
The total integral is approximated by
+
 
+
$  \frac1{2} \sum_n \int_{x_{n-1}}^{x_{n+1}} f(x) \, {\mathrm d}x$
+
 
+
=== Matrix representation of linear operators ===
+
 
+
Linear equations can be written in the matrix form
+
 
+
$\mat{L} \,\vec{p} = \mat{M}\, \vec{q} + \vec{s}.$
+
 
+
$\mat{L}$ and $\mat{M}$ are $N\times N$ matrices representing linear operators. If $\vec{q}$ and $\vec{s}$ are known then the right-hand side is simple to evaluate. Consider the equation
+
 
+
$\label{eq:matmod}
+
  \mat{L}\, \vec{p} = \vec{q},$
+
 
+
where $\vec{p}$ is unknown. If the matrix operators are formed by linear combinations of the weights in [[#Differentiation]], using only $k$ neighbouring points to each side, they are banded. Boundary conditions are formed in the same way and are placed in the end rows. The equation is solved by forward-backward substitution, using the banded LU-factorisation of $\mat{L}$.
+
 
+
figure=LU.eps, scale=0.65
+
 
+
== Fourier representaion — $\theta$, $z$ ==
+
 
+
=== Expansion of variables ===
+
 
+
$A(\theta,z) =
+
      \sum_{km} A_{km} {\mathrm e}^{{\mathrm i} (\alpha kz + m_0 m\theta)}$
+
 
+
$A$ real implies $A_{km} = A_{-k,-m}^*$ (coefficients are conjugate-symmetric). It is sufficient to keep only $m\ge 0$, and for $m=0$ keep $k\ge 0$.
+
 
+
=== Orthogonality ===
+
 
+
Exponentials
+
 
+
$\int_0^{2\pi} {\mathrm e}^{{\mathrm i}(m+n)} \, {\mathrm d}\theta
+
      = 2\pi \, \delta_{m,-n}\, ,
+
      \qquad
+
      \int_0^{\frac{2\pi}{\alpha}} {\mathrm e}^{{\mathrm i}\alpha(k+j)} \, {\mathrm d}z
+
      = \frac{2\pi}{\alpha} \, \delta_{k,-j}$
+
 
+
=== Energy integral ===
+
 
+
$E \, = \,
+
      {\textstyle \frac1{2}} \int \vec{A} \cdot \vec{A} \, {\mathrm d}V
+
      \, = \,
+
      \frac{2\pi^2}{\alpha} \, \sum_{km}
+
      \int |\vec{A}_{km}|^2 \, r\,{\mathrm d}r$
+
 
+
$E = \sum_{m=0}^{\infty} E_m,
+
  \qquad
+
      E_m =
+
      \left\{
+
        \begin{array}{ll}
+
            \displaystyle
+
            \frac{2\pi^2}{\alpha}\int\vec{A}_{00}^2 \, r \, {\mathrm d}r
+
            + \frac{4\pi^2}{\alpha} \sum_{k>0}
+
            \int|\vec{A}_{k0}^2| \, r \, {\mathrm d}r,
+
            & m=0, \\[15pt]
+
            \displaystyle
+
            \frac{4\pi^2}{\alpha} \sum_k
+
            \int |\vec{A}_{km}|^2 \, r \, {\mathrm d}r,
+
            & m>0 .
+
        \end{array}
+
      \right.$
+
 
+
=== Volume integral ===
+
 
+
$E \, = \,
+
      {\textstyle \frac1{2}} \int A \, {\mathrm d}V
+
      \, = \,
+
      \frac{4\pi^2}{\alpha} \, \int A_{00} \, r\,{\mathrm d}r$
+
 
+
== Enforced conditions at the axis ==
+
 
+
The geometry enforces conditions on each variable at the axis when expanded over Fourier modes in $\theta$. Given a vector A, each mode for $A_z$ is even in $r$ if $m$ is even, and is odd if $m$ is odd. Each mode for $A_r$ and $A_\theta$ is even in $r$ if $m$ is odd, and is odd if $m$ is even.
+
 
+
= Model =
+
 
+
== Governing equations ==
+
 
+
=== Non-dimensionalisation / scales ===
+
 
+
* Length $R$ (radius=$D/2$).
+
* Velocity $2\, U_b$, equivalent to $U_{cl}$ of laminar Hagen-Poiseuille flow ('''c'''entre'''l'''ine, '''b'''ulk=mean).
+
* The non-dimensional radius is 1, laminar centreline speed is 1, and bulk speed is 1/2.
+
* Note that 1 'advection-time' unit $D/U_b$ is equivalent to 4 code time units $R/2U_b$.
+
 
+
=== Dimensionless parameters ===
+
 
+
Reynolds number, fixed flux, $Re_m = 2 U_b R / \nu$<br />
+
Reynolds number, fixed pressure, $Re = U_{cl} R / \nu$<br />
+
$1+\beta = Re / Re_m$<br />
+
$Re_\tau=u_\tau R/\nu = (2\,Re_m\,(1+\beta))^\frac{1}{2}=(2\,Re)^\frac{1}{2}$
+
 
+
=== Evolution equations ===
+
 
+
Fixed flux,
+
 
+
$ (\partial_{t} + \vec{u}\cdot\bnabla) \vec{u}
+
  = -\bnabla \hat{p} + \frac{4}{\Rey_m}\,(1+\beta)\vechat{z}
+
    + \frac{1}{\Rey_m}\bnabla^2 \vec{u} $
+
 
+
Fixed pressure
+
 
+
$ (\partial_{t} + \vec{u}\cdot\bnabla) \vec{u}
+
  = -\bnabla \hat{p} + \frac{4}{\Rey}\vechat{z}
+
    + \frac{1}{\Rey}\bnabla^2 \vec{u} $
+
 
+
Let $\vec{u}=W(r)\vechat{z}+\vec{u}'$. Using the scaling above, $W(r) = 1-r^2$. Then
+
 
+
$ (\partial_{t} - \frac{1}{\Rey_m}\bnabla^2)\,\vec{u}'
+
  = \vec{u}' \wedge (\bnabla \wedge\vec{u}')
+
  - \frac{\mathrm{d}W}{\mathrm{d}r}\,u'_z \vechat{z}
+
  - W\,\partial_{z}\vec{u}' + \frac{4\,\beta}{\Rey_m}\vechat{z} - \bnabla\hat{p}' \, . $
+
 
+
== Decoupling the equations ==
+
 
+
The equations for $u_r$ and $u_\theta$ are coupled in the Laplacian. They can be separated in a Fourier decompositon by considering
+
 
+
$u_\pm = u_r \pm \mathrm{i} \, u_\theta,$
+
 
+
for which the $\pm$ are considered respectively. Original variables are easily recovered
+
 
+
$u_r = \frac{1}{2} ( u_+ + u_-),
+
  \qquad
+
  u_\theta = -\,\frac{\mathrm{i}}{2}(u_+ - u_- ) .$
+
 
+
Governing equations are then
+
 
+
$\begin{eqnarray*}
+
  (\partial_{t} - \nabla^2_\pm)\, u_\pm
+
  & = &  N_\pm - (\bnabla p)_\pm , \\
+
  (\partial_{t} - \nabla^2 )\, u_z
+
  & = &  N_z - (\bnabla p)_z ,\end{eqnarray*}$
+
 
+
where
+
 
+
$\nabla^2_\pm = \nabla^2 - \frac{1}{r^2}
+
  \pm \frac{\mathrm{i}}{r^2}\partial_{\theta}$
+
 
+
 
+
== PPE-formulation with correct boundary conditions ==
+
 
+
Write the time-discretised Navier–Stokes equations in the form
+
 
+
$\label{eq:NSdisc}
+
        \left\{\begin{array}{rcl}
+
    \mat{X}\, \vec{u}^{q+1} & = & \mat{Y}\, \vec{u}^q
+
                + \vec{N}^{q+\frac{1}{2}} - \bnabla p \, , \\
+
                \nabla^2 p & = & \bnabla\cdot(\mat{Y}\, \vec{u}^q
+
                + \vec{N}^{q+\frac{1}{2}}) ,
+
        \end{array}\right.$
+
 
+
where $q$ denotes time $t_q$, which is sixth order in $r$ for $\vec{u}^{q+1}$ and second order for $p$, where the solenoidal condition is not explicitly imposed. Symmetry conditions provide the conditions at the axis. The difficulty is in imposing the remaining four — this system should be inverted, in principle, simultaneously for $p$ and $\vec{u}^{q+1}$ with boundary conditions $\vec{u}^{q+1}=\vec{0}$ and $\bnabla\cdot\vec{u}^{q+1}=0$ on $r=R$ (Rempfer 2006). In practice it would be preferable to invert for $p$ first then for $\vec{u}^{q+1}$, but the boundary conditions to not involve $p$ directly.
+
 
+
Note that the $\mat{Y}\,\vec{u}^q$ term has been included in the right-hand side of the pressure-Poisson equation, the divergence of which should be small. Assume that pressure boundary condition is known: the right-hand side of the Navier–Stokes equation is then projected onto the space of solenoidal functions though $p$ and hence after inversion, $\vec{u}^{q+1}$ will be solenoidal.
+
 
+
Consider the ‘bulk’ solution, $\{\bar{\vec{u}},\bar{p}\}$, obtained from solution of the following:
+
 
+
$\label{eq:NSbulk}
+
        \left\{\begin{array}{rcl}
+
    \mat{X}\, \bar{\vec{u}} & = & \mat{Y}\, \vec{u}^q
+
                + \vec{N}^{q+\frac{1}{2}} - \bnabla \bar{p} \, , \\
+
                \nabla^2 \bar{p} & = & \bnabla\cdot(\mat{Y}\, \vec{u}^q
+
                + \vec{N}^{q+\frac{1}{2}}) ,
+
        \end{array}\right.$
+
 
+
with boundary conditions $\bar{\vec{u}}=\vec{0}$ and $\partial_{r}\bar{p}=0$. Introduce the following systems:
+
 
+
$\label{eq:NSp0}
+
        \left\{\begin{array}{rcl}
+
          \vec{u}' & = & -\bnabla p' \, , \\
+
                \nabla^2 p' & = & 0,
+
        \end{array}\right.$
+
 
+
with boundary condition $\partial_{r}p'=1$ on $r=R$, and
+
 
+
$\label{eq:NSu0}
+
        \left\{\begin{array}{rcl}
+
    \mat{X}\, \vec{u}' & = & \vec{0},
+
        \end{array}\right.$
+
 
+
with boundary conditions $u'_+=1$, $u'_-=1$, $u'_z=\mathrm{i}$ on $r=R$. The system \ref{eq:NSp0} provides a linearly independent function $\vec{u}'_j$ that may be added to $\bar{\vec{u}}$ without affecting the right-hand side in \ref{eq:NSbulk}, but altering (to correct) the boundary condition applied. (Note that $\bnabla^2(\bnabla p')=\bnabla(\nabla^2 p')-\bnabla \wedge\bnabla \wedge\bnabla p'=0$.) Similarly the system \ref{eq:NSu0} provides a further three functions. The superposition
+
 
+
$\label{eq:usuperpos}
+
  \vec{u}^{q+1} = \bar{\vec{u}} + \sum_{j=1}^4 a_j\, \vec{u}'_j \,$
+
 
+
may be formed in order to satisfy the four boundary conditions, $\vec{u}^{q+1}=\vec{0}$ and $\bnabla\cdot\vec{u}^{q+1}=0$ on $r=R$. Substituting \ref{eq:usuperpos} into the boundary conditions, they may be written
+
 
+
$\mat{A}\,\vec{a} = -\vec{g}(\bar{\vec{u}}) ,$
+
 
+
where $\mat{A}=\mat{A}(\vec{g}(\vec{u}'))$ is a 4$\times$4 matrix. The appropriate coefficients required to satisfy the boundary conditions are thereby recovered from solution of this small system for $\vec{a}$. The error in the boundary conditions $g_j(\vec{u}^{q+1})$ using the influence-matrix technique is at the level of the machine epsilon, typically order 1e-18.
+
 
+
The functions $u'_j(r)$, the matrix $\mat{A}$ and its inverse may all be precomputed. The boundary conditions for $\vec{u}'$ have been chosen so that that $u'_\pm$ are pure real, $u'_z$ is pure imaginary, and $\mat{A}$ is real. For each timestep, this application of the influence matrix technique requires only evaluation of the deviation from the boundary condition, multiplication by a 4$\times$4 real matrix, and the addition of only two functions to each component of $\vec{u}$, each either pure real or pure imaginary. Compared to the evaluation of nonlinear terms, the computational overhead is negligible.
+
 
+
== Predictor-corrector ==
+
 
+
The model equation for each Fourier mode is
+
 
+
$\label{eq:harmmod}
+
  (\partial_{t} - \nabla^2) f = N,$
+
 
+
where nonlinear terms have been evaluated on each radial point by the transform method, and is solved as in [[#Matrix_representation_of_linear_operators]]. The predictor at time $t_q$, with Euler nonlinear terms and implicitness $c$
+
 
+
$\frac{f_1^{q+1}-f^q}{\Delta t}
+
  - \left( c\nabla^2 f_1^{q+1} + (1-c)\nabla^2 f^q \right)
+
  = N^q ,$
+
 
+
$\left( \frac1{\Delta t} - c\nabla^2 \right) f_1^{q+1}
+
  = \left( \frac1{\Delta t} + (1-c)\nabla^2 \right) f^q + N^q .$
+
 
+
Corrector iterations are
+
 
+
$\left( \frac1{\Delta t} - c\nabla^2 \right) f_{j+1}^{q+1}
+
  = \left( \frac1{\Delta t} + (1-c)\nabla^2 \right) f^q
+
  + c\,N_j^{q+1} + (1-c)\, N^q ,$
+
 
+
or equivalently, for the correction $f_{corr} = f_{j+1}^{q+1} - f_j^{q+1}$
+
 
+
$\left( \frac1{\Delta t} - c\nabla^2 \right) f_{corr}
+
  = c\,N_j^{q+1} - c\, N_{j-1}^{q+1} ,$
+
 
+
where $j=1,2,\dots$ and $N_0^{q+1}=N^q$. The size of the correction $\|f_{corr}\|$ must reduce at each iteration. For $c=\frac1{2}$ the scheme is second order such that $\|f_{corr}\| \sim \Delta t^2$.
+
 
+
== Timestep control ==
+
 
+
$\Delta t = \mathrm{C} \, \min(\Delta \, / \, |\vec{v}| ) ,
+
  \qquad
+
  0<\mathrm{C}<1,$
+
 
+
where $\mathrm{C}$ is the Courant number.
+
 
+
The timestep should also be small enough such that the corrector norm $\|f_{corr}\|$ is satisfactorily small. This may be important for integrating initial transients, but ought not to be the limiting factor in general.
+
 
+
= The code =
+
 
+
== Naming conventions ==
+
 
+
=== Parameters ===
+
 
+
Parameters defined within <tt>parameters.f90</tt> are given a prefix indicating the data type. For example:
+
 
+
<table>
+
<tr class="odd">
+
<td align="left">$N$</td>
+
<td align="left"><tt>i_N</tt></td>
+
<td align="left">integer</td>
+
</tr>
+
<tr class="even">
+
<td align="left">$\Delta t$</td>
+
<td align="left"><tt>d_timestep</tt></td>
+
<td align="left">double</td>
+
</tr>
+
<tr class="odd">
+
<td align="left">fixed flux?</td>
+
<td align="left"><tt>b_fixed_flux</tt></td>
+
<td align="left">boolean</td>
+
</tr>
+
</table>
+
 
+
Note that the naming convention frees the variable name for dummy variables, e.g.:
+
 
+
<pre>  integer :: n
+
  do n = 1, i_N
+
      ...</pre>
+
=== Public variables ===
+
 
+
By default variables declared within a module are accessible to any other module or subroutine that uses the module. To ensure that it is clear where a public variable is declared, variables are given a prefix. Examples:
+
 
+
<table>
+
<tr class="odd">
+
<td align="left">$\Delta t$</td>
+
<td align="left"><tt>tim_dt</tt></td>
+
<td align="left"><tt>timestep.f90</tt></td>
+
</tr>
+
<tr class="even">
+
<td align="left">$u_r$</td>
+
<td align="left"><tt>vel_r</tt></td>
+
<td align="left"><tt>velocity.f90</tt></td>
+
</tr>
+
</table>
+
 
+
== Ordering the Fourier modes ==
+
 
+
$A = \sum_{km} A_{km} {\mathrm e}^{{\mathrm i}(\alpha kz + m_0 m\theta)},$
+
 
+
Coefficients are stored as in the following loop, where
+
rather than two indices <tt>k</tt> and <tt>m</tt>, the single index <tt>nh</tt> labels the modes:
+
 
+
<pre>  nh = -1
+
  do m = 0, M-1
+
      m_ = m*Mp
+
      do k = -(K-1), (K-1)
+
        if(m==0 .and. k&lt;0) cycle
+
        nh = nh+1
+
        ...</pre>
+
NOTE: this loop, and its parallel equivalent, are predefinded macros in <tt>parallel.h</tt>.
+
 
+
== Data types ==
+
 
+
The data types below consist of logical data groups to facilitate the passing of data between functions. For clarity, in the following text they are defined as though we were working on a single processor. See [[#Modifications_for_the_parallel_implementation|Parallel]] regarding subtle differences in the definitions due to parallelisation.
+
 
+
=== coll, spec — storage of coefficients ===
+
 
+
The collocated data <tt>type (coll)</tt> is the principle type used for mode-independent operations.
+
 
+
<pre>  type coll
+
      double precision    :: Re(i_N, 0:i_H1)
+
      double precision    :: Im(i_N, 0:i_H1)
+
  end type coll</pre>
+
The value <tt>Re(n,nh)</tt> is the real part of the coefficient for the <tt>nh</tt>$^{\mathrm{th}}$ harmonic at $r_n$. <tt>H1</tt>$=24$ for the example in [[#Ordering_of_the_Fourier_modes]] and <tt>N</tt> is the number of radial points.
+
 
+
The spectral <tt>type (spec)</tt> is the data type used for operations independent at each radial point. It is principally a transitory data format between the <tt>coll</tt> and <tt>phys</tt> types.
+
 
+
<pre>  type spec
+
      double precision    :: Re(0:i_H1, i_N)
+
      double precision    :: Im(0:i_H1, i_N)
+
  end type spec</pre>
+
Conversions between the <tt>coll</tt> and <tt>spec</tt> types involve only a transpose, <tt>var_coll2spec()</tt> and <tt>var_spec2coll()</tt>.
+
 
+
=== phys — real space data ===
+
 
+
The <tt>type (phys)</tt> is used for data in real space.
+
 
+
<pre>  type phys
+
      double precision    :: Re(0:i_Z-1, 0:i_Th-1, i_N)
+
  end type phys</pre>
+
The element <tt>Re(k,m,n)</tt> refers to the value at $z_k$, $\theta_m$ and $r_n$.
+
 
+
=== mesh, lumesh — storage of banded matrices ===
+
 
+
The finite-difference stencil has finite width ([[#Finite_differences_.E2.80.94_.24r.24|Finite_Differences]]) and so matrix operations involve matrices that are banded. The <tt>type (mesh)</tt>, defined in <tt>meshs.f90</tt>, is used for matrices on the radial mesh involving $kl$ points to either side of each node.
+
 
+
<pre>  type mesh
+
      double precision :: M(2*i_KL+1, i_N)
+
  end type mesh</pre>
+
For a matrix <tt>A</tt>, the element <tt>M(KL+1+n-j, j) = A(n,j)</tt>. See the man page for the lapack routine <tt>dgbtrf</tt>. The LU-factorisation of a banded matrix is also banded:
+
 
+
<pre>  type lumesh
+
      integer          :: ipiv(i_N)
+
      double precision :: M(3*i_KL+1, i_N)
+
  end type lumesh</pre>
+
The element <tt>M(2*KL+1+n-j, j) = A(n,j)</tt>. Pivoting information is stored along with the matrix.
+
 
+
=== rdom — the radial domain ===
+
 
+
The <tt>type (rdom)</tt> contains information the radial domain. The public variable <tt>mes_D</tt> is declared in <tt>meshs.f90</tt>. Data includes the number of radial points <tt>N</tt>; powers of $r$, $r_n^p$=<tt>r(n,p)</tt>; weights for integration $r{\mathrm d}r$; matrices for taking the $p^{\mathrm{th}}$ derivative <tt>dr(p)</tt>;…
+
 
+
<pre>  type rdom
+
      integer          :: N
+
      double precision :: r(i_N,i_rpowmin:i_rpowmax)
+
      double precision :: intrdr(i_N)
+
      type (mesh)      :: dr(i_KL)
+
  end type rdom</pre>
+
 
+
== Modifications for the parallel implementation ==
+
 
+
The code has been parallelised using the Message Passing Interface (MPI). Sections of the code with special MPI function calls are separated from the serial sections by preprocessor directives. The number of processors is given by the parameter <tt>_Np</tt> set in <tt>parallel.h</tt>. The rank of the local processor is given by <tt>mpi_rnk</tt> declared in <tt>mpi.f90</tt>, and <tt>mpi_sze</tt>$=$<tt>_Np</tt>.
+
 
+
Data type changes for the model equation (see equations \ref{eq:matmod}, \ref{eq:harmmod}) are as follows:
+
 
+
figure=model.eps, scale=0.75
+
 
+
Data is split over the spherical harmonics for the linear parts of the code — evaluating curls, gradients and matrix inversions for the timestepping. These linear operations do not couple modes. Here all radial points for a particular mode are located on the same processor, separate modes may be located on separate processors. In the definition of <tt>type (coll)</tt> the parameter <tt>i_H1</tt> is replaced by <tt>i_pH1 = (_Np+i_H1)/_Np-1</tt>. The variable <tt>var_H</tt> has elements to determine which harmonics are on which processor. The elements <tt>pH0</tt> and <tt>pH1</tt> are the zeroth index and one-minus the number of harmonics on the local processor. The corresponding values for all processors are stored in the arrays <tt>pH0_(rank)</tt> and <tt>pH1_(rank)</tt>. For a variable of <tt>type (coll)</tt>, valid values for the harmonic index in <tt>Re(n,nh)</tt> are <tt>nh</tt>=<tt>0..pH1</tt>, which refer to the indices ([[#Ordering_of_the_Fourier_modes]]) $nh$=<tt>pH0..pH0+pH1</tt>.
+
 
+
Data is split radially when calculating Fourier transforms and when evaluating products in real space. In the definitions of <tt>type (spec)</tt> and <tt>type (phys)</tt> the parameter <tt>i_N</tt> is replaced by <tt>i_pN = (_Np+i_N-1)/_Np</tt>. The location of the radial subdomain on the local processor is given by additional elements of the variable <tt>mes_D</tt>. The elements <tt>pNi</tt> and <tt>pN</tt> are the location of the first (inner) radial point and the number of points. The values for all processors are stored in the arrays <tt>pNi_(rank)</tt> and <tt>pN_(rank)</tt>. For a variable of <tt>type (spec)</tt>, valid values for the radial index in <tt>Re(nh,n)</tt> are <tt>n</tt>=<tt>1..pN</tt> which refer to the indices of $r_n$, where $n$=<tt>pNi..pNi+pN-1</tt>.
+
 
+
The bulk of communication between processors occurs during the data transposes in the functions <tt>var_coll2spec()</tt> and <tt>var_spec2coll()</tt>. For <tt>_Np</tt> processors the data is transfered in <tt>_Np</tt>$-1$ steps. At each step each processor sends and receives a block of data, the source and destination are selected systematically as set in the following loop:
+
 
+
<pre>  do step = 1, mpi_sze-1
+
      dest  = modulo(mpi_rnk+step, mpi_sze)
+
      src  = modulo(mpi_rnk-step+mpi_sze, mpi_sze)
+
      ...</pre>
+
figure=transp.eps, scale=0.65
+
 
+
The sketch is for the case <tt>_Np</tt>$=4$. The size of each block is approximately <tt>i_pN</tt>$\times($<tt>i_pH1</tt>$+1)$, and the data within each block must also be transposed.
+
 
+
 
+
 
+
= Differential operators in cylindrical–polar coordinates =
+
 
+
[app:diffops]
+
 
+
$(\bnabla f)_r = \partial_{r}f,
+
      \quad
+
      (\bnabla f)_\theta = \frac1{r}\, \partial_{\theta} f,
+
      \quad
+
      (\bnabla f)_z = \partial_{z} f .$
+
 
+
$\nabla^2 f = (\frac1{r}\,partial_{r} + \partial_{rr} ) f
+
      +\frac1{r^2}\,\partial_{\theta\theta} f + \partial_{zz} f .$
+
 
+
$(\nabla^2 \vec{A})_r  = 
+
      \nabla^2 A_r - \frac{2}{r^2}\,\partial_{\theta}A_\theta
+
      - \frac{A_r}{r^2} ,$
+
 
+
$(\nabla^2 \vec{A})_\theta  =     
+
      \nabla^2 A_\theta + \frac{2}{r^2}\,\partial_{\theta}A_r
+
      - \frac{A_\theta}{r^2} ,$
+
 
+
$(\nabla^2 \vec{A})_z  = 
+
      \nabla^2 A_z .$
+
 
+
$\bnabla \cdot \vec{A} = (\frac1{r} + \partial_{r}) A_r
+
      + \frac1{r}\,\partial_{\theta}A_\theta + \partial_{z} A_z .$
+
 
+
$(\bnabla \wedge \vec{A})_r  =
+
      \frac1{r}\,\partial_{\theta} A_z - \partial_{z} A_\theta ,$
+
 
+
$(\bnabla \wedge \vec{A})_\theta  =
+
      \partial_{z} A_r - \partial_{r} A_z ,$
+
 
+
$(\bnabla \wedge \vec{A})_z  = 
+
      (\frac1{r}+\partial_{r}) A_\theta - \frac1{r}\,\partial_{\theta} A_r .$
+
 
+
$(\vec{A}\cdot\bnabla\,\vec{B})_r  =
+
      A_r\,\partial_{r}B_r + \frac{A_\theta}{r}\,\partial_{\theta}B_r
+
      + A_z\,\partial_{z}B_r - \frac{A_\theta B_\theta}{r} ,$
+
 
+
$(\vec{A}\cdot\bnabla\,\vec{B})_\theta  =     
+
      A_r\,\partial_{r}B_\theta + \frac{A_\theta}{r}\,\partial_{\theta}B_\theta
+
      + A_z\,\partial_{z}B_\theta + \frac{A_\theta B_r}{r} ,$
+
 
+
$(\vec{A}\cdot\bnabla\,\vec{B})_z  =     
+
      A_r\,\partial_{r}B_z + \frac{A_\theta}{r}\,\partial_{\theta}B_z
+
      + A_z\,\partial_{z}B_z .$
+
 
+
Toroidal-poloidal decomposition, axial:
+
 
+
$\begin{eqnarray*}
+
      &
+
      \vec{A} = \bnabla \wedge(\vechat{z}\psi) + \bnabla \wedge\bnabla \wedge(\vechat{z}\phi) ,
+
      & \\ &
+
      \psi=\psi(r,\theta,z), \quad
+
      \phi=\phi(r,\theta,z).
+
      \nonumber &
+
  \end{eqnarray*}$
+
 
+
$\nabla^2_h f =
+
      (\frac1{r}\,\partial_{r}+\partial_{rr}) f + \frac1{r^2}\,\partial_{\theta\theta} f .$
+
 
+
$\vec{A}_r  =
+
      \frac1{r}\,\partial_{\theta} \psi + \partial_{rz} \phi ,$
+
 
+
$\vec{A}_\theta  =
+
      -\partial_{r} \psi + \frac1{r}\,\partial_{\theta z} \phi ,$
+
 
+
$\vec{A}_z  =     
+
      - \nabla^2_h \phi .$
+
 
+
Toroidal-poloidal decomposition, radial:
+
 
+
$\begin{eqnarray*}
+
      &
+
      \vec{A} = \psi_0\,\vechat{\theta} + \phi_0\,\vechat{z}
+
      + \bnabla \wedge(\vec{r}\psi) + \bnabla \wedge\bnabla \wedge(\vec{r}\phi) ,
+
      & \\ &
+
      \psi_0=\psi_0(r), \quad
+
      \phi_0=\phi_0(r), \quad
+
      \psi=\psi(r,\theta,z), \quad
+
      \phi=\phi(r,\theta,z).
+
      \nonumber &
+
  \end{eqnarray*}$
+
 
+
$\nabla^2_c f = \frac1{r^2}\,\partial_{\theta\theta} f + \partial_{zz} f .$
+
 
+
$\vec{A}_r  =
+
      -r \nabla^2_c \phi ,$
+
 
+
$\vec{A}_\theta  =
+
      \psi_0 + r \partial_{z} \psi + \partial_{r\theta} \phi ,$
+
 
+
$\vec{A}_z  =     
+
      \phi_0 - \partial_{\theta} \psi + (2+r\partial_{r})\partial_{z} \phi .$
+

Latest revision as of 08:21, 20 January 2018

If something needs updating or you have suggestions, please communicate them (Main_Page#Author) and/or request an openpipeflow login (top right corner). This website uses Mediawiki and is easy to edit.


Overview of the solver

Equations, properties, methods, etc.:

Using the simulation code:

  • Getting started - overview of files, setup, starting and ending a job.
  • Tutorial - setup a job, basic monitoring and visualisation of outputs.
  • Core implementation - Discretisation, Predictor-Corrector, Code Structure, Parallelisation.
  • Parallel i/o - a brief note on parallel data access.
  • Utilities - pre/post-processing, runtime processing and manipulations, Newton solver for pipe flow.

Non-problem specific codes

  • These are designed for integration with any pre-existing code.
  • File:Arnoldi.f - Krylov-subspace method for calculating eigenvalues of a matrix.
  • File:GMRESm.f90 - Krylov-subspace method for solving the linear system Ax=b for x.
  • File:NewtonHook.f90 - Newton-Krylov-Hookstep method for finding nonlinear solutions x of F(x)=0.