Postprocessing

Once a case has been setup correctly so it can be run without errors, you may want to modify the postprocessing to of the simulation output. By default, NekRS will output a basic set of data according to frequency set in the checkpointInterval of the Parameter File (.par) which can subsequently be viewed through a visualization tool such as Paraview or Visit. However, additional data or derived values can be extracted by setting up User Defined outputs using the UDF_ExecuteStep function of the .udf file.

Checkpointing & Visualization

Standard NekRS field data output files have the form <case>0.f<n>, where <case> is the case name and <n> is a five-digit number indicating the number of the output file (each output file represents a single time step that is output according to the settings for checkpointControl and checkpointInterval in the .par file). These output files are in Nek5000 format that has a line of ascii header followed by binary data. It requires a metadata file <case>.nek5000 to be viewable in Paraview or VisIt. This should be automatically generated by NekRS, but can also be manually created using the nrsvis script.

Adding Custom Fields

In the .par file, and for each COMMON field, one can use checkpointing = true to toggle on (or off) and decide whether dumping a specific field into checkpoint files. Combined with solver = none, the extra fields can be stored into auxiliary scalar fields. This is equivalent to simply attach the pointer to a field name. For instance, the following line will assign the pointer of density to the slot of scalar01.

nrs->addUserCheckpointField("scalar01", std::vector<deviceMemory<dfloat>>{nrs->o_rho});

Adding Custom Output File

A more flexible approach is to dump all desired fields into a separate file. This is handled by the iofld class (src/core/io/iofld.hpp) and the following example shows the common APIs:

// UDF global variable
std::unique_ptr<iofld> iofld;

// UDF_Setup
iofld = iofldFactory::create("nek");
iofld->open(mesh, iofld::mode::write, "properties");
iofld->writeAttribute("uniform", "true");
iofld->writeAttribute("polynomialOrder", std::to_string(mesh->N + 2));

iofld->addVariable("scalar00", std::vector<deviceMemory<dfloat>>{nrs->o_rho});
iofld->addVariable("velocityscalar00", std::vector<deviceMemory<dfloat>>{nrs->o_NLT});

// UDF_ExecuteStep
iofld->addVariable("time", time);
iofld->process();
if (nrs->lastStep) iofld->close();

Here, engineType = "nek" is used, which requires the variable names matching the common field names. The alternative is to use engineType = "adios" which provides a more general output format, documented in the ADIOS2 Format below.

create(const std::string& engineType = "");: Empty enginType will read platform->options.getArgs("CHECKPOINT ENGINE");
polynomialOrder: Can be a different polynomial order.
uniform: Interpolation data from GLL points to uniform grid.

Element filter: Only dump the elements in the list. For example, the below code snippet coped from turbPipe filters out the elements which has zmax larger than zRecycLayer.

auto elementFilter = [&]()
{
  std::vector<int> elementFilter;
  for(int e = 0; e < mesh->Nelements; e++) {
     auto zmaxLocal = std::numeric_limits<dfloat>::lowest();
     for(int i = 0; i < mesh->Np; i++) zmaxLocal = std::max(z[i + e * mesh->Np], zmaxLocal);
     if (zmaxLocal > zRecycLayer) elementFilter.push_back(e);
  }
  return elementFilter;
}();
iofld->writeElementFilter(elementFilter);

Compute Derived Quantity

Additional control of the simulation to compute additional/derived quantities or output custom fields can be achieved by utilising the UDF_ExecuteStep function of the .udf file. Here we demonstrate how this can be used to compute a derived quantity and output custom fields.

Tip

One can use if (nrs->checkpointStep) to only compute the quantities when it’s dumping a checkpoint File.

Built-in Operators

Without further specifying, the mesh is predefined as auto mesh = nrs->mesh;

platform->linAlg class has functions for commonly used vector computations. Here is an example to print min and max for each fields.

// Print min/max
auto printMinMax = [&](std::string tag, const occa::memory& o_u)
{
  if (o_u.isInitialized()) {
    const auto Nlocal = o_u.length();
    const auto umin = platform->linAlg->min(Nlocal, o_u, platform->comm.mpiComm);
    const auto umax = platform->linAlg->max(Nlocal, o_u, platform->comm.mpiComm);
    if (platform->comm.mpiRank == 0) {
      printf("chk min/max %d %2.4e %6s %13.6e %13.6e\n", tstep, time, tag.c_str(), umin, umax);
    }
  }
};
printMinMax("UX", nrs->o_U.slice(0 * nrs->fieldOffset, meshV->Nlocal));
printMinMax("UY", nrs->o_U.slice(1 * nrs->fieldOffset, meshV->Nlocal));
printMinMax("UZ", nrs->o_U.slice(2 * nrs->fieldOffset, meshV->Nlocal));
printMinMax("PR", nrs->o_P.slice(0 * nrs->fieldOffset, meshV->Nlocal));
printMinMax("T", cds->o_S.slice(cds->fieldOffsetScan[0], cds->mesh[0]->Nlocal));

mesh->surfaceAreaMultiply (in src/mesh/meshSurface.cpp) multiply the weights from surface integral on given boundary ids. Here is an example of getting the surface area.

// Set up boundaryID
std::vector<int> bidWall = {3};

auto nbid = bIDList.size();
auto o_bid = platform->o_memPool.reserve<int>(nbid);
o_bid.copyFrom(bIDList.data(), nbid);

// compute surface area
auto o_one = platform->o_memPool.reserve<dfloat>(fieldOffset);
platform->linAlg->fill(mesh->Nlocal, 1.0, o_one);
nusseltArea = mesh->surfaceAreaMultiplyIntegrate(nbid, o_bid, o_one).at(0);

Volume integral can be computed as the inner product between and the diagonal lumpped mass matrix mesh->o_LMM or also named mesh->o_Jw for Jacobian times quadrature weights. Here is an example of computing the averaged $v_z$ from the turbPipe example.

auto o_UZ = nrs->o_U + 2 * nrs->fieldOffset;
const dfloat ubar =
    platform->linAlg->innerProd(mesh->Nlocal, o_UZ, mesh->o_Jw, platform->comm.mpiComm) / mesh->volume;
if (platform->comm.mpiRank == 0) {
  printf(" uBulk: %g\n", ubar);
}
auto uzbar = platform->linAlg->innerProd(mesh->Nlocal, o_UZ, mesh->o_Jw, platform->comm.mpiComm) / mesh->volume;

Derivatives: The opSEM class (in src/core/opSEM.hpp) has various Del operators including gradient, divergence, curl and Laplacian with the combination of weak formulation or weighted strong formulation in terms of flux.

// compute unassembled inv mass matrix
auto o_invJw = platform->device.malloc<dfloat>(fieldOffset);
o_invJw.copyFrom(mesh->o_LMM, mesh->Nlocal);
platform->linAlg->ady(mesh->Nlocal, 1.0, o_invJw);

auto o_grad = platform->o_memPool.reserve<dfloat>(3*fieldOffset);
opSEM::strongGrad(mesh, fieldOffset, nrs->cds->o_S, o_grad); // w_i grad o_S
platform->linAlg->axmyMany(mesh->Nlocal, 3, fieldOffset, 0, 1.0, o_invJw, o_grad);

See also vecGradY in examples/gabls1/gabls.oudf for a customized kernel.

Q-criterion cane be computed by the function nrs->Qcriterion(o_qcriterion); See turbPipe and tgv examples.
Aero force, viscous and pressure induced drag force can be computed by nrs->aeroForces. See the example in examples/ktauChannel/ci.inc.

Nusselt number: With the functionalities above, the Nusselt number, $\frac{1}{|S|}\int_S \nabla T \cdot \vec{n} dS$, can be computed by the following

dfloat compute_nusselt(mesh_t *mesh, const int fieldOffset, std::vector<int> bIDList,
                 occa::memory o_Temp, const double time, const int tstep) {

  static auto firstTime = true;
  static dfloat nusseltArea = 0.0;
  static occa::memory o_invJw;

  auto nbid = bIDList.size();
  if (nbid == 0) return 0.0;
  auto o_bid = platform->o_memPool.reserve<int>(nbid);
  o_bid.copyFrom(bIDList.data(), nbid);

  if (firstTime) {
    o_invJw = platform->device.malloc<dfloat>(fieldOffset);
    o_invJw.copyFrom(mesh->o_LMM, mesh->Nlocal);
    platform->linAlg->ady(mesh->Nlocal, 1.0, o_invJw);

    // surface area
    auto o_one = platform->o_memPool.reserve<dfloat>(fieldOffset);
    platform->linAlg->fill(mesh->Nlocal, 1.0, o_one);
    nusseltArea = mesh->surfaceAreaMultiplyIntegrate(nbid, o_bid, o_one).at(0);
    firstTime = false;
  }

  // dTdn
  auto o_grad = platform->o_memPool.reserve<dfloat>(3*fieldOffset);
  opSEM::strongGrad(mesh, fieldOffset, o_Temp, o_grad);
  platform->linAlg->axmyMany(mesh->Nlocal, 3, fieldOffset, 0, 1.0, o_invJw, o_grad);

  auto flux = mesh->surfaceAreaNormalMultiplyIntegrate(
                                      fieldOffset,
                                      nbid,
                                      o_bid,
                                      o_grad).at(0);

  dfloat nu = flux / nusseltArea;
  if (platform->comm.mpiRank == 0) {
    printf("%9d %11.4e %11.4e %11.4e %11.4e nusselt(rs)\n",
           tstep, time, flux, nusseltArea, nu);
  }

  return nu;
}

Runtime Averaging

When running a high fidelity case with DNS or LES turbulence models, it is often necessary to time-average the solution fields to extract meaningful quantities. This may sometimes even be useful for a URANS case as well. Similar to Nek5000’s avg_all, NekRS has tavg plugin (../src/plugins/tavg.hpp) that can compute $E[u]:=\int_0^{time} u\ dt = \sum_i \Delta t\ u(i\Delta t)$ where $u$ here is an arbitrary scalar field.

To set it up, one needs pass the list of tavgFields to the API tavg::setup. Each entity in tavgFields is a vector of pointers that can have one to four scalar fields. For example, {o_w, o_w, o_temp} will compute the average of E[ o_w[i]*o_w[i]*o_temp[i] ]. The follow code is copied from the gabls1 example that will start the integral from time=10 and dump a checkpoint file with prefix tavg and storing the duration of the integral into time. The averaging period is also reset once the file is dump so each tavg checkpoint file stores the time averaged fields in each non-overlapped window.

#include "tavg.hpp"

void UDF_Setup()
{
  std::vector<std::vector<deviceMemory<dfloat>>> tavgFields;

  deviceMemory<dfloat> o_u(nrs->o_U.slice(0 * nrs->fieldOffset , nrs->fieldOffset));
  deviceMemory<dfloat> o_w(nrs->o_U.slice(2 * nrs->fieldOffset , nrs->fieldOffset));
  deviceMemory<dfloat> o_temp(nrs->cds->o_S.slice(0 * nrs->cds->fieldOffset[0], nrs->cds->fieldOffset[0]));

  tavgFields.push_back({o_u}); // E[u]
  tavgFields.push_back({o_w}); // E[v]
  tavgFields.push_back({o_temp}); // E[T]

  tavgFields.push_back({o_u, o_temp}); // E[uT]
  tavgFields.push_back({o_w, o_temp}); // E[wT]
  tavgFields.push_back({o_w, o_w, o_temp}); // E[w^2T]

  tavg::setup(nrs->fieldOffset, tavgFields);
}

void UDF_ExecuteStep(double time, int tstep)
{
  if (nrs->timeStepConverged && time>=10.0) tavg::run(time); // sum
  if (nrs->checkpointStep) tavg::outfld(mesh); // dump tavg file
}

Often, one might want to compute other quantities on the top of time-averaged variables. The storage can be accessed via tavg::o_avg(). For example, the code below copy the first three tavgFields into o_work.

o_work.copyFrom(tavg::o_avg(), 3 * nrs->fieldOffset);

Sample Points & Particle Tracking

Legacy Support (userchk)

ADIOS2 Format

Starting from v24, the iofld class is introduced to handle reading and writing files. This includes the ADIOS2 support and its BPFile format version 5 (BP5), controlled by checkpointEngine = adios.

By default, NekRS compiles the adios2 as a 3rd party library ENABLE_ADIOS=on.

The checkpoint file will be a folder end with .bp postfix, such as turbPipe.bp/. The ADIOS2 command line utilities will also be installed under $NEKRS_HOME/bin/. Those provides easy ways to inspect and manipulate the data. The data structure will be converted to vtk like format and can be opened by the ParaView’s ADIOS2VTXReader.

Here are some sample usages to inspect the file:

Check metadata with $NEKRS_HOME/bin/bpls turbPipe.bp/. In this case, there are 5 timesteps.

uint64_t  connectivity      [2]*{1344560, 9}
uint64_t  globalElementIds  [2]*{3920}
uint32_t  numOfCells        scalar
uint32_t  numOfPoints       scalar
uint32_t  polynomialOrder   scalar
float     pressure          5*[2]*{2007040}
double    time              5*scalar
uint32_t  types             scalar
float     velocity          5*[2]*{2007040, 3}
float     vertices          [2]*{2007040, 3}

Dump specific variables with -d. For example, $NEKRS_HOME/bin/bpls turbPipe.bp/ time polynomialOrder numOfCells numOfPoints -d

  uint32_t  numOfCells        scalar
1344560

  uint32_t  numOfPoints       scalar
2007040

  uint32_t  polynomialOrder   scalar
7

  double    time              5*scalar
    (0)    0.003 0.006 0.0135 0.0255 0.0375

By default, vtkCellType types=12 is used, which converts each element to 343=N*N*N cells. In this example, 3920 elements and 7-th degree polynomials means a total number of $1344560$ cells, which is defined as the points id stored in connectivity whos coordinates are stored in vertices. All scalar fields and vectors fields are then represented on those numOfPoints=2007040 points.

On HPC, one might want to use export ADIOS2_INSTALL_DIR=<path-to-adios2> .. ENABLE_ADIOS FIXME: move to installation

ADIOS2 format is more flexible in the sense that it allows user to name the variables freely as an additional field. Here is an example of dumping a file contains velocity vector and a scalar field of q_criterion.

Global variables

std::unique_ptr<iofld> iofld;
deviceMemory<dfloat> o_qcriterion;

UDF_Setup

// UDF_Setup
o_qcriterion.resize(mesh->Nlocal); // q croterion

iofld = iofldFactory::create("adios");
iofld->open(mesh, iofld::mode::write, "test");
iofld->writeAttribute("uniform", "true");
iofld->writeAttribute("polynomialOrder", std::to_string(15));

{ // velocity
  std::vector<occa::memory> o_iofldU;
  o_iofldU.push_back(nrs->o_U.slice(0 * nrs->fieldOffset, nrs->mesh->Nlocal));
  o_iofldU.push_back(nrs->o_U.slice(1 * nrs->fieldOffset, nrs->mesh->Nlocal));
  o_iofldU.push_back(nrs->o_U.slice(2 * nrs->fieldOffset, nrs->mesh->Nlocal));
  iofld->addVariable("velocity", o_iofldU);
}

{ // scalars
  iofld->addVariable("q_criterion", std::vector<deviceMemory<dfloat>>{o_qcriterion});
}

UDF_ExecuteStep

// UDF_ExecuteStep
if (nrs->checkpointStep) {
  nrs->Qcriterion(nrs->o_U, o_qcriterion);
  iofld->addVariable("time", time);
  iofld->addVariable("timeStep", tstep);
  iofld->process();
}

if (nrs->lastStep) iofld->close();