Refer to: http://www.sam.math.ethz.ch/alsvid-uq Version history: ALSVID-UQ-3.0: - non-isotropic materials for wave equation - 3d plotting routines - plotting scripts reorganized - data loading optimized for 'cut_slice' option: slicing in 3d is now _very_ fast (except for slices at 'z' axis) - cutting 1d slice out of 3d simulation data - topography:kl,kl2d removed (legacy) - material-kl.h/cpp added to organized KL-generation functions - imbalance estimation - compiler option '-mcmodel=medium' removed from cluster configuration files - run_samples() routine restructured (some routines moved to variance.cpp and kurtosis.cpp files) - model:mhd_cloudshock3d and model:mhd_magnetic-blast added - downsampling and variance routines moved to separate source files - variance_fix routines moved out from saver.cpp - support for different compilers: pgi, intel, cray, gnu - CURRENT_LEVEL renamed to MESH_LEVEL - level_init() renamed to mesh_level_init() - SAMPLES_LEVEL introduced; samples_level_init() introduced - parameter 'l' removed from many routines - volumes.cpp and numtypes.cpp created for refactoring of previously used header files only - implemented operator overload between volumes with and without ghost cells - all parallel communication related to statistics is now done using volumes with no ghost cells, increasing the scaling and performance - subscript operator [][][ for volumes improved - deterministic runs optimized: no extra allocation of volumes, saving data on-the-fly - NMT is now computed dynamically - added error: "MULTIX can NOT be larger than number of cells on the coarsest mesh in x-direction" - balancer information added to infolines - multi_combine_mlmc() generalized for cases with very small coarsest mesh resolution and intensive DDM on finest mesh, i.e. when MULTI[X-Z][L] > N[X-Z][0] - 'calc_all_vars()' generalized for all equations using 'calc_xtra()' - variance computation fix in equation:sw - option 'detailed' implemented for 'plotStrongScaling()' routine - movie generation into multi-page PDF implemented (for use with LaTeX package 'animate') - back-end and helper Python scripts were moved to 'scripts/' - errorplot.py script also estimaes constant and convergence rate in error convergence plots - bug in quadratures for 'initial_data()' fixed - bugs in material:logn-fft(-per) (including material-fft.h/cpp files) fixed - bug 'HAVE_DIR(Z) > 1' in 'multi_bc()' routine was fixed - 'comm_level' communicators removed, since they cause MPI error "Too many communicators"; instead, 'comm_level_ranks' is used directly with MPI_COMM_WORLD - D[X-Z][0-L] variable added to store mesh widths - balancer:static can now handle number of samples which are _not_ multiples of number of workers (samplers) - histogram plotting implemented - 'predict_imbalance()' removed - SUB_STEP_HOOK removed, Powell source term computation moved to corresponding solver_*.cpp files - 'SAMPLES' section added to infolines - 'samples.py' stores used number of samples for each sample level - for multi:mpi, 'multix was' the fastest index and 'multiz was' the slowest; this caused problems for FFTW-MPI; now 'multix' is the slowest index, 'multiz' is the fastest - 'stats:mean-var-functional' implemented - 'coefficients_init_ghost_cells()' added to set bc of XtraVars at time t=0, just before starting FVM simulation. - LINKER_LIBS variable added - USE_ESTIMATED LOADS renamed to ESTIMATE_LOADS - HIST_DATA related code removed - LOADS_FILE parameter added, plotting routines now use this one - LOADS_FILE and SAMPLES_FILE are appended with run name - if 'USE_ESTIMATED_LOADS=1', then all computed speeds are output to some python file 'loads.dat' - all histogram-related code removed - by default, 1d and 2d plots are now _additionally_ saved into the EPS file for potential future need in some journals - 'model_set_chaos_size()' and 'model_set_flux_size()' and 'topography_set_chaos_size()' added everywhere via 'set_flux_size()' routine for each equation - 'bc_fix()' removed! 'coefficients_init_ghost_cells()' now does the same job - COMPUTE_VARIANCE option removed completely, equivalence is constant STATS_COUNT == 2, which is enforced for legacy simulations if COMPUTE_VARIANCE was set to '1' - WRITE_LEVEL_DATAFILES now also supported with multi:mpi - samples are computed starting from the finest mesh level and going to the coarser to fix the measurements of MPI idling time - all MPI_Group_create() routines removed (due to poor scaling) and replaced with MPI_Comm_split() routines - bugfix in parallel gathering of LOADS - LOADS are now computed on ALL samplers of MULTI_LEVEL == L, _adaptively_ (parallelization configuration depends on mesh resolution) - balancer_report() is now compatible with adaptive parallelization configuration - ENABLE_SLICED_PARALLELIZATION implemented to support parallel FFT for balancer:adaptive - if parallelization configuration is such that there are more samplers than samples, a warning is issued - routines run_samples, append_samples, etc. rewritten - routines rng_reinit(), chaos_reinit() introduced - all -O3 were changed to -O2 - sample numbers at each level are reported also in multi:mpi - material:logn-fft-per-mpi implemented and tested - multi_level_init() is also called before data output, such that samplers_root and domain_root are set correctly - ENTROPIC_RECONSTRUCTION is not replaced by PRE_RECONSTRUCTION_HOOK and POST_RECONSTRUCTION_HOOK and is defined _only_ in the required solver:* alternatives - implemented stats_copy() in all stats - condition 2s <= d+1 is again strictly enforced - timers.h/cpp implemented, old timing routines deleted - a warning is issued if FFT is parallelized not on all cores - CHECK_VOLUME_INDICES implemented: check if the index bounds are respected, but results in poor vectorization. Use for debug only. Default is set to 0. - balance:adaptive modified for DISTRIBUTE_IN_DIFFERENCES=1 - CHAOS_TYPES implemented - default parallelization configuration is written to runtime.py; later it is used in plot_imbalance() and displayed in infolines - in the generation of infolines, if MULTI_CORES > 0, other MULTI* variables are ignored and hence NOT displayed - USE_PRIME_SEEDS now has default 0 - in errorplot.py, "MPI efficiency" renamed to "Parallel efficiency" - combine.py for MULTI_CORES > 0 implement - combine.py for histograms implemented - CHAOS_TYPE added to all *_set_chaos_size() routines - CHAOS_TYPE added to all model_init(), random flux, random material and random topography routines - in local configuration files for CSCS machines, '--mem' changed to '--mem-per-cpu' - operator overlaoding for volume +=/-=/*=//= T implemented (*= needed for covariance computations) - "vertical" tex table writer implemented - multi_level_init() split into multi_level_init_setup(int mesh_level) and multi_level_init() - material:logn-fft-per-int implemented - two-layer FFT long coeffcient for wave equation added: material:logn-fft-per-int - balancer_write_datafiles() added: writes balancer information to disk, BEFEORE starting computations (can be used for analysis of failed runs) - stats:mean-covar added - make.py option '-s, --simulate' added: simulates 'make' (no actual compilation or running) - plotting routines have 'engine' argument, specifying the desired compute engine, which can be generated using respective 'local_visualization' file - balacer:adaptive is PARALLEL - plot_imbalance() implemented: uses data from the file 'balancer.dat' - adaptive load computation implemented, where load on different meshes are computed with different paralllelization configurations - ZERO_FLUX and WAVE_REFLECT boundary conditions added - support for heterogeneous boundary conditions on each boundary added - parallelization setup now depends also on the MESH_LEVEL; this way the specified MIN_CELLS_PER_CORE is enforced - WRITE_LEVEL_DATAFILES [default:0] switch added. if set to 1, separate output files for each mesh level and type are generated - SMOOTH_FLUX option added - READ_RUN_NUMBER_FROM_FILE introduced (RUN_NUMBER == -1) no longer works - VISCOUS option added for equation:burgers, adjusted by VISCOCITY variable - balancer:adaptive based on generalization of "greedy" algorithm is the only adaptive balancer available - USE_OPTIMAL_NUMBER_OF_SAMPLES added [default: 0] - removes the log-term in error convergence estimates - stats:mean-var-unstable added - UPSCALE variable removed - TYPE variable now does exactly the same job - constraint 2s <= d+1 is no longer enforced, just a warning for performance is displayed - consts USE_ESTIMATED_LOADS and SORT_SAMPLES added for balancer - ML = -1 replaced by READ_SAMPLES_FROM_FILE - DOWNSAMPLE removed!!! we!! - STATIC_LOAD_BALANCER_TYPE removed - LEVEL_BEGIN and LEVEL_END removed - balancer alternative added: - balancer:static - old static load balancer - balancer:adaptive - balancer based on COMBINED knowledge about a-priori WORK ESTIMATES (like in static) and COMPUTATION of TIME STEP SIZE (based on CFL) - COMPILER_DIRECTIVES variable added - '-lfftw3' compiler directive removed from all configs - DOWNSAMPLING option removed - SNAPSHOT global variable introduced - TYPE global variable introduced - WEIGHTS array implemented - 'stats' alternative was introduced - local configuration files for Rosa and Todi, for ICC,PGI,GNU,CCE compilers added - local visualization configuration files for Todi and Rosa using Eiger added - NUMERICAL_STABILIZATION removed - it does not do anything anyway... - support for COMPUTE_IN_DIFFERENCES=1 added - AlltoAll communication is used at the end of logn-fft-mpi field generation - chaos.h/cpp rewritten to support variable amount of random numbers per sample for each sample_level - MULTI_CORES variable introduced: if >0, then MULTI* are ignored and parallelization configuration is computed on the fly. - -w option now accepts float values: -w HH.MM mean HH hours and MM minutes (MM is bounded by 59) - USE_PRIME_SEEDS option implemented [default: 1] - ENSURE_PARALLEL_REPRODUCIBILITY [default: 1] implemented - for ENSURE_PARALLEL_REPRODUCIBILITY = 0, chaos now has GLOBAL (computed on all domain cores) and LOCAL (computed on specific subdomain cores) random variables - Szudzik's pairing is used for computing seed based on multiple parameters - 'level' and 'type' args added to all plotting routines - CLIP, HIGHLIGHT and ENGINE arguments added to all 3d plotting routines ALSVID-UQ-2.0: - for deterministic runs (STATS=0), mean_U is not initialized - saver is used directly on U and Xtra - With ML=-1 the program reads the ML.txt file where the M[l] for every level is given. Allows to make several runs with different number of samples without recompiling. - Increase of the number of primes used for the initialization of the random number generator. - new equation:wave, with all corresponding fluxes, materials, models and solvers - material:const, logn-kl, logu-kl, logn-fft (highly optimized logn-field generation using FFT) - random fluxes were introduced for all equations; some equations have only default deterministic flux, called by: 'flux:det' - filtering out default alternatives ("none"-type) when generating infolines in plots - stats:mean-var, threshold-sw - for evaluation of various statistical events, rather than just estimates of mean and variance - \sigma_K estimation is provided in plots to make sure paramater K is large enough - implementation of Buckley-Leverett (oil recovery) equations, i.e. equation:bl, and random perturbed flux for them - bugfix for serial job submission on clusters (with job schedulers) - "gap problem" in the plots fixed - space:o2central - provided 2nd order accurate reconstruction without flux limiters (used for smooth problems) - calc_time_step() moved to equation.cpp; calc_max_speed() is now sufficient for equation_*.cpp and flux_*.cpp files - HISTOGRAMS added, parallel routines also written - most of routines from mlmc_main.cpp were transfered to mlmc.cpp and the appropriate header file mlmc.h with comments - saver.h split properly into saver.h and saver.cpp, comments added - alternative load balancing algorithm LOAD_BALANCING_TYPE=2. If not specified otherwise, then, automatically: - LOAD_BALANCING_TYPE=2 is chosen for DOWNSAMPLE=0 - LOAD_BALANCING_TYPE=1 is chosen for DOWNSAMPLE=1 - DOWNSAMPLE=1 and LOAD_BALANCING_TYPE=2 are incompatible! (SAVED samples need to be TRANSFERED between cores, which is INEFFICIENT and NOT IMPLEMENTED) - DOWNSAMPLE=0 and LOAD_BALANCING_TYPE=1 is fine, but results in LOSS OF EFFICIENCY when finest levels require most of the CPU time - SKIP_IO_WTIME renamed to RAW_SIMULATION_WTIME - NXl_FULL, NYl_FULL, NZl_FULL introduced - FFT_MATERIAL=1 generalized to BUFFERED_MATERIAL=1 - '%' operator on typedef 'int' was found to return negative values; this operator was replaces with 'mod', which is defined in 'numtypes.h' - generation of the boundary conditions file 'bc.cpp' moved from 'make.py' to 'make_bc.py' - generation of the boundary conditions for shallow water equations implemented through addition function 'bc_fix()' in 'equation.h' - routine names 'get_b()' and 'get_c()' replaced by more consistent names 'b' and 'c' - runtimes are printed in HH:MM:SS format to std::cout - eno.* moved to space_o2eno-util.* - ec_util.* moved to solver_ec-util.* - hll_util.h moved to solver_hll-util.h; hll_util-*.cpp moved to solver_hll-util-*.cpp - solver-wave:fdm-central, space:o1fdm, space:o3norec, time:o3 - new alternatives for verifying dispersive effects in wave equations with highly oscillatory coefficients - '-a' directive for the make.py script fixed - exeption with short explanation is displayed when incompatible options are requested - flux, solver and model are now equation-dependent - equation:burgers extended to multi-d - model:burgers_square-circle added - numtypes.h split into numtypes.h and volumes.h - structures_*.h simplified: now they inherit operator overloading from Base structures, defined in structures-base.h - default values for '-walltime', '-memory' and '-ptile' can be specified in 'local.py' - script option '-v' added: displays full compiler output, parameters and job submission - EXP_LOG renamed to NUMERICAL_STABILIZATION - additional solver:fdm-central and space:o1fdm, space:o2central, space:o3norec for wave equation - new variables INITIAL_DATA_OFFSET, INITIAL_DATA_SCALING - topography:wl,wl2d removed - "none" alternatives removed - default 'material' changed to 'const' - detault 'topography' changed to 'flat' - higher order quadrature added - model_init() modified to use quadrature of the required order - init_cell() replaced by initial_data() - additional parameter SMOOTH_SOLUTION - additional material:riemann - additional flux:viscosity and flux:permeability - model-wave:shock renamed to model-wave:riemann - equation:oil renamed to bl - equation:wave-fe removed - material:kl(2d) renamed to logn-kl(2d) - additional material:logu-kl - 2s > d+1 allowed only in multi:single mode - primitive black and white plotting implemented - insanely large number of other small bugfixes - CPU clock timers removed for multi:mpi mode - MPI efficiency is now also measured separately for DDM, MC, MLMC and load balancing - RAW_SIMULATION_WTIME parameter removed - 'mpi_scaling' renamed to 'mpi_efficiency' - 'nodes' renamed to 'cores' - bugfix in number of open files for 'combine.py' (now warning is displayed with the maximal allowed number of MULTI[XYZ], depending on the file system) ALSVID-UQ-1.6: - entropy stable 2nd order reconstruction 'minmod' was renamed to 'ecminmod' - entropy stable 2nd order reconstruction 'tecno' added - entropy stable solvers 'es-rusanov' and 'es-roe' were optimized - local configuration file for profiling with CrayPat on Cray XE/XT architectures: palu-craypat-local.py - isotropically tensorized bottom topography 'hr2d' added - hierarchical hat basis representations in 1D (hr) and 2D (hr2d) have exponentially decaying coefficients ALSVID-UQ-1.5: - bugfixes in shallow water routines - executables for cluster job submission follow are stored under DATA_PATH ALSVID-UQ-1.4: - additional bottom topographies: hierarchical hat basis representations in 1D (hr) and 2D (hr2d) - implementation of anti-aliasing for 'hr' and 'hr2d' bottom topographies via default option ANTI_ALIASING=1 - WRITE_DATAFILES option - custom DATA_PATH option - more efficient parallel random number generation (avoids scattering, broadcasting, etc.) - default load balancing type is set to LOAD_BALANCING_TYPE=1 - alternative random number generators: well512a, well19937a, mt19937, lcg - SKIP_IO_WTIME option - optimization for computation of bottom topography in wavelet series representation (for shallow water equations) - optimization for simulations with time-independent bottom topography - bugfix for the default choice of the bottom topography - lots of other small bugfixes ALSVID-UQ-1.3: - cosmetic fixes in error and scaling plotting scripts - alternative load balancing algorithm LOAD_BALANCING_TYPE=2. If not specified otherwise, then, automatically: - LOAD_BALANCING_TYPE=2 is chosen for DOWNSAMPLE=0 - LOAD_BALANCING_TYPE=1 is chosen for DOWNSAMPLE=1 - additional bottom topographies: Haar wavelet representations in 1D (wl) and 2D (wl2d) - infolines in plots includes cluster information on which the simulation was run - strong linear scaling verified up to 1023 cores with 97% efficiency ALSVID-UQ-1.2: - extensions and optimization of conservation laws: Wave Equation - ESF: height positivity preserving Roe-type energy stable (ES) solver for Shallow Water equations - 3D visualization tools via MayaVi - additional bottom topographies: Karhunen-Loeve expansions in 1D (kl) and 2D (kl2d) - modularization of topographies - portation to Cray XE6 (Palu) on CSCS - Shallow Water models: steadystate, dambreak, lake-at-rest (+ 2D versions steadystate2d, dambreak2d) ALSVID-UQ-1.1: - extensions and optimization of conservation laws: Shallow Water, Burgers, Euler and Linear Advection - normal distribution - slicing of 2D plots to investigate smoothness of 2D solutions - Shallow Water Equations; models: dambreak, tsunami; topographies: flat, bump, bump2d, shore (+ their random versions); solvers: HLL, EC, ES. - sample offsetting (SAMPLES_OFFSET) for fault tolerance analysis ALSVID-UQ-1.0: - random initial data with any number of uniform stochastic drivers - parallelisation over MLMC levels, MC samples and domain decomposition - posibility to observe time evolution in rendered movies - lots of stochastic 1D and 2D models - WELL512a random number generator - "downsampling" version of MLMC also available