PRACE Preparatory Access – 12th cut-off evaluation in March 2013

Find below the results of the 12th cut-off evaluation of March 2013 for the PRACE Preparatory Access Call.

Type A – Code scalability testing

Project name: Dimerization of the beta-2-Adrenegentic Receptor Protein in Different Membrane Environments Studied Through Multiscale Molecular Modeling

Project leader: Alexander Lyubartsev, Stockholm University, SE
Collaborators: Joakim Jambeck, Stockholm University, SE
Research field: Chemistry and Materials
Resource awarded: 50.000 core hours on HERMIT@HLRS and 100.000 core hours on JUQUEEN, GAUSS@Jülich, Germany



Abstract: In the humane genome the largest membrane protein family is the G protein-coupled receptors (GPCRs) having roughly one thousand members. These proteins govern cell signaling and are therefore major targets for therapeutic agents. In a recently performed X-ray crystallography study the specific binding of cholesterol to Beta2-AR, a well-characterized GPCR, was found. As it has been speculated that Beta2-AR forms oligomers in biological membranes the closely bound cholesterol is believed to promote these oligomerization processes by increasing the stability of the protein in terms of kinetics and energetics. The actual mechanism for this oligomerization is under debate and the role of cholesterol has not been fully understood. By employing large-scale computer simulations, namely molecular dynamics simulations, on different length and time scales we aim at being able to describe these processes in atomistic resolution. By employing enhanced sampling methods the thermodynamics of the dimerization of Beta2AR can be characterized in a quantitative manner. These calculations will be supplemented by coarse-grained simulations, which will allow for systems of larger sizes to be studied under larger time scales. As biological membranes have a vast number of components these calculations have to be performed in bilayers of various composition in order to be more biologically relevant. Finally the impact of cholesterol on the dimerization of Beta2AR will be thoroughly investigated with recently developed accurate potentials in order to elucidate the impact of cholesterol on the dimerization process. The mentioned simulations will include system sizes ranging from hundreds of thousands of atoms to millions of particles and cover processes from the nanosecond to sub-millisecond time scale. Due to the vast computational resources required for this type of simulations the HPC allocations provided by PRACE are of necessity.

Project name: 3-Global instability of three-dimensional confined flows

Project leader: Alexander Gelfgat, Tel-Aviv University, ILCollaborators : Helena Vitoshkin, Tel-Aviv University, IL
Research field: Engineering and Energy
Resource awarded: 50.000 core-hours on CURIE FN, GENCI@CEA, France and 50.000 core-hours on HERMIT, GAUSS@HLRS, Germany



Abstract: We have two independent codes that capable to perform three-dimensional stability analysis of fully three-dimensional basis flows, i.e., to solve so-called 3-Global stability problem. One code is based on the well-known parallel direct linear system solver MUMPS. Its scalability properties are well studied, however fr 3D problems we are restricted with the available memory. Hopefully, PRACE facilities will help us to overcome, at least partially, this restriction.This will allow us to solve several model problems (stability of flow in a lid-driven cube, stability of convection in a laterally heated cube), which will also provide necessary benchmark data for rather wide scientific community interested in it. Within the preparatory project we want to examine the memory restrictions, so that our production application can be better planned. Our second code is not memory demanding, however its scalability on a large number of CPUs is yet to be studied. The method uses pressure-velocity coupled formulation and semi-analytical inverse of the Stokes operator. The description of the method and the parallelization algorithm are described in arXiv:1107.2461v1. The purpose of the preparatory project is the study of scalability when number of CPUs is larger than 100.

Project name: QM/MM metadynamics studies about the TEM-1 inhibition by aviobactam

Project leader: Jacopo Sgrignani, iIcrm cnr, IT
Collaborators: Giovanni Grazioso, University of Milano, IT
Research field: Medicine and Life Sciences
Resource awarded: 50.000 core-hours on MareNostrum@BSC, Spain and 50.000 core-hours on HERMIT, GAUSS@HLRS, Germany



Abstract: Beta-Lactamases (BLs) are enzymes present in some bacteria and able to inactivate beta-lactam antibiotics by hydrolytic reactions; to date they are the major cause of antibiotic resistance. Recently, avibactam, a non-beta-lactamic BLs inhibitor reached the phase III of clinical trials then opening new perspectives for the development of new antibiotic agents (Zhanel et al. Drugs 2013, in press; Shlaes DM., Ann N Y Acad Sci. 2013 Jan;1277(1):105-14). In our project we propose to use QM/MM metadynamics simulations to investigate the free energy surface associated with the formation of the covalent complex between Avibactam and a well characterized BL as TEM-1 (Ehmann et al. Proc Natl Acad Sci U S A. 2012 Jul 17;109(29):11663-8). Metadynamics is a computational technique aimed at enhancing the sampling of MD simulations then accelerating rare events as chemical reactions. After a metadynamics simulation the free energy surface can be derived as the negative of the bias potential. Considering the fast appearing of new BLs able to inactivate our our “last resort” antibiotics, the carbapenems, the data and also the computational protocol developed in our project will be useful for the design of new inhibitors

Project name: Simulating Cosmic Reionization

Project leader: Andreas Pawlik, Max Planck Society, GE
Research field: Astrophysics
Collaborators: Joop Schaye, Milan Raicevic, Alireza Rahmati, Leiden University, GE
Resource awarded: 50.000 core-hours CURIE TN, GENCI@CEA, France



Abstract: The first stars and galaxies formed just a few hundred million years after the Big Bang. Their ionizing radiation is thought to have transformed the universe during an epoch called reionization, changing the initially cold and neutral cosmic gas into the hot and ionized intergalactic medium that we observe today. Reionization is a key epoch in the history of our universe and crucially shaped the format
ion and evolution of galaxies, including our own galaxy, the Milky Way. We kindly request a Type A Preparatory Access supercomputer time allocation to support the preparation of cosmological radiation-hydrodynamical simulations of galaxy formation and reionization that we will propose to perform in the framework of an upcoming Tier-0 PRACE project. The simulations will make use of TRAPHIC (Pawlik & Schaye 2008, MNRAS, 389, 651), an innovative and well-tested MPI parallel ionizing radiative transfer (RT) method. The unique design of TRAPHIC is inspired by well-tested state-of-the-art RT methods but improves substantially on existing approaches, mainly because it is spatially adaptive and has a computational cost independent of the number of radiation sources. For comparison, most current RT simulations of reionization are carried out on non-adaptive uniform computational meshes averaging out much of the dynamic range that characterizes structure formation in the universe. Moreover, nearly all current simulations require a cost that increases with the number of sources, limiting investigations to relatively small problems. TRAPHIC is radiation-hydrodynamically coupled to GADGET (Springel 2005, MNRAS 364, 1105), the most widely employed massively MPI parallel galaxy formation code. This enables us to go beyond most current works that perform the RT in post-processing on static density fields, and to investigate the radiative feedback from reionization on the assembly and evolution of galaxies. We will use the requested Preparatory Access allocation to investigate the strong and weak scaling of TRAPHIC/GADGET. The results will directly inform our application for computation time in an upcoming Tier-0 PRACE simulation call to carry out very large, high-resolution radiation-hydrodynamical simulations of galaxy formation and reionization


Project leader: Natalio Mingo, CEA, FR
Research field: Chemistry and Materials
Collaborators: Jesus Carrete, CEA, FR
Resource awarded: 50.000 core-hours on CURIE FN and 50.000 core-hours on CURIE TN, GENCI@CEA, France



Abstract: The thermal conductivity of materials plays a fundamental role in many applications. Heat dissipation in microelectronics, for example, has been identified as the major bottleneck towards further miniaturization of electronic components. Finding ultra-low thermal conductivity semiconductor compounds is also a primary goal of current research on thermoelectric energy conversion, which is one of the most promising avenues for thermal energy scavenging. In the search for better compounds with low thermal conductivities, trial and error experimentation is costly, time consuming, and likely to miss out many potentially good material compositions. This is where high-throughput ab initio calculations can make a big difference. The same idea, when generalized to all properties of materials, is behind the creation of databases such as, managed by Prof. S. Curtarolo and coworkers, with which the present project is affiliated. However, when compared to other properties of materials that have been treated with high-throughput approaches, the computational cost of rigorously computing the thermal conductivity of even one compound is huge. The main limiting factor in the process is the calculation of the set of second- and third-order interatomic force constants needed as input, which can total 150,000 CPU hours even for a binary compound. We have developed and tested a software package (HiThruPack) able to run such calculations of the lattice thermal conductivity tensor of crystalline material compounds and alloys, end-to-end, in an automatic fashion and without any adjustable parameters. This package combines automatic usage of VASP as a calculation backend, custom code for harnessing the symmetries of the compound ensuring a minimum number of VASP invocations, a cutting-edge solver of the Boltzmann transport equation for phonons (wlsBTE) and a significant body of automation code to streamline the whole process. This makes it possible to perform systematic searches for new materials with specific thermal transport properties, rapidly exploring the chemical phase space to identify best material candidates. This in turn shall save precious experimental resources and accelerate the pace of novel materials discovery. The purpose of this project is to serve as a preparatory step towards calculating the thermal conductivity of all hitherto synthesized binary crystalline compounds with an entry in the ICSD database using this package, taking ab initio thermal conductivity calculations to a scale never attempted before and eventually delivering a major contribution to the field. Our goal in this preparatory phase is to successfully deploy it in this new environment and test its scalability so as to fulfill the requirements of a regular access proposal later on.

Project name: OSC

Project leader: Marco Stenta, Syngenta, CH
Collaborators: Torsten Luksch, Daniel Emery, Olivier Jacob, Syngenta, CH
Research field: Medicine and Life Sciences
Resource awarded: 100.000 core hours on FERMI@CINECA, Italy, 50.000 GPU on CURIE Hybrid and 50.000 core-hours on CURIE TN, GENCI@CEA, France



Abstract: Sterols mediate, in many organisms, essential functions either or both as integral part of cellular membranes and as key messengers in many in many regulatory and signaling processes. The synthetic pathways leading, in different organisms, to different sterol molecules have been subjected to extensive investigations aimed at finding sterol biosynthesis inhibitors (SBIs) to be used as anticholesteremic, antifungal, and anti-trypanosomatid drugs. The potential of SBI has been identified by both Pharmaceutical and Agrochemical industries, as confirmed by the number of marketed compounds whose mode of action (MoA) involves sterol biosynthesis. In particular more than 40 agrochemical SBI fungicides have successfully reached the market, thus constituting, for over two decades, the most important group of specific fungicides. The biochemical basis of this success is the fact that fungi have specific sterols that differ from those present in plants and animals, giving the chance to develop selective inhibitors. The list of sterol biosynthesis enzymes targeted by marketed SBIs include: squalene epoxidase (erg1), Delta8->Delta7 Isomerase (erg2), C14 demethylase (erg11/cyp51), delta14-reductase (erg24), and 3-keto reductase (erg27). Substantial efforts have been made by both Pharmaceutical and Agrochemical companies to expand this list and validate new SBI targets. Roche Pharmaceuticals, in particular, individuated the enzyme oxidosqualene cyclase (erg7, OSC) as a possible SBI target for new anticholesteremic drugs that could complement the widely used statins. Despite potent OSC inhibitors have been developed by using suicide inhibitors, mutagenesis studies, and homology modeling, no active ingredient has reached the market so far, possibly owing to the toxicity observed on mammals for this particular class of compounds. Despite these toxicity issues OSC is, nevertheless, an extremely promising target for the development of antifungal inhibitors, owing to metabolic differences between mammals, plants and pathogenic fungi. OSC catalyzes the regio-/stereo-controlled cyclization of 2,3-oxydosquale
ne to the product lanosterol. The X-ray structures of OSC in complex with the potent Roche inhibitor Ro 48-8071 and with lanosterol provided a deep insight on the binding mode, and paved the way to mechanistic investigations aimed at understanding the reaction mechanism and the catalytic role of active site residues. The objective of this research program is to clarify some unknown aspects of OSC functioning. In particular substrate uptake and product egression from the active site constitute an interesting research area since mutagenesis investigation highlighted a gating mechanism as a concurring cause of substrate specificity in OSC. Given the particular orientation of the active site channel, the membrane is supposed to play a role in the uptake of both natural substrate and inhibitor molecules. Taken together these new evidences could drive the rational design of stronger inhibitors with improved binding kinetics. In addition to that, the specific investigation of the OSC homologues from pathogenic fungi could help improving selectivity and thus address toxicity as early as potency. Extensive molecular dynamics simulations will be carried out to study the dynamics of proteins in their native membrane environments and to gain a deeper insight in the kinetics and thermodynamics aspects of both mammal and fungal OSC inhibition.

Project name: Esteban

Project leader: Thierry MONTAGU, CEA, FR
Research field: Medicine and Life Sciences
Collaborators: Eric Barat, Thomas Dautremer, CEA, FR
Resource awarded: 50.000 core-hours on CURIE TN, GENCI@CEA, France



Abstract: Molecular imaging plays an increasingly significant role in medicine, particularly positron emission tomography (PET). The major constraint of PET is the injection of a radioactive tracer to the patient. In recent years, the increase in population exposure to ionizing radiation, because of the growing number of radiological examinations, is a major concern of national and international radiation protection bodies. The decrease in exposure of these examinations becomes a priority for manufacturers of medical imaging devices. In PET, this amounts to a reduction of the radioactive dose injected to the patient. In order to offset the degradation of the diagnostic image quality induced by the dose reduction, it is necessary to increase the sensitivity of PET systems. This can be achieved by an improvement of the processing in PET image reconstruction software. To this end, a novel statistical method for PET images reconstruction was developed by the project partners. This method is based on a Bayesian nonparametric approach that allows to obtain images, along with their associated uncertainty, for very low injected doses without affecting the quality of image analysis. This new reconstruction technique has initially been evaluated on simulated data. The software was then interfaced with two clinical PET scanners used by one of the project’s partner, to assess reconstruction performances on real data. The Esteban project aims to provide a proof of concept for this Bayesian nonparametric reconstruction technique by evaluating its clinical impact in low doses situations for a large number of studies and assessing the usefulness of uncertainties quantitative estimation for diagnostic.

Type B – Code development and optimization by the applicant (without PRACE support)

Project name: Scaling test for the PINOCCHIO code

Project leader: Pierluigi Monaco, Universita` di Trieste, IT
Research field: Astrophysics
Collaborators: Emiliano Sefusatti, Abdus Salam International Center for Theoretical Physics, IT; Tom Theuns, University of Durham, UK
Resource awarded:250.000 core hours on FERMI@CINECA, Italy



Abstract: We propose to port the cosmological code PINpointing Orbit-Crossing Collapsed HIerarchical Objects (PINOCCHIO, Monaco etal. 2002, MNRAS 331, 587) to the FERMI BG/Q machine. We aim at assessing the feasibility of using this code for computing covariance matrices for statistical measures (power spectrum, two-point correlation function and higher moments) of the large-scale distribution of simulated galaxy catalogs. Given a cosmological model of the LambdaCDM class, the PINOCCHIO code generates catalogues of dark matter halos in a cosmological volume, similarily to N-body simulations but in a time shorter by orders of magnitudes (one run like the Millennium simulation, that costed 350,000 cpu hours, takes 35 min on 360 cores of PLX@CINECA). It is based on Lagrangian Perturbation Theory (LPT) to predict the evolution of particles under gravity and uses Fast Fourier Transforms (FFT). The produced catalogs provide positions, velocities, masses and merging histories for halos, and their statistics (mass function, power spectrum, auto-correlation function) reproduce within 5% those of a simulation run on the same initial conditions. This tool is ideal for studies of large-scale structure, especially in problems like estimates of covariance matrices, where thousands of realizations of very large volumes are needed. We want to optimize this code on a Blue Gene/Q machine like FERMI@CINECA so as to be dominated by FFTs and scale like N log N at least up to 16384 cores. Based on our scaling on PLX@CINECA and assuming a x5 scaling on FERMI, we expect a 5400^3 run on 16384 cores to require 10,000 cpu-h ( 35 min). We then ask for 30,000 cpu-h.

Project name: Aghora_hpc

Project leader: Vincent Couaillier, ONERA, Fr
Research field: Mathematics and Computer Science
Collaborators: Martin Emeric, Renac Florent, de la Llave Plata Marta , Chapelier Jean-Baptiste, ONERA, Fr
Resource awarded:200.000 core-hours on CURIE TN, GENCI@CEA, France



Abstract: over the years, the development of new and increasingly powerful CFD simulation tools has helped manufacturers in the aerospace industry gain a greater understanding of the operating performance of their products. This has allowed them to progress through the design life cycle in a more timely and cost-effective manner by supplementing or replacing experimental testing with CFD computations. The industrial demand for CFD predictions at an ever-increasing level of detail is the driving force for the development of highly accurate simulation techniques able to predict not only overall flow features, but also local values of the quantities of interest. This will allow engineers to expand the range of flow conditions to which CFD can be applied. Today, most industrial CFD codes are based on second-order finite volume methods, which appear not to be sufficiently accurate to reach these goals. With the aim of overcoming the limitations of second-order approaches, Onera has launched the research project Aghora. The main goal is to develop a new demonstrator able to integrate efficient high-order schemes based on Discontinuous Galerkin methods. The new solver will provide advanced HPM adaptation techniques (H for grid, P for accuracy of shape function, M for model) and will be able to run on parallel archit
ectures. The project AGHORA (Algorithm Generation for High-Order Resolution in Aerodynamics), started in January 2012, involves four departments of Onera (DSNA, DAAP, DADS, DTIM) and is managed by the team NUMF (NUmerical Methods for Fluid dynamics) in the CFD & Aeroacoustics Department (DSNA). This prototype focuses on the solution of complex flow systems, including turbulent flows (RANS, LES, DNS), multi-phase flows and multi-species flows. The software demonstrator Aghora will serve as a prototype for the next generation of CFD simulation tools. The aim is to develop innovative tools able to handle:
- Turbulent flow simulation with accurate error estimation and related HP-adaptive techniques
- Multi-scale representation of complex flows with adaptive modeling and hierarchical shape functions
- Accurate geometry representation (high-order curved meshes and iso-geometric approach) However, these methods require the solution of very large discrete systems. This leads to long execution times and high memory requirements. Designing efficient algorithms for modern multi-core architectures and developing highly-scalable parallel strategies turn out to be absolutely necessary to tackle such challenges. The current version of the code has already proven to provide a satisfactory weak scalability efficiency on 2048 Intel Westmere X5675 cores with a pure MPI strategy (non-blocking and synchronous communications). A hybrid MPI/OpenMP parallel strategy is under development. We would like to test the weak scalability efficiency at very large scale using the pure MPI version, as well as variants of the hybrid version. On top of that, we would like to investigate the impact of several factors on the performance (MPI libraries, distributions of processes and threads over the core, MPI thread support levels, polynomial order of the method, etc.). We expect to identify possible bottlenecks arising from either non optimized algorithms or MPI subroutines to enhance the efficiency of the code.

Project name: Scaling MESONH to next gen IBM, Cray & GPU architecture for PETASCALE simulation

Project leader: ESCOBAR MUNOZ Juan, CNRS, Fr
Research field: Earth Sciences and Environment
Resource awarded: 100.000 GPU on CURIE Hybrid, GENCI@CEA, France; 50.000 core-hours on HERMIT, GAUSS@HLRS and 250.000 core-hours on JUQUEEN, GAUSS@Jülich, Germany



Abstract: MesoNH is the non-hydrostatic mesoscale atmospheric model of the French research community. It has been jointly developed by the Laboratoire d’Arologie (UMR 5560 UPS/CNRS) and by CNRM-GAME (URA 1357 CNRS/Mto-France). MesoNH is now running in production mode in TIER0 computer upto 8K cores . The goal of this project is to prepare MesoNH to the next gen architecture, CRAY & NVIDIA/GPU & IBM-BG/Q to gain 1 order of magnitude in the scalability of the code in production runs. Development will concentrate in priority on the continuation of porting of MESONH on the GPU with the PGI/OPENACC tool-kit on multi-node clusters . The second goal of the project is to test the scalability of new algorithm for the pressure solver equation ( like Multi-Grid ) to replace the quasi-spectral present algorithm

Project name: Dispersive interactions in two-dimensional materials from the Random Phase Approximation

Project leader: Thomas Olsen, Technical University of Denmark, DK
Collaborators: Thygesen Kristian, Jacobsen Karsten, Technical University of Denmark; Enkovaara Jussi, Aalto University School of Science FI; Yan Jun, Bligaard Thomas, O’Grady Christopher, SLAC National Accelerator Laboratory USA
Research field: Chemistry and Materials



Abstract: The purpose of the project is to utilize GPU resources to perform fast ab initio quantum mechanical calculations within the Random Phase Approximation. The differential equation describing systems of interacting electrons is well known, but completely intractable if more than a few electrons are involved. For solid state systems, large molecules, and nanostructured materials it is thus crucial to apply an approximate method, which will then allow one to obtain physical quantities such as geometrical structures, total energies and optical properties. Presently, the dominating method for such calculations is Density Functional Theory (DFT), which is a self-consistent approach allowing one to treat systems containing thousands of atoms. However, DFT with the usual semi-local approximations, is notoriously bad at describing systems where both dispersive and covalent interactions are important. For example, semi-local DFT fails to describe the adsorption of two-dimensional structures such as graphene and hexagonal Boron Nitride on metallic and semiconducting substrates. Due to the rapidly growing interest in these materials for applications to nanoscale electronics and optoelectronics there is a pressing need to develop and optimize methods going beyond DFT. Presently, one of the most promising post DFT method is the Random Phase Approximation (RPA). During the past decade there has been an increasing interest in RPA for applications to van der Waals bonded systems and it is now well established that the method accurately describes electronic structure problems where dispersive interactions play an important role (T. Olsen et al. PRL 107, 156401). However, the computational load of RPA calculations is significantly larger than DFT and so far the method has only been applied to small structures containing at most 20-30 atoms. For ab initio simulations of nanoscale electronics devices it will be important to apply the RPA method to much bigger systems. The major bottleneck of RPA calculations, is a large number of matrix operations needed in order to set up the response function describing the electronic system. Preliminary results indicate that a GPU version of the code will speedup RPA calculations by a factor of 20-50 compared to the CPU implementation. Furthermore, the RPA calculations are efficiently parallelized and with a fully working GPU version of the code, it will become possible to simulate large electronic systems, which have previously been out of reach due to the huge computational load.
Resource awarded: 100.000 GPU-hours on Curie Hybrid GENCI@CEA, France

Project name: Asynchronous large scale linear and non linear system solving

Project leader: Raphael Couturier, University of Franche Comte, FR
Collaborators: Charr Jean-Claude, University of Franche Comte, FR
Research field: mathematics and Computer Science
Resource awarded: 100.000 GPU on CURIE Hybrid, GENCI@CEA, France; 250.000 core-hours on JUQUEEN GCS@HLRS, Germany



Abstract: Asynchronous iterative methods may be used to solve large scale linear and non linear systems. These methods may be more s
calable than synchronous ones because no synchronization is required. We have a strong experience in designing large scale asynchronous iterative solvers and in this project we would like to investigate two new approaches: 1) Solving non linear Initial Value Problems (IVP) with the Waveform Relaxations (WR) methods over GPU clusters. We have already shown that these methods are interesting over distributed systems because they allow to compute many time steps without synchronization. 2) Solving linear systems using multisplitting methods which are 2 stage methods. Inner method is a synchronous one, for example GMRES on relatively small blocks and outer method is an asynchronous one between all the synchronous blocks. For this second application, we are interested in using only CPUs.

Project name: Numerical highway to the Earth’s core

Project leader: Nathanael Schaeffer CNRS, FR
Research field: Earth Sciences and Environment
Resource awarded: 200.000 core-hours on CURIE FN/TN and 100.000 GPU on CURIE Hybrid, GENCI@CEA, France



Abstract: our previous project has shown that we can compute a geodynamo simulation (that is a direct numerical simulation of the flow of liquid metal inside the Earth’s core coupled to the self-generated magnetic field) on short time-scales with parameters much more realistic than the current state-of-the-art full geodynamo simulations. Current geodynamo simulations operate in a regime that does not allow Alfven waves to propagate and play a significant role, while it is thought that they are important in the Earth’s core. This is because the Alfven number A (the fluid velocity divided by the Alfven wave speed) is of order 1, while in the Earth’s core it is about A=0.01. To improve on this situation, we use the scaling laws proposed by Christensen and Aubert (2006) and Aubert et al. (2009) in order to rescale the velocity and magnetic fields as well as the non-dimensional control parameters to restart a new simulation with parameters closer to the one of the Earth’s core. Contrarily to a simulation started with random or arbitrary initial condition, this allows us to minimize transients and start the simulation in a state very close to the statistical equilibrium. Therefore, we don’t need to compute too many time-steps before reaching a significant solution. This gives us a numerical highway to approach the state of the Earth’s core. Thanks to a previous preparatory call (2010PA1039), we have been able to do one step in the right direction, namely compute a fully-resolved simulation at viscosity 10 times lower than the original one, going from an Ekman number of 1e-5 to 1e-6 (while this Earth is at 1e-15), leading to an Alfven number reduced to about 0.5. This is not enough yet, and we need one or two more steps to have truely interesting results. In order to do so, we need to further optimize our numerical code XSHELLS

Project name: In-Stent Restenosis

Project leader: Alfons Hoekstra, Universiteit van Amsterdam, NL
Collaborators: Bona Casas Carles, Borgdorff Joris, Universiteit van Amsterdam, NL
Research field: Medicine and Life Sciences
Resource awarded: 250.000 core-hours on SuperMUC, GAUSS@LRZ, Germany



Abstract: In-stent restenosis is a cardiovascular disease which occurs in between 10 and 30% of the patients that undergo a cardiovascular intervention, in which a stent is deployed to fix an already-present vascular occlusion (stenosis). As a reaction to the injury caused by the stent, there is an excessive growth of smooth muscle cells that can provoke a re-occlusion/stenosis of the vessel in the area where the stent was deployed (therefore the name, in-stent restenosis). Many clinical and biological factors involved in the progression of restenotic lesions have been studied in detail over the past few years but the mystery behind the pathophysiological mechanisms of this disease is still unresolved. We have already been addressing scientific questions like the correlation between the degree of injury of the vessel and the amount of restenosis present by simulating small portions of an idealised artery by coupling models of blood flow, an agent based model of the arterial wall and an anisotropic diffusion code to simulate drug-eluting stents. Drugs from the drug-eluting stents seem to reduce the degree of restenosis. We also include a model for thrombus formation due to the modifications in the flow profile that the stent introduces. The aim of this project is to be able to simulate realistic three-dimensional geometries extracted from medical imaging, for which the scalability of the code needs to be improved

Project name: Linear-scaling DFT with 100,000 atoms in the SIESTA code: application to experimental processes at oxide perovskite surfaces

Project leader: Fabiano Corsetti, CIC nanoGUNE Consolider, SP
Collaborators: Aguado Puente, Pablo Artacho, EmilioCIC nanoGUNE Consolider, SP
Research field: Chemistry and Materials
Resource awarded: 250.000 core hours on FERMI@CINECA, Italy; 100.000 core-hours on MareNostrum@BSC, Spain; 200.000 core-hours on CURIE TN, GENCI@CEA, France; 50.000 core-hours on HERMIT, GAUSS@HLRS, 250.000 core-hours on JUQUEEN, GAUSS@Jülich and 250.000 core-hours on SuperMUC, GAUSS@LRZ, Germany



Abstract: First principles simulations using density-functional theory (DFT) have become an invaluable tool for understanding and predicting the properties of a wide range of systems and materials, from potential nanotechnology devices to biological molecules. We aim to make use of current state-of-the-art high performance computing facilities to improve the scalability and reach of the SIESTA DFT code, in order to allow for fully quantum-mechanical simulations of hundreds of thousands of atoms on thousands of cores. To achieve this, we plan to implement a re-engineered linear-scaling DFT algorithm based on the orbital minimization method that is stable, efficient, and scalable. We will then apply our method to the study of complex experimental processes at oxide perovskite surfaces, requiring the simulation of length and time scales that have, up until today, only been achievable by means of empirical potentials.

Project name: Radiative Hydrodynamics with GPUs

Project leader: Dominique Aubert, University of Strasbourg, FR
Collaborators: Ocvirk Pierre, University of Strasbourg, FR
Research field: Astrophysics
Resource awarded: 100.000 GPU on CURIE Hybrid, GENCI@CEA, France



Abstract: The project relies on two cosmological simulation codes that can handle the physics at play during the Epoch of Reionization in the Universe : gravitation, hydrodynamics and radiative transfer. The first code is RAMSES-ATON that couples the CPU code
RAMSES (for gravity+hydro, written by R. Teyssier) to the GPU code ATON (for radiation, written by D. Aubert) on a fixed cartesian grid. Thanks to a successful INCITE application, this code is about to run on 8000 GPUs on TITAN (the hybrid section will be made available in May 2013 at the earliest). The second code is Quartz, part of an ANR-funded project EMMA, that handles the 3 physics on an AMR grid and where calculations are accelerated by GPUs. This code is in full developpment and aims at providing a significant GPU acceleration to AMR-structured data by choosing carefully the layout of the data and its access patterns. The preparatory access time will be dedicated to two tasks:
- first it will be used to tune the RAMSES-ATON code in the prospect of the upcoming TITAN 8000 GPUs run. In particular some physical ingredients (star formation parameters, clumping factors, feedback process) must be calibrated on reduced versions of the full run.
- second it will be used to developp the Quartz code. Specifically we aim at looking what kind of data layout and access pattern would maximize the GPU performance and see if it does not conflict with the MPI inter-cpu performances. Also we would add subgrid processes such as star formation and this could be done by comparing Quartz and RAMSES-ATON.

Project name: Parallel Mesh Generation with Netgen

Project leader: Can Ozturan, Bogazici University, TR
Collaborators: Yilmaz Yusuf, Bogazici University, TR
Research field: Engineering and Energy
Resource awarded: 200.000 core-hours on CURIE FN/TN, GENCI@CEA, France



Abstract: The main objective of this project is the generation of multi-billion element unstructured mesh on complex CAD geometries. To achieve this, a parallel tetrahedral mesh generator is developed based on existing sequential mesh generation software. As sequential mesh generation software, the Netgen mesh generator is used due to its availability as LGPL open source software and its wide user base. Parallel mesh generation routines are developed using the MPI libraries and the C++ language. The parallel mesh generation algorithms developed proceed by decomposing the whole geometry into a number of sub-geometries sequentially on a master node at the beginning and then mesh each sub geometry in parallel on multiple processors. Three methods are implemented. The first decomposes the CAD geometry into sub- geometries which are sent to other processors for volume mesh generation. The second and third methods are refinement based methods that also make use of the CAD geometry information. The second refines volume meshes whereas the third refines surface meshes. To facilitate mesh data structure migrations repartitioning a scalable migration algorithm that utilizes “owner updates” rule is also developed. This work will in particular concentrate on the third method that is based on surface mesh refinements to generate multi-billion element meshes.

Project name: Optimizing JADIM Code for parallel computation of bubbly flow

Project leader: Thomas Bonometti, Institut de Mecanique des Fluides de Toulouse, FR
Collaborators: Pedrono Annaig, Institut de Mecanique des Fluides de Toulouse, FR
Research field: Engineering and Energy
Resource awarded: 200.000 core-hours on CURIE TN, GENCI@CEA, France



Abstract: JADIM is a research code developed at IMFT (Institut de Mecanique des Fluides de Toulouse, Toulouse, France). This numerical tool solves the 3D unsteady incompressible Navier-Stokes equations on curvilinear orthogonal meshes. The code is based on a finite volume approach using a classical projection method to enforce incompressibility. The present approach allows the computation of a wide variety of one- and multi-phase flows including :
- turbulent flows through Large-Eddy Simulation (LES, dynamic mixed subgrid-scale model),
- heat transfer,
- dispersed two-phase flows (two-way coupled Lagrangian particles tracking),
- two and three-phase flows with deformation, break-up and coalescence (Volume Of Fluid approach),
- fluid-solid interaction (Immersed Boundary Method). Here we are interested in the computation of dense bubbly flows. JADIM code is parallelized with a parallelisation based on domain decomposition using a message passing interface (MPI) library. At present time, JADIM code has been running on O(100^3) cores for grid size up to O(10^8) grid points. Our goal is to increase the size of the computation up to O(10^9) in order to capture all the temporal and spatial scales of the flow. The aim of the present project is to optimize the code. In particular the optimization shall focus on (1) decreasing the amount of memory per core, (2) improving the parallel efficiency of sparse linear solver, and (3) reducing the number of blocking MPI communications.

Project name: Nested DOUAR: Coupling high and low resolution finite element models to solve 3D geologic problems

Project leader: David Whipp, University of Helsinki, FI
Research field: Earth Sciences and Environment
Resource awarded: 200.000 core-hours on CURIE FN, GENCI@CEA, France



Abstract: DOUAR (Braun, J., Thieulot, C., Fullsack, P., DeKool, M., Beaumont, C., & Huismans, R. (2008). DOUAR: A new three-dimensional creeping flow numerical model for the solution of geological problems. Physics of the Earth and Planetary Interiors, 171(1-4), 76–91. doi:10.1016/j.pepi.2008.05.003) is a parallel octree-based viscous-plastic creeping flow finite-element program used to simulate deformation of Earth materials from crustal to upper mantle scale (10 – >1000 km). One significant challenge working at this scale in 3D is running experiments at the model resolution required to adequately resolve shear zones in frictional plastic materials typically used to simulate the Earth’s crust. For example, numerical experiments designed to simulate deformation in the Himalayan mountain system typically require a planform spatial extent of 1500 km, limiting the element size in DOUAR to no less than 6 km and poorly resolving deformation on the major faults bounding and within the system. One solution to this resolution limitation is to utilize a nested modeling approach, where a higher scale 3D model is embedded within a larger scale model with a lower resolution. By properly linking the two
models operating at different scales, local regions within the nested model can reach the appropriate mesh resolution, ensuring deformation within the model accurately represents the simulated geologic system. The intent of this project is to further develop an early version of Nested DOUAR, building on a functional, but feature-poor version of DOUAR that includes the option for insertion of zones of higher mesh resolution. The present nested version of DOUAR links both the velocity solution and the geometry of Lagrangian surfaces used to define material interfaces in the large-scale and nested models. However, several material properties, such as total rock strain, and the temperature calculations are not shared between the two model resolutions. These values are both quite important in the long-term deformation of rocks in the lithosphere, and thus they must be properly shared at the two model scales. In addition, the large-scale and nested models are currently controlled by separate input files, which could potentially lead to user-end problems if the two files do not include correct and consistent definition of material properties in each model, for example. The project plan is to (1) complete the implementation of sharing of the full set of material properties and the thermal calculation between the two models, (2) modify the existing interface to the input file to ensure all information required by the large-scale model and the nested model is stored in a single input file, and (3) rigorously test the code at low and high resolution to both ensure the code is operating as expected and to profile the nested code performance.

Project name: OpenFOAM AMR capability for industrial large scale computation of the multiphase flow of future automotive component .

Project leader: Jerome HELIE, Continental Automotive France SAS
Collaborators: Nicolas Lamarque, Chesnel Jeremy, Continental Automotive France SAS; Eng Khan Muihammad, Ecole Centrale Lyon; Lu NaiXian, CORIA CNRS, FR
Research field: Engineering and Energy
Resource awarded: 200.000 core-hours on CURIE FN/TN, GENCI@CEA, France



Abstract: European Industrial industry challenge is important with the global warming issue and the new and future European legislation (so named Euro6, Euro7, new test cycles) for engine pollutant. As electromobility will still represent in the next decade a minority, reciprocating combustion engine efficiency has to be improved. A key component to improve the engine efficiency is the fuel atomizer, where complex turbulent multiphase flow takes place. Presently simulation with reduced number of processors is fully integrated in the prototype development. Highly parralel computations can support in the next year the scientifical and technical bottlenecks in this field, especially simulation real industrial atomizer geometry and operation conditions, with its resulting primary liquid atomization. It requires gas/liquid interface solving and Large Eddy Simulations approach. An interesting compromise between academic solvers and commercial softwares is proposed with opensource developement platform, as OpenFOAM, coded in C++. Presently a large research and applied developement community is collaborating to develop this platform. State-of-art Physical models have been alredy implemented. To success its application to real injector cases, the challenge is to extend OpenFoam to massive parallel runs (more than 1000processor) and simultaneously to reduce the mesh overall size of the discretization points needed to solve the non-linear filtered Navier-Stokes equations and the associated submodels. The approach chosen here is the Automatic Mesh refinement, where locally the mesh is dynamically refined to allow resolution of the needed low-scale structure, as cavitation place or liquid ligament generation: . This promising method requires a parrelization compatibility development effort, which is the target of this preliminary PRACE project.

Project name: HPMC – High-Performance Monte Carlo for nuclear reactors

Project leader: VICTOR HUGO SANCHEZ ESPINOZA, Karlsruhe Institute of Technology, DE
Collaborators: Hoogenboom Eduard, Delft Nuclear Consultancy, NL; Dufek Jan, KTH Royal Institute of Technology, SE; Leppanen Jaakko, VTT Technical Research Centre of Finland, FI
Research field: Engineering and Energy
Resource awarded: 250.000 core-hours on JUQUEEN, GAUSS@Jülich, Germany



Abstract: The HPMC project is an European Union supported project for the development of High-Performance (HP) computing of neutron transport in nuclear reactor cores with the stochastic Monte Carlo (MC) method. Monte Carlo methods have the major advantage over deterministic methods of exactly representing the complicated reactor core geometry with over 70,000 separate fuel rods and the continuous-energy representation of all nuclear data for neutron interactions. The main drawback is its long computation time for statistically accurate results when a detailed power distribution over the reactor core is required. The Monte Carlo method is basically well suitable for parallel execution and the major general-purpose Monte Carlo codes used in the project, MCNP5 (LANL, USA) and SERPENT2 (VTT, Finland), are designed for both MPI and OpenMP and their combined use for parallelisation. Calculations on a parallel computer show a reasonable scalability up to, say, 32 processor cores. However, it became also clear that scaling up to large numbers of computer nodes and processor cores with preservation of good speedup factors is not obvious and will also depend on computer architecture. The Preparatory Access application is intended to determine the scalability of the Monte Carlo codes used on a top-performance supercomputer using up to ten to hundred thousand processor cores in order to establish whether detailed full-core reactor calculations will be possible in an acceptable time. Moreover, code optimisation for parallel execution will be aimed at for maximum performance.

Type C – Code development with support from experts from PRACE

Project name: Enabling Xnavis (URANS solver for fluid-dynamics) for massively parallel simulations of wind farms.

Project leader: Riccardo Broglia, CNR-Insean, IT
Collaborators: Zaghi Stefano, Dubbioso Giulio, Durante Danilo, Muscari Roberto, CNR-Insean, IT; Mascio Andrea, CNR-IAC, IT
Research field: Engineering and Energy
Resource awarded: 250.000 core hours on FERMI@CINECA, Italy



Abstract: Renewable energy is getting in more and more evidence during the last decades due to the increasing of energy demand and to the need for reducing the pollution impact. In this context, large farms of wind turbines play a crucial role to enlarge the produced amount of green and renewable energy. Conceiving the next generation of such devices requires an accurate analysis of the physical phenomena involved. Due to the remarkable complexity of the problem, the theoretical and empirical approaches currently employed are not adequate to get
detailed physical insights. Numerical simulations based on physical models with a minimal set of assumptions may lead to a more reliable description. When dealing with wind turbines, several aspects need to be taken into account: from environmental data, to aerodynamics properties and shape optimization. Solution of the Reynolds averaged Navier-Stokes equations is the cutting-edge choice for such analysis. Anyhow, the set up of numerical codes suitable for exploiting High Performance Computing resources may be challenging. In this project, we plan to extend the capabilities of Xnavis software, a well tested and validated parallel flow solver used by the research group of CNR-INSEAN. The baseline code features an hybrid MPI/OpneMP parallelization, proven to scale when running on order of hundreds of cores (i.e. Tier-1 platforms). However, some issues arise when trying to use this code with the current massively parallel HPC facilities provided in the Tier-0 PRACE context. First of all, it is mandatory to assess an efficient speed-up up to thousands of processors. The following step will concern the memory minimization when huge meshes with moving grids are used. Another important aspect is related to the pre and post processing phases which need to be optimized and, possibly, parallelized. The last one concerns the implementation of MPI-I/O procedures in order to reduce the number of generated files. The support of PRACE experts would be very helpful during the code development needed to enable the code for massively parallel architectures. We expect a more strict collaboration as concerns the assessment of scalability and for giving hints about the implementation of MPI procedures and memory management.

Project name: Scalability analysis, OpenMP hybridization and I/O optimization of a code for Direct Numerical Simulation of a real wing

Project leader: Matteo Bernardini
Research field: Engineering and Energy
Resource awarded: 250.000 core hours on FERMI@CINECA, Italy



Abstract: The project aims at extending the capabilities of an existing flow solver for Direct Numerical Simulation of turbulent flows. Starting from the scalability analysis of the MPI baseline code, the main goal of the project is to devise an OpenMP/MPI hybridization capable of exploiting the full potential of the current architectures provided in the PRACE framework. DNS is a natural candidate for physical investigation of turbulent flows because in a direct computation all scales of motion are resolved, thus avoiding to appeal to any turbulence model (as happens for Large Eddy Simulations or Reynolds-Averaged Navier Stokes computations). The main drawback of DNS is given by the extensive computational resources required, since a very small grid spacing is necessary to obtain an accurate resolution of the near-wall small scales. Moreover the computational resources dramatically increase with Reynolds number following an approximately cubic power law. Current High Performance Computing systems now offer the possibility of performing Petascale computations but suitable code design has to be planned to get the highest performance, trying to match the underlying hardware features. For instance, the Blue Gene/Q architecture (available in PRACE) features a complex multi-node arrangement with 16 computing cores per node, and 4 hardware threads each. In this view, a possible strategy to improve the performance is to implement an MPI/OpenMP hybridization to reduce the number of MPI processes and minimizing the communication time. Starting from the baseline code, which already has a full MPI parallelization based on Cartesian topologies, an hybrid MPI/OpenMP porting will be attempted and the resulting performance will be analyzed and compared to the pure version using common tools (e.g. Scalasca). A second minor task of the project is to optimize the MPI/I-O, also adding a direct output suitable for Open-source visualization softwares running in parallel (e.g. Paraview).

Project name: Next generation pan-European coupled Climate-Ocean Model – Phase 1 (ECOM-I)

Project leader: Jun She, Danish Meteorological Institute, DK
Research field: Earth Sciences and Environment
Resource awarded: 250.000 core hours on FERMI@CINECA, Italy and 200.000 core-hours on CURIE TN, GENCI@CEA, France



Abstract: ECOM (Next generation pan-European coupled Climate-Ocean Model) aims to build up and optimize the computing performance of a coupled climate-ocean model for both operational marine forecasting and regional climate modeling in pan-European scale. This will provide a modeling basis for next generation Copernicus marine and climate service (2015-2020) and IPCC AR6 evaluation. ECOM is divided into two Phases: Phase 1 (this proposal – ECOM-I) will focus on optimize a pan-European Ocean model for operational forecasting based on HBM (HIROMB-BOOS Model). Phase 2 (to be submitted to PRACE in 3months) will focus on coupled pan-European climate-ocean model for climate modeling based on HIRHAM-HBM. HBM has been developed as a Baltic-North Sea forecast model for the last decade, and in recent years optimized with e.g. OpenMP and MPI support. Recently, the model is extended to cover entire European Seas (Baltic, North Sea, NW Shelf, SW Shelf, Mediterranean Sea and Black Sea). HBM is now the only model which has two-way nested coverage of the entire European Seas. With this set up, however, the model needs to be further optimized in order to have a capacity to reach operational forecasting standards. The goal of ECOM-I is to make HBM fast enough to meet computing standards of an operational forecasting model, i.e., a 10day forecast within 2hours Wall Clock time. In ECOM-I, DMI will carry out more pan-European test runs, identify and propose solutions to the bottle-necks in the system and work together with PRACE expert to optimize the code scalability and performance. Major optimizations will be made on MPI communication, vectorization and I/O module.

Project name: Increasing the QUANTUM ESPRESSO capabilities II: towards the TDDFT simulation of metallic nanoparticles

Project leader: Arrigo Calzolari, CNR-NANO Istituto Nanoscienze, IT
Collaborators: Corni Stefano, CNR-NANO Istituto Nanoscienze, IT
Research field: Chemistry and Materials
Resource awarded: 50.000 core-hours on HERMIT, GAUSS@HLRS, Germany



Abstract: Plasmonics is the branch of physics that studies phenomena involving quanta of collective electronic excitations for a given physical system. Plasmonic properties of bulk metals are known for a long time and are well described by the semiclassical Drude-Lorentz model that allows to describe the intraband absorption due to free carriers and the interband one due to bonded carriers, in terms of the complex frequency-dependent dielectric function. Going to the nanoscale, the dielectric constant becomes, in general, a function that depends on the size and on the shape of the system. The excitations and the optical properties of these nano-systems do not have a well defined character as in the bulk and it is more complicated to characterize them. A different appro
ach must be used to describe the excitation of these systems with intermediate characteristics between that of molecular and bulk phase. In particular, the way the excitation properties of molecules transform into that of nanoparticles and then into macroscopic bulk is far from being understood, and it is a field of great current interest. In principle, the plasmonic properties of metal nanoparticles can be studied with ab initio methods recently developed, such as Time-Dependent Density Functional Theory (TDDFT), which proved to be very efficient and gives reliable results for molecular systems. In practice, TDDFT calculations for nanoparticles of even a few nm in size represented a formidable computational challenge, not accessible with most of standard codes (e.g. g09). A promising alternative is given by a recent the implementation of TDDFT (turboTDDFT code), based on a Lanczos approach to the linearized quantum Liouville equation, which allows for the calculation of extended portions of the spectrum in systems comprising several hundred atoms. In this project we plan implement novel strategies for reducing the memory requirements and improving the weak scalability of turboTDDFT, which is a planewave pseudopotential TDDFT code, included in the QUANTUM ESPRESSO (QE) package. The final goal is to obtain a net improvement of the code capabilities and to be able to study the plasmonic properties of metal nanoparticle (Ag, Au) and their dependence on the size of the system under test. This project is a preliminary, but necessary step for the simulation of a hybrid systems composed of metal nanoparticles and molecular antenna coupled by plasmonic interactions. This would be a major breakthrough in the ab initio simulations of nanomaterials, for realization of optoelectronic devices and biomedical sensors. The complete characterization of this system will be demanded to a next Tier-0 PRACE production project. We have previous experience (PRACE preparatory access –pa0699) in the optimization of other codes (pw.x, cp.x) of QE suite, which have the same implementation scheme and share part of the low-level routines. This will be pivotal for the realization of the project. We have also experience in running the code on CINECA machines (e.g. SP6, FERMI, PLX), here we would like to prove the performance of the code on a different architecture, such as HERMIT Cray, in order enhance the transferability of the code.

Project name: Scalability of gyrofluid components within a multi-scale framework

Project leader: Bruce Scott, Max-Planck-Institute for Plasma Physics, Euratom Association, DE
Collaborators: Hoenen Olivier, Coster David, Max-Planck-Institute for Plasma Physics, Euratom Association, DE; Strand Par, Chalmers University of Technology, SE
Research field: Engineering and Energy
Resource awarded: 250.000 core-hours on JUQUEEN, GAUSS@Jülich and 250.000 core-hours on SuperMUC, GAUSS@LRZ, Germany



Abstract: The EU, together with six partners representing more than half of the world’s population, is building the next generation nuclear fusion device, ITER, in Cadarache (France). This tokamak is in an experimental stage due to issues related to the plasma behaviour which are not fully understood. In particular, magnetically confined plasmas are subject to small scale instabilities which develop to a turbulent state, which generates transport degrading the confinement quality. Thus, study and comprehension of the connection between micro-turbulence (which occurs at sub-microsecond time and sub-millimeter space scales) and the global confinement time of the plasma (seconds and meters) is an important milestone on the road to fusion energy. Such study requires tackling some very difficult computational problems. The PRACE report entitled “The Scientific Case for HPC in Europe 2012-2020” identifies three key computational challenges for magnetically confined fusion experiments (MCF). Among these, is multiscale simulation involving slow transport and fast turbulent timescales. We are interested in such multiscale simulations, as part of our work within the FP7 EU-funded MAPPER (Multiscale Applications on European e-Infrastructures) project. This project defines a formal background for multiscale modelling and provides efficient tools for distributed multiscale computing. In MAPPER’s approach, a multiscale system is approximated by coupling several single-scale submodels. The main fusion scenario being addressed within MAPPER project is the coupling of at least three different submodels, namely a 1D transport solver, a 2D equilibrium solver and a 3D gyrofluid turbulence code. All of these codes have been developed within the framework of EFDA Integrated Tokamak Modelling task force (ITM), which makes extensive use of generic data structures, allowing a highly modular approach. In this project, we propose to improve the parallel scalability of GEM, a 3D gyrofluid code. GEM can be used with different geometries depending on the targeted use case, and has been proven to show good scalability when the computational domain is distributed amongst two dimensions over three. Such a distribution allow grids with sufficient size to describe small scale devices. In order to enable simulation of very large tokamaks (such as ITER) the third dimension has to be parallelized and weak scaling has to be achieved for significantly larger grids.

Project name: Direct numerical simulation of a high-Reynolds-number homogeneous shear turbulence

Project leader: Javier Jimenez, Universidad Politecnica Madrid, SP
Collaborators: DONG SIWEI, Universidad Politecnica Madrid, SP
Research field: Fundamental Physics
Resource awarded: 250.000 core-hours on JUQUEEN, GAUSS@Jülich, Germany



Abstract: Turbulence is often induced by shear, and one of the most fundamental problems in fluid dynamics is to reveal the interaction between the mean flow and the kinetic energy of the turbulent fluctuations. The simplest flow in which to investigate this chaotic interaction is the so-called homogeneous shear turbulence (HST), which has a constant velocity gradient (shear) in one direction but whose statistics are not a function of space (homogeneity). It is known that HST has velocity ‘streaks’ like those often observed in wall-bounded turbulent flows, and some features similar to the logarithmic layer in those flows. The logarithmic layer has been investigated for a long time, recently using simulations, and the multi-scale interactions among its eddies are of great interest. However, because of the non-linear nature of those interactions, the mechanisms by which large-scale motions are generated and later collapse into smaller eddies are not well understood. The key is to achieve a high-enough Reynolds number to include a healthy range of scales. The current state of the art in simulations of isotropic turbulence, which do not include shear, is Re_lambda=600-800, while the largest available simulations of wall-bounded flows (Re_tau=2000) reach Re_lambda=130. Since the cost of wall-bounded simulations increases roughly as Re_lambda^8, increasing the range of scales in real wall-bounded flows is out of reach for present computational resources, while HST is only slightly more expensive than the isotropic case. The purpose of this project is to investigate the turbulent structures of the logarithmic layer, and of other shear flows, using data from direct nume
rical simulations of HST. That would be the subject of a later project request which we expect to be of the order of 40 Mcpuh, based on BG/Q. We have run preliminary tests to determine the necessary box sizes and Reynolds numbers, but the simulation code, which is a modification of a previous one for turbulent channels, needs to be ported to large machines. Our group has experience in large-scale computation. The channel code has been used extensively by several groups up to 2K cores, with good scaling, and a previous modification for boundary layers has run successfully in production in Jugene (Entrainment Effects in Rough-wall Boundary Layers; 2011040595), and Intrepid (Incite) on 32Kcores. The present modification has only been tested in small clusters, up to 256 cores. The required changes are similar to those of the previous codes. The data from the final computation will be compared with those of turbulent channels and boundary layers at high Reynolds numbers, and the results will be made available to the community.

Project name: Explicit solvent Molecular Dynamics Simulation of ribosome unit with Gromacs

Project leader: Leandar Litov, University of Sofia “St. Kliment Ohridski”, BU;
Collaborators: Apostolov Rossen, KTH Royal Institute of Technology, SE;
Research field: Medicine and Life Sciences
Resource awarded: : 250.000 core-hours on JUQUEEN, GAUSS@Jülich, Germany and 250.000 core hours on FERMI@CINECA, Italy



Abstract: The aim of the project is to test the performance scalability of Gromacs 4.6.1 on IBM Bluegen/Q and to optimize for production runs a simulation setup for explicit solvent molecular dynamics simulations of huge macromolecular ribosome systems containing several millions atoms. We may also test different optimizations of Gromacs routines that calculate atomic forces. The behavior of Gromacs 4.6 molecular dynamics simulations involving millions of atoms running on BlueGene/Q is still not deeply investigated and such a study is of grate importance for research community winch has access to HPC platforms like FERMI at CINECA and JUQEEN at Jlich Supercomputing Centre. The performance of the Gromacs Bluegen/Q kernels will be investigated and code will be optimized in collaboration with Gromacs development team.

Project name: Massively Parallel Multiple Sequence Alignment Method Based on Artificial Bee Colony

Project leader: Plamenka Borovska, Technical University of Sofia, BU
Collaborators: Gancheva Veska, Landzhev Nikolay, Technical University of Sofia, BU
Research field: Medicine and Life Sciences
Resource awarded: 250.000 core-hours on JUQUEEN, GAUSS@Jülich, Germany



Abstract: The biological sequence processing is essential for bioinformatics and life science. This scientific area requires powerful computing resources for exploring large sets of biological data. The parallel implementations of methods and algorithms for analysis of biological data using high-performance computing are important for accelerating the research and reduce the investment. Multiple sequence alignment is an basic method in the DNA and protein analysis. The project is aimed to carrying out scientific experiments in the area of bioinformatics, on the basis of parallel computer simulations. The aim of the project is optimization and investigation the parallel performance and efficiency of an innovative parallel algorithm MSA_BG for multiple alignment of biological sequences, which is highly scalable and locality aware. The MSA_BG algorithm is iterative and is based on the concept of Artificial Bee Colony metaheuristics and the concept of algorithmic and architectural spaces correlation. The case study is discovering the evolution of influenza virus and similarity searching between RNA segments of various influenza viruses A strains utilizing all available 8 segments of the influenza virus A on the basis of parallel hybrid program implementation of the MSA_BG multiple sequence alignment method.