PRACE Preparatory Access – 25th cut-off evaluation on June 2016

Find below the results of the 25th cut-off evaluation of June 2016 for the PRACE Preparatory Access.

Projects from the following research areas:

 

Hybrid liquids from ZIFs: how do MOFs melt ?

Project Name: Hybrid liquids from ZIFs: how do MOFs melt ?
Project leader: Dr François-Xavier Coudert
Research field: Chemical Sciences and Materials
Resource awarded: 50000 core hours on MareNostrum
Description

Metal-Organic Frameworks (MOFs) constitute a fast-growing class of materials aimed at many different of applications. They are mostly studied as crystalline materials for applications such as gas storage via adsorption or heterogeneous catalysis. Nonetheless, amorphous MOFs (aMOFs) are also of great interest. In fact, amorphization could be used to trap in an amorphous non-porous structure a harmful guest detected in a crystalline MOF. Time-controlled drug delivery could also be performed by amorphization of MOF drug delivery vehicles. While traditional silica-glass is hard to functionalize, MOF amorphization is of great interest to produce functional luminescent or optically active glass-like materials. Furthermore, whereas crystalline MOFs are often criticized because of their weak mechanical stability, aMOFs exhibit much better mechanical properties, which is quite important for industrial applications. In collaboration with experimental teams, we study the temperature-induced amorphization for some Zeolitic Imidazolate Frameworks (ZIF), especially ZIF-4. Experimentally, it is possible to obtain a glass-like material from crystalline ZIF-4 by melting and quenching. However, the liquid phase is very difficult to characterize experimentally, as is the melting process itself, because of the thermal conditions and of the disordered nature of this phase. Our project focuses on this melting process at the molecular level, via ab initio molecular dynamics simulations. This level of description is necessary in order to study the melting process. Indeed, force-field based approach developed for MOFs are still pretty bad at describing the coordination chemistry inherent to metal-organic networks and are often totally unable to describe covalent bond breaking and reforming, which are expected to happen here. The simulation of such systems, that contains several hundred of atoms per cell, can only be performed on parallel architectures. As we carry out simulations in the canonical ensemble, at constant density, we will need to perform them on several systems with different densities. We will then analyze the results in terms of effective normal modes to identify the vibration modes responsible for the melting process. Also, computing the pair distribution functions, we will be able to compare the phase obtained by simulation and the liquid phase obtained experimentally. The aim of this study is to get a deeper understanding of the melting process in MOF at a molecular scale, providing information on the local structuration of the liquid phase. In fact, for different crystalline MOFs, the behavior under melting is different as far as local atomic environment is concerned.

top

Analysis of FLEDS Software Package Scalability

Project Name: Analysis of FLEDS Software Package Scalability
Project leader: Dr Annarita Viggiano
Research field: Engineering
Resource awarded: 5000 core hours on Marconi – Broadwell
Description

The aim of this project is the analysis of the computational performance of FLEDS (Flow – Large Eddy and Direct Simulation) software package by using the new Marconi (Broadwell) system. FLEDS is an in-house developed code, fully parallelized by using the MPI libraries, which solves the conservation equations of compressible multi-component reacting premixed and non-premixed mixture of thermally perfect gases by using advanced numerical schemes. FLEDS has a wide range of applications, from the study of fundamental phenomena by using a Direct Numerical Simulation approach to the study of performance and emissions of propulsion systems with advanced combustion strategies and conventional and renewable fuels by using Large Eddy Simulation. The results of this analysis will be used in order to apply to PRACE Project Access Call for Proposals.

top

Scalability of pseudo-spectral code for turbulence simulation with particle tracking using a 1d-pencil decomposition

Project Name: Scalability of pseudo-spectral code for turbulence simulation with particle tracking using a 1d-pencil decomposition
Project leader: Prof. Carlos Silva
Research field: Engineering
Resource awarded: 5000 core hours on MareNostrum, 100000 core hours on Hazel Hen, 100000 core hours on SuperMUC
Description

The project aims at assessing and improving the scalability of a new code for the simulation of isotropic turbulence with particles transport using direct numerical simulations – DNS. Recent research work published by the applicant in several journals (e.g. Journal of Fluid Mechanics, vol. 685, 2011, vol. 695, 2012, vol. 760 2014, and Physics of Fluids, vol. 25, 2013, vol. 26, 2014, vol. 27, 2016) used the same code in temporal simulations of turbulent both incompressible jets and isotropic turbulence. The code was recently reverted into MPI using the promising 1D ‘pencil’ decomposition through the 2DECOMP library (www.2decomp.org), which allows the use of thousands of processors. Moreover, the effects of polymer additives have been included with the FENE-P model. Recent simulations have showed a considerable improvement of the code capabilities, compared to the earlier version that was used in the papers cited above. More recently, the computation of millions of particle trajectories of point particles (tracers) has been added to the code and the new version has been tested successfully in our home clusters. However, this new feature still needs to be tested in a large cluster with thousands of processors, where the 1D ‘pencil’ decomposition is known to deliver its maximum performance. The applicant has experience in running a similar tests in the Lonestar and Ranger machines of the Texas Advanced Computing Centre (TACC), and recently a similar code was tested in several supercomputer architectures of the PRACE infrastructure e.g. Cray, IBM, INTEL XEON and XEON PHI. The aim is to run the new code in simple/very short simulations using a large number of cores in order to make the scalability test/assessment and to develop/improve its scalability for future large-scale simulations.

top

Reproducibly Finding Performance Deficits in MPI Libraries (2)

Project Name: Reproducibly Finding Performance Deficits in MPI Libraries (2)
Project leader: Dr Sascha Hunold
Research field: Mathematics and Computer Sciences
Resource awarded: 100000 core hours on Hazel Hen, 100000 core hours on SuperMUC
Description

Our research focuses on benchmarking MPI libraries, in particular MPI collective communication operations. These collective operations are fundamental building blocks of large scale applications on current supercomputers. Thus, the performance of these libraries is of great interest to scientists as well as to system providers. We have developed a set of MPI benchmarks that enable us to verify self-consistent performance guidelines. A violation of a performance guideline directly pinpoints a performance degradation for a specific function and message size. The benchmark has been tested on several parallel machines, but only on one larger machine that is part of the TOP500 list. We therefore would like to know whether our benchmarking approach leads to insightful results on other supercomputers. The experimental results will allow us to compare the efficiency of different MPI libraries installed on current supercomputers. Our results will also enable MPI developers and system administrators to test and tune individual functions.

top

Investigation of Activation Mechanism of SpCas9 with Molecular Dynamics Simulations

Project Name: Investigation of Activation Mechanism of SpCas9 with Molecular Dynamics Simulations
Project leader: Asst. Prof Sefer Baday
Research field: Biochemistry, Bioinformatics and Life sciences
Resource awarded: 50000 core hours on Marconi – Broadwell, 50000 core hours on MareNostrum, 5000 cores MareNostrum Hybrid Nodes, 100000 core hours on Hazel Hen, 100000 core hours on Juqueen, 100000 core hours on SuperMUC
Description

The Cas9 protein (CRISPR Associated Protein 9), which is a part of type II CRISPR (clustered regularly interspaced short palindromic repeats) bacterial immune systems, has been a very popular tool for engineering the genome in many different organisms. The finding that Cas9 can be easily programmed to target new sites by altering its guide RNA sequence has accelerated and revolutionized the genome-editing studies. CRISPR-Cas9 system allows us to perform many complicated applications like the removal of disease genes, modification of existing genes and insertion of new genes with high precision and with a much easier methodology. Thus, CRISPR-Cas9 system offers a plethora of possible applications in many different areas including biomedical research, therapeutics, and biotechnology. Cas9 is activated by an RNA molecule that is specific to the target DNA sequence and used as the “guide” (also named as sgRNA, small guide RNA) for the function of Cas9. Upon binding of sgRNA Cas9 changes its conformation and recruits DNA molecule to be edited. In order to Cas9 function properly, a specific nucleoite sequence (5’-NGG-3’ protospacer adjacent motif which is called PAM sequence) is required to exist on the target DNA sequence. Cas9 from Streptococcus pyogenes, SpCas9, has been the most widely studied protein among CRISPR/Cas9 systems. Fortunately, crystal structures of SpCas9 in complex with sgRNA and target DNA have been solved recently. Yet, despite the availability of the structural information at various states of SpCas9, dynamics SpCas9 is poorly understood. In this project, we aim at elucidating the activation mechanism of SpCas9. We will investigate sgRNA recognition and conformational changes occuring upon sgRNA binding. We plan to carry out micro-second long classical and accelerated molecular dynamic simulations of SpCas9 systems in order to acquire insights in the SpCas9 activation mechanisms. At this stage of the project, we will assess the feasibility and scability of classical and accelerated MD simulations of SpCas9.

top

Instabilities of disk galaxies in highly scalable cosmological simulations

Project Name: Instabilities of disk galaxies in highly scalable cosmological simulations
Project leader: Dr. Massimo Dotti
Research field: Universe Sciences
Resource awarded: 100000 core hours on Hazel Hen
Description

To date simulations of galaxy formation either explored large cosmological scales at a limited (kpc) resolution (e.g. Shaye et al. 2014 and Vogelsberger et al. 2014), or focused on smaller zoom-in simulations with a spatial resolution as high as ~100 pc (e.g. Guedes et al. 2011, Bonoli et al. 2016). The second family of runs can resolve internal structures. Among these, bars have been recently proposed as one of the actors shaping galactic dynamics and the star formation history of field galaxies in the intermediate to low redshift Universe (e.g. Gavazzi et al. 2015, Cheung et al. 2015, see the scientific case section for a thorough motivation). Because of the low number of resolved objects, zoom-in simulations have little power in constraining which physical processes are the most common and significant in shaping the galaxy evolution as a function of galactic mass and merger history. On the other hand processes associated to internal structures such as bars are not resolved in currently published large scale cosmological runs. In order to overcome these drawbacks, we plan to realize a large scale cosmological simulation with multiple refined zones, centered on field Milky-Way like galaxies. Such audacious attempt has become thinkable only recently thanks to the new publicly available code ChaNGa (Menon et al. 2015), whose Charm++ based implementation allows for an unprecedented scalability. As detailed in Menon et al. (2015), ChaNGa has been tested only on pure dark matter (DM) simulations of up to 24G particles showing almost linear scaling up to 512k cores, and strongly clustered datasets sampled with only 52M particles that scaled close to linearly up to 8k cores. As published ChaNGa scalability tests are available only for pure DM simulations, we plan to start testing the code on an already evolved and highly clustered DM+baryon distribution. We require our cosmological box to be large enough to contain tens of Milky Way like field galaxies. In order to test the code at the most computationally demanding phase of the future production run (close to redshift 0), we will use as initial conditions two boxes containing 8 and 27 replicas of the z~0.1 snapshot from ErisBH (Bonoli et al. 2016). The largest run will evolve a comoving box of 270 Mpc on a side, for a total of ~10 billion particles. In the zoomed region we will achieve an exquisite gravitational resolution of 120 physical pc and an extreme mass resolution of 98k, 20k, and 6k solar masses for DM, gas, and stars, respectively. We stress that these initial configurations (1) will allow us to get a conservative upper limit of the computational time of the production run that will be then initialized from scratch on a second stage, and (2) will allow us to run the test for about 1 Gyr. Such timespan is longer than that associated to the sub-grid physics prescriptions, but at the same time will allow us to save more than 10 Gyr evolution of our cosmological box without losing any scaling-related info.

top

Determining drug binding sites in adeB multidrug exporter of Acinetobacter baumannii

Project Name: Determining drug binding sites in adeB multidrug exporter of Acinetobacter baumannii
Project leader: Asst. Prof Sefer Baday
Research field: Biochemistry, Bioinformatics and Life sciences
Resource awarded: 50000 core hours on Marconi – Broadwell, 50000 core hours on MareNostrum, 5000 cores MareNostrum Hybrid Nodes, 100000 core hours on Hazel Hen, 100000 core hours on Juqueen, 100000 core hours on SuperMUC
Description

Acinetobacter baumannii is an opportunistic bacterial pathogen primarily associated with hospital-acquired infections of the lower respiratory tract in ventilator-assisted patients, urinary tract and bloodstream infectios, and sepsis. Recenlty, the resistance of A. baumannii to antibiotics have escalated and became a very important issue. After the report of A. baumannii strains that have resistance to all known antibiotics, A. baumannii became a very dangerous threat for patients in intensive care units. Recent studies have shown that the majority of the resistance of antibiotics stem from the up-regulation of the AdeABC efflux pump. Bacterial Efflux pump removes antibiotics from cells so that they don’t harm the bacteria. A very important strategy to tackle Efflux pump associated antibitotic resistance is to develop Efflux pump inhibitors which can be co-administered with antibiotics. In order to design compounds that compete with antibiotics, the binding sites of antibiotics for AdeABC efflux pump should be known. In this project, we aim at determining antibiotic binding sites for AdeABC efflux pump to enable development of novel inhibitors that could potentially fight against antibiotic resistance.

top

Scalability tests for first-principles point-defect energetics of 3R-CuCrO2 by means of VASP

Project Name: Scalability tests for first-principles point-defect energetics of 3R-CuCrO2 by means of VASP
Project leader: Dr Manuel Perez Jigato
Research field: Chemical Sciences and Materials
Resource awarded: 100000 core hours on Hazel Hen
Description

The first-principles supercell method is one of the main approaches known to investigate point-defects in solids, providing access to finite-size effects by means of calculations of the point defect embedded in supercells of varying size. In order to carry out absolute convergece studies of the formation energy for the neutral copper vacancy in 3R-CuCrO2, supercells of varying size and lattice type are to be computed, our goal being to be able and investigate the gap between 360 atom supercell and 2880 atom supercell under the PBE0 hybrid functional. Small scale scalability tests for total energy PBE0 calculations are presented in table 1 for different compounds, and in figures 1, 2 and 3 for 3R-CuCrO2 on Intel Xeon processors (12 core per node). Linear behaviour is exhibited up to 144 cores. In order to carry out large scale scalability tests, we intend to investigate the supercells of 360 atoms and 2880 atoms starting with 1000 cores and above.

top

Type B: Code development and optimization by the applicant (without PRACE support) (11)

Porting and Scaling for Applications of E-CAM Community

Project Name: Porting and Scaling for Applications of E-CAM Community
Project leader: Dr Alan O
Research field: …
Resource awarded: 100000 core hours on Marconi – Broadwell, 250000 core hours on Hazel Hen, 250000 core hours on Juqueen, 250000 core hours on SuperMUC
Description

E-CAM is an e-infrastructure for software, training and consultancy in simulation and modelling. It is one of eight Centres of Excellence (CoEs) for computing applications within Horizon 2020. Part of the services that E-CAM will provide to it’s user community is a continuous integration infrastructure. A component of this infrastructure will be the use of JUBE as the regression test infrastructure to be used both for verification of correctness and performance at scale. The goal of this project is to take a sample set of user applications and add them to this infrastructure, porting them to PRACE environments and performing scaling tests on PRACE infrastructures. The resulting benchmark meta-data can then be used by the respective community as a reference point in any application development they engage in. This consultancy process will allow the users of the sample set to apply for production PRACE resources at scale. The initial application set will include QuantumEspresso, DL_POLY and Wannier90, CPMD.

top

Improving the Efficiency of Parallel Sparse Linear System Solvers

Project Name: Improving the Efficiency of Parallel Sparse Linear System Solvers
Project leader: Prof. Cevdet Aykanat
Research field: Mathematics and Computer Sciences
Resource awarded:  100000 core hours on Marconi – Broadwell, 100000 core hours on MareNostrum, 250000 core hours on Hazel Hen, 250000 core hours on Juqueen, 250000 core hours on SuperMUC
Description

The objective is to increase the performance and the scalability of the parallel sparse linear system solver. We propose a new approach for a parallel algorithm to solve minimum norm solution of sparse underdetermined linear systems. Our algorithm utilize the structure of the coefficient matrix and divides it into sub-matrices according to the number of processors in the parallel environment. Each processors solve the small linear system individually and find the partial results. To calculate the global solution we have to solve underdetermined linear system which stems from columns of coefficient matrix that are overlapped between two cascaded blocks. After solving smaller linear system and using this solution to construct the global solution, the global global solution is acquired.

top

UEABS: PRACE’s Unified European Applications Benchmark Suite

Project Name: UEABS: PRACE’s Unified European Applications Benchmark Suite
Project leader: Mr. Walter Lioen
Research field: Mathematics and Computer Sciences
Resource awarded: 100000 core hours on Marconi – Broadwell, 100000 core hours on MareNostrum, 20000 core hours on MareNostrum Hybrid Nodes, 250000 core hours on Hazel Hen, 250000 core hours on Juqueen, 250000 core hours on SuperMUC
Description

This project will support the activities of PRACE-4IP Task 7.3.A in porting, testing and running scalability tests on the application codes from an upcoming new release of the Unified European Application Benchmark Suite.

top

Optimizing Multiple Communication Cost Metrics for Scalable Tensor Decomposition

Project Name: Optimizing Multiple Communication Cost Metrics for Scalable Tensor Decomposition
Project leader: Prof. Dr. Cevdet Aykanat
Research field: Mathematics and Computer Sciences
Resource awarded: 100000 core hours on Marconi – Broadwell, 20000 core hours on MareNostrum Hybrid Nodes, 250000 core hours on Hazel Hen, 250000 core hours on SuperMUC
Description

Tensors are generalized versions of matrices to three or more dimensions, i.e., representations of multi-dimensional data. They arise in many scientific and engineering domains such as numerical analysis, neuroscience, computer vision, data mining, and knowledge base analytics. For example, electronic health records constitute a three-dimensional sparse tensor with dimensions patient-procedure-diagnosis. The latent features of a given tensor are generally analyzed through decomposing the tensor into smaller pieces. The most widely-used technique for tensor decomposition is called Candecomp/Parafac (CP). It decomposes a given n-dimensional tensor into n factor matrices so that the sum of the outer products of the columns of those matrices approximates the given tensor. CP decomposition is generally computed via Alternating Least Squares (ALS) method, which is iterative and computationally expensive. At each iteration, the expensive Matricized Tensor Times Khatri-Rao Product (MTTKRP) operation, in which the nonzeros of the tensor are multiplied with factor matrices, is performed n times for an n dimensional tensor. Even for sparse tensors, computing ALS sequentially is costly in terms of both memory and computations. So, efficient and scalable parallel ALS methods for distributed-memory systems are required. In this project, we aim to optimize the communication costs of a parallelization of ALS method on sparse tensors for distributed-memory systems. There is a gap in the literature for minimizing these costs in terms of different metrics such as latency-based ones. We have devised a combinatorial model for minimizing those metrics to scale ALS method for thousands of processors. Our method permutes each dimension of the tensor in such a way that each processor owns a subtensor with approximately equal number of nonzeros, the number of factor rows sent by each processor is minimized and balanced, and the total number of messages communicated is reduced. The latency-based metric of minimizing the total number of messages will become more beneficial for scalability with thousands of processors, because the average amount of computation and communication volume per processor will get smaller in contrast to the increasing number of messages per processor. We wish to conduct experiments for showing the validity of our model to scale ALS method on different architectures. We plan to compare our model against DMS[1], which distributes the tensor data to the processors without permuting it to optimize some communication cost metric by a combinatorial model. We also plan to compare our model against the tensor partitioning algorithms proposed in [2], which optimize only total communication volume of processors. [1] S. Smith and G. Karypis, A medium-grained algorithm for distributed sparse tensor factorization, Parallel and Distributed Processing Symposium (IPDPS), 2016, IEEE International [2] O. Kaya and B. Uçar, Scalable sparse tensor decompositions in distributed memory systems, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2015, ACM

top

Asynchronous Runtime System for TiDA

Project Name: Asynchronous Runtime System for TiDA
Project leader: Asst. Prof. Didem Unat
Research field: Mathematics and Computer Sciences
Resource awarded: 250000 core hours on Hazel Hen, 250000 core hours on SuperMUC
Description

This project focuses on the development of an asynchronous runtime system and its scheduler for TiDA, a tiled-based programming model. We have previously developed the TiDA library that provides management of data locality. TiDA currently uses bulk synchronous communication, which is not suitable for highly parallel muticore chips because the communication overhead poses a serious limitation to the application’s scalability. This project aims to develop a runtime system to scale TiDA applications on large scale-systems. The runtime system will be important in thread creation, work partitioning and efficient use of the hardware resources, thus will provide scalability through asynchronous communication in message passing layer.

top

Improving Communication Layer of a Computational Simulation of Multiphase Flows

Project Name: Improving Communication Layer of a Computational Simulation of Multiphase Flows
Project leader: Asst. Prof. Didem Unat
Research field: Engineering
Resource awarded: 250000 core hours on Hazel Hen, 250000 core hours on SuperMUC
Description

This project targets scalable parallelization of an Eulerian-Lagrangian method, namely 3D front tracking method, for simulating multiphase flows. Currently Eulerian and Lagrangian grids are represented by a separate MPI processes. In this work, we plan to combine the tasks in these two types of processes under a single process and perform task parallelism to increase the performance. In addition, due to the uneven distribution of Lagrangian grids on the Eulerian domain, there is load-imbalance between processors. In the work, we will develop load balancing strategies to improve the performance of the simulation.

top

Alternative communication patterns for large-scale irregular applications

Project Name: Alternative communication patterns for large-scale irregular applications
Project leader: Prof. Cevdet Aykanat
Research field: Mathematics and Computer Sciences
Resource awarded: 100000 core hours on MareNostrum, 250000 core hours on Hazel Hen, 250000 core hours on Juqueen, 250000 core hours on SuperMUC
Description

Sparse irregular applications that rely on large-scale computations necessitate communication and synchronization overheads to be reduced for scalability. Such applications often exhibit an irregular communication pattern and they are usually realized with point-to-point communication primitives. The importance of different components of overall communication cost can be disproportionate due to the irregularity and sparseness inherent in the application. In such conditions, the best strategy for communication should favor the metric that is most crucial for the performance and a general method that attributes same importance to all metrics is likely to suffer. This makes alternative communication patterns tailored for a specific communication cost metric appealing while realizing point-to-point communications. Our aim is to develop techniques centered around alternative communication patterns in order to achieve a trade-off between important communication cost metrics that determine the parallel performance. For this purpose, we plan to investigate a store-and-forward method that realizes point-to-point communication operations in multiple stages in order to establish bounds on the latency overhead. We wish to examine the effect of increasing the number of stages on the scalability and characteristic of the application empirically. We believe that considering multiple communication cost metrics is the key to achieve very high core counts on modern systems.

top

Simulations of the quantum effects in low-dimensional magnetism

Project Name: Simulations of the quantum effects in low-dimensional magnetism
Project leader: Prof. Grzegorz Kamieniarz
Research field: Fundamental Physics
Resource awarded: 100000 core hours on Marconi-Broadwell, 100000 core hours on MareNostrum, 250000 core hours on Hazel Hen, 250000 core hours on SuperMUC
Description

The Heisenberg model is widely applicable in many areas of quantum physics and chemistry, in particular in physics of magnetism and molecular magnetism. The model is mapped onto a the corresponding quadratic matrix determined in a Hilbert space and is subject to the numerical diagonalization in order to analyze the physical properties of a given physical system. The Hilbert space dimension blows up exponentially with the system size and this is a challenge encountered in application of the spin models to real quantum magnets. Molecule-based metallic clusters and chains behave like individual quantum nanomagnets, displaying quantum phenomena on macroscopic scale. In view of potential applications of such materials in magnetic storage devices or in envisaged quantum computer processor as well as in the low-temperature refrigerants, the accurate simulation of these complex objects becomes the key issue. The magneto-structural correlations, the role and mechanism of magnetic anisotropy and intrinsic quantum effects following from the geometrical frustration induced by the topological arrangement of spins or particular interactions as well as the spin transport count among the new challenges for computer simulations. The simulations planned in the project address the quantum phenomenological models which are the most reliable theoretical representatives of the physical molecular-based nanomagnets investigated recently and their reliability from the fundamental microscopic point of view assessed by the well established first-principle electronic structure calculations. Exploiting a deterministic exact diagonalization technique, the model calculations will be performed without any uncontrolled approximations and will be numerically accurate which is crucial for the calculations of the inelastic neutron scattering spectra (INS) or the ballistic transport in the linear Heisenberg antiferromagnets. The chromium-based rings which are outstanding materials for quantum information processing and for low-temperature cooling will be the principal objects of investigation. The real challenges appear for the molecules containing more than eight Cr S=3/2 ions and/or are doped by magnetic Ni or Cu ions, nevertheless the exact energy spectra, S-mixing, the total spin oscillations essential for quantum coherence and frustration phenomena important for magnetic refrigeration will be accomplished.

top

DISPATCH: A new exa-scale framework for star and planet formation

Project Name: DISPATCH: A new exa-scale framework for star and planet formation
Project leader: Dr Jon Ramsey
Research field: Universe Sciences
Resource awarded: 100000 core hours on Marconi-Broadwell, 250000 core hours on Hazel Hen
Description

A star forms when a dense core of dust and gas, a small part of a much larger complex known as a molecular cloud, collapses under gravity. Due to the conservation of angular momentum, a fraction of the core material forms a rotating protoplanetary disk around the new star. The new star accretes material from the disk, aided by a magnetic outflow that carries away the disk’s angular momentum. All the while, the disk is influenced by its birth environment via continually infalling material that replenishes it, mainly along discrete accretion filaments. Dust in the disk settles to the midplane over time via gravity, becomes concentrated, and interacts with itself and the surrounding disk gas. Through processes that are far from being well understood, the dust grows to pebble (i.e., mm- to cm-) size, accumulates into planetesimals, and eventually forms new, or is accreted onto, planetary embryos. In particular, how efficiently planetary embryos grow under realistic conditions remains uncertain. We propose to simulate the hydrodynamical accretion of pebbles onto embedded planetary embryos with warm, hydrostatic atmospheres. By using the results of RAMSES ab initio zoom-in simulations of star and disk formation (Nordlund et al. 2014, IAU Symposium 299, p.131; Kuffmeier et al. 2016, arXiv:1605.05008, accepted for publication in the Astrophysical Journal) as realistic initial conditions, and by resolving the pressure scale height of the protoplanetary atmosphere, we are in the unique position to not only model pebble accretion to unprecedented realism, but also the interaction between the protoplanetary disk and protoplanetary atmosphere, and even convection within the atmosphere itself. We have developed a new adaptive mesh refinement code for simulating star and planet formation, DISPATCH, which specifically targets exa-scale computing, and which will be used to tackle this ambitious project. The current generation of astrophysical adaptive mesh refinement codes simply do not have the raw performance or scaling properties required to make this proposed project a reality. Thus, we need to consider a new strategy. We have already had some perfomance success with DISPATCH: it updates cells at roughly 1 microsecond/cell/update and scales nearly linearly up to 12288 cores for idealised MHD experiments. In order to make the proposed project a reality, however, we need to test the performance of the code under realistic, production-like settings. Included in this, we also need to know how the code performs at 10k cores when the particle integrator and recently-implemented radiative transfer module are activated (in addition to MHD). If we can obtain scaling data under these conditions, then we are in a signficantly better position to apply for the Tier-0 PRACE resources that are needed to bring the project to successful completion.

top

Fast practical numerical linear algebra

Project Name: Fast practical numerical linear algebra
Project leader: Dr. Oded Schwartz
Research field: Mathematics and Computer Sciences
Resource awarded: 250000 core hours on Hazel Hen, 250000 core hours on Juqueen, 250000 core hours on SuperMUC
Description

For high performance computing HPC, the major bottleneck is the cost of communication between processors and within memory hierarchy. These costs take orders of magnitude more time (and energy) than arithmetic computations, and judging by hardware trends, their share in the total costs is expected to increase further. Hence the need for communication minimizing algorithms. Ideally, we would be able to obtain lower bounds on the amount of communication required for fundamental problems, and design communication-optimal algorithms, i.e., attaining those bounds. We have obtained several lower bounds and optimal algorithms within dense and sparse numerical linear algebra. In this project we intend to implement, tune, and benchmark some of these algorithms.

top

Performance Analysis of GROMACS using various Tools

Project Name: Performance Analysis of GROMACS using various Tools
Project leader: Mr. Thomas Ponweiser
Research field: Chemical Sciences and Materials
Resource awarded: 100000 core hours on MareNostrum, 250000 core hours on Hazel Hen, 250000 core hours on SuperMUC
Description

In this project, carried out in the frame of PRACE-4IP, Task 7.2, the performance of the widely used molecular dynamics simulation code GROMACS is analysed by using various profiling tools. The outcomes of this project, published in form of a PRACE white paper, are threefold: Firstly, our profiling results will give in-depth insight on current performance hotspots of GROMACS, which may be of high interest for users and developers of GROMACS. Secondly, best practices for the usage of the employed profiling tools are collected, which are not limited to applications with GROMACS and thus may be helpful for a relative wide user community. Lastly, the project provides feedback to the profiling tool developers, potentially leading to further improvements of their tools.

top

Type C: Code development with support from experts from PRACE (0)

Share: Share on LinkedInTweet about this on TwitterShare on FacebookShare on Google+Email this to someone