PRACE Preparatory Access – 32th cut-off evaluation in March 2018

Find below the results of the 32th cut-off evaluation of 1 March 2018 for the PRACE Preparatory Access.

Projects from the following research areas:

 

SA3VIS

Project Name: SA3VIS
Project leader: Dr Geoffroy Chaussonnet
Research field: Engineering
Resource awarded: 100 000 core hours on Hazel Hen
Description

The present project proposes to investigate the air-assisted atomization of non-Newtonian bio-slurries at high ambient pressure, in the context of renewable energy production based on gasification of biomass. The knowledge of air-assisted atomization of a non-Newtonian liquid would enable to develop a new atomizer design, which would take advantage of the non-Newtonian behaviour of the slurry. Additionally, the results of the simulation would be compared to an experiment conducted at the Karlsruhe Institute of Technology. Primary breakup involves length and time scales over several orders of magnitude that interact with each other. Therefore, in order to have a detailed insight of this phenomenon and to identify the interactions over the different scales, it is required to have a numerical domain that is, both at the same time, large and highly refined. This particular needs lead to highly computationally demanding simulations, which is only feasible on HPC clusters. The selected numerical method is the Smoothed Particle Hydrodynamics (SPH) method. It is a fully Lagrangian method, in which the interpolation points, called particles, carry the physical properties (mass, momentum, energy) and are advected to the fluid velocity. Thus, contrary to mesh-based method, the advection term is not prone to numerical diffusion. Another advantage of this method is that the definition and convection of the gas/liquid interface is naturally described by the motion of gas/liquid particles. This method was developed in the late seventies for astrophysics, then adapted to multiphase flow in the nineties and applied to primary atomization in 2010. Since then, it is successfully applied in different type of breakup configurations. However, the recent investigations highlighted the need of larger simulations which provide a fine resolution together with a large domain. The combination of a higher resolution and a larger domain would enable to resolve the whole, multiscale, atomization cascade, from primary instability to the spray droplets. At the Institute of Turbomachinery (ITS), from the Karlsruhe Institute of Technology (KIT), the SPH method is solved by a in-house code called super_sph. The development of this code was started in 2010, and it has been always designed in the perspective of HPC. In the course of smaller HPC programmes (Tier-2), the code super_sph showed the capacity to handle large numerical domains (more than a billion particles) on parallel simulations up to 4000 cores. Even though super_sph has shown promising scaling trends, its performances were not tested to a production scale required for Tier-0, i.e. larger than twenty thousand CPU.

top

TFSWTFBFS – Thermal fluctuations in solid walls due to turbulent flow in backward facing step geometry

Project Name: TFSWTFBFS – Thermal fluctuations in solid walls due to turbulent flow in backward facing step geometry
Project leader: Prof Dr Iztok Tiselj
Research field: Engineering
Resource awarded: 50 000 core hours on SuperMUC
Description

The goal of this project are the improved turbulent heat transfer predictions in liquid metal cooled systems. This is one of the objectives of the Euratom SESAME (thermal hydraulics Simulations and Experiments for the Safety Assessment of MEtal cooled reactors) project, which is focused on liquid metal coolants relevant for fast breeder nuclear reactors. The SESAME project supports the development of European reactors (ASTRID, ALFRED, MYRRHA, SEALER). Due to its fundamental and generic nature, developments will be of relevance also for the safety assessment of contemporary light water reactors. Our contribution are the direct numerical simulations (DNS) of heat transfer in the backward facing step geometry (BFS). Our results will be used as a comparison base for simulations with other methods, which will in turn then be used as a tool in the design phase. The 3D BFS geometry can be visualized as a channel where one of the walls has a shape of a step. The flow of liquid sodium at around 150°C is flowing from the narrower part to the wider part of the domain. Two solid walls are simulated within the geometry. These are the step wall and the wall after the step, which is in contact with the step wall. The latter wall has internal sources of heat. Because of turbulent fluctuations in the fluid flow, the developed thermal fluctuations in the fluid penetrate into the solid walls. The experiment with liquid sodium in the same geometry and heaters is supposed to take place within the SESAME project at Karlsruhe Institute of Technology.

top

Spin dynamics in magnetic topological insulators: implications for quantum and classical devices

Project Name: Spin dynamics in magnetic topological insulators: implications for quantum and classical devices
Project leader: Prof Antonio Costa
Research field: Fundamental Physics
Resource awarded: 50 000 core hours on MareNostrum
Description

This project aims at exploring the potential of magnetic topological insulators as building blocks for classical communication devices of nanoscopic dimensions or quantum communication devices. The first part of the project will assess their possible usefulness as quantum and classical data buses. The second part will deal with their properties as nano emitters of electromagnetic radiation in the terahertz frequency range. Topological insulators have unique electronic properties stemming from their peculiar electronic structure: a set of topologically protected edge states and an insulating bulk. The edge states can be used as practically lossless charge channels. They also have unique spin transport properties, such as a ground state spin current. The combination between topological insulators and magnetic adatoms or clusters can be used to store, transmit and manipulate information, both classically and quantum mechanically. Carefully chosen magnetic adatoms may be used to provide the needed coupling with the protected edge states without disturbing significantly their transport properties. They can also serve as qubits or classical bits themselves: the large spin-orbit coupling present in the topological insulator induces large magnetocrystalline anisotropy in the magnetic adatom, making its magnetization direction stable at finite temperatures. The same strong anisotropy renders each magnetic adatom a nano oscillator with a natural oscillation frequency in the terahertz range. If a number of magnetic adatoms is deposited onto a topological insulator, the edge states will couple their magnetizations and can, eventually, lead to synchronized or phase-locked nano oscillators. Those are, in principle, useful resources for building medium and long range miniaturized communication devices. By formulating realistic models for the electronic structure of the magnetic topological insulators we will be able to evaluate the energy dissipation rate of the relevant processes quantitatively. The study of the thermodynamics of spin currents and spin excitations will allow us to devise energy efficient data buses and nano emitters. It will also shed light onto the fundamental features of spin relaxation, which is currently a subject of intense debate. Our results should be useful for guiding experimental attempts at building energy efficient.

top

Machine Learning Optimisation for Drag Reduction in a Turbulent Boundary Layer

Project Name: Machine Learning Optimisation for Drag Reduction in a Turbulent Boundary Layer
Project leader: Dr Sylvain Laizet
Research field: Engineering
Resource awarded: 100 000 core hours on Hazel Hen
Description

In this proposal for Preparatory Access the scalability and the performance of the Computational Fluid Dynamics solver called Incompact3d (www.incompact3d.com) are investigated on the supercomputer Hazel Hen, hosted by GCS at HLRS (Germany). This high-order flow solver is able to perform turbulence-resolving simulations based on the incompressible Navier-Stokes equations. It is a finite-difference solver which is based on a powerful 2D domain decomposition (http://www.2decomp.org/). It has lead to excellent parallel efficiency and it is used routinely for production runs on several thousands of cores in France, in Brazil and in the UK. For this project, we would like to investigate the behavior of Incompact3d on one of the most powerful HPC systems in the world. The objective of the project is to perform simulations on of a turbulent boundary layer in order to achieve drag reduction using a machine learning approach. The ability to create net energy savings by reducing the skin-friction drag of turbulent air flows by more than 70% – levels of drag reduction more routinely associated with the well-known phenomenon of Maximum-Drag-Reduction ­– would be of global significance with far reaching consequences across the fields of science and engineering. This ambitious goal can be achieved by developing and implementing a novel machine learning paradigm based on Bayesian optimisation ­– a global optimisation technique that requires several simulations to be performed at the same time, and is yet to be exploited in the field of fluid dynamics.

top

What is an asperity?

Project Name: What is an asperity?
Project leader: Dr Ing Tobias Brink
Research field: Engineering
Resource awarded: 50 000 core hours on MareNostrum
Description

While wear processes—the loss of material when different components come into contact—are present in almost all engineering applications, they are not well understood on the microscopic scale. Currently, we know that the phenomena of friction and wear are governed by the details of the microscopic contact between surfaces. In the literature, models often use the concept of contacting asperities to describe the behavior of macroscopic contact, while assuming a simplified shape of these asperities. However, contact simulations of realistic surfaces and recent experiments show that the actual contact area has a complex shape. While previous atomistic simulations in our group [1,2] gave new physical insight into the single-asperity case, we found in further preliminary simulations that closely-spaced asperities are not worn off individually, but can act together to form single debris particles. The implication for modeling wear is that these asperities should potentially not be treated individually at all, dependent on their distance and geometry. A complete picture of how this interaction occurs in different materials could bring us one step closer to a “first principles” prediction of wear from just a surface geometry, a description of the involved materials, and the loading conditions. A realization of this, though, is dependent on more realistic simulations, optimally on a range of materials models. [1] Aghababaei, Warner, and Molinari; Nat. Commun. 7, 11816 (2016) [2] Aghababaei, Warner, and Molinari; PNAS 114, 7935 (2017).

top

Scaling test of SHEFFlow for DNS simulation of flow control

Project Name: Scaling test of SHEFFlow for DNS simulation of flow control
Project leader: Professor Ning Qin
Research field: Engineering
Resource awarded: 100 000 core hours on MARCONI -KNL
Description

This research is part of an EU H2020 project DRAGY, which aims to tackle the problem of turbulent drag reduction through the investigation of active/passive flow-control techniques. This study involves direct numerical simulation(DNS) of zero-mass jets to achieve drag reduction. In this particular case, by means of solving the Navier-Stokes equations directly in order to obtain the detailed turbulent flow structures accurately and to better understand the flow physics involved. This technique requires much finer meshes than the those used in industry by solving the Reynolds averaged Navier-Stokes equations, making the simulations very CPU demanding due to the very high turbulent flow resolution required. Currently, our in-house research code, SHEFFlow, has been tested employing up to 512 CPUs using OpenMPI. The parallel efficiency on our limited university has shown to be good. However it limits severely our capability for higher resolution. The objective of this project is to develop and test the scalability performances of SHEFFlow to be able to run larger meshes for finer resolution, in the next Prace call, capturing the turbulent flow physics and its control much more accurately.

top

Dark matter core formation: dark matter self-interactions versus Supernovae feedback

Project Name: Dark matter core formation: dark matter self-interactions versus Supernovae feedback
Project leader: Prof Jesús Zavala Franco
Research field: Universe Sciences
Resource awarded: 50 000 core hours on Curie
Description

The goal of the project is to perform scalability tests for N-body simulations of dark matter haloes with a population of star particles as tracers. In addition to gravity, the simulations include one of two types of additional physical processes inducing the formation of a central dark matter core: (i) self-interactions between the dark matter particles, or (ii) an effective rapid removal of mass from the halo centre (mimicking a supernova-driven galactic wind). These two mechanisms have been proposed to reduce the inner dark matter densities at ~kpc scales, as seemingly required by observations. Our purpose is to study the response of the population of star tracers to both mechanisms which are different in nature (adiabatic vs impulsive). This is part of a PhD thesis project (Jan D. Burger) supervised by the project leader. Our team has worked extensively on self-interacting dark matter algorithms, and we have been testing the implementation of the supernova-driven core formation scenario in small- scale simulations (1 million and 10 million particles). We are requesting computing resources to perform scalability tests of our simulations to reach 100 million particles. This is in preparation to larger simulations that will explore the orbital space of a large sample of tracer stars to identify differences in the orbital evolution due to the two different mechanisms of core formation.

top

Scalability evaluation of Finite Volume codes for flow simulations in the transportation sector

Project Name: Scalability evaluation of Finite Volume codes for flow simulations in the transportation sector
Project leader: Dr Simone Sebben
Research field: Engineering
Resource awarded: 50 000 core hours on MareNostrum
Description

The aim of this project is to assess the scalability of two different finite volume codes used for external and internal aerodynamics and aeroacoustics computations within the transportation sector. Both codes employ high fidelity turbulence modelling such as Detached Eddy Simulations (DES) and advanced numerical schemes such as Time Spectral methods in order to investigate the aerodynamic and acoustic performance of road transportation vehicles and aircraft propulsion devices for fully detailed geometries at real-life Reynolds numbers. Furthermore, the project should allow for gathering of information needed for a future PRACE Project access application.

top

Role of Glycans in Epidermal Growth Factor Receptor Regulation by Glycolipids

Project Name: Role of Glycans in Epidermal Growth Factor Receptor Regulation by Glycolipids
Project leader: Prof Ilpo Vattulainen
Research field: Biochemistry, Bioinformatics and Life sciences
Resource awarded: 500 00 core hours on Curie, 100 000 core hours on Hazel Hen, 100 000 core hours on Piz Daint
Description

Over the past decade, protein glycosylation attracts ever-growing attention in a variety of research areas, from biomedicine to biotechnology. Due to a strong causative link between protein glycosylation patterns and development of severe disorders, the study of protein glycans is of utmost importance. Epidermal growth factor receptor (EGFR) is a glycosylated transmembrane receptor that plays a pivotal role in a vast majority of cellular processes and involved in the development of several forms of cancer. Recently EGFR glycans have been sequenced, and we started to understand the involvement of protein glycans in EGFR functioning and cancer development. In particular, it was recently demonstrated that EGFR activity is mediated by association with membrane glycolipid GM3 through carbohydrate-to-carbohydrate interactions. However, the exact molecular mechanism of GM3 interaction with differentially glycosylated EGFR remains unexplored. Here we are going to elucidate the mechanisms behind the regulation of EGFR activity by GM3 glycolipid using an extensive all-atom Molecular Dynamics (MD) simulations with GROMACS package. Considering the size of fully glycosylation EGFR dimer which requires a large membrane for its accommodation, systems would need to go through proper scaling tests to maximize the efficiency of calculations. This project is aimed at benchmarking and fine-tuning the GROMACS performance for above described solvated EGFR systems.

top

DENtICLE – DirEct NumerIcal simulation of Channel flow with ribLEts

Project Name: DENtICLE – DirEct NumerIcal simulation of Channel flow with ribLEts
Project leader: Dr Davide Modesti
Research field: Engineering
Resource awarded: 100 000 core hours on MARCONI -Broadwell
Description

The aim of the present project is to perform profiling and scalability tests of CTI Cliff, an unstructured finite volume solver for the solution of the incompressible Navier-Stokes equations through Direct Numerical Simulation (DNS), in preparation for the upcoming Prace calls. The solver will be used to carry out DNS of channel flow with riblets, small streamwise-aligned grooves inspired by sharks skin that are known to reduce drag.

top

ETHOS: From the linear to the non-linear power spectrum in models with a primordial cutoff in the power spectrum

Project Name: ETHOS: From the linear to the non-linear power spectrum in models with a primordial cutoff in the power spectrum
Project leader: Prof Jesús Zavala Franco
Research field: Universe Sciences
Resource awarded: 50 000 core hours on MareNostrum
Description

The goal of the project is to perform scalability tests for N-body simulations of cosmological structure formation that contain a primordial cutoff in the power spectrum. This cutoff can be produced by either the free streaming of dark matter particles (known as Warm Dark Matter, WDM), or by allowed interactions between dark matter particles and relativistic particles in the early Universe (known as interacting Dark Matter, iDM). Dark matter models with such a cutoff offer attractive solutions to the outstanding challenges faced by the standard Cold Dark Matter (CDM) model. Our scientific goal is to establish a mapping between the parameters of the particle physics models connected to WDM and iDM, and the parameters that characterize the non-linear evolution of the power spectrum down to the scales relevant for galaxy formation. To accomplish this goal, we need to perform N-body cosmological simulations of the evolution of dark matter structures from initial conditions computed in the linear regime, into the highly non-linear regime. This requires a suite of simulations of two different types: (i) large-scale uniform boxes to have a fair representation of the density field in a large cosmic volume and get the correct power spectrum at large scales; (ii) zoom simulations in smaller volume regions to achieve higher spatial resolution and sample the power spectrum down to the scales relevant for galaxy formation. This is part of a PhD thesis project (Sebastian Bohr) supervised by the project leader. Our team and collaborators have worked extensively in N-body simulations (the project leader has over a decade of experience in the field) within the standard CDM model, WDM, and more recently have had a leading role in iDM. For this particular project, we have performed low resolution simulations with the aim at testing the matching between the uniform boxes (that provide the power spectrum at large scales) and the zoom boxes (that provide the power spectrum at smaller scales). We have performed several of these simulations in boxes ranging from 5 Mpc to 25 Mpc, with different particle numbers going from 128^3 to 512^3. In this stage of our project, we are requesting computing resources to perform scalability tests of our simulations to reach at least 1024^3 particles in uniform and zoom boxes. We need at least one of each, which given our estimates, would take 100,000 CPU hours each running in 264 cores (3GB memory per core). We anticipate to use at least 2TB of data for storage.

top

Scalability for the EllipSys3D solver for wind energy purpose.

Project Name: Scalability for the EllipSys3D solver for wind energy purpose
Project leader: Prof Niels Nørmark Sørensen
Research field: Engineering
Resource awarded: 50 000 core hours on MareNostrum
Description

The project aim at proving the parallel scaling of the EllipSys3D code for a wide range of wind energy applications, covering Large Eddy Simulations for Atmospheric Boundary Layer flows over complex terrain, DES and RANS simulations of resolved rotor flows, with the aim of paving the way for the large scale simulations needed in the future. Activities within national and international projects pushes the size of the required simulations beyond what is realistic on in-house resources. Specifically the need for LES simulations of off-shore wind parks, and wind parks in complex terrain will require large domain with high spatial resolution and long simulation times for improved statistics. Secondly, aero-elastic simulations with aerodynamics based on CFD, with resolved inflow turbulence is also requiring high spatial and time resolution and extended time periods to obtain sufficient statistics. The EllipSys3D code is a in-house general purpose flow solver developed over the last 25+ years at the Danish Technical University and the former Risø National Laboratory. The code is based on the pressure correction approach, and has been parallelized using MPI for distributed memory machines.

top

WindFoam

Project Name: WindFoam
Project leader: Dr Nikolaos Lampropoulos
Research field: Engineering
Resource awarded: 50 000 core hours on SuperMUC
Description

The project WindFoam aims at assessing the accuracy and scalability of the open source CFD tool OpenFoam® v.4.1 applied to flow simulations around large rotor wind turbines. The motivation lies on the physical complexity of such flows as well as on the use of computational meshes which are quite demanding for this tool. The topics to be addressed in this project read, in a nutshell, as follows: 1) Flows around large scale wind turbines can become three dimensional presenting dynamic stalls and compressibility phenomena. Since this software was initially developed for low speed applications its performance will be investigated. 2) The computational meshes around such large structure are comprised of skewed, high aspect ratio cells exceeding frequently the values of 20,000. Since OpenFoam was developed for low speed flows using cartesian meshes we will investigate also the numerical stability of the algorithm which is expected to be compromised. 3) The use of two zones in the computational mesh (the rotating one comprising the wind turbine rotor and the stationary one extending to the far field boundaries) is also to be proven seamless in parallel processing. 4) The scalability of the code up to 1000 cores is also to be estimated. Due to the size of meshes (dozens of millions of cells) and the small stepping during time marching parallel processing is mandatory by such flows. The efficiency of the parallelization in terms of accuracy and wall clock execution time is also to be verified. To the end of this project we aim at providing guidelines on the following issues: 1) Set up of the input files of OpenFoam. The method to be used (application pimpleDyMFoam) simulates the time unsteady flow with two zones in the mesh (rotating & stationary). 2) Characteristics of the computational meshes that can run smoothly with OpenFoam. The degrees of skewness and volume anisotropy can be quantified by the geometry of cells (max included angle, aspect & area ratio, etc.). Different meshing strategies can be adopted by making use of the commercial tool PointWise® . Moreover, the accuracy and the parallelization efficiency of the software will be evaluated. The benchmarking of the results will be done using the database of the AVATAR project (www.eera-avatar.eu/) for up-scaled wind turbine rotors.

top

Scalability Tests of a Higher Order Finite Element Parallel Code for High Frequency Electromagnetics

Project Name: Scalability Tests of a Higher Order Finite Element Parallel Code for High Frequency Electromagnetics
Project leader: Prof Luis E. Garcia-Castillo
Research field: Engineering
Resource awarded: 50 000 core hours on MareNostrum
Description

An in-house higher order finite element code for high frequency electromagnetic problems (named HOFEM – higher order finite element method) is to be tested on a large scale. The code is general purpose suitable to model the whole range of electromagnetic wave propagation phenomena, e.g., those present in the analysis of waveguides, microwave passive devices, antennas, Radar Cross Section prediction and so on. It makes use of higher-order curl-conforming finite elements and a special technique for mesh truncation in open domain problems (scattering and radiation phenomena) among other features. The code is written using modern Fortran constructions (mainly FORTRAN 2003) and uses hybrid MPI+OpenMP programming. So far, HOFEM has been mainly tested on small (and mid-size) clusters up to a few hundred cores. Therefore, the main goal of the present project is to test the code with bigger problems using a larger number of cores, in order to asses the scalability of the code and detect possible bottlenecks that could be conveniently retrofitted into the existing code and improve future developments.

top

Nanoparticles for Catalysis Computational Modelling – NCCCM

Project Name: Nanoparticles for Catalysis Computational Modelling – NCCCM
Project leader: Prof Joseph Kioseoglou
Research field: Chemical Sciences and Materials
Resource awarded: 50 000 core hours on MareNostrum
Description

Size-selected nanoclusters exhibit remarkable physical and chemical properties, which diverge significantly from the properties of bulk materials. These properties make them ideal for applications such as novel electronic and photonic materials, sensors or binding sites for biomolecules. Recent advances in cluster sources, which increased the nanocluster yield by orders of magnitude, have enabled the use noble-metal nanoclusters for a new generation of precision catalysts. The properties of nanoclusters are greatly dependent on their size, shape and binding to the support, therefore modelling approaches at the atomistic scale are required to properly describe their behaviour. Ab initio methods are quantum mechanical methods that can provide full information over the properties of a system, but have seen limited application in studies of nanocluster systems, mainly owing to the large size of the models, and the consequent computational cost. This project aims to make use of the most novel ab initio approaches, offering high accuracy and full exploitation of modern massively parallel high-performance computers, to attack open questions in nanocluster science, such as the relation of fundamental properties of elemental and alloy nanoclusters to their size, shape and composition, the interactions between nanocluster and support and its effects on the system’s behaviour and finally the catalytic activities of nanocluster-based catalysts for reactions relevant to fine chemicals and clean energy. The project is expected to have an impact in both fundamental nanocluster science and large-scale industrial manufacturing of nanocluster catalysts.

top

 

 

Type B: Code development and optimization by the applicant (without PRACE support) (2)

Scaling up and load-balancing of a cellular flow with fully resolved particles

Project Name: Scaling up and load-balancing of a cellular flow with fully resolved particles
Project leader: Prof Alfons Hoekstra
Research field: Biochemistry, Bioinformatics and Life sciences
Resource awarded: 100 000 core hours on MareNostrum
Description

Blood is a complex suspension constituted of various components suspended in plasma. Erythrocytes are the major component and determine blood rheology. Platelets form the link between transport dynamics and several vital biochemical processes. Their collective behaviour can provide explanation to the most fundamental transport phenomena in blood, such as the non-Newtonian viscosity, the margination of platelets, the Fåhræus effect, the appearance of a cell-free layer, or the scaling of shear-induced diffusion of RBCs. We developed HemoCell (www.hemocell.eu), a high-performance computational framework with validated cell-material models. HemoCell works very well for micro-circulatory flows (up to 2-300 micrometer diameter vessels). The aim of the current proposal is to scale the framework up to millimetre scale to bridge the gap with macroscopic simulations.

top

Development and optimization of High Performance Computing code for Robust Topology Optimization

Project Name: Development and optimization of High Performance Computing code for Robust Topology Optimization
Project leader: Prof David Herrero-Pérez
Research field: Engineering
Resource awarded: 200 000 core hours on Curie
Description

The current project aims at the development of the required tools for the topology optimization under uncertainty of large-scale structures. The topology optimization provides engineering designers with a powerful tool to find innovative and high-performance conceptual designs at the early stages of the design process. Such a technique has been successfully applied to improve the design of complex industrial problems. However, deterministic conditions are usually assumed obviating the different sources of uncertainty which may notably affect the optimal design performance under real-world engineering conditions. Uncertainty is then incorporated into the topology optimization formulation to obtain optimal designs under realistic conditions. Topology optimization of large-scale continuum structures under uncertainty is a computational challenge by itself. This is due to the use of large Finite Element (FE) models and uncertainty propagation methods. The former aims to address the ever-increasing complexity of more and more realistic models, whereas the latter is required to estimate the statistical metrics of the formulation. The computational burden of the problem requires the use of parallel computing (and the use of a supercomputer) in order to address large-scale topology optimization problems under real-world engineering conditions. Nevertheless, the computational requirements can be reduced by adopting efficient techniques, such as parallel Adaptive Mesh Refinement (AMR) methods. Such techniques are rewarding for topology optimization problems because the refinement is only needed close to the boundary of the structural design. This permits the use of a coarse discretization inside (fully solid) and outside (fully void) of the domain and a fine discretization close to the boundary of the design, which reduces meaningfully the computational cost of the analysis using FE models. The project also adopts the use of sparse grid stochastic collocation methods to calculate the statistical metrics of the problem, which are embarrassingly parallel techniques. This fact facilitates the proper exploitation of large distributed systems and ensures the scalability of the calculation of statistical metrics. The code developed for the project incorporates the ingredients previously mentioned: sparse grid stochastic collocation methods to calculate the statistical metrics of the problem, Adaptive Mesh Refinement (AMR) approach for the calculation of the stochastic nodes and the partition of the domain using domain decomposition to make use of parallel computation on distributed memory systems. The current stage of the project requires the testing of scalability i) at the domain decomposition level and ii) at the stochastic collocation points level. This will permit to detect bugs and bottlenecks to improve the current implementation. This work will result in two main scientific results: + The topology optimization of large-scale structures under real-world (uncertain) engineering conditions. + The testing and improvement of new parallel algorithms for AMR in unstructured meshes and uncertainty quantification methods when used in HPC platforms of thousands of cores.

top

 

Type C: Code development with support from experts from PRACE (0)

 

Type D: Optimisation work on a PRACE Tier-1 (1)

Improving Scalability of Sparse Tensor Decomposition via Parallel Graph Partitioning

Project Name: Improving Scalability of Sparse Tensor Decomposition via Parallel Graph Partitioning
Project leader: Prof Cevdet Aykanat
Research field: Mathematics and Computer Sciences
Resource awarded: Tier-1 – pending
Description

CANDECOMP/PARAFAC Decomposition (CPD) is the most commonly used method for analyzing the multiway data represented by tensors. The applications in which CP decomposition arises include chemometrics [1], signal processing [2], neuroscience [3, 4, 5], cyber traffic analysis [6]. CPD is successfully obtained by an iterative algorithm called CPD-ALS, which is based on the alternating least squares method. CPD-ALS exhibits large memory and runtime overheads, hence an efficient parallelization scheme for distributed-memory systems is essential. Distributed-memory SPLATT [7] is a successful parallel CPD-ALS tool for sparse tensors. It deploys a Cartesian block partitioning on the input tensor and is reported to achieve better scalability compared to earlier alternatives such as HyperTensor [8]. The success of the distributed-memory SPLATT is attributed to the fact that Cartesian tensor partitioning provides upper bounds on the total communication volume and total message count. However, the partitioning in this tool does not take advantage of the sparsity pattern of the input tensor, which can be utilized to actually minimize the total communication volume. Another problem with this tool is that it does not take any Cartesian partition as input. In this project, we plan to adapt and optimize distributed-memory SPLATT to be used with a parallel graph partitioner. We plan to use ParMetis [9] as the graph partitioner. We plan to investigate models/methods for the minimization of the total communication volume within the parallel CPD-ALS code. We also plan to consider other performance metrics such as maximum communication volume and maximum computational load of processors. [1] K. R. Murphy, C. A. Stedmon, D. Graeber, and R. Bro, Fluorescence spectroscopy and multi-way techniques. PARAFAC, Analytical Methods, 5 (2013), p. 6557. [2] N. D. Sidiropoulos, L. De Lathauwer, X. Fu, K. Huang, E. E. Papalexakis, and C. Faloutsos, Tensor decomposition for signal processing and machine learning, 2016, arXiv:1607.01668. [3] E. Acar, C. A. Bingol, H. Bingol, R. Bro, and B. Yener, Multiway analysis of epilepsy tensors, Bioinformatics, 23 (2007), pp. i10–i18. [4] F. Cong, Q.-H. Lin, L.-D. Kuang, X.-F. Gong, P. Astikainen, and T. Ristaniemi, Tensor decomposition of EEG signals: A brief review, Journal of Neuroscience Methods, 248 (2015), pp. 59–69. [5] I. Davidson, S. Gilpin, O. Carmichael, and P. Walker, Network discovery via constrained tensor analysis of fMRI data, in KDD’13: Proceedings of the 19th ACM SIGKDD Inter- national Conference on Knowledge Discovery and Data Mining, ACM, 2013, pp. 194–202. [6] K. Maruhashi, F. Guo, and C. Faloutsos, MultiAspectForensics: Pattern mining on large- scale heterogeneous networks with tensor analysis, in 2011 International Conference on Advances in Social Networks Analysis and Mining (ASONAM), IEEE, 2011, pp. 203–210. [7] S. Smith and G. Karypis, A medium-grained algorithm for sparse tensor factorization, in Parallel and Distributed Processing Symposium, IEEE, 2016, pp. 902–911. [8] O. Kaya and B. Uçar, Scalable sparse tensor decompositions in distributed memory systems, in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, ACM, 2015, p. 77. [9] G. Karypis, ParMETIS – Parallel Graph Partitioning and Fill-reducing Matrix Ordering, http://glaros.dtc.umn.edu/gkhome/metis/parmetis/overview.

top