PRACEdays16 Posters

PRACEdays16 Best Poster Award:
Computational design of hydrogen storage materials

  • PhD, 2004
    Physical Chemistry Institute, University of Kiel (Germany)
  • MSc, 2001
    Chemistry Department, Istanbul Technical University
  • BSc, 1999
    Chemistry Department, Istanbul Technical University
  • 2005-2007
    Postdoc, Theoretical Organic Chemistry Dept., University of Duisburg-Essen
  • 2008-2010
    Postdoc, Physics Department, Technical University of Denmark
  • 2010-2013
    Assistant Professor, Informatics Institute, Istanbul Technical University
  • 2013-
    Associate Professor, Informatics Institute, Istanbul Technical University

Research Interests

  • Computational Design of Energy Materials
  • Hydrogen Storage
  • Intermolecular Interactions
  • Computational Modelling
  • Global Optimization
  • Force-Field Development

Hydrogen is one of the promising alternatives for the replacement of fossil-fuels. One of the major bottlenecks preventing its widespread commercialization for on-board applications is to find the most suitable storage medium. Metal borohydrides are one of the classes of solid materials studied intensively to store hydrogen due to their high theoretical hydrogen capacities. However, their high thermodynamic stability is one of the major problems limiting their usage. The requirement of high decomposition temperature can be lowered by the inclusion of ammonia. The resulting new complex containing both borohydrides and ammines is called as Ammine Metal Borohydrides (AMBs). However, some of the AMBs have insuppressible release of ammonia during the dehydrogenation. This can be solved by the inclusion of a second metal atom into AMBs leading to dual-cation AMBs with a general formula of M1M2(BH4)x(NH3)y, x=3-5 and y=2-6. Until now, there are only a few synthesized dual cation AMBs reported in the literature. Therefore, by conducting a computational screening study we aim to find new AMBs with desired properties. In this respect, M1 was selected as an alkali metal (Li, Na or K) and M2 was assumed to be one of the following species: Mg, Ca, Ni, Mn, Sr, Zn, Al, Y, Sc, Ti, Zr and Co. Since there is a very limited information about the crystal structures of AMBs, first template structures were located using a crystal structure prediction algorithm called as CASPESA. Subsequently, these structures were further relaxed at the DFT level. AMBs were evaluated with the help of some alloying and decomposition reactions. The results obtained so far indicate that reported AMBs in the literature such as LiMg(BH4)3(NH3)2 and NaZn(BH4)3(NH3)2 were found to be in the desired region. Moreover, many new AMBs were also found quite promising.

This work is a good example showing how supercomputers can be utilized to design new materials. In this case, the target is an energy material, however, the scope of design can easily be broadened e.g. to batteries or gas sensing materials. All these computational efforts allow a fast, economic and less expensive (in terms of time compared to experiment) way of strategy in both chemical and physical sciences.

The current computational study includes thousands of crystal structure optimizations. The algorithm, CASPESA, used for this purpose is able to run in parallel which significantly lowers the required computational time. Quantum mechanical computations are one of the area where nothing can be done without supercomputers. These highly time consuming calculations were carried out with the help of Quantum Espresso suit. In a such a screening study, the number of computations is unfortunately is enormous and when this is coupled with the complexity of the quantum mechanical calculations, it is clear that such works requirs highly effient supercomputing power.

Large-scale Ultrasound Simulations with Local Fourier Basis Decomposition

Jiri Jaros is currently a Marie Curie Fellow and an assistant professor at the Faculty of Information Technology, Brno University of Technology. He received his MSc and PhD in Computer Science from the Brno University of Technology in 2003 and 2010 respectively. He worked for two years as a postdoctoral researcher at the Australian National University in the Computer Systems group under the supervision of Prof Alistair Rendell, and 6 months as a postdoctoral researcher in the Centre for Computational Science, University College London under Prof Peter Coveney. His research interests include high performance computing, scientific computation, parallel programming, many-core accelerator and GPU architecture and programming. He is an active developer of the k-Wave project responsible for large-scale code development, validation and optimisation. He has published more than 50 scientific papers, and is a co-author of an open-source acoustics toolbox for MATLAB called k-Wave.

The simulation of ultrasound wave propagation through biological tissue has a wide range of practical applications including planning therapeutic ultrasound treatments of various brain disorders such as brain tumours, essential tremor, and Parkinson’s disease. The major challenge is to ensure the ultrasound focus is accurately placed at the desired target within the brain because the skull can significantly distort it. Performing accurate ultrasound simulations, however, requires the simulation code to be able to exploit thousands of processor cores and work with TBs of data while delivering the output within 24 hours.

We have recently developed an efficient full-wave ultrasound model (the parallel k-Wave toolbox) enabling to solve realistic problems within a week using the pseudospectral model and global slab domain decomposition (GDD). Unfortunately, GDD limits scaling by the number of 2D slabs, which is usually below 2048. Moreover, since the method is reliant on the fast 3D Fourier transform, all-to-all communications concealed in matrix transpositions significantly deteriorate the performance.

This work presents a novel decomposition method called Local Fourier basis decomposition (LDD). This approach eliminates the necessity of all-to-all communications by replacing them with local nearest-neighbour communication patters. The LDD partitions the 3D domain into a grid of subdomains coated with a halo region of a defined thickness. The gradients (FFTs) are only calculated on local data. In order to ensure the ultrasound wave can propagate over subdomain interfaces, the halo regions are periodically exchanged with the direct neighbours. To make the propagation over the interfaces smooth, a custom bell function was developed. The reduced communication overhead leads to better scaling despite the fact that more work (calculation of the halo region) must be done.

When developing the LDD, we addressed the communication overhead by (1) reducing the number of communication phases per time-step from 14 down to 7, (2) replacing global all-to-all communications by local nearest-neighbour patterns, and (3) overlapping the halo exchange with computation impossible under the global domain decomposition. The main drawback of LDD is the reduced accuracy arising from the fact that the gradients are not calculated over the whole domain. The level of numerical error can be controlled by the thickness of the halo region. By experimental validation and numerical optimization, the thickness of 16 grid points was determined as sufficient.

The performance and scaling were investigated on realistic ultrasound simulations with various spatial resolutions between 5123 and 20483 grid points. We used SuperMUC’s thin nodes and scaled the calculation from 8 cores (one socket) to 8192 cores (512 nodes). While GDD’s scaling is quite limited and shows performance fluctuations, the LDD scales nicely up to 8192 cores for all domain sizes. Moreover, the scaling curves for LDD are smoother and steeper yielding much higher efficacy. Three types of LDDs (pure-MPI version, and hybrid OpenMP/MPI versions with a single process per socket and per node) were implemented. The hybrid versions further reduce communication overhead and the relative size of the halo region. This leads to superior performance of both hybrid versions which can outperform the pure-MPI and GDD versions on the same number of cores by a factor of 1.5 and 4, respectively.

In order to conclude, the local Fourier basis allowed to employ up to 16 times more computer cores. The time per simulation timestep was reduced by a factor of 8.55 in the best case. Since typical ultrasound simulations need a week to finish on 1024 cores, this decomposition can finally get us below 24 hours, which is necessary for clinical trials.

ANTAREX : AutoTuning and Adaptivity appRoach for Energy efficient eXascale HPC systems

Cristina Silvano is an Associate Professor (with tenure) of Computer Engineering at the Politecnico di Milano. She received her MS degree (Laurea) in Electrical Engineering from Politecnico di Milano in 1987. From 1987 to 1996, she was Senior Design Engineer at the R&D Labs of Group Bull in Pregnana Milanese (Italy) and Visiting Engineer at Bull R&D Labs in Billerica (US) (1988-89) and at IBM Somerset Design Center, Austin (US) (1993-1994). She received her Ph.D. in Computer Engineering from the University of Brescia in 1999. She was Assistant Professor of Computer Science at the University of Milano (2000 -2002) and then Associate Professor at the Politecnico di Milano (2002-present). Her primary research interests focus on computer architectures and electronic design automation, with particular emphasis on power-aware design for embedded systems, design space exploration and runtime resource management for many-core architectures. Her research has been funded by several national and international projects. In particular, she was Principal Investigator of some industrial funded research projects in collaboration with STMicroelectronics. She was Project Coordinator of the FP7 European Projects MULTICUBE (2008 – 2010) and 2PARMA (2010 – 2013). She is currently Project Coordinator for the H2020-FET-HPC ANTAREX European project on autotuning and adaptivity for energy-efficient exascale High Performance Computing systems.

The ANTAREX research project started on September the 1st, 2015, and is funded by the H2020 Future and Emerging Technologies programme on High Performance Computing (HPC). The main goal of the ANTAREX project is to provide a breakthrough approach to map, runtime manage and autotune applications for green and heterogeneous HPC systems up to the Exascale level. One key innovation of the proposed approach consists of introducing a separation of concerns, where self-adaptivity and energy efficient strategies are specified aside to application functionalities, promoted by a Domain Specific Language (DSL) inspired by aspect-oriented programming concepts for heterogeneous systems. The new DSL will be introduced for expressing the adaptivity/energy/performance strategies and to enforce at runtime application autotuning and resource and power management. The goal is to support the parallelism, scalability and adaptability of a dynamic workload by exploiting the full system capabilities (including energy management) for emerging large-scale and extreme-scale systems, while reducing the Total Cost of Ownership (TCO) for companies and public organizations.
The project involves CINECA, the Italian Tier-0 Supercomputing Centre and IT4Innovations, the Czech Tier-1 Supercomputing Center. The Consortium also includes three top-ranked academic partners (ETH Zurich, University of Porto, and INRIA), one of the Italian leading biopharmaceutical companies (Dompé) and the top European navigation software company (Sygic).
The ANTAREX project is driven by two use cases chosen to address the self-adaptivity and scalability characteristics of two highly relevant HPC application scenarios: a biopharmaceutical HPC application for accelerating drug discovery deployed on the 1.2 PetaFlops heterogeneous NeXtScale Intel-based IBM system at CINECA; a self-adaptive navigation system to be used in smart cities deployed on the server-side on an heterogeneous Intel-based 1.46 PetaFlops class system provided by IT4Innovations. The key ANTAREX innovations will be designed and engineered since the beginning to be scaled-up to the Exascale level. Performance metrics extracted from the two use cases will be modeled to extrapolate these results towards Exascale systems. These use cases have been selected due to their significance in emerging application trends and thus by their direct economic exploitability and relevant social impact.
This poster will present the project, its main objectives, some of the technologies being researched and developed, and will emphasize its expected social impact.

Micro-robots and bacteria swimming in a supercomputer

Recently Francisco Alarcón has completed his PhD at University of Barcelona (UB) under the advising of Prof. Ignacio Pagonabarraga where they have developed and applied a numerical scheme to study the collective behaviour of microorganisms. They have done systematic studies to understand how hydrodynamic signature affects to the emergency of collective motion. During his PhD, Francisco has learned the fluid dynamics simulation method called Lattice-Boltzmann, where he was able to carried out large scale simulations using high performance computing. Currently, he is still working with Prof. Pagonabarraga, they are studying numerical simulation of micro-swimmers and opening another research line about the modelling of mechanical behaviour of filled rubber in collaboration with the SKF Engineering & Research Centre (ERC). His research interests include computational modeling, mesoscopic simulation methods, high-performance computing, fluid dynamics, active matter, polymers, biophysics, and statistical physics in general.

Bacteria and microorganisms in general are distributed in various aquatic environments in nature and industry. Despite their tiny size, microorganisms play a vital role in a wide variety of phenomena. For example plankton, which plays a fundamental role in the ocean ecosystem, develops massive blooms which form the basis of the marine food web, regulate carbon in the atmosphere, and are responsible for half of the photosynthesis that takes place on our planet. Phytoplankton makes life on Earth possible. Algae bioreactors using to produce biomass constitute another relevant example.

In the above examples, the behavior of such suspensions is tightly coupled the transport of chemical substances, momentum, and energy. In particular, to understand the microorganism motility and transport, we should take into account that their collective behaviour emerges from their dynamic self-organization. Such collective behaviour has inspired researchers to deepen the understanding of the physics of motility to engineer complex emergent behaviours in model systems that promise advances in technological applications. With numerical simulations we are able to find the fundamental role that the hydrodynamic coupling through the embedding solvent has in the collective behaviour in model systems of self-propelled microswimmers. Such a fundamental understanding will help to identify new routes to design micro-robots that can imitate microorganisms.

Recently, we have shown that micro-swimmers in 3D can generate coordinated response, such as a tendency to swim along each other, or create giant density fluctuations induced by the emergence of a percolating dynamic cluster. We have found that the key factor to produce these collective motions is the hydrodynamic signature of the micro-swimmers.

Since the set-up of many experiments is a suspension where particles can move in a quasi-2D geometry, we developed a systematic numerical study such that experimental parameters are simulated. We present here numerical simulations of interacting micro-swimmers constrained to move in a slab. The results prove that our simulations can reproduce the living clusters obtained by experimentalists for either active colloids or bacteria. We also show some results of spherical swimmers trapped in a plane, but embedded in an unconstrained fluid, can move along the interface and rotate freely in all directions.

In order to identify the intrinsic nature of the measured emergent coordinate behavior, and rule out finite size effects, we have had to carry out a systematic finite size analysis, reaching large simulation boxes. The intrinsic non-equilibrium nature of the systems, and their tendency to develop long range correlations require the use of large scale suspensions, involving millions of swimmers, where we were able to explore the evolution of the suspensions over long time windows to avoid the dynamic slowing down that we have identified in active suspensions. To do such studies it is important to have high performance computing; in our case we had access to the MareNostrum Supercomputer at Barcelona Supercomputing Center (BSC) and also through the Partnership for Advanced Computing in Europe (PRACE).

Local and global simulation of accretion discs

The process of accretion, through which matter gradually falls on a central object, is crucial for many astrophysical phenomena. Accretion typically occurs in the form of a disc and we can find accretion discs both in the nuclei of galaxies, around supermassive black holes, and around stars at the beginning and at the end of their life. As accretion proceeds, the matter falling on the central body releases its gravitational energy, that is transformed into heat and into the observed radiation. Accretion discs can power some of the most energetic phenomena in the universe and understanding how they work is very important for the comprehension of different astrophysical problems like how stars are formed or what happens in the central cores of galaxies.

In order to have accretion, however, angular momentum has to be extracted from the matter in the disc, otherwise it would continue to orbit forever around the central object. For this reason, understanding the process of angular momentum extraction is fundamental for determining the mass accretion rate and the disc luminosity. Friction between adjacent rings of the disc, moving at different angular speed, can provide a way to extract angular momentum, but this process is much too slow and the resulting luminosities are well below those we observe. Friction must then be enhanced by turbulent motions and, in order to develop turbulence, the disc must be magnetized, in fact non magnetized discs are known to be stable. On the other hand, turbulence has to regenerate the magnetic field it needs to develop, otherwise it dies and accretion stops, unless an external magnetic field is provided. The analysis of these complex processes can be done only through three-dimensional magnetohydrodynamic simulations that can be only performed on supercomputers of the class of PRACE systems.

In our project we followed the rotation of a small portion of the disc at very high resolution, in order to capture the turbulence at very small scale, and we also studied the dynamics of the global disc. First, looking at a small patch of the disc, we studied how the heat generated inside the disc is carried out to the surface to be radiated and we discovered that this is accomplished by convection. Furthermore, the presence of convection helps turbulent motions to regenerate magnetic field by a dynamo process and thus enhances the overall efficiency of the accretion process. The global disc simulations show a similar convective behavior and how it changes with radial distance from the central body This project has been successful in adding a new tile in the comprehension of how the interplay between turbulence, magnetic field generation, energy transport by convection makes accretion work in the most diverse astrophysical environments.

Born on 29/01/1954. Laurea in Physics University of Torino, 1977. Researcher Astronomer at Turin Observatory from 1981 to 1995, Associate Astronomer at Turin Observatory from 1995 to the present. Visiting scientist at Harvard Smithsonian Center fotr Astrophysics (1984-1985, 1990-1991), and at the University of Chicago (periodic visits). His main research interests are Plasma Astrophysics, Computational Astrophysics with particular reference to astrophysical jets and accretion discs.

The Performance Optimisation and Productivity Centre of Excellence

As science continues to push the boundaries, so too must performance of the applications used.

The efficient analysis of ever larger scientific problems can only be achieved through well-optimised HPC codes. With the growing complexity of parallel computers, users may not be aware of the detailed issues affecting the performance of their applications, and therefore will not be fully exploiting their HPC resources.The Performance Optimisation and Productivity Centre of Excellence in Computing Applications (POP) is funded by the European Union’s Horizon 2020 programme to address this situation. It aims to boost the productivity of European researchers and industry by discovering inefficiencies in existing HPC applications and suggesting feasible improvements. POP is a collaboration between Barcelona Supercomputer Center, High Performance Computing Center Stuttgart, Jülich Supercomputing Centre, Numerical Algorithms Group Ltd, RWTH Aachen and TERATEC.

POP offers a service which brings together world-class HPC expertise from a number of European partners to combine excellent academic resources with a practical, hands-on approach. Even better, its services are delivered free of charge to organisations based within the European Union!

In this talk we describe some of the work undertaken on the project so far. This includes some of the potential improvements identified and the corresponding potential savings, as well as highlighting feedback from the end users. POP aims to work on 150 codes in its first 2.5 years of operation and deliver 3M Euros of savings directly to customers through the improved performance of parallel codes. This will enable European organisations to complete better and faster research, whilst also reducing their carbon footprint by focusing on efficiency and obviating the need for investment in additional hardware.

Sally Bridgwater is an HPC Applications Analyst at NAG Ltd, working on the Performance Optimisation and Productivity Centre of Excellence. She received an MSci in Mathematics and Physics from the University of Bristol in 2010 and then her PhD in Computational Physics from the University of Warwick in 2015. Her main interests lie in molecular simulation with a particular focus on Monte Carlo algorithms, along with a wider enthusiasm for the use of HPC in science.

Towards an HPC tool for simulation of 3D CSEM surveys: an edge-based approach

Octavio Castillo Reyes has his barchelor’s in Computer Systems engineering from Xalapa Institute of Technology, Mexico and M.Sc. In Networks and Telecommunications from Atenas Veracruzana University, Mexico. He has previously worked as lecturer at the University of Veracruz, particularly in the Master in Telematic Engineering and Bachelor in Administrative Computer Systems.

His scientific interests range in the broad fields of telecommunications, wireless sensor networks, computational methods, finite element method, multiprocessor architectures, memory systems, performance and workload characterization.

In 2015, he won the JCC2015-BSC prize of “Your Thesis in Three Minutes” as part of the 2nd. CONACyT-Catalonia Cooperation Days.

Currently, he is a fellow CONACyT and conducts doctoral studies in Computer Architecture at the Polytechnic University of Catalonia (Barcelona, Spain). Develops its research in the Department of Computer Applications in Science and Engineering (CASE) at the Barcelona Supercomputing Center – National Supercomputing Center (BSC), under the direction of professor José María Cela Espin and researchers Josep de la Puente and Vladimir Puzyrev from research group Repsol-BSC. His doctoral thesis is about Edge-based finite element method for the solution of electromagnetic problems in geophysics and it’s coupling on high performance computing (HPC) architectures.

This work is about the integration of three concepts: Controlled-Source Source Electromagnetic Method (CSEM) for geophysics, Edge-based Finite Element Method (EFEM) and High Performance Computing.

Nowadays, CSEM, as first concept, is one of the most important technique for reducing ambiguities in data interpretation in hydrocarbon exploration. In the standard configuration, the sub-seafloor structure is explored by emitting low-frequency signals from a high-powered electric dipole source towed close to the seafloor. By studying the received signal, the subsurface structures could be detected at scales of a few tens of meters to depths of several kilometers.

EFEM as second concept, has become very popular to simulate electromagnetic fields problems. The main advantages of EFEM formulations are the elimination of spurious solutions and the substantial reduction of the computational modeling cost. However, the state of the art is marked by a relative scarcity of robust HPC codes based on this approach. It’s may be attributed to their theoretical and implementational threshold.

The third concept plays an important role because in real scenarios the 3D CSEM modeling can easily overwhelm single core and modest multicore computing resources.

Based on previous ideas, this work describes the most relevant aspects to consider in order to implement an HPC code to simulate CSEM surveys with EFEM. After investigating recent trends in parallel computing techniques to mitigate the computational overburden associated with the modeling, we obtained a modular, flexible and simple software stack. The result is an implementation that allows users to specify edge-based variational forms of H(curl) for the simulation of electromagnetic fields in CSEM surveys.

Results of simulation are validated through convergence tests and the comparison with other results. The code’s performance is studied through scalability tests and the analysis of hardware counters.

Effect of laser imperfections on laser wakefield acceleration and betatron source

Julien Ferri is a 3rd year PhD student at the Commissariat à l’Energie Atomique (CEA) in France. He obtained a masters degree in 2013 in engineering and applied mathematics at the ENSTA (Paris).

He is now supervised by Agustin Lifschitz in the Laboratoire d’Optique Apliquée (LOA) and Xavier Davoine (CEA). The topic of the thesis is to study the X-ray source that can be generated through laser wakefield acceleration of electrons: the Compton emission and particularly the Betatron X-ray source. The work performed in this thesis is purely numerical with intensive use of the particle-in-cell code CALDER.

Laser Wakefield Acceleration (LWFA) of electrons was first proposed in 1979 by Tajima and Dawson. It consists in sending an ultra-short and intense laser pulse in a low density gas jet. The extremely high intensity of the laser pulse completely ionizes the gas, turns it into a plasma, and creates a high amplitude plasma wave, the so-called wakefield, which can trap and accelerate an electron beam. Extremely high accelerating fields are obtained in the wakefield, conferring a huge advantage to this scheme over the conventional electron accelerators: the same electron energies are achieved in a distance about 1000 times shorter, so the size of the facilities is reduced by the same factor.

Of particular interest are the X-ray sources that can be generated with electron obtained by LWFA. Amongst them, the betatron source offers the advantage of a very simple experimental implementation, as it is based on the natural transverse wiggling of the electron during their acceleration. In addition to this, these source benefits of the advantages of the accelerated electron beam: very short duration (a few fs = s), small source size (μm) and high photon energy (tens of keV range), which make them suitable for useful applications in different domain using X-ray imaging, such as medicine, high-energy physics or industry.

The stability and characteristics of the betatron X-ray sources have greatly improved in the last ten years. This was partly possible due to the extensive use of numerical simulations, helping to understand the underlying physics. This numerical work is realised with particle-in-cell (PIC) codes, which can be very computationally demanding. First because the LWFA physics requires 3D-simulations for a good reproduction of the phenomenon involved. The other reason is that we numerically have to deal with much different scales: the laser wavelength (μm) differs by several order of magnitude from the plasma size (~ mm/cm), but has to be accurately discretised.

In this work, based on the comparison between experimental data and 3D PIC simulations, we investigate the influence of realistic laser beam imperfections on LWFA. Simulations with a realistic beam show that the performances for electron acceleration and betatron X-ray emission can be profoundly degraded compared with an ideal laser beam of same energy: the X-ray photon number is reduced by one order of magnitude. This clearly put forward the limitation of using an ideal beam in the simulations: taking the experimental laser imperfections into account lead to better quantitative agreement with experimental data, and fine effects or trends can be reproduced. Moreover it shows that improving the laser beam quality in the experiments is of first importance.

Reactivity of Metal Oxide Nanocluster Modified TiO2: Oxygen Vacancy Formation and CO2 Interaction

Marco Fronzi is working as a postdoctoral researcher in the Materials Modelling for Devices Group at Tyndall National Institute, University College Cork (Ireland).

He received his Laurea in Physics and my PhD in Computational Material Science at Tor Vergata University in Rome (Italy).

During the course of my career I have held research positions with a range of institutions across the world, including Osaka University and National Institute for Material Science in Japan, University of Sydney in Australia, University of Tor Vergata and Neuroscience Institute of the National Research Council in Italy. The main focus of my research concerns the application of theoretical-computational methodologies (e.g. Density Functional Theory calculations, as ,ab-initio and classical Molecular Dynamics simulations etc.), to understand and predict properties of surfaces, interfaces and bulk of materials of technological interest and to analyse their catalytic surface properties.

The results of my work have been published as scholarly articles in prestigious international journal (among which: Phys. Chem. Chem. Phys., J. Chem. Phys., Phys. Rev. B), and presented in international conferences (e.g. American Physical Society Meeting, American Vacuum Society Meeting, Australian Institute of Physics Congress, The Physical Society of Japan Meeting).

Marco Fronzi (Presenter) and Michael Nolan

Tyndall National Institute, UCC, Lee Maltings, Dyke Parade, Cork, Ireland

TiO2 photocatalysts, which use sunlight to generate chemically active electrons and holes that transform water or CO2 into hydrogen or hydrocarbon fuels, have two key challenges:
(1) to shift the TiO2 band gap to the visible region, allowing solar energy to be used and
(2) enhancing charge separation after photoexcitation.
We discuss our modelling work, using density functional theory, on a new mechanism for band gap modification in TiO2: surface modification of TiO2 with metal oxide nanoclusters. Modifying TiO2 with transition metal oxide nanoclusters induces visible light activity, which is achieved by introducing nanocluster derived electronic states above the original TiO2 valence band edge, to shift the VB edge to higher energy. A model of the photoexcited state confirms the band gap reduction which is controlled by the coverage of transition metal oxide nanoclusters. A range of metal oxide nanoclusters including Ce3O5, Zr3O6 and Sn4O4 have been investigated and the mechanisms of band gap modification elucidated. Simple rules for modifying TiO2 to induce visible light absorption are presented. We show that our models can predict the date of photoexcited holes and electrons and that the presence of low coordinated atoms is crucial. We also investigated the interaction with the CO2 in order to understand the electron tranfer mecanism and CO2 activation.

In performing these studies high performance computing using local resources, the ICHEC Stokes and Fionn infrastructure and the PRACE infrastructure (through DECI projects) was absolutely crucial. The simulations involve large atomistic models of TiO2 based systems, the screening of many potential structures and compositions and associated analysis and post processing. In allowing us to carry out these calculations in the first place (capability) and provide the necessary throughput (capacity), the combined HPC ecosystem of our local, national and European HPC resources was entirely necessary.

Two of the most significant societal grand challenges we face are in Energy Supply (due to the decline in fossil fuel resources and increased fuel usage) and in CO2 emissions (due to increased fuel usage and with extreme consequences). The photocatalyst materials we have developed mark a significant advance in this field providing a new pathway to photocatalyst development using widely available and safe materials. The ability to operate under visible light and to be sufficiently reactive to oxidise water and convert waste CO2 to useful molecules has been demonstrated so that these photocatalysts are cutting edge materials in addressing the energy supply and emissions challenges we face into the near future.

Processing and visualization of medical images from CT scans

Milan Jaroš
+420 597 329 583
Year of birth 1980
Nationality Czech

  • 2014–Present
    IT4Innovations national supercomputing center, Ostrava (Czech Republic)
    Visualization and Virtual Reality, Bioinformatics
  • 2011–2014
    KVADOS, a.s., Ostrava (Czech Republic)
    Developing applications for mobile devices
  • 2009–2011
    KVADOS Mobile Solutions s.r.o., Ostrava (Czech Republic)
    Developing applications for mobile devices​
  • 2007–2009
    Vigour Delta spol. s r.o., Praha (Czech Republic)
    Web development for banking institutions​
  • 2006–2007
    Ing. SOFTWARE DLUBAL s.r.o., Praha (Czech Republic)
    Development of programs to calculate the beam structures​
  • 2013–Present
    VŠB-Technická Univerzita Ostrava, Ostrava-Poruba (Czech Republic)
  • 2000–2006
    VŠB-Technická Univerzita Ostrava, Fakulta elektrotechniky a informatiky, Ostrava-Poruba (Czech Republic)

Strakos, P., Jaros, M., Karasek, T., Kozubek, T., Vavra, P., Jonszta, T. (2015), ‘Advanced image processing methods for automatic liver segmentation’, in COUPLED PROBLEMS 2015: Proceedings of the 6th Int. Conf. on Coupled Problems in Science and Engineering, ed. Onate E., Papadrakakis M., Schrefler B.A., pp. 125-136.

P. Strakos, M. Jaros, T. Karasek, L. Riha, M. Jarosova, T. Kozubek, P. Vavra, T. Jonszta, “Parallelization of the Image Segmentation Algorithm for Intel Xeon Phi with Application in Medical Imaging”, in P. Iványi, B.H.V. Topping, (Editors), “Proceedings of the 4th Int. Conf. on Parallel, Distributed, Grid and Cloud Computing for Engineering”, Civil-Comp Press, Stirlingshire, UK, Paper 7, 2015. doi:10.4203/ccp.107.7.

Strakos, P., Jaros, M., Karasek, T., Kozubek, T., Vavra, P., Jonszta, T. (2015), ‘Review of the Software Used for 3D Volumetric Reconstruction of the Liver’, World Academy of Science, Engineering and Technology, International Science Index 98, International Journal of Computer, Control, Quantum and Information Engineering, 9(2), pp. 422-426.

Jaroš, M., Říha, L., Strakoš, P., Karásek, T., Vašatová, A., Jarošová, M., Kozubek, T. (2015), ‘Acceleration of Blender Cycles Path-Tracing Engine using Intel Many Integrated Core Architecture’, in CISIM 2015: Proceedings of the 14th IFIP TC 8 International Conference, ed. Saeed K., Homenda W.

In medicine, technical advances are common and bring enormous increase in processed data. Computed tomography (CT) is one of the particular areas of medicine where large amounts of data appear. CT uses X-rays to create an images of human body in sequential slices.

With development of this technique, new directions in diagnostic medicine were found. It can validate therapeutical effectiveness in cancer diseases or help in surgical planning. To help medical doctors with the treatments, 3D models of organs are very useful. To create the 3D models from CT images, image segmentation is required.

The image segmentation is digital image processing tool which can automatically split digital image into the regions, where all pixels in the region share same properties. Results of the image segmentation can then be used in 3D model reconstruction of organs. The image segmentation can be very time consuming process thus possibilities how to reduce the computational time are being investigated.

One of the most obvious solutions for computational time reduction is parallelization of image segmentation algorithms. In this poster parallelization of k-means algorithm for image segmentation is presented. For parallel implementation, the Intel Xeon Phi with Many Integrated Core architecture (MIC) of coprocessors has been selected. To demonstrate parallel capabilities of the k-means algorithm, segmentation of CT images covering the abdominal part of a body was performed. The results of this work will be used for development of the software application for the automatic three-dimensional model reconstruction of human organs and tissues.

ECOSCALE: Reconfigurable Acceleration and Runtime System towards Exascale

Dr. Iakovos Mavroidis is a member of the Telecommunication Systems Institute in Greece and a Visiting Professor in Computer Science at the University of Crete. From 1991 to 1995, he was awarded two gold and two bronze medals in national competitions and two honorable mentions in International Olympiads in Mathematics and Computer Science. He received his M.Sc. degree in Electrical Engineering and Computer Science from the University of California at Berkeley in 2001. He worked at Sun Microsystems in 2000, designing a network interface card. From 2001 to 2002 he was with MIPS Technologies, where he was responsible for the design of the Load Store Unit and Memory Management Unit of the R20K microprocessor. From 2004 to 2006 he was with Ellemedia Technologies designing a network processor as a senior engineer. From 2007 to 2010 he was with Virtual Trip as a Manager of the Integrated Systems Group. From 2004 to 2006 and from 2012 until now he has been also a Visiting Instructor in Computer Science at the University of Crete, Greece. In 2011, he received his Ph.D. degree from the Department of Electronic and Computer Engineering at the Technical University of Crete in Greece. He was the author and Technical Manager of two EU projects (FASTCUDA, VPlanet), participated in several other EU projects (EUROSERVER, ExaNeSt, ExaNode, HiPEAC, DeSyRe, ENCORE, OSMOSIS, TEXT, FASTER, HEAP, NPMADE, LYDIA) and is coordinating two EU projects (ECOSCALE, RAPID).

In order to reach exascale performance, current HPC systems need to be improved. Simple hardware scaling is not a feasible solution due to the increasing utility costs and power consumption limitations. Apart from improvements in implementation technology, what is needed is to refine the HPC application development flow as well as the system architecture of future HPC systems. ECOSCALE tackles these challenges by proposing a scalable programming environment and architecture, aiming to substantially reduce energy consumption as well as data traffic and latency. ECOSCALE introduces a novel heterogeneous energy-efficient hierarchical architecture, as well as a hybrid many-core+OpenCL programming environment and runtime system. The ECOSCALE approach, including the architecture, the programming model and the runtime system, is hierarchical and is expected to scale well by partitioning the physical system into multiple independent Workers (i.e. compute nodes). Workers are interconnected in a tree-like fashion and define a contiguous global address space that can be viewed either as a set of partitions in a Partitioned Global Address Space (PGAS), or as a set of nodes hierarchically interconnected via an MPI protocol. To further increase energy efficiency, as well as to provide resilience, the Workers employ reconfigurable accelerators mapped into the virtual address space utilizing a dual stage System Memory Management Unit with coherent memory access. The architecture supports shared partitioned reconfigurable resources accessed by any Worker in a PGAS partition, as well as automated hardware synthesis of these resources from an OpenCL-based programming model.

ESPRESO – solver for petascale systems

Lubomír Říha

Research scientist at IT4Innovations. His research interest include: efficient parallelization and acceleration of various scientific applications using Multi- and Many-core architectures (such as GPUs and Intel Xeon Phi) using different parallelization techniques; multi-core and GPU acceleration of fast indexing and search in multidimensional databases; development and optimization of application level communication protocols (communication efficient work distributions on heterogeneous clusters with multiple GPU accelerators per node and InfiniBand interconnect; communication hiding and avoiding techniques for FETI solvers); and architecture optimization for specific application workloads.

Work Experience:

Research Scientist, HPC Enabling Expert06/2013 – present

Acceleration of iterative sparse linear solvers using heterogeneous accelerators. Implementation of massively parallel Hybrid FETI solvers

Visiting Research Scientist09/2014 – 12/2014

Parallelization of dimension reduction techniques for hyperspectral remote sensing and cloud detection algorithms for Intel Xeon Phi and Nvidia Kepler GPU many-core architectures.

Visiting Research Scientist04/2014 – 07/2014

Initial stage of the development of the ExaScale PaRallel FETI Solver (ESPRESO). It is a highly efficient parallel solver which contains several FETI based algorithms including new Hybrid Total FETI method suitable for world largest parallel machines.

Research Scientist and System administrator01/2011 – 06/2013

High Performance Computing Laboratory (HPCL) The George Washington University, Department of Electrical and Computer Engineering, Washington, DC

Research and development in the area of high performance scientific, engineering and business analytics applications.

Research Assistant and System Administrator01/2008 – 01/2011

Design and implementation of real time image processing algorithms for: (a) computer cluster (using MPI), (b) FPGA (using VHDL, Matlab), (c) CELL (using C and CELL BE SDK) and (d) GPU (using C, C# and NVIDIA CUDA). Experience in development of Linux embedded systems for Sensor Networks. System administration of Apple G5 based HPC cluster XSEED.

ESPRESO sparse linear solver: weak scaling up 5.8 billion unknowns; superlinear strong scaling up to 4913 nodes of CSCS Piz Daint solving 2.6 billion unknown problem using pipelined conjugate gradient method; GPU acceleration using Schur complement – speedup 3-4x over CPU.

ExaScale PaRallel FETI SOlver (ESPRESO) is a highly parallel implementation of Hybrid FETI. It is designed to provide parallelism required for future Exascale machines. It has been successfully tested on the European largest supercomputer CSCS Piz Daint running on 4913 out of its 5272 nodes.

The performance is demonstrated on linear elasticity synthetic 3D cube and real world benchmarks (for smaller problems). Weak scalability tests evaluate ESPRESO’s performance on extremely large problems. It runs from 1 to 2197 nodes of the Piz Daint, 2.7 million unknowns per node. The results show that it can solve 5.8 billion of unknown problems using 2197 compute nodes with flattening characteristics.

Strong scaling test evaluates 2.6 billion unknown problem running on 1000 to 4913 nodes. ESPRESO uses pipelined Conjugate Gradient as an iterative solver instead of general version. This method improves the strong scaling performance by hiding the global communication in the solver (global dot products) behind the SpMV operation. Its efficiency is shown using the single iteration time which exhibits superlinear scaling up to 4913 nodes. The entire solver runtime which includes number of iterations also exhibits linear scaling up to 4913 nodes.

ESPRESO is also supports Intel Xeon Phi and GPU accelerators. This version is designed to fully utilize SIMD units of modern many-core accelerators. To use the accelerator efficiently the solver uses dense Schur complement instead of sparse Cholesky factors of the stiffness matrices. This allows ESPRESO to use dense BLAS GEMV function instead of forward and backward substitution of the direct solver. The speedup of the solver runtime delivered by one Tesla K20m is 3-4x over two 8-core Sandy Bridge Xeons (E5-2470).

Large-scale atomistic Molecular Dynamics simulation study of polymer-graphene nanocomposites

Emmanuel Skountzos received his Diploma in Chemical Engineering from the University of Patras in 2010 and his Master’s in Chemical Engineering from the Department of Chemical Engineering of the same University in 2013. Since then, he undertakes his PhD thesis study at the Laboratory of Statistical Thermodynamics and Macromolecules (LSTM) headed by Prof. Vlasis G. Mavrantzas at the same Department. He has been working on the simulation of several Graphene-based polymer nanocomposites. More specifically he investigates the role of the functionalization of the polymer chains on the fine dispersability of the graphene sheets in a nanocomposite melt. Prior to this, he has successfully simulated the mechanical properties of a glassy polymer and its nanocomposites which led to a recent publication (E.N. Skountzos, A. Anastassiou, V.G. Mavrantzas, D. N. Theodorou, “Determination of the Mechanical Properties of a Poly(methyl methacrylate Nanocomposite with Functionalized Graphene Sheets through Detailed Atomistic Simulations”, Macromolecules 2014, 47, 8072). His up-to-date research on the problem of the Graphene agglomeration in a polymer matrix led to an even more recent publication (K. D. Papadimitriou, E. N. Skountzos, S. S. Gkermpoura, I. Polyzos, V. G. Mavrantzas, C. Galiotis, C. Tsitsilianis, “Molecular modeling combined with advanced chemistry for the rational design of efficient graphene dispersing agents”, ACS MacroLetters 2016, 5, 24). He has been part of several awarded allocation projects through the periodic PRACE Calls (HPC-Europa, PRACE and LinkSCEEM) which has been submitted under the supervision of the Scientific Director (Prof. Vlasis Mavrantzas).

Graphene is widely used nowadays for the fabrication of novel polymer nanocomposite structures because of its extraordinary combination of properties that render it an extremely promising material as a nanofiller. Graphene sheets (GS), however, tend to self-assemble in the polymer matrix and agglomerate into multilayer graphitic structures due to strong π-π stacking interactions. To overcome this problem, functionalization of GS through covalent or non-covalent modification has been suggested and found to improve its dispersion in solvents and processability. An alternative approach is to use pyrene-functional polymers. Pyrene derivatives are strongly adsorbed on GS through robust π-π stacking interactions, thereby leading to highly uniform and stable dispersions without inflicting any damage to the graphitic surface.

The scope of the current project is: a) to exploit the power of detailed, large-scale molecular dynamics (MD) simulations in order to examine the capability of pyrene-functionalized α,ω-poly(methyl methacrylate) (py-PMMA-py) to serve as a dispersing agent that can prevent GS agglomeration, and b) to use computer deformation experiments to calculate the mechanical properties of the resulting GS-based py-PMMA-py nanocomposites. Here, we will rely on the methodology first proposed by Theodorou-Suter and applied in our previous study, leading to the calculation of the elastic constants of syndiotactic PMMA at small wt.% loadings of functionalized GS with remarkable success.

The molecular weight (MW) of the py-PMMA-py chains that were addressed in the current study varied from 1,000 to 30,000 gr mol-1 while the dimension of GS had been on the size of a few decades of nanometers. Depending on the temperature and pressure conditions specified, equilibrating the resulting nanocomposite structures required tracking their evolution for times up to a few hundreds of nanoseconds (up to microseconds in some cases). Moreover, for the simulation results to be free of any finite system size effects, very large simulation cells were employed (on the order of a few hundreds of Angstroms in each direction) containing a few hundreds of thousands of atomistic units. These conditions rendered necessary the use of high performance super-computing centers in order to track system dynamics and extract reliable information for the system response to the applied mechanical deformation (shear, compression, dilatation, etc.).

Graphene is anticipated to have potential applications in several research fields: to build semiconductors beyond the limits of silicon-based technology, to design higher performance solar cells, to fabricate LCD screens and photon sensors, etc.. Yet, graphene is still at the development stage, and its commercialization pathway remains to be determined. For that reason, European Union decided to grant with a budget of 1 billion euros for the 10 forthcoming years (the EU’s biggest ever research initiative, “Graphene Flagship”) several academic and industrial researchers to take graphene from the realm of academic laboratories into European society and generate economic growth, new jobs and new opportunities.

Apart from governmental funding, several large companies (Samsung, Apple, Nokia, IBM and Sony) are involved in graphene research, funding their Research & Development Departments with several hundreds of millions of dollars. Samsung is far and away the world’s leader in graphene patents (210 patents) followed by IBM, SanDisk and Apple with 64, 36 and 35 respectively. This great scientific and commercial interest for graphene proves that this research field is still very active even after 10 years of its discovery.

Permon toolbox

Educations: Master’s Degree in Applied mathematics, VSB-TU Ostrava

Position: PhD. student, Computational science, IT4Innovations, VSB-TU Ostrava

Area of interest: parallel programming, FETI methods, quadratic programming

Permon toolbox is set of tools which make use of theoretical results in discretization techniques, quadratic programming (QP) algorithms and domain decomposition methods (DDM). Permon is based on PETSc library and uses its coding style. The most essential modules are PermonQP, PermonFLLOP. PermonQP is a module providing a base for solution of QP problems. It includes data structures, transformations, algorithms and supporting functions for QP. PermonQP allows solving unconstrained QP problems or equality constrained ones. Inequality constrained QP problems can be solved by using PermonIneq module. PermonFLLOP is an extension of the PermonQP. It implements FETI non-overlapping DDM. PermonFLLOP provides support for assembly of the FETI-specific objects. The algebraic part of FETI methods is implemented as a special QP transform combining some QP transforms from the PermonQP. Newly emerging interface with FEM software Elmer will enable the solution of real problems of mechanics and other problems.

Modelling arbitrarily shaped laser pulses in electromagnetic codes

Illia Thiele is a PhD student at Centre Lasers Intenses et Applications in Bordeaux (France). He obtained his master degree in physics at the Friedrich-Schiller-University in Jena (Germany) in 2014 after working on ”Investigation of nonlinear effects in plasmonic nanostructures by finite difference time domain simulations”. His current work in Bordeaux focuses on modeling of ultrashort nonlinear pulse propagation in gases with emphasis on terahertz generation and strongly focused laser beams. Of particular interest for this work are particle in cell simulations including various physical mechanisms as ionization and collisions.

Electromagnetic codes are useful tools to study various problems in microwave engineering, plasma physics, optics and other branches of natural science. Such codes solve Maxwell’s equations coupled to constitutive equations describing the matter. In studies of laser matter interaction, external electromagnetic waves (the ”laser”) have to enter the computational domain in order to interact with the matter.

In the case of particle-in-cell codes for investigations of laser plasma interaction it is common practise to prescribe external electric and magnetic fields at the numerical box boundaries. Very often, the paraxial approximation is used to calculate the required fields at the boundaries. However, the paraxial approximation is valid only if the angular spectrum of the laser pulse is sufficiently narrow. Thus, it is not possible to use this approximation for strongly focused pulses. For several beam types, e.g. Gaussian, higher order approximations have been presented, but they are rather complicated and therefore not easy to implement. Moreover, for more exotic beam shapes, like vector beams or even sampled experimental profiles, it may be even impossible to find an explicit analytical solution.

We propose a simple and efficient algorithm for a Maxwell consistent calculation of the electromagnetic fields at the boundaries of the computational domain. We call it laser boundary conditions (LBCs). Our algorithm can describe any kind of laser pulses, in particular tightly focused, arbitrarily shaped and polarized ones. The presented algorithm can be parallelized in a straight forward manner and may be used with simulations tools employing domain decomposition as for example particle-in-cell codes. Using domain decomposition where each of the processing units describes a particular part of the computational domain as a principle of parallelization requires high-performace computers.

Using such, we successfully employed our approach to simulate a tightly focused Gaussian pulse. An accurate handling of the laser injection turns out to be crucial: Electron density profiles from ionization of neutral argon atoms due to field ionization are shown to be strongly dependent on the LBCs. Consequently, the LBCs may have significant impact on features like back-reflected radiation or energy deposition in the medium. Furthermore, our algorithm offers a simple way to simulate more complex pulse configurations or even sampled experimental beam profiles. Such ”structured light” receives a lot of recent interest from various communities. As an example we demonstrated a longitudinal needle beam, which may be interesting for, among others, laser based material processing or particle acceleration studies. Thus, we believe that our approach will be useful for a larger community working on electromagnetic simulation codes.

Topological constraints in polymer rings

Dimitrios Tsalikis received his Diploma in Chemical Engineering from the University of Patras in 2004 and his Ph.D. (titled: “Computational study of structural relaxation and plastic deformation of glassy polymers”) from the National Technical University of Athens in 2009 under the advisement of Prof. D.N. Theodorou. In 2011 he joined the research team of Prof. Vlasis Mavrantzas in Patras as a post-doctoral fellow. He is the author of 9 papers, practically all on the multi-scale modeling and simulation of polymers in their glassy or molten state. In the last few years, he is developing methodologies for understanding topological interactions (threading events) in melts of non-concatenated polymer rings starting from atomistic trajectories accumulated in the course of very long, large-scale molecular dynamics simulations in full atomistic detail, using polyethylene oxide (PEO) as a model system. He has already published convincing evidence for strong threading events in these systems (either pure or contaminated with linear counterparts) that can explain the long delays observed in the stress relaxation modulus of highly purified polymer rings compared to analytical theories.

Dimitrios Tsalikis has a unique experience with high performance computing since 2007 being an active user of Tier1 and Tier0 HPC systems available to scientific community via HPC-Europa, PRACE and LinkSCEEM projects. He has also considerable experience with programming on Graphics Processing Units (GPUs) which he is using to accelerate algorithm execution for the atomistic Monte Carlo (MC) and kinetic Monte Carlo (kMC) algorithms that he is developing as part of its participation in many other projects in the team of Prof. Vlasis Mavrantzas such as the determination of the effective diffusivity of water in polymer matrices containing carbon nanotubes and the simulation of silicon films grown by plasma-enhanced chemical vapor deposition.

Conformational dynamics and topological analysis for polymer rings via atomistic molecular-dynamics simulations and comparison with experimental data

Dimitrios G. Tsalikis,1 Vlasis G. Mavrantzas1,2

1 Department of Chemical Engineering, University of Patras&FORTH/ICE-HT, Patras, GR 26504, Greece

2 Particle Technology Laboratory, Department of Mechanical and Process Engineering, ETH-Z, CH-8092 Zurich, Switzerland

Due to their chain-like structure and uncrossability, a number of microscopic topological constraints are generated in high-molar mass polymers known as entanglements which dominate their dynamical and rheological properties. For linear or branched polymer architectures, these topological interactions are well understood today thanks to the tube model, an effective medium theory proposed independently by de Gennes1 and Doi-Edwards2 built on the concept of primitive path (PP). According to the tube model, entanglements constrain the lateral motion of the chains which is thus restricted within a curvilinear tube-like region encompassing the chain. The primitive path is the shortest disconnected path along the chain contour which avoids crossings with other chains and has the same topology as the chain itself. For a linear chain with two free ends, the PP diffuses (i.e., reptates) backward and forward along the tube axis. An arm of a branched polymer has one free end which fluctuates attempting to reach the other, immobile end.

There exists, however, a class of polymers, the so called ring polymers, which lack chain ends and whose dynamics cannot be described by the conceptual framework of the tube model. Due to the absence of chain ends and their particular loop topology, polymer rings exhibit dynamic and viscoelastic properties that cannot be explained by the reptation theory. Even for the simplest case of unlinked (i.e., non-concatenated) ring polymers, issues related with chain configuration, molecular shape, local and terminal dynamics and stress relaxation are today only partially understood, although several advances have been made over the years. The key idea is that entangled rings contract into a folded form in order to satisfy the constraint that they should never link with neighboring rings (the so called lattice animal picture). Several extensions and improvements of this picture have been proposed recently, all however agreeing on the contraction and presence of local double folds or loops.6-9

Polymer rings serve, however, as a model system for understanding dynamics of fundamental ring structures in nature such as mitochondrial and plasmic DNA, and this explains the big interest in their properties (static, dynamic, and viscoelastic) in the last few years. From an experimental point of view, a major difficulty to overcome in the measurements is the production of highly purified and monodisperse polymer rings in sufficient quantities, since small contamination of the melt by linear chains can have a dramatic effect on the dynamics and molecular rheology of the melt.

To help in the direction of understanding the nature and role of microscopic topological interactions in polymer rings we have thus resorted to computer simulations and to the design of a simulation strategy involving three steps: a) execution of very long molecular dynamics simulations with model ring polymer structures using a well-studied system, polyethylene oxide (PEO), b) topological analysis of the accumulate trajectories to define local contact points, and c) detailed geometric analysis using vector calculus to identify threading events, classify them into weak and strong, and compute their dynamics from birth to death.

Key to the success of the entire methodology is the simulation of model PEO structures of molecular weights spanning the regime of MWs addressed experimentally. This means very large simulation cells and thus the use of supercomputing time. For example, we run simulations with systems containing up to 200,000 atomistic units. We also need to study the flow properties of these systems, which due to Lees Edwards boundary conditions necessitate simulation cells that on the flow direction are as long as the fully extended size of the chains, i.e., on the order of 100 nm. This increases even further the number of particles in the simulation cell, up to 500,000.

So far, we have already carried out test simulations with short-to-moderately long PEO melts (MW = 5k, 10k, and 20k) and the results are impressive. We have found outstanding agreement with experimental data for the ring center-of-mass self-diffusion coefficient and the normalized single-chain dynamic structure factor from small angle neutron scattering (SANS), neutron spin echo (NSE), and pulse-field gradient NMR (PFG-NMR) measurements).3 Furthermore, we have quantified ring-ring threading in the simulated ring PEO melts which reveals a variety of topological interactions corresponding to single and multiple penetrations, that can last up to several times the average ring polymer orientational relaxation time.4,5 And we have shown that these interactions can explain, at least in part, the appearance of slow relaxation modes observed experimentally in entangled rings.6

[1] P.G. De Gennes, J. Chem. Phys.55, 572 (1971).

[2] M. Doi and S.F. Edwards, The Theory of Polymer Dynamics (Clarendon Press, 1986).

[3] D G. Tsalikis, T. Koukoulas, V.G. Mavrantzas, D. Vlassopoulos, W. Pyckhout-Hintzen, and D. Richter, Macromolecules, in preparation (2015).

[4] D.G. Tsalikis and V. G. Mavrantzas, ACS Macro Lett. 3, 763 (2014).

[5] D.G. Tsalikis, V.G. Mavrantzas, D. Vlassopoulos, Phys. Rev. Lett., submitted (2015).

[6] M. Kapnistos, M. Lang, D. Vlassopoulos, W. Pyckhout-Hintzen, D. Richter, D. Cho, T. Chang, and M. Rubinstein, Nature Mater. 7, 997 (2008).

Optimized Atomistic Monte Carlo and Molecular Dynamics Algorithms for simulating self-assembly in soft matter

Flora Tsourtou received her Diploma in Chemical Engineering from the University of Patras in 2010 and her Master’s in Chemical Engineering also from the same University in 2013. Since then, she is a PhD student in the same Department under the advismenet of Prof. Vlasis G. Mavrantzas. She has been working on the design and efficient implementation of state/of/the/art Monte Carlo algorithms for the simulation of self-assembly and chain self/organization in Soft Matter systems with emphasis on polymers. In her Master’s thesis, she simulated the bulk phase self-assembly of semifluorinated alkanes and resulted in a scientific publication (F.D. Tsourtou, O. Alexiadis, V.G. Mavrantzas, V. Kolonias, E. Housos, “Atomistic Monte Carlo and Molecular Dynamics simulation of the bulk phase self-assembly of semifluorinated alkanes”, Danckwerts Special Issue, Chemical Engineering Science 2015, 121, 32-50). In her PhD, she is extending that work to two more challenging problems:
1. the prediction of nano-scale morphology in polymer semiconductors based on thiophene and alkyl-thiophene as a function of temperature, and
2. the prediction of the secondary structure in polypeptides such as poly-L-alanine in vacuum, solution and melt. The Monte Carlo codes she is developing are optimized to run on Graphics Processing Units (GPUs). She has therefore a strong experience with parallel programming using MPI, OpenMP, and GPUs. The Monte Carlo and Molecular Dynamics codes that she develops and uses in the course of her PhD thesis research work run on several supercomputing environments available to scientific community via HPC-Europa, PRACE and LinkSCEEM projects. Access to these networks and high-performance machines is made available through proposals submitted by the Scientific Director (Prof. Vlasis Mavrantzas) to periodic PRACE Calls.

An outstanding issue in the field of molecular simulations is the so called problem of long relaxation times, namely that the longest time that can be accessed today in a brute force molecular dynamics (MD) simulation even on the most powerful supercomputing systems is still a few microseconds, i.e., orders of magnitude shorter than the time scales (seconds or minutes) associated with the development of morphology at nano- or meso-scales in soft matter systems. As a result, and despite the fact that the equilibrium structure of many of these systems is rather well-known thanks to advanced experimental techniques, their molecular modeling remains challenging and currently prevents the use of computer simulations as a design tool for new materials with modulated or tailored properties.

To overcome this problem and study challenging issues with tremendous scientific and technological interest such as self-organization and assembly while avoiding coarse-graining, one can resort to a non-dynamic method and unleash the computational power of Monte Carlo (MC) technique. We thus design novel atomistic MC algorithms which retain atomistic detail, are based on advanced moves that enhance the rate with which the system explores new states, and have the potential to predict self organization at intermediate length scales remarkably accurately. We use these algorithms to predict self-assembly and chain self-organization in three classes of materials: semifluorinated alkanes SFAs), polypeptides, and polymer semicunductors. We chose these three different families because of the following reasons:
(a) There exist several experimental data about their nanoscale structure with which we can compare the results of our simulation algorithms;
(b) the three systems have many common features at the molecular level, so they can be treated similarly from an algorithm point of view;
(c) the successful simulation of their self-assembling properties will have important implications: for example, the simulation of polypeptide folding will open the way to the direct simulation of novel materials (e.g., peptide-based hydrogels) for application in biomedical and tissue engineering. SFAs, on the other hand are used as drug delivery systems while polymer semiconductors constitute the active elements of optoelectronic devices, such as organic photovoltaic solar cells.

To study morphology in all these three systems, it is important to use large simulation cells with dimensions on the order of tens to hundreds of nanometers, which brings up the issues of parallelization of the corresponding codes. To cope with the large requirements in CPU time accompanying the use of such super-cells in our simulations, we identify the Monte Carlo subroutines with the largest demand in computational resources and run them in parallel by utilizing multithreading on NVIDIA graphics processing units (GPUs). We saw that this improves code performance by almost one order of magnitude with some spectacular results concerning the predicted morphologies. In addition, we resort to other parallelization schemes such as parallel tempering which combined with MPI and CUDA allow us to push the simulations down to low temperatures where the above materials become glassy or semicrystalline. In my poster, I will start by reviewing the state-of-the-art in the field and then I will focus on our own work. I will present the basic design aspects of the new algorithms and their parallelization on GPUs and CPUs and I will highlight the results obtained so far and how they compare against experimental data and other simulation findings based on the use of simplified coarse-grained models. I will also present how our Monte Carlo simulations can be interfaced with large scale atomistic MD simulations to further study segmental and chain relaxation mechanisms in these highly complicated systems.

Parallel Encryption Method for Big Data

Sena Efsun Cebeci received her B.S. degrees in Computer Engineering and Computer Science from Bahcesehir University, in 2008. She earned her M.S. degree in Computer Science from Oakland University, in 2010. She is currently pursuing her Ph.D. degree at Istanbul Technical University, in Applied Informatics. Her research interests include cybersecurity and cryptography, energy efficiency in peer-to-peer networks and distributed systems.

In the past few decades, information technology revolution affected almost all aspects of human life and emerged a growing demand of secure systems. For information technology systems, providing security via encryption has become significant in this respect. Due to its applicability in Big Data, we consider parallel encryption problem. We propose a novel method which provides secure and parallel data encryption for Big Data. Current encryption methods with Cipher Block Chaining (CBC) mode assuring security and privacy have drawbacks in terms of parallelization and robustness. Furthermore, the well-known symmetric key encryption methods are not feasible with the mode operation of Electronic Code Book (ECB) on account of known cipher text attacks. In this research, we present an encryption method can be used with the efficient ECB mode of operation to construct a system resistant to possible attacks and deals with storing Big Data in a highly protected manner.

Reduction of greenhouse effect: Direct Numerical Simulation of geological CO2 sequestration

Marco De Paoli received in 2013 his master degree in Mechanical Engineering from the University of Udine, where he collaborated with professors A. Soldati, C. Marchioli and F. Zonta. The object of his thesis was a model for Lagrangian Particle Tracking in LES flow fields. He was a visiting student at the Institute de Mecanique des Fluides de Toulouse (IMFT) in 2013. Since 2014 he is a PhD student in Environmental and Enenrgy Engineering Science at the University of Udine and his interests have been addressed toward HPC systems. He attended a Summer School on Parallel Computing (Rome, 2014) and an International Summer School on High Performance Computing (Toronto, 2015). He is actually working on flows in porous media, using pseudo-spectral methods and massively parallelized tools. From March to September 2016 he will be a visiting student at the University of Vienna.

Carbon dioxide (CO2) emission by combustion of fossil fuels is seriously increasing in the last decades with dramatic environmental consequences. Such emissions have led to a corresponding rise in the atmospheric concentration of CO2, likely responsible for the concomitant increase in the average global temperature (greenhouse effect). One option to reduce the CO2 concentration is the CO2 storage into large, porous geological formations located kilometers beneath the Earth surface. These formations are saline aquifers, and are characterized by the presence of brine (highly salted water). At these depths, where CO2 exists at a supercritical (liquid-like) state, the fundamental aspect of the dissolution mechanism is that CO2, which is lighter than brine when pure, is partially soluble in brine (3% in weight) and forms a heavier solute that flows downward (driven by the density difference). It is crucial to carefully evaluate the vertical mass flux of CO2 because this is the macroscopic parameter that drives the CO2 dissolution in deep aquifers. Prediction of this downward flow is made complex by a characteristic instability: the heavier layer of CO2-rich brine becomes unstable and gives rise to the formation of fingers, which make the process of CO2 sinking much more efficient but also extremely harder to predict: the problem becomes multiscale with correspondingly huge computational costs required for accurate solutions. Due to the high resolutions required, these predictions are possible only using High Performance Computing resources.

In this work we performed two-dimensional Direct Numerical Simulations of convective flows in a confined porous medium characterized by non-isotropic permeability. We use an accurate and efficient numerical pseudo-spectral method (Fourier and Chebyshev) to discretize the equations. A two-level Adams-Bashforth scheme for the non-linear terms and an implicit Crank-Nicolson scheme for the linear terms are employed for time advancement. The code uses multiple FFTW3 libraries to perform Fast Fourier Transforms. The parallel programming paradigm used is the MPI standard and the parallelization is achieved by one-dimensional domain decomposition. The domain is spaced using up to 8192 x 1025 nodes in horizontal and vertical directions, and the computation is massively parallelized on the TIER-0 HPC system named “FERMI”.

From a technical and economical point of view, the fundamental knowledge and characterization of the process of CO2 permanent sequestration is crucial to ensure the realization of a ‘Zero emission fossil fuel power plant (ZEP)’, i.e. a power plant that will capture at least 85% of the CO2 formed during the power generation process.

The authors thankfully acknowledge CINECA Supercomputing Center for the availability of high performance computing resources and support in the framework of the projects CARDIOS (ISCRA B HP10BT2CCX) and GECOS (ISCRA C HP10C7M9MV).

POP: Promoting Efficiency in European HPC

Nick Dingle obtained an Computing Science from Imperial College London in 2001 and a Ph.D. in Computing from the same institution in 2004. He worked as a Research Associate at Imperial and the School of Maths at the University of Manchester between 2004 and 2013. His main research interest is parallel numerical linear algebra, with a particular focus on the efficient solution of large sparse systems. He is currently employed by NAG Ltd as an HPC Application Analyst on the EU-funded Performance Optimisation and Productivity Centre of Excellence in Computing Applications (POP) project, but retains a visitor’s position at the University of Manchester.

The Performance Optimisation and Productivity Centre of Excellence in Computing Applications (POP) provides an independent performance assessment of your academic AND industrial HPC code(s) in all domains. Our assessment will include a detailed estimate of potential performance gains and identification of the techniques to get them. This service is free to all organisations based within the European Union.

POP is a team with excellence in performance tools and tuning, programming models and practices, research and development, and the application of academic and industrial use cases. We work with code developers, users, infrastructure operators and vendors.

POP is funded by the European Union’s Horizon 2020 programme and is a collaboration between Barcelona Supercomputing Center, High Performance Computing Center Stuttgart, Jülich Supercomputing Centre, Numerical Algorithms Group Ltd, RWTH Aachen and TERATEC.

High-resolution 3D hybrid simulations of turbulence and kinetic instabilities in the expanding solar wind

Luca Franci is a Postdoctoral Research Fellow at the Department of Physics and Astronomy of the University of Florence, working on plasma physics and in particular on simulations of solar wind turbulence. He has only recently started his research activity in this field (04/2014), right after completing his PhD studies on neutron stars. He has already three published papers (two as the first author) on international peer-reviewed journals on the topic. He has presented his results at many international conferences (e.g., EGU General Assembly 2015, AGU Fall Meeting 2015). He has years-long experience in running, debugging, profiling and optimizing parallel codes for computational astrophysics and in managing HPC resources. The latter dates back to 2011, when he attended the PRACE Tier-1 Workshop. In 2012 he attended the PRACE Winter School at CINECA, after which he was selected for an internship. In 2013 he has been the PI of a CINECA ISCRA-C projectsand a member of the PRACE awarded project “3DMagRoI” (6th call). Currently, he is the PI of an active CINECA ISCRA-C project and the Co-PI of a CINECA ISCRA-B project. He’s also the PI of a project submitted to the DECI 13th Call, which is now under evaluation.

Turbulence in magnetized collisionless plasmas, such as the solar wind, is one of the major challenges and open questions of space physics and astrophysics. A vast store of data coming from space plasma missions indicate that kinetic interactions are active in the solar wind and that processes at small length scales are crucial for a proper understanding of the dynamics of the plasma from the solar corona outwards.

This project is aimed at studying the development and interplay of turbulence and kinetic processes in the expanding solar wind by means of high-resolution simulations. Its mail goal is to describe the features of the turbulent cascade in the sub-ion range, to identify dissipation mechanisms and to model particle heating.

We investigate these phenomena by employing a three-dimensional (3D) hybrid particle-in-cell (PIC) code, in which the ions are discretized into macro-particles while the electrons act as a massless, charge-neutralizing fluid. The code also employs the Hybrid Expanding-Box (HEB) Model, which simulates the solar wind expansion.

Last year, thanks to the intensive use of Tier-0 resources (CINECA’s “Fermi” machine), we performed the most accurate two-dimensional (2D) hybrid-PIC simulations of solar wind turbulence in the literature. Thanks to a very high spatial resolution and to a large number of particles, these simulations were able to reproduce many properties of the observed solar wind spectra, all simultaneously for the first time in the literature. Our simulations also proved that a very large number of particles is mandatory for a correct quantitative estimate of the proton heating.

Despite the very good agreement between our 2D results and observations, we have recently performed also accurate 3D simulations, since turbulence is inherently a 3D process and a 2D geometry strongly constrains the simulated dynamics. Due to the highly demanding computational requirements, getting reliable insights on the properties of kinetic turbulence at sub-ion scales in 3D was not been feasible until very recently. The results from our 3D simulations have been compared with previous studies in the literature. We could improve our understanding of how astrophysical turbulence operates and of how plasma properties control or react to the evolution of turbulence. We investigated the nature of the ion-scale spectral break observed in solar wind spectra. Such accurate simulations represent a strategic tool for planning future investigations and are particularly important for interpreting the observations from present and forthcoming space missions.

Performing a very high-resolution 2D study and a high-resolution 3D study has been an ambitious and innovative goal itself. The use of the HEB model in 3D hybrid simulations was also novel, since the expanding-box-model itself had only rather recently been extended to three dimensions even in the simpler magneto-hydrodynamic case. Indeed, our 2D and 3D simulations represent the most accurate simulations of solar wind turbulence in the literature, and our 3D-HEB simulation is the first example of consistent study of the evolution of turbulence accounting for the effect of the expansion and of kinetic physics in three dimensions.

Investigating the properties of turbulence and kinetic instabilities at small length scales require the use of very large computational grids and large numbers of particles. The use of Tier-0 resources was therefore fundamental to improve our knowledge of this topic, mainly due to the high memory request (from 1 to 6 TBs of RAM for very accurate 2D simulations and up to 12 TBs for accurate 3D ones). The use of HPC systems with thousand of cores was also mandatory for completing a high-resolution 3D HEB simulation within reasonable time. In fact, it had to run for a much longer time than standard (non expanding) simulations, since the characteristic expanding time is much larger than the non-linear time associated to the turbulence.

Maxwell-Fluid code for simulating the terahertz generation in high-intensity laser-plasma interaction

Pedro González de Alaiza Martínez is a PhD student at CEA-DAM-DIF in Arpajon (France). He obtained a Master’s Degree in Modelling and Simulation at École Centrale Paris (France) and during his Master’s thesis he worked on a parallelized Particle-In-Cell (PIC) code for simulating plasmas under the action of ultra-high intensity lasers. He is currently studying analytically and numerically the generation of intense terahertz (THz) fields by laser-plasma interaction. His expertise domains cover unidirectional pulse propagation equations, PIC and Maxwell-Fluid codes for the production of intense photonic sources by laser-plasma interactions.

Terahertz (THz) radiation has nowadays many promising applications, in particular for the remote detection of drugs or explosives, and the medical imaging of tumors. Intense and broadband THz sources are needed for these applications. However, most of the existing techniques to supply THz radiation (conventional antennas, photoconductive switches, quantum cascade lasers and optical rectification) appear limited by damage thresholds or by their spectral narrowness. In this context, for more than ten years, an alternative technique, consisting in mixing several colours of intense femtosecond laser pulses inside a plasma spot that serves as a frequency down-converter in gases, has been successfully explored and it allows us to generate, even remotely through laser filaments, broadband THz signals.

Laser-driven THz sources encompass various physical mechanisms producing THz waves: photoionization, nonlinear Kerr effect and plasma oscillations initiated by ponderomotive forces. Two models are widely used to understand the underlying physics in this context: the UPPE (Unidirectional Pulse Propagation Equation) equation, which describes accurately all the nonlinear optics, and the PIC (Particle-In-Cells) codes, which solve the complex plasma physics. We propose a new Maxwell-Fluid equation as an in-between model that couples the Maxwell equations including all nonlinear optics with Vlasov equations describing plasmas.

Handling big computational domains while supplying high resolution in many variables (in space and time for the laser harmonics and in frequency for simulating an accurate THz spectrum) when solving very nonlinear systems of hyperbolic equations represent a numerical and computational challenge. We have developed a fully parallelized code based on finite volumes, endowed with complete ionization modules for multiple ionization of mixtures of gases and molecules (such as argon or air), which solves accurately our Maxwell-Fluid model from moderate (1014 W/cm2) to high (1019 W/cm2) laser intensities.

We have successfully employed this code to simulate THz generation by laser-plasma interaction in different situations. On the one hand, our simulations with the Maxwell-Fluid code reproduce with very good agreement the first experimental measurements of spectra in air, which show that THz is mainly due to the combination of both Kerr and plasma effects at moderate intensities (5×1013 W/cm2). On the other hand, the simulation results obtained from the Maxwell-Fluid code are in good agreement with PIC simulations up to relativistic intensities (1018 W/cm2). We have demonstrated numerically and analytically that photoionization induced by two-color laser pulses is the main mechanism yielding THz radiation, but plasma effects gain in importance when the laser intensity increases.

Our Maxwell-Fluid model actually combines the physics of particle-in-cell and propagation codes. It allows us to have a complete insight of the optical and plasma mechanisms involved in THz generation. We believe that our approach and results are important for designing bright THz sources using low-energy, ultrashort lasers.

On the binding mechanism to the Influenza A M2 channel

Current Position

Marie Sklodowska Curie Fellow at the University of Edinburgh.

The main focus of my research is to use computational simulations to investigate conformational changes in proteins and to understand how they can be used for drug design purposes


  • 2015: Postdoctoral Research Scholar at the University of Florida.
  • 2014: PhD in Biomedicine, University of Barcelona.
  • 2013: Visiting PhD Student at University of Bologna
  • 2010: MSc in Biomedicine, University of Barcelona
  • 2008: Graduate in Pharmacy, University of Barcelona.

The M2 channel of the Influenza A is an essential protein of the influenza A virus, the causal agent of the flu. It assembles as a homotetramer across the viral envelope and the transmembrane part plays a critical role into the virus life cycle as a pH-gated proton channel that allows the acidification of the viral interior. The M2 channel is a validated target for antiflu drugs such as amantadine (Amt) and rimantadine, which effectively inhibit the wild type form of the channel. However, the widespread formation of Amt resistance mutations found in flu circulating strains (including pandemic 2009 H1N1 swine flu and the highly lethal H5N1 chicken flu) impelled the CDC to advise against its use as flu treatment, dramatically decreasing the available therapeutical options.

Understanding how mutations affect the energetic and structural determinants of drug binding to the M2 channel is a key step in the design of new compounds that can target Amt-resistant flu strains. In order to better understand the molecular determinants of these processes, we leveraged the HPC resources provided by the Barcelona Supercomputer Center (supported by a PRACE project) to combine unrestrained Molecular Dynamics simulations with multiple-walker metadynamics to obtain an atomistic description of the binding and unbinding processes of M2 channel blockers. Our results have disclosed the existence of two ligand binding modes in the interior of the channel and allowed us to characterize the energetics of the Amt binding pathway. These findings provide valuable guidelines for the development of new inhibitors of potential therapeutical interest.

Quantum-dot solar cell modelling using density functional theory

Frédéric Labat is assistant professor of molecular modelling and informatics at Chimie-Paristech (Paris, France) since 2010. His main research interests concern the application of quantum chemical approaches such as density functional theory to solid-state systems, with a particular emphasis on energetic devices.

With the decrease of available fossil energy sources damaging the environment, considerable effort is devoted to the research of alternative, environmentally friendly and renewable energy power. Among these sources, emerging photovoltaic technologies are expected to play a key role in future energy breakdown, as long as their scalability and conversion efficiencies are increased, at minimal operation costs. This requires a comprehensive understanding of their basic operating principles, which remains largely lacking, with a very large number of potential candidates currently proposed at the experimental level.

In a standard solar cell, the photovoltaic effect involves generation of electrons and holes in a single semiconductor material under illumination, and subsequent charge collection at opposite electrodes. Light absorption and charge collection operations can also be separated when an heterojunction interface between materials of different ionization potentials or electron affinities is considered. In a photoanode for instance, the driving force for the electron/hole separation into free charge carriers is a fast electron injection from the photoexcited light-harvester to the sensitized material.

Quantum dots (QDs) have recently gained marked attention as light harvesters, mainly due to their size-tuned optical response. Major challenges in quantum dot sensitized solar cells (QDSCs) however include efficient separation of electron-hole pairs formed upon photoexcitation of the QDs, and facile electron transfer to the electrode, which is typically achieved by combination with nanomaterials of suitable band energy as electron acceptors for the excited QDs.

In this contribution, we show how HPC resources can be used to address some of the key points of the basic operating principles of stacked QD/graphene layers based QDSC devices, using periodic density functional theory calculations applied to large-scale systems.

Increasing Passive Safety in Railway Traffic

2013 – now
Ph.D., Applied Mechanics, Faculty of Mechanical Engineering, VSB-TU Ostrava
2011 – 2013
Masters of Applied Mechanics, Faculty of Mechanical Engineering, VSB-TU Ostrava
2008 – 2011
Bachelor of Applied Mechanics, Faculty of Mechanical Engineering, VSB-TU Ostrava

2013 – now
IT4Innovations, National Supercomputing Centre, VSB-TU Ostrava
2013 – now
Academic staff – lecturer
Department of Applied Mechanics, VSB-TU Ostrava

Significant publication:
Horyl, P.; Snuparek, R.; Marsalek, P.: Behaviour of Frictional Joints in Steel Arch Yielding Supports. Archives of Mining Sciences, Vol. 59 (2014) No. 3, pp. 781-792.

2014 LSTC, Troy (MI)
2015 LSTC, Livermore (CA)

Contact in LS-DYNA
Composite in LS-DYNA
Blast in LS-DYNA
Penetration in LS-DYNA
ALE & Fluid/Structure Coupling in LS-DYNA
Smoothed Particle Hydrodynamics in LS-DYNA

Area of interest:
Mechanics – dynamics, LS-DYNA solver

Safety of the passengers is one of the key issues of transport industry. In automotive or industry designing of safe vehicles is norm and it is governed by many standards. Surprisingly it does not apply for railways although by trains millions of people is commuting every day. There are no standards for crash test similar those in automotive, except for UK where very strict rules applies. Manufacturers of passenger seats who wants to sell their products in UK have to pass rigorous crash test which are very expensive and prolong design cycle of new seats. Numerical modelling and simulation is option how to reduce both costs and time needed to bring new products to the market.

In this poster joint research work between IT4Innovations and local SME company Borcad is presented. Main objective of this project is development of new generation of double seat for regional rail transport of passengers. Presented double seat have to meet the revised parameters of British crash-test standards GM/RT2100.

The main purpose of modelling is the implementation of crash-tests according to test conditions in the experimental laboratory. Mathematical description of this problem is very strongly nonlinear and time-dependent (including dummies/ATDs, preloading seat structure by bolts, more than thousand spotwelds, geometric and material nonlinearity, many contacts etc.). To solve this highly nonlinear problem in reasonable time frame supercomputer Anselm operated by IT4Innovations was utilized. Our results of HPC simulations carried out by finite element method are used by Borcad for developing of crumple zones and mass decreasing of double seats.

COMPAT: Computing Patterns for High Performance Multiscale Computing

Hugh Martin is a computational nucleotide-nanopore scientist with a PhD from UCL. Since finishing his PhD, Hugh has worked for the Centre for Computational Science at UCL on various EU and UK funded initiatives including projects pertaining to biomedical informatics and complex natural systems. Hugh is also the General Manager at CBK Sci Con, a scientific consultancy devoted to the provision of high end scientific, technical and management advice to business in e-science domains.

Multiscale phenomena are ubiquitous and they are the key to understanding the complexity of our world. Despite the significant progress achieved through computer simulations over the last decades, we are still limited in our capability to accurately and reliably simulate hierarchies of interacting multiscale physical processes that span a wide range of time and length scales, thus quickly reaching the limits of contemporary high performance computing at the tera- and petascale. Exascale supercomputers promise to lift this limitation, and in this project we will develop multiscale computing algorithms capable of producing high-fidelity scientific results and scalable to exascale computing systems.

We present the COMPAT project. With COMPAT, our main objective is to develop generic and reusable High Performance Multiscale Computing algorithms that will address the exascale challenges posed by heterogeneous architectures and will enable us to run multiscale applications with extreme data requirements while achieving scalability, robustness, resiliency, and energy efficiency. Our approach is based on generic multiscale computing patterns that allow us to implement customized algorithms to optimise load balancing, data handling, fault tolerance and energy consumption under generic exascale application scenarios. We aim to realise an experimental execution environment on our pan-European facility, which will be used to measure performance characteristics and develop models that can provide reliable performance predictions for emerging and future exascale architectures. The viability of our approach will be demonstrated by implementing nine grand challenge applications which are exascale-ready and pave the road to unprecedented scientific discoveries. Our ambition is to establish new standards for multiscale computing at exascale, and provision a robust and reliable software technology stack that empowers multiscale modellers to transform computer simulations into predictive science.

Aquatic Purification Assisted by Membranes

Kannan Masilamani is a PhD Student under the supervision of Prof. Sabine Roller,

University of Siegen.

He is working on the development of a coupled simulation framework to simulate electro-membrane processes for seawater desalination. His research is focused on the multi-species LBM method and it’s coupling with electro-dynamics. He is also interested in the development of an efficient automated Octree mesh generation tool for large-scale parallel applications.

A membrane-based electrodialysis process for seawater desalination is an energy and cost efficient method compared to other methods like reverse osmosis. In electrodialysis processes, selective ion exchange membranes are used together with an electric field to separate the salt ions from the seawater. A geometrically complex structure, called spacer, is used between the membranes to keep them apart of each other, to maintain stability and also to improve the transport of ions and the bulk mixture.

The multi-species lattice Boltzmann method (LBM) for liquid mixtures was developed and implemented in our highly scalable simulation framework Adaptable Poly-Engineering Simulator (APES). The code was deployed on High Performance Computing (HPC) systems to gain insight into the involved processes. The compute time awarded by PRACE on Cray XC40 system Hazel Hen, HLRS, Stuttgart was used to perform simulations to study the effect of interwoven and non-woven spacer structures in the flow channels. Due to the interacting phenomena and small-scale effects in this process, a high resolution is required to resolve the ion propagation in the flow. At the same time the geometrical parameters are to be considered, as they are the main parts that could be changed in the industrial deployment of the system. The simulations conducted on the HPC systems lay the foundation for a more thorough understanding and a path towards better designs of the desalination process.

Acceleration of the boundary element library BEM4I using the Intel Xeon Phi coprocessors

Michal Merta

Researcher at IT4Innovations National Supercomputing Center


  • 2011 – present: Ph.D. study at Dept. of Applied Mathematics, Technical University of Ostrava
  • 2009 – 2011: Master’s degree in Computational Mathematics
  • Award: Joseph Fourier Prize 2015

Area of interest

  • high performance computing
  • Intel Xeon Phi accelerators
  • boundary element method

The boundary element method (BEM) is a counterpart to the finite element method (FEM), widely used in the engineering community for the modelling of physical phenomena. Since it reduces the problem to the boundary of the computational domain, and thus does not require the volume mesh, it is especially suitable for problems stated in unbounded domains (such as the sound or electromagnetic wave scattering) or shape optimization problems. However, several factors make it far less popular than FEM – firstly, the occurrence of singular kernels in surface integrals makes it necessary to use a sophisticated method for numerical quadrature; secondly, fast BEM techniques are needed to reduce the memory and computational complexity from quadratic to almost linear. Finally, the parallelization in shared and distributed memory is crucial to allow solution of large scale problems.

The parallel boundary element library BEM4I developed at IT4Innovations National Supercomputing Center implements solvers for a wide range of problems – from simple steady-state heat transfer, to linear elasticity, to sound scattering. It is based on well established algorithms used in the boundary element community (e.g., numerical quadrature of singular kernels, sparsification of system matrices using the adaptive cross approximation), as well as novel approaches to discretization and parallelization. It is hybridly parallelized in shared and distributed memory using OpenMP and MPI, respectively. New approaches for system matrix distribution among computational nodes for steady-state problems and time-dependent scattering problems are implemented.

Moreover, we focus on the usage of modern Intel based technologies to accelerate the computation. We exploit the SIMD capabilities of current processors to vectorize the computationally most demanding parts of the code. This is one of the essential steps to ensure the efficient evaluation on the CPU, but becomes even more important when utilizing the Intel Xeon Phi coprocessors with their extended AVX-512 vector registers. In the poster we focus on the acceleration of the code using the offload feature of the Intel Xeon Phi coprocessors. We provide the results of numerical experiments performed on the Salomon supercomputer located at IT4Innovations NSC which is equipped with 864 accelerator cards making it the largest installation of this type in Europe. The numerical results show significant reduction of computational time when using optimized code accelerated by the coprocessors.

Tuning the performance of the code on the current generation of Xeon Phi coprocessors (Knights Corner) will make the transition to the next far more sophisticated and powerful generations of the accelerators (Knights Landing and Knights Hill) significantly easier. These future generations will be provided not only as additional PCIe cards but also as standalone host processors with more than 60 cores and 3 TFLOPs of compute performance. With this performance and increasing power efficiency they represent one of the possible ways towards exascale supercomputers. We aim to incrementally optimize our code for the new architectures and enable the solution of large scale engineering problems, e.g., in the area of noise prediction and reduction, shape optimization, or non-destructive testing.

Magnetic Nulls in Kinetic Simulations of Space Plasmas

Vyacheslav Olshevsky is a research associate at Centre for mathematical Plasma-Astrophysics. He represents a broad community of researchers performing kinetic Particle-in-Cell simulations of space plasmas in several outstanding institutions from across the world. Vyacheslav’s research is focused on numerical modeling of magnetized space plasmas in both kinetic and fluid approaches. In particular, he is focused on the Particle-in-Cell simulations of magnetic reconnection in turbulent plasma. Vyacheslav has more than ten years of experience in high-performance computing using the largest European and American supercomputers. During his early career he used and developed numerical codes applied mainly in solar physics: radiative transfer, magnetohydrodynamic wave propagation, and turbulent magneto-convection. Now his main interests are scientific data analysis and visualization, code development. Vyacheslav collaborates with a number of outstanding research institutions around the world: Institute of Astrophysics on Canary Islands, Utrecht University, Stanford University, Swedish Institute of Space Physics and the KTH Royal Institute of Technology. His work is published in peer-reviewed journals and presented at international conferences.

Magnetic nulls are the very particular regions in space where magnetic field vanishes and allows for a fascinating phenomenon of magnetic reconnection to happen. Magnetic reconnection is the primary mean for space plasmas to release the accumulated magnetic energy, producing solar flares, coronal mass ejections and magnetic storms. Aurorae (polar light) are perhaps the most prominent signatures of magnetic reconnection in the Earth’s magnetosphere.

This contribution represents a joint effort of several groups across the world to examine magnetic reconnection in different numerical simulations, total cost of which is tens of millions of CPU hours. We study electromagnetic energy conversion at magnetic nulls in massively parallel kinetic Particle-in-Cell simulations of space plasmas. In this unprecedented study three-dimensional simulations representing various regions of space are addressed: Earth’s magnetotail and magnetosheath, dipolar and quadrupolar planetary magnetospheres, and lunar magnetic anomalies.

Strikingly, often magnetic nulls do not indicate the locations of intense energy dissipation. The so-called X-lines (classical magnetic reconnection sites) are rather stable and do not exhibit much energy conversion. However, in reconnection outflows and in turbulent space plasmas with strong electric currents the nulls of particular, spiral type, are likely to form. In the majority of simulations the number of spiral nulls prevails over the number of radial nulls by a factor of 3 – 9, in accordance with recent observations.

Powerful energy dissipation is detected in the vicinity of the spiral nulls enclosed by magnetic flux ropes with strong currents. These flux ropes efficiently dissipate magnetic energy due to secondary instabilities. This possible new mechanism of efficient energy dissipation in space plasmas can break the paradigm of magnetic reconnection that dominate in space physics for more than half a century.

A Novel CPS simulation framework demanding heterogeneous HPC systems

Dr. Ioannis Papaefstathiou is a Technical Manager at Synelixis and an Associate Professor at the Department of Electronic and Computer Engineering at the Technical University of Crete. He is working in the design and implementation methodologies for HPC as well as CPS with tightly coupled design parameters and highly constrained resources. He was granted a PhD degree in computer science at the University of Cambridge UK, in 2001, an M.Sc. degree (Ranked 1st) from Harvard University, Cambridge, MA, in 1996 and a B.Sc. degree (Ranked 2nd) from the University of Crete, Greece in 1996. From 1994-1996 he was a VLSI systems engineer at ICS-FORTH, from 1997-2000 he was a Research Associate at the Systems Research Group, Computer Laboratory, University of Cambridge, and a hardware designer for ATM microelectronics industry (later acquired by Virata). From 2001-2005 he was the manager of the Crete R&D department of Ellemedia Technologies, a closely affiliated to Lucent’s Bell Labs microelectronics company. He has published more than 100 papers in IEEE-sponsored journals and conferences. He has been the prime Guest Editor for an issue of IEEE Micro Magazine. He has served as a scientific evaluator for the Commission of the European Communities (FP6 and FP7-IST), as well as for the Greek General Secretariat for Research and Technology. He has participated in numerous European R&D Projects within several EC programs (ESPRIT, FP6, FP7, H2020); in total he has been Principal Investigator in 12 competitively funded research projects in Europe (in 7 of them he was the technical manager), in the last 5 years, while his cumulative budget share exceeds €4.2 million.

Cyber Physical Systems (CPS) are considered “The Next Computing Revolution” after Mainframe computing (60’s – 70’s), Desktop computing & Internet (80’s – 90’s) and Ubiquitous computing (00’s). CPS are growing in capability at an extraordinary rate, promoted by the increased presence and capabilities of electronic control Units as well as of the sensors and actuators and the interconnecting networks. In order to meet the growing requirements of automated and assisted features and the inherent complexity, developers have to model and simulate those sophisticated systems at all the design stages; simulation has proven itself in the last 15 years by providing an analytical approach to solutions for complex problems.

One of the main problems the CPS designers face is the lack of simulation tools and models for system design and analysis as well as the necessary processing power for executing the CPS simulations. The “Novel, Comprehensible, Ultra-Fast, Security-Aware CPS Simulation” (COSSIM) framework address all those needs by providing an open-source system seamlessly simulating, in an integrated way, both the networking and the processing parts of the CPS while provide significantly more accurate results, especially in terms of power consumption, than existing solutions and report the, critical for many applications, security levels of the simulated CPS. In summary, COSSIM is the first known framework that allows for the simulation of a complete CPS utilizing complex SoCs interconnected with sophisticated networks.

In order to be able to execute this complex task COSSIM needs extreme amounts of processing resources and computation time so as to simulate a complete CPS at a low level (e.g. when simulating the execution of the complete software stack, including the Operating System, in a target platform at a close to cycle accurate level). This processing power can only be provided by a heterogeneous HPC system consisting of high-end CPUs efficiently interconnected with high-end FPGAs and fast and large memories; this is facilitated by the fact that certain parts of COSSIM are to be implemented in reconfigurable resources. So COSSIM presents a new very demanding, as well as important, application domain for future HPC systems: CPS simulations.

The COSSIM simulator will enable new business models for numerous service providers that will be able to utilize the unique features and the simulation speed provided by HPC-run COSSIM framework so as to deliver sustainable, high quality services based on novel CPS infrastructures to the citizen at home, in the road and everywhere. Moreover, the technology implemented in COSSIM will enable the development of applications in a number of areas that utilize CPS; for example new ambient assisted living, surveillance and security services will be available anywhere and anytime increasing the safety and the well being of the European citizens. Moreover, COSSIM and its speed and accuracy when executed on an HPC platform will allow the development of novel CPS with great environmental impact such as those in energy efficient buildings, ecologically-aware energy grids, agricultural applications etc.

DRAGON: High Performance Computing of the flow past a circular cylinder from critical to supercritical Reynolds numbers

Prof. Ivette Rodríguez Pérez holds a PhD in Mechanical Engineering (UPC, 2006). She is Associate Professor at the Heat Engines Department and senior researcher at the Heat and Mass Transfer Technological Centre (CTTC) of the Universitat Politècnica de Catalunya – BarcelonaTech (UPC) since 2009. She has participated in 14 national and EU financially supported projects on mathematical modelling and optimisation in aerodynamics and thermal systems and equipment (turbulence modelling, HVAC systems, CSP and building efficiency, etc.), 13 High Performance Computing (HPC) research projects on aerodynamics, one of them a Tier-0 PRACE project granted with 23M core hours, and 4 projects with industrial partners, two of them as co-ordinator. Her research interests are focused on computational fluid dynamics and heat transfer applied to transitional and turbulent flows, as well as the numerical simulation of aerodynamics and thermal systems and equipment applied to different fields. She has been the director of two PhD theses and is author of 30 papers in JCR journals, more than 70 contributions to peer-reviewed international conferences and 6 patents. Since 2004, she has been involved in the different teaching activities such as: Solar Thermal Energy, Thermal and Thermochemical Energy Storage, Heat and Mass Transfer, Numerical methods courses in several graduate and undergraduate programs at UPC.

This work is in the context of the numerical simulation of the turbulent flow past bluff bodies, which is of importance due to its presence in many engineering applications (e.g. flows past an airplane, a submarine and an automobile, in turbomachines, etc.). It is enclosed in a long term strategy devoted to the study of massive separated flows.

The flow past a circular cylinder is a canonical case which in spite of its simple geometry it is characterised by flow separations, transition to turbulence in the shear layers (SL) and shedding of vortices. It is well known that when Reynolds numbers is about 2×105 transition from laminar to turbulent regime moves towards the cylinder just after separation. The range of critical Reynolds numbers up to 5e5 is characterised by a rapid decrease of the drag coefficient with the Reynolds numbers (critical regime). This rapid decrease in the drag with the Reynolds numbers is also known as the Drag Crisis. A laminar bubble separation (LSB), similar to those observed in airfoils at incidence at low-to-moderate Reynolds numbers, is also a characteristic trait of the flow pattern at critical and super-critical Reynolds numbers. In the critical transition, a LSB appears on one side of the cylinder producing asymmetries in the forces acting on the cylinder and in the near wake flow. A second LSB on the other side, as the Reynolds number increases, stabilises the forces and the wake, as the flow enters in the supercritical regime.

In this work, results obtained from large eddy simulations (LES) of the flow past smooth cylinders at critical and super-critical Reynolds numbers will be presented. This work was performed within a High Performance Computing (HPC) project awarded 23 million CPU-hours by the Partnership for Advanced Computing in Europe (PRACE): “DRAGON: – Understanding the DRAG crisis: ON the flow past a circular cylinder from critical to trans-critical Reynolds numbers (Project No. 2012071290)”. The results obtained showed the robustness of the methodology used, and allowed to gain insight into the complex physics taking place at these high Reynolds numbers. Because drag reduction is closely related to energy savings, mechanisms associated with the drag crisis phenomenon is an interesting object of investigation. In this research, aspects related to the complex physics present such as the change in the mechanism by which the boundary layer transitions to turbulence in the critical regime have been studied.

Role of helicity in three dimensional turbulence

Name: Ganapati Sahoo

Current Position: Post-doctoral Researcher under ERC grant “NewTurb”

Qualification: PhD in Physics.

Areas of interest: Statistical properties of fluid and magnetohydrodynamic turbulence. Study of intermittency and energy transfers in two and three dimensional turbulent flows. Numerical and experimental investigation of Lagrangian particles in fluid flows .

Skills: Direct numerical simulations and shell models in fluid and magnetohydrodynamic (MHD) turbulence. Experiments in fluid turbulence.

Turbulence plays an important role in a wide range of scientific and industrial applications. Being a complex, multi-scale nonlinear phenomenon it is regarded as a set of most difficult problems of the classical physics. Behavior of the turbulent flows could be changed by changing the nature of the external force or the confining geometry which essentially results in breaking some of the symmetries of the ideal homogeneous and isotropic flows. The dimensionless control parameter of a turbulent flow is the Reynolds number (Re) which attains very high values such as 106 – 1010 in many geophysical/atmospheric and industrial flows. In three dimensional flows the number of degrees of freedom grows as the Re9/4. Availability of high performance computing (HPC) makes it possible to carry out direct numerical simulations (DNS) of many such flows and make prediction about their statistical properties.

Numerical simulations, in addition, makes it possible to selectively break symmetries of the Navier- Stokes equations with other constraints like helicity. In a recent [1] simulation of a decimated version of the incompressible three dimensional Navier-Stokes equations, where helicity was maintained sign-definite using a helical projection, a reversal of energy cascade similar to two-dimensional Navier-Stokes equations was observed. The sign-definite helicity breaks the parity symmetry of the flow. It is one of the important symmetries of the flow that contributes to the forward energy cascade in three dimensional Navier-Stokes equations. In our study we measure the degree to which the parity symmetry controls the direction of the cascade. We introduce a mechanism in which the parity is broken stochastically but in a time frozen manner with helical constraints. We keep triadic interactions in Fourier space involving modes with definite sign of helicity and decimate the triads of other modes with opposite sign of helicity with a fixed probability. We studied [2, 3] the cascade of energy in three dimensional turbulence by changing the relative weight between positive and negative helicity modes. The results show critical nature of presence of the oppositely polarized helical modes for direct energy cascade from large scales to small scales.

We present the results from our recent high resolution DNSs using grid sizes up to 10243. Our study gives a deeper insight into the understanding of the triadic interactions in three dimensional turbulent flows which are responsible for the transfer of energy and formation of structures in such flows.


[1] Inverse energy cascade in three-dimensional isotropic turbulence,

L Biferale, S Musacchio, and F Toschi.

Phys. Rev. Lett. 108, 164501 (2012)

[2] Role of helicity for large-and small-scales turbulent fluctuations,

G Sahoo, F Bonaccorso, and L Biferale.

Phys. Rev. E 92, 051002 (R) (2015).

[3] Disentangling the triadic interactions in Navier-Stokes equations,

G Sahoo and L Biferale.

Eur. Phys. J. E 38, 114 (2015).