PRACE Preparatory Access – 16th cut-off evaluation in March 2014

Find below the results of the 16th cut-off evaluation of March 2014 for the PRACE Preparatory Access Call.

Type A – Code scalability testing

Project name: Elucidation of the 20S proteasome activation and inhibition mechanisms by organic molecules: A molecular dynamics study

Project leader: Dr Manthos Papadopoulos; National Hellenic Research Foundation, GREECE
Collaborators: Dr Aggelos Avramopoulos; Dr Georgios Leonis; Dr Konstantinos Papavasileiou
(National Hellenic Research Foundation – GR)
Research field: Chemistry and Materials
Resource awarded: 50000 CPU core hours on Curie Hybrid @ GENCI@CEA, France;



Abstract: This proposal’s objective is the deployment of molecular dynamics (MD) calculations on various model 20S proteasome benchmark systems, in order to evaluate computational setups that offer enhanced scalability. This know-how will be used to design a comprehensive study for the elucidation of the 20S activation and inhibition mechanisms by small organic molecules. The 20S proteasome is a large multisubunit protease ( 750 kD MW, 6,400 residues, 100,000 atoms), playing a vital role in the ATP-requiring proteolytic pathway under the 26S format in eukaryotic cells, responsible for the orderly catalytic degradation of most cellular proteins in living organisms. Substrate access to the proteasome interior is regulated by the N-terminal residues of the α subunits, which form a closed entry point at the centre of the α rings. The size of the 20S proteasome system renders computational efforts extremely expensive and technically demanding. Various model 20S benchmark systems will assist towards performance scaling evaluation on CPU/GPU hybrid architecture nodes. Our preparatory benchmarks will allow assessment between different model systems in terms of accuracy as well as determination of the trade-off between system size and the associated computational cost. The results of this proposal shall be used in support of applications on forthcoming PRACE calls.

Briefly, 20S is in latent form in eukaryotic and mammalian cells and its activity can be triggered by small organic molecule activators, among which Sodium Dodecyl Sulfate (SDS) –an anionic surfactant and common detergent– is prominent and has been extensively used in low concentrations. Therefore, the complex 20S-SDS constitutes an ideal candidate benchmark system both technically and scientifically, since the activation of 20S by SDS has been well established experimentally, but the exact mechanism is not known. Furthermore, experimental studies have shown that when no substrate is present, SDS inactivates 20S irreversibly. This behaviour implies a dual functionality of molecules such as SDS, that of initial activation and subsequent inhibition of the proteasome. Insofar, there have been no theoretical studies attempting to describe this interaction of dual functionality (initial activation and subsequent inhibition) of the 20S proteasome with SDS. Specifically, we shall perform MD simulations by means of the AMBER software, in order to study: i) The 20S pore opening mechanism by SDS and its effect on the catalytic active centres located at the adjacent β ring (activation). Experimental studies indicate that pore opening gives rise to allosteric effects, leading to conformational changes in the catalytic subunits; we aim to perform benchmark computations so as to determine proper model systems for the description of this process. ii) The interaction of these molecules with the proteolytic sites (inhibition). It has been reported that although SDS is a known 20S activator, increased concentrations could reverse its functionality and render its action inhibitory, ceasing all peptidase activities. The cause of the 20S activity loss is not known. It is hypothesized, however, that it could be attributed to the binding of these molecules to the β ring active sites. The proposed benchmark computations shall be conducted on the basis of this assumption.

Project name: Ab-inito Molecular Dynamics simulations of sorbed molecules within nanoporous materials: CP2K scalability tests with double hybrid functionals

Project leader: Prof. Pierfranco Demontis; University of Sassari, ITALY
Collaborators: Dr. Andrea Gabrieli; Dr Marco Sant; Prof. Giuseppe Suffritti
(University of Sassari – IT)
Research field: Chemistry and Materials
Resource awarded: 50000 GPU core hours on CURIE FN @ GENCI@CEA, France; 50000 CPU core hours on Curie Hybrid @ GENCI@CEA, France; 50000 GPU core hours on Curie TN @ GENCI@CEA, France;



Abstract: Aim of this project is to test the scalability of the CP2K code in a HPC environment in orderto apply for a PRACE regular access project. Within this research we want to study the dynamics of molecules sorbed inside various nanoporous materials (e.g., zeolites) via ab-initio Molecular Dynamics using double hybrid functionals.

The trajectories collected with these simulations, besides giving new insight on the physics of molecules in tight confinement, will be used to develop reliable classical Molecular Dynamics (MD) force fields via the “force matching” technique, enabling accurate large scale simulations.

Project name: MuLCO

Project leader: Dr Fabio Bernardini; University of Cagliari, ITALY
Collaborators: Mr. Pietro Bonfa’
(University of Parma – IT)
Research field: Fundamental Physics
Resource awarded: 50000 GPU core hours on MareNostrum @ BSC, Spain; 50000 GPU core hours on Curie TN @ GENCI@CEA, France;



Abstract: Muon spin rotation and relaxation spectroscopy is the experimental technique most suited to study materials where the constituent atoms have small magnetic moments or/and the magnetism shows a short-range order. The technique consists of implanting spin-polarized muons that, after a slowdown process, stop at interstitial sites emitting positron upon decay preferentially along the muon spin direction. Emitted positron detection allows to study the rotation of the muon spin induced by the magnetic microscopic field and the hyperfine interactions. Using symmetry considerations and the value of the magnetic field deduced by muon rotation, it is possible to reconstruct the direction and magnitude of the magnetic moments of each atomic species in the material studied. A key factor for an accurate determination of the magnetic moments is the exact knowledge of the muon position at the interstitial site. Recently, a big step forward was made using Density Functional Theory (DFT) to compute the muon interstitial site and its wave-function spread in crystalline materials. The success of the DFT approach is witnessed by a harvest of successful examples in metals and insulating
fluorides and oxides.[1-5] Additionally, DFT provides information on whether muons act only as spectators or substantially perturb the system. Indeed, it is very important to take into account the effect of muon perturbation when studying the physical properties of systems that lie on the verge of phase transitions. With this project we want to apply DFT to study muons in correlated magnetic oxides, a class of materials which is intensively studied with µSR experiments. The final aim of this project is to obtain an accurate description of the interactions between the muon and the sample in experimentally well characterized cases and eventually provide a strategy to identify the interstitial sites and the perturbation introduced by the muon in µSR experiments.
[1] F. Bernardini et al., Phys. Rev. B 87, 115148(2013); J. S. M¨oller et al., Phys. Rev. B 87, 121108 (2013).
[2] S. J. Blundell et al., Phys. Rev. B 88, 064423 (2013).
[3] J. S. M¨oller et al., Physica Scripta, 88, 068510 (2013).
[4] R. De Renzi et al., Superconductor Science and Technology 25, 084009 (2012).
[5] M. Bendele et al., Phys. Rev. B 85, 064517 (2012).

Project name: Amyloid beta-protein aggregation at experimental concentration in explicit solvent.

Project leader: Dr. Bogdan Barz; Forschungszentrum Juelich, GERMANY
Collaborators: Jun.-Prof. Birgit Strodel
(Forschungszentrum Juelich – DE)
Research field: Medicine and Life Sciences
Resource awarded: 50000 GPU core hours on CURIE FN @ GENCI@CEA, France; 50000 CPU core hours on Curie Hybrid @ GENCI@CEA, France; 50000 GPU core hours on Curie TN @ GENCI@CEA, France; 50000 GPU core hours on HERMIT @ GCS@HLRS, Germany; 100000 GPU core hours on SuperMUC @ GCS@LRZ, Germany;



Abstract: Aberrant protein aggregation is one of the main causes for the onset of many neurodegenerative diseases such as Alzheimer’s (AD) or Parkinson’s disease. In the case of AD the protein involved is amyloid beta with two main alloforms of 40 (AB40) and 42 (AB42) amino acids. Many experimental studies investigate the aggregation of AB40 and AB42 into oligomers and fibrils, generally at macroscopic level. Few of them derived structural models of small oligomers (mainly tetramers) that are thought to be the toxic agents but do not provide enough information in order to design inhibitors that would prevent the disease. Computational studies following the aggregation of several monomers into oligomers are few and often resume to coarse grained models combined with an implicit solvent at high solute concentration. Ideally one should explore the early aggregation of AB40 and AB42 and differences between their aggregation pathways using all-atom molecular dynamics in explicit solvent and close to experimental concentrations using the state of the art parallel molecular dynamics software GROMACS. The presence of explicit solvent as well as the low concentration is expected to have a great impact on the aggregation process and resulting oligomer structures. However, such systems are very large and require careful testing for an efficient use of computational resources. In this project we will fine tune and benchmark the performance of GROMACS for two systems containing 10 and 20 monomers of AB42 with a total of 2.1 and 4,.4 million atoms, respectively.

Project name: Scalability of various flow solvers for turbulent subsonic jet simulation

Project leader: Dr. Peter Wassermann; Robert Bosch GmbH, GERMANY
Research field: Engineering and Energy
Resource awarded: 50000 GPU core hours on HERMIT @ GCS@HLRS, Germany;



Abstract: In the Corporate Sector Research and Advanced Engineering of Robert Bosch our highly specialised employees work on innovative ideas and technological breakthroughs.New ideas are constantly taking shape that make existing products even more efficient,more comfortable, safer and more environment-friendly, while also opening up entirely new lines of business. Concurrently market cycles are growing shorter, cost pressure is rising and the Bosch products are growing increasingly complex.

To ensure the performance and reliability of our products and remain competitive, a highly simulation driven, virtual product development process is essential throughout all development stages. Especially in the field of fluid dynamics elaborate numerical methods are used for the assessment of early concept ideas as well as for the optimization of final component designs.

Beside the common RANS approaches, we are also applying high-fidelity, scale resolvingnumerical methods to capture the detailed, highly transient flow physics when indicated by the dominating phenomena, i.e. for ICE combustion, aero acoustics and transitional or cavitating flows. Here, resolving the relevant scales in space and time is essential for a proper component design and control of the working physics. On the one hand side, these methods come along with high computational effort and on the other hand side, we always endeavor to shorten the simulation turn around time. Thereby both, extended HPC resources and highly efficient and scalable numerical methods are necessary.

Up to now, for detailed Large Eddy Simulations we’re typically limited by the scalability of the established codes and the HPC resources available. To get a basis for future strategicdecision with respect to using HPC resources at HLRS, the performance and scalability of the latest versions of the used codes should be evaluated.

The subject of investigation are the flow dynamics and aeroacoustics of a turbulent subsonic jet.

Project name: Performance and accuracy of the linear-scaling DFT method applied to a complex metal oxide surface

Project leader: Prof Rubén Pérez; Universidad Autónoma de Madrid, SPAIN
Collaborators: Dr Milica Todorovic
(Universidad Autónoma de Madrid – ES)
Research field: Chemistry and Materials
Resource awarded: 10000 MIC core hours on MareNostrum Hybrid Nodes @ BSC, Spain; 50000 GPU core hours on Curie TN @ GENCI@CEA, France;



Abstract: Promising candidates for future organic solar cell and transistor devices involve complex nanoscale structures, with thin films of organic molecules and metal oxides layered between metallic electrodes. In order to improve their design and efficiency, it is important to understand the internal structure and properties of the multilayers and their interfaces.

In this computational PRACE project, we prepare for a large-scale computational study of organic molecules on the (101) surface of TiO2 anatase, a material of interest for industrial applications. Metal-oxides and their surfaces feature complex electronic structure and require accurate computational methods, but simulations of realistic large-scale systems further demand an innovative and highly-parallelised computational approach. We employ a linear-scaling DFT method implemented in the OpenMX code to simulate large-scale surface reconstructions, surface defects and molecule-surface interactions with a
high degree of accuracy. The aim of this preparatory work is to evaluate the level of accuracy required and explore code performance on anatase surface benchmark systems.

Simulation quality would be verified against accurate computational data and experimental results. This work is carried out in collaboration with experimental group of Dr. O. Custance (NIMS, Japan) and with Prof. T. Ozaki (JAIST, Japan).

Project name: Studies of magnetohydrodynamics instabilities in hypermassive neutron stars

Project leader: Prof Luciano Rezzolla; Institute for Theoretical Physics, Frankfurt am Main, GERMANY
Collaborators: Dr Daniela Alic;Mr. Filippo Galeazzi;Mr. Federico Guercilena;Dr Kentaro Takami;Dr Antonios Tsokaros;Dr Bruno Mundim
(Institute for Theoretical Physics, Frankfurt am Main – DEMax Planck Institute for Gravitational Physics – DE)
Research field: Astrophysics
Resource awarded: 100000 GPU core hours on SuperMUC @ GCS@LRZ, Germany;



Abstract: During the past decades, numerous astronomical observations across the entire electromagnetic spectrum (gamma-rays, X-rays, visible light, radio waves) allowed for the detection of some of the most energetic phenomena in the universe, such as gamma-ray bursts (GRBs) and supernovae, and for the discovery of compact objects, pulsars, active galactic nuclei (AGNs) etc. Thanks to the recent progress in numerical simulations and the significant increase of the computational capabilities, numerical relativity (investigating this phenomena via general relativistic numerical simulations) have become the leading tool for shedding light on the structure of these objects and the physical laws behind such spectacular events.

Despite decades of observations, a final consensus has not yet been reached on the exact mechanisms behind short gamma-ray bursts (GRBs), which are thought to originate from binary neutron star mergers. Numerical relativity simulations of compact binaries provide a promising instrument to unravel the physical processes powering a short GRB. In addition, compact binary mergers are among the most promising sources of gravitational waves (GWs), for which the first detection is expected just a few years from now. Numerical relativity is essential in order to predict the GW signals, thereby enabling their extraction form the noise of the detectors. Moreover, maximizing the scientific outcome of a GW detection will require the identification and study of coincident electromagnetic (EM) counterparts. With event rates for the NS-NS binary scenarios predicted in a wide range (between 0.4-400 per year), finding an association between a GW and an EM signal through short GRBs and their afterglows or precursors has recently become one of the most interesting open problems in contemporary astronomy as it provides yet another testbed for general relativity.

Considering that short GRBs are believed to be triggered after the magnetized NS binary collapses into a BH with a surrounding thick hot torus, most of the simulations suggested in this project deal either with the fate of the hypermassive neutron star (HMNS) formed after the merger or the “life” of the BH-torus system and the merger ejecta. On the one hand, we are interested in the magnetic field amplification mechanisms and neutrino emission that are thought to power the ultra-relativistic jet. On the other hand, e.g., the optical transients powered by the radioactive decay of heavy nuclei in the merger ejecta, referred to as the kilonovae, provide another possible EM counterpart to GWs.

Based on micro-physical considerations on both the equation of state describing matter at super-nuclear densities and a realistic coupling of plasma dynamics to the evolution of EM fields, we propose to perform numerical simulations providing insight on the missing link between ultra-relativistic particle-acceleration in collimated magnetically dominated flows (jets), and the possible connection to GRBs and their astronomical sources.

With our numerical simulations we will be able to study, for the first time, the full evolution of compact binaries including essential elements of neutrino physics, electromagnetism and GWs with high-order numerical methods.

Type B – Code development and optimization by the applicant (without PRACE support)

Project name: 3IP-WP8 PCP – Benchmark validation

Project leader: Mr Eric Boyer; CINES, FRANCE
Research field: Mathematics and Computer Science
Resource awarded: 100000 CPU core hours on Curie Hybrid @ GENCI@CEA, France; 200.000 GPU core hours on Curie TN @ GENCI@CEA, France; 250.000 GPU core hours on JUQUEEN @ GCS@Jülich, Germany;



Abstract: This project will enable validation of 3IP-WP8 PCP benchmarks. This requires porting and running test cases, ensuring reliability and stability of the application.

Project name: Time evolution of ultracold binary mixtures in optical lattices

Project leader: Dr. Pavel Soldán; Charles University in Prague, CZECH REPUBLIC
Collaborators: Mr. Miroslav Urbanek
(Charles University in Prague – CZ)
Research field: Fundamental Physics
Resource awarded: 200000 GPU core hours on CURIE FN @ GENCI@CEA, France; 200.000 GPU core hours on Curie TN @ GENCI@CEA, France; 250.000 GPU core hours on SuperMUC @ GCS@LRZ, Germany;



Abstract: An optical lattice is a rather new experimental device that opens huge possibilities of examination of fundamental microscopic laws. It has a potential to answer to several unsolved questions of theoretical physics which still resist all attempts to solve them. It has also been proposed as a possible realization of a quantum computer. Experiments in optical lattices make use of ultracold atoms that are confined into a periodic structure created using laser light. Currently suitable atoms are mainly alkali-metal atoms and alkaline-earth-metal atoms. Important step further is a shift to experiments with atomic mixtures, which allow for rich quantum systems, for example for formation of polar molecules or for mixing of bosons and fermions in optical lattices. The aim of the proposed project is to study behaviour of such mixtures and to calculate their properties numerically. We try to find allowed states of selected systems and to determine their time evolution as a result of lattice parameter change.

Project name: Simulations of dynamics of partially ionized solar atmosphere

Project leader: Dr. Elena Khomenko; Instituto de Astrofísica de Canarias, SPAIN
Collaborators: Dr. Angel de Vicente; Dr. Manuel Luna; Dr. Nikola Vitas
(Instituto de Astrofísica de Canarias – ES)
Research field: Astrophysics
Resource awarded: 250000 GPU core hours on MareNostrum @ BSC, Spain; 200000 GPU core hours on CURIE FN @ G
ENCI@CEA, France; 200.000 GPU core hours on Curie TN @ GENCI@CEA, France;



Abstract: The objective of the project is to investigate the physical consequences of the very low degree of ionization of the solar atmospheric plasma on its dynamics, and on the energy propagation and release. This low degree of ionization is due to the rapid drop of temperature, and the material ascending from sub-photospheric layers encounters a layer of almost neutral gas. This fact has almost always been neglected in the MHD description of the magnetized photospheric and chromospheric plasma, assuming its complete ionization. However, the presence of even a small amount of neutral atoms in the plasma may significantly change its physical properties and dynamics, which will strongly differ from the fully ionized case. To this effect, we are performing 2D/2.5D/3D simulations of various dynamical phenomena in the solar atmosphere, going beyond the classical magnetohydrodynamical (MHD) description of solar plasma to take into account non-ideal effects due to the presence of neutral atoms and weak collisional coupling of different species. Our simulations will include: (1) wave propagation in non-trivial magnetic structures; (2) formation of chromospheric spicules and jets; (3) equilibrium of magnetic structures – sunspots and flux tubes – in the multi-fluid plasma description; (4) formation and instabilities in prominences; and (5) magneto – convection.

Project name: HPMC – High-Performance Monte Carlo for nuclear reactor safety

Project leader: Dr. VICTOR HUGO SANCHEZ ESPINOZA; Karlsruhe Institute of Technology, GERMANY
Collaborators: Mr. Aleksandar Ivanov; Dr Anton Travleev; Dr Jaakko Leppanen; Dr Eduard Hoogenboom; Dr Jan Dufek
(Karlsruhe Institute of Technology – DEVTT Technical Research Centre of Finland – FIDelft Nuclear Consultancy – NLKTH Royal Institute of Technology – SE)
Research field: Engineering and Energy
Resource awarded: 250.000 GPU core hours on SuperMUC @ GCS@LRZ, Germany;



Abstract: The HPMC project is an European Union supported project for the development of High-Performance (HP) computing of neutron transport in nuclear reactor cores with the stochastic Monte Carlo (MC) method.

Monte Carlo methods have the major advantage over deterministic methods of exactly representing the complicated reactor core geometry with over 70,000 separate fuel rods and the continuous-energy representation of all nuclear data for neutron interactions. The main drawback is its long computation time for statistically accurate results when a detailed power distribution over the reactor core is required.

The Monte Carlo method is basically well suitable for parallel execution and the major general-purpose Monte Carlo codes used in the project, MCNP5 (LANL, USA) and SERPENT2 (VTT, Finland), are designed for both MPI and OpenMP and their combined use for parallelisation. Calculations on a parallel computer show a reasonable scalability up to, say, 32 processor cores. With many improvements reasonable scalability up to about 1,000 processor cores on a dedicated computer was obtained. However, the scalability for up to 10,000 cores or more is not acceptable. Nonetheless these large numbers of cores will be needed to perform calculations for full-size reactors within an acceptable time of one or two days.

The Preparatory Access application is intended to improve the scalability of the Monte Carlo codes in combination with thermal-hydraulic calculations for the temperature distribution used on a top-performance supercomputer using up to 10,000 processor cores and more in order to establish whether detailed full-size reactor calculations will be possible in an acceptable time.

Project name: Atomistic simulations of heterogeneous media on the Intel Xeon Phi

Project leader: Dr Daniele Coslovich; Laboratoire Charles Coulomb, FRANCE
Collaborators: Mr. Dwight Smite
(Laboratoire Charles Coulomb – FR)
Research field: Fundamental Physics
Resource awarded: 20000 MIC core hours on MareNostrum Hybrid Nodes @ BSC, Spain;



Abstract: The study of the physical properties of heterogeneous media, such as porous materials, confined fluids, or gels, represents a broad and active area of research for both practical and fundamental reasons. Molecular dynamics simulations provide an ideal tool to gain microscopic insight into these systems – an insight that remains inaccessible in mesoscopic or macroscopic modeling approaches. However, large scale simulations are required to describe the wide range of length scales characterizing these materials, hence the need of high-performance computing resources.

Standard strategies to parallelize molecular dynamics simulations on distributed memory systems can be readily applied to heterogeneous media, but they tend to be rather inefficient. For instance, the local density of porous material shows strong spatial fluctuations, which lead to severe load unbalancing when using a domain decomposition strategy. Simpler approaches, such as atom or force decomposition methods, are unsuitable for large scale simulations due to their high communication costs. With the advent of many-cores architectures, however, different approaches have appeared and may naturally resolve the current computational bottlenecks.

In this project, we want to implement a new parallelization scheme to simulate heterogeneous materials that may perform well on the Intel Xeon Phi coprocessors. Compared to other many-cores architectures (GPUs), the Intel Xeon Phi offers a simpler programming model, which enables incremental parallelism and quicker code development. We will build on a flexible, in-house simulation code and implement a task-based parallelization scheme, which is well suited for shared-memory, many-cores architectures. We will strive to fully exploit the vector capabilities of the Intel Xeon Phi, since this is crucial to achieve optimal performances. We believe that the developed code will be able to outperform existing and more sophisticated packages in the simulation of complex phenomena occurring in heterogeneous media, e.g. gel formation, flow though porous matrices, or the spinodal decomposition of viscous liquids.

Project name: Porting the Cosmological RT code CRASH to HPC

Project leader: Dr Luca Graziani; OAR – Observatory of Rome, ITALY
Collaborators: Dr. Benedetta Ciardi; Miss Nitya Hariharan; Mr Koki Kakiichi
(Max Planck Institute for Astrophysics – MPA Garching – DE)
Research field: Astrophysics
Resource awarded: 200000 GPU core hours on CURIE FN @ GENCI@CEA, France; 200.000 GPU core hours on Curie TN @ GENCI@CEA, France;



Abstract: The project will adapt, optimize and test the new version 4 of the cosmological radiative transfer code CRASH (Graziani, Maselli, Ciardi MNRAS 2013) on HPC facilities, by applying a combination of openMP, MPI and functional p
arallelisation strategies.

CRASH4 is a modular and multi-purpose RT code suitable for RT simulations both in cosmological and interstellar media and it is based on a combination of long-characteristics ray tracing and Monte Carlo sampling techniques through a regular Cartesian grid. The code accounts for parallel radiation propagation in multi-frequency bands (UV, soft-x, Lyman alpha) and self-consistent ionization of the atomic species composing the gas: H,He, atomic metals. The parallel architecture of the new CRASH4 has been presented at the CSC 2013 supercomputing conference (See:…).

While the current CRASH4 implementation (Graziani, Ciardi MNRAS 2014, in prep.) runs successfully on small clusters (e.g. the hybrid cluster of the Max Planck Institute, MPA-Garching), it needs further development to work efficiently on large Intel based HPC facilities, where it can perform large scale reionisation simulations necessary to interpret the new set of observational data coming from the next generation of radio telescopes (e.g. LOFAR or SKA). The high-resolution domain grids required to resolve structures in the gas distribution and the million sources present at this scales require HPC-class resources and Big Data management techniques. Because the CRASH4 algorithm is a Monte Carlo based scheme, up to 1d8 photon packets per source must be emitted to ensure the run convergence, further increasing the computational requirements of a single CRASH simulation.

The HPC porting project is structured in three steps: I) Enhancement of the synergy between the openMP modules implementing photon emission and propagation in different spectral bands, and the parallel scheme solving the chemical network in each cell domain. This step will be tuned on the high computational costs of the scattering processes induced by the Lyman alpha band and the resulting gas heating at high redshifts. II) Integration of the modular scheme described in I) with the existing MPI implementation required for large domain decomposition. At low redshifts in fact, the gas becomes fully transparent in the hydrogen component and the photons increase their free mean path, traveling the entire domain and involving a number of cells proportional to the adopted grid resolution. This step will target grid resolutions higher than 1024 cells/cube side. III) Implementation of Big Data strategies (mainly by Hadoop adoption) to managed the huge set of input and output data resulting form the increase of computational power obtained by I) and II).

Project name: Optimization of GWIMP-COMPSs, an efficient parallel computing framework for Genome-Wide IMPutation and association studies to identify novel molecular mechanisms for complex genetic diseases

Project leader: Dr Josep M Mercader; Barcelona Supercomputer Center, SPAIN
Collaborators: Mrs Sílvia Bonàs-Guarch; Mrs Marta Guindo; Mr Elias Rodríguez-Fos; Dr Friman Sánchez; Prof David Torrents
(Barcelona Supercomputer Center – ES)
Research field: Medicine and Life Sciences
Resource awarded: 250000 GPU core hours on MareNostrum @ BSC, Spain;



Abstract: Despite tremendous investments to identify causal genes for complex genetic diseases, such as diabetes, asthma, and others, through genome-wide association studies (GWAS), for the majority of the diseases, less than 10% of the variance attributable to genetic factors can be explained with the currently identified genetic variants. The goal of this project is to to make use of whole genome sequencing data from the 1000 Genomes project and UK10K Exome sequencing project to gain more information and statistical power in 69 GWAS datasets comprising at least 44 different diseases and more than 250.000 subjects. This will represent the analysis of around 1.750 billions of genotypes.

In order to perform this analysis, supercomputing resources and advanced statistical and systems biology techniques are required. For this, we have developed a Genome-Wide IMPutation work-flow that makes use of COMPSs parallel programming framework, termed GWIMP-COMPSs. This tool will allow the identification of new genetic regions in the genome that increase the risk for several diseases, as well as fine mapping the already known susceptibility risk regions. We will then use manually curated disease specific biological pathways and networks reconstructed by our group. Also using novel in-house developed pathway and network analysis methods we will identify new key biological processes and genes perturbed in subjects suffering from a variety of complex diseases, ranging from metabolic diseases, to psychiatric or autoimmune diseases. We will experimentally validate the findings by replication of the associations in other cohorts and by analysing the functional effect of the discovered variants in independent cohorts for which DNA and tissue banks are available.

We expect this project will allow the discovery of novel molecular mechanisms involved in a variety of complex diseases, opening new lines of research in several diseases, and will provide a novel framework to better exploit the existing available genetic data to better characterize the molecular bases of complex diseases.

Project name: Krylov multisplitting algorithms for large scale linear systems

Project leader: Prof Raphael Couturier; University of Franche-Comte, FRANCE
Collaborators: Dr Lilia Ziane Khodja
Research field: Mathematics and Computer Science
Resource awarded: 200.000 GPU core hours on Curie TN @ GENCI@CEA, France; 250.000 GPU core hours on JUQUEEN @ GCS@Jülich, Germany;



Abstract: Iterative methods are often used to solve large sparse linear systems because they are easier to parallelize than direct ones. However, the traditional iterative solvers suffer from scalability problems on a platform of many cores. They need global synchronizations and collective communications that penalize the scalability.

A possible solution consists in using multisplitting methods which could be more efficient than traditional ones on large scale architectures of thousands of cores. Multisplitting methods split a linear system into subsystems or blocks. Each subsystem is solved by a group of processors and the multisplitting method manages the interactions between groups taking into account the dependencies between the blocks. These methods have been studied for a long time.

They can be used with asynchronous iterations in order to remove synchronizations between the blocks. Most of the time, multisplitting methods converge very slowly to the solution, consequently, these methods cannot be applied in practice to solve linear systems faster than other methods.

We have developed a new solver based on the krylov multisplitting method. The principle is to mix a krylov method which increases the convergence only from time to time and a classical multisplitting method. We have tested our method with a 3D poisson problem of size 468^3 on a CRAY computer with 2048 cores and compared the execution times with those of the GMRES method. We have obtained a speed up of approximately 4 with 2 blocks of 1024 cores.

In this project, we would like to test the scalability of our method with a larger
number of processors and to improve some parts of our code.

Project name: Aerodynamic characterization of swirling flows in combustors

Project leader: Dr. MariaTeresa Parra-Santos; University of Valladolid, SPAIN
Collaborators: Mr. Victor Mendoza; Mr. Ruben Perez
(University of Valladolid – ES)
Research field: Engineering and Energy
Resource awarded: 250000 GPU core hours on MareNostrum @ BSC, Spain; 20000 MIC core hours on MareNostrum Hybrid Nodes @ BSC, Spain; 200000 GPU core hours on CURIE FN @ GENCI@CEA, France; 100.000 CPU core hours on Curie Hybrid @ GENCI@CEA, France; 200.000 GPU core hours on Curie TN @ GENCI@CEA, France;



Abstract: Assessment of Large Eddy Simulation (LES) models of confined coaxial swirling jets is the aim of this project. Despite the simple geometrical set-up of the benchmark testcase, the flow pattern shows complex aerodynamic behavior. The simple burner considers the use of two jets: one axial with fuel and another annular swirling jet with air. The expansion of the flow, when entering the chamber will produce the Outer Recirculation Zone (ORZ). If swirl number is large enough to let the flow turn back into the centre, the vortex break down phenomenon appears to form an Inner Recirculation Zone (IRZ). The region between both recirculation zones with high shear is where mixture of fuel-air occurs. Shear region is controlled by both, ORZ near the nozzle exit and the IRZ. The final application is to improve the stabilization of flames of poor mixtures by means of a swirling flow. This provides an efficient consume of fuel as well as a reduction of contaminant emissions. The selection and development of estate of the art numerical methods is a prerequisite to perform accurate numerical simulations to better predict flame stability. LES is to describe mixing more accurately than traditional Reynolds Averaged Navier Stokes (RANS) approaches.

As for postprocess tasks, instantaneous and time-averaged flow fields have to be analyzed. A thorough comparison with experimental data has to be carried out, as well as energy spectra and Proper Orthogonal Decomposition (POD) will be performed to establish the most energetic flow structures and its decay evolution. The temporal sampling evidences this center spins around the axis of the device forming the precessing vortex core (PVC) whose Strouhal numbers are more than two for Swirl numbers of one.

Project name: Large scale massively parallel Density Functional Tight Binding simulations of solids

Project leader: Dr. Balint Aradi; Bremen Center for Computational Materials Science, GERMANY
Collaborators: Dr. Benjamin Hourahine
(The University of Strathclyde – UK)
Research field: Chemistry and Materials
Resource awarded: 250.000 GPU core hours on SuperMUC @ GCS@LRZ, Germany;



Abstract: The Density Functional Tight Binding method is an approximate form of the Density Functional Theory (DFT) method, enabling a speedup of usually two or three orders of magnitude while still maintaining good accuracy. The DFTB+ code is an open source package which aims to provide a DFTB implementation which is both fast and efficient while at the same time also modular and extensible, so that various further developments of the original DFTB method can be implemented easily. Several such extensions have already been added, making DFTB+ probably the most versatile DFTB implementation currently available.

Until recently, parallelism in the DFTB+ code was only achieved via OpenMP parallelization and thread-parallel linear algebra libraries. In summer 2013, a new MPI based DFTB+ test version was released, in which diagonalization is performed via the ScaLAPACK library, with additional parallelization for spin channels and k-points within the code. While first tests with periodic systems consisting of up to four thousands atoms produced promising scaling results with up to 256 MPI processes, tests for larger systems and higher numbers of corescould not be carried out due to lack of access to large scale computing resources.

The current project has the goal to test the MPI-parallelized DFTB+ code for calculations on periodic systems consisting of tens of thousands of atoms. Those large scale calculations will be useful to identify and cure bottlenecks in the parallel implementation for these larger systems. It is important to apply the parallelized code to such problems at this stage, as for example, any required global design changes to increase scalability should be identified as early as possible.

It is important to note, that we would like to use periodic systems for the large scale testing of DFTB+ (see scientific case of the project), as this involves additional techniques (like Ewald-summation) not needed for non-periodic calculations. Also, periodic systems (especially those with incompletely filled bands) are much harder to tackle using order-N techniques asnon-periodic ones, so that it is of great importance to have well scaling codes using conventional O(N^3) diagonalization schemes for those applications.

Estimated project resource needs: 100,000 core-hours. We would like to use *one* of the clusters SuperMUC, MareNostrum or Curie Thin Nodes in the given order of preference.

Project name: Hypersonic Flow Simulations based on Boltzmann Kinetic Equations – Novel Parallelisation Method Implemented in Object-Oriented Framework

Project leader: Dr Rene Steijl; University of Liverpool, UNITED KINGDOM
Collaborators: Prof. George Barakos
(University of Liverpool – UK)
Research field: Engineering and Energy
Resource awarded: 250.000 GPU core hours on SuperMUC @ GCS@LRZ, Germany;



Abstract: Computational Fluid Dynamics simulations of hypersonic flows of practical interest require a wide range of flow physics phenomena to be adequately resolved. Such flows occur for example during the launch and re-entry phase of space launchers and spacecraft. Other examples include the flow around space-planes and hypersonic cruise vehicles, a number of which are under development or under consideration for use in the near future.

It is widely known that at high Mach numbers (M>5), and particularly at high altitude conditions, the Navier-Stokes equations fail to model the physics correctly.

In the present work, a parallel simulation method designed to accurately and efficiently model such complex flows is considered. A mathematical modelling of the flow at a more detailed level than the Navier-Stokes equations is employed to increase the accuracy and realism of the flow simulation. Here, kinetic models derived from the Boltzmann equation (BKG, ES-BGK, R-BGK) are used to capture the complex flow physics within hypersonic flow with rarefaction effects.For engineering applications, a deterministic computational technique for the kinetic equations, e.g. discrete-velocity or discrete ordinate methods, typically require unrealistic amounts of computational resources. For this reason the simulation techniques in this work invol
ve a hybrid approach where the kinetic equations are employed where the more computationally efficient Navier-Stokes equations fail while in the remainder of the domain the flow is assumed continuum and therefore the Navier-Stokes equations are used.

Due to the large computational and memory overhead created by the discretization in velocity space (in contrast to the continuum-solver, which typically stores 5 continuum-flow quantities, the kinetic stores (10^4) degrees of freedom per cell), the kinetic approach is only used locally. Still, the memory as well as CPU time requirements are considerable and therefore an efficient parallel implementation involving ’two’ levels of parallelism was conceived (explained later).

The solver is implemented as an application of an object-oriented C++ library developed at Liverpool University. The aim of the project is to assess the parallel performance of such hybrid simulations using the two-level parallelization strategy for different topologies of the flow domain as well as the investigation of a number of improvements to the parallel implementation aiming to resolve a number of performance bottlenecks.

The multi-physics hypersonic flow solver has so far been tested on a range of parallel computers, ranging from small local cluster (< 64 cores), a Tier-2 University cluster (up to 512 cores) as well as IBM BlueGene/Q (up to 2048 cores). Due to the relatively large memory requirements of the used deterministic discrete-velocity method for the kinetic flow domain, parallel computers of the BlueGene class are not the typical or most suitable machines used for scientific/engineering simulations of this type. State-of-the-art simulations for which the solver was developed, e.g. partially rarefied diatomic gas flows within substantial thermodynamic non-equilibrium effects around (generic) aerospace vehicles, would require O(10,000) cores (Intel Xeon or comparable) with 2GB or 4GB per core. The SuperMuc was selected, though other machines could be considered according to the PRACE availability. The proposed project’s main outcome would be to prepare the multi-physics solver for such large-scale simulations pushing forward the state-of-the-art in this field of science, for which a follow-on PRACE application is planned.

Project name: Performance improvement and parallelization of SMUFIN protocol based on OMPSS technology.

Project leader: Mr. David Torrents; Barcelona Supercomputing Center, SPAIN
Collaborators: Mr. Santi Gonzalez; Mr. Valenti Moncunill
(Barcelona Supercomputing Center – ES)
Research field: Medicine and Life Sciences
Resource awarded: 250000 GPU core hours on MareNostrum @ BSC, Spain;



Abstract: Next generation sequencing (NGS) technologies have significantly increased the speed in which genomic data is generated compared to previous years. This technological improvement has led to the production of large amount of genomic data that continuously grow every day. As a result, the ability of generating these data is exceeding the capability of scientists to analyse them. For instance, while the sequencing of a single human genome can be done in less than 24 hours, its analysis requires several days. Due to this scenario, current methods for genomic data analysis are becoming a growing computational challenge that must be addressed by using high-performance computing platforms (HPC).

One of the most complex analysis of genomic data is the one related to the identification and prediction of a wide range of somatic mutations. This process requires the development of accurate methods to identify diseases-associated mutations like cancer from whole-genome sequencing data. In this scenario we present a proposal to optimize and parallelize SMUFIN, a reference free program that through the analysis of DNA sequencing data detects wide range of somatic mutations. SMUFIN is the only available method which is able to catalog at the same time pointmutations, insertions, deletions, inversion and translocations. It also produce the most accurate results in terms of sensitivity and specificity reported ( sensitivity of 92% and 74% for single nucleotide variants and structural variants, with specificities of 95% and 91%, respectively). SMUFIN produces better results compared to current methods even with a combination of the existing ones which are specific for different types of mutations. At the same time our program brings an algorithm that can be computational optimized and adapted to HPC.

The close collaboration between the BSC- Life Science department, which has developed and tested the SMUFIN protocol, and the BSC- Computer Science department, which bring its support with OmpSs and its feedback in HPC systems, enable the proposal of a high-performance and parallel version of SMUFIN. The preliminary analysis of the potential parallelism of SMuFin v2.0 with OmpSs suggests that a few number of processors are enough to identify somatic mutations for a single patient. Now, with SMuFin v2.0, this process is performed faster than the time required to generate the corresponding genomic data through NGS.

Our objective is overcome the existing bottleneck in the computational prediction of somatic mutations. It will allow the massive analysis of patients and will support all the existing research groups to analyze thousand of patients for different diseases in a reasonable time scale.

Project name: High Performance release of the GAMESH pipeline

Project leader: Dr Luca Graziani; OAR – Observatory of Rome, ITALY
Collaborators: Mr Matteo De Bennassuti; Dr Raffaella Schneider; Dr. Stefania Salvadori
(OAR – Observatory of Rome – ITKapteyn Astronomical Institute – NL)
Research field: Astrophysics
Resource awarded: 200000 GPU core hours on CURIE FN @ GENCI@CEA, France; 200.000 GPU core hours on Curie TN @ GENCI@CEA, France;



Abstract: This project develops, adapts and tests the various components of the newly implemented pipeline GAMESH (Graziani et al MNRAS 2014 in prep.) on HPC facilities.

GAMESH combines the RT code CRASH4 and the merger-tree based code GAMETE, to study the interplay between radiative and chemical feedback effects. The pipeline also uses the outputs of large resolution NBody simulations of Milky Way galaxy to obtain spatial information on the star forming progenitor halos.

While the pipeline is already working on small clusters and performs reionisation simulations on low resolution N-Body simulations, it needs many improvements to be adapted to realistic cases with high resolution and to perform runs taking advantage of HPC facilities.

First, the GAMETE code needs openMP parallelisation to process many halos in parallel, second the storage system of the entire pipe must implement the Hadoop technology to handle high resolution data deriving from a more resolved N-Body simulation. Scaling and correct dump/reload procedures should finally be implemented in the pipeline coordination software, to correctly manage the ’terminate’ and ’re-submission’ operations on HPC queues. This operation is non trivial to implement because it involves the coordination of two independent engines, running as re-entering shell calls, and sharing a common execution time on the HPC queue. The project will take advantage from the CRASH side, by the results o
f the PRACE proposal 2010PA1522.

Project name: Parallel fully resolved blood flow simulations

Project leader: Dr Alfons Hoekstra; University of Amsterdam, NETHERLANDS
Collaborators: Dr Alfons Hoekstra; Dr Eric Lorenz; Mr Lampros Mountrakis
(University of Amsterdam – NL)
Research field: Medicine and Life Sciences
Resource awarded: 250,000 core hours on FERMI @ CINECA, Italy;



Abstract: Blood platelets, one of the main ingredients of a thrombus, exhibit an excess concentration near the walls of a blood vessel, thus ensuring a more effective homeostatic response to vessel and tissue damages. This margination can be attributed to the high concentration of red blood cells (RBCs), which apart from giving rise to the complex non-newtonian rheological behavior of blood, largely determine the transport properties of cells and other substances like white blood cell, platelets, oxygen and more. The transport behavior of platelets is one of the main topics of our research (L. Mountrakis et al., Interface Focus 3, 2 (2013),

Red blood cells are deformable biconcave disk-shaped cells, 8 microns in diameter and 2 microns thick. Under healthy conditions, they constitute 40 to 45% of the total blood volume. This high volume fraction of RBCs renders fully-resolved blood simulations which explicitly model RBCs and platelets computationally challenging . The reason is that for only 1 cubic millimeter of blood, 5 million RBCs are required. Yet, modeling blood through its constituents, allows capturing important rheological and transport properties of blood. And in combination with biochemical models allows unique and detailed studies of thrombosis in realistic anatomies. This will improve coarse grained blood models and enhance our understanding of underlying and basic physiological phenomena.

We developed a three-dimensional model, in which RBCs and platelets are suspended in a plasma like fluid. The Lattice Boltzmann Method (LBM) is used for solving the fluid flow and is coupled with the immersed boundary method (IBM) to a discrete element model (DEM) for RBCs and platelets. The resolution of the flow field is in the order of a micron and each RBC is composed of 162 to 642 vertices. This model is build on top of Palabos, an open-source C++ CFD solver with proven scaling capabilities. Nevertheless, the heavy usage of Lagrangian particles as vertices for the RBC model, shifts the weight of calculations from the fluid, to the calculation of cell mechanics and the coupling with the fluid, leaving LBM with a small fraction of the computations. This new picture requires additional work on the new bottlenecks as well as a separate scaling analysis.

The purpose of this project is to analyze the scaling performance of our current code, to identify and resolve parallelization bottlenecks, and further develop it for massively parallel systems. Next we will carry out extended validation runs of the blood model, significantly enhancing the range of validation (in terms of model parameters, such as shear rates or hematocrit). Moreover, this will also lead to optimization on the choice of various model parameters (number of vertices per cell, IBM kernel). This should lead to a validated model for fully resolved blood, and a highly efficient parallel code, ready for extreme scale production runs. The next step will then be to actually start using the code for the scientific case as described in section 2.

Type C – Code development with support from experts from PRACE

Project name: Parallel mesh partitioning in Alya

Project leader: Dr Guillaume Houzeaux; Barcelona Supercomputing Center, SPAIN
Research field: Engineering and Energy
Resource awarded: 250000 GPU core hours on MareNostrum @ BSC, Spain;



Abstract: Alya is a parallel computational mechanics code developed at Barcelona Supercomputing Center (BSC). Unfortunately, the pre-process part of Alya is still sequential. In order to be able to partition very large meshes (say between 100M to 10B elements) into a large number of subdomains (say 50k to 100k), the mesh reading and partitioning must be parallelized as well. This projects aims at including all the preprocessing steps of Alya inside the same MPI environment. That is, from mesh file reading, through the partitioning, until the end of the simulation process.

Project name: HORSE: High-order method for a new generatiOn of LaRge eddy Simulation solvEr

Project leader: Mr Jean-François Boussuge; CERFACS, FRANCE
Collaborators: Dr Guillaume Puigt
Research field: Engineering and Energy
Resource awarded: 200.000 GPU core hours on Curie TN @ GENCI@CEA, France;



Abstract: Two years ago, CERFACS started the development of a new CFD solver (called JAGUAR) based on unstructured High-Order Method (order of accuracy strictly superior than 2) to solve unsteady flows on massively parallel platforms. This approach can provide solutions to otherwise intractable fluid flow problems within complex geometries. These include vortex dominated flows (e.g. flow around the rotor of an helicopter or around high-lift devices), as well as problems in aeroacoustics (jets, landing gear systems…). In those examples, unstructured High-Order Method (HOM) with Large Eddy Simulation (LES) modelling is a good candidate.

Many HOM have been developed for structured mesh but the use of HOM on unstructured mesh is less common. We have chosen the “Spectral Difference Method” (SDM) approach, which is based on a local high-order polynomial representation of the solution with a specific treatment of the discontinuity at the element interfaces. This discontinuity is treated with a Riemann solver. In fact the philosophy of this approach is very close to the “Discontinuous Galerkin Method” (DGM). The DGM uses the Finite Element framework whereas the SDM uses the Finite Difference framework. SDM is easier to implement than DGM and requires less floating-point operations, leading to a reduced wall-clock time. Nevertheless, the HOM have the reputation to be CPU costly, so we have paid a particular attention to the global performance of the code, in serial mode and in parallel mode. The result is that today, the efficiency of the JAGUAR solver (expressed in microseconds per iteration or per degree of freedom) is similar or even better than standard low-order CFD solvers. To run on parallel platforms, different paradigms have been implemented in JAGUAR: MPI, MPI-CUDA and MPI-OpenMP. For flat MPI, we observe a perfect scalability from 1 to 2048 cores. For MPI-GPU, a speedup of 50 over 64 GPU cards is observed with an acceleration factor of 30 for a single GPU card versus a single Sandy Bridge core. The last paradigm (MPI-OpenMP) is not as efficient: half of the flat-MPI performance is only attained.

So the objectives of this project are twofold

1) First, we would like to analyze the flat MPI efficiency above 2048 core
2) Second, we would like to improve the efficiency of the hybrid paradigm MPI-OpenMP to reach the same behavior as flat MPI. Indeed, the target of JAGUAR is to be able to run LES simulations of billions of degrees of freedom on 100 000 cores in two years. A flat MPI approach could encounter problems which such a number of cores due to the number of MPI communications. Thus, in order to reduce drastically the stress on the MPI library the hybrid MPI-OpenMP approach seems a good candidate and need to be optimized.

Project name: Performance of the post-Wannier Berry-phase code for the anomalous Hall conductivity calculations.

Project leader: Dr Malgorzata Wierzbowska; University of Warsaw, POLAND
Collaborators: Dr Karolina Milowska; Dr Svjetlana Galamic-Mulaomerovic
(Ludwig-Maximilians-Universitat Munchen – DEBarbaTech Ltd – IE)
Research field: Chemistry and Materials
Resource awarded: 250.000 GPU core hours on SuperMUC @ GCS@LRZ, Germany;



Abstract: Thin layers of dilute magnetic semiconductors (DMS) open new technological possibilities – different from the bulk materials. The electric manipulation can change the level of hole/electron doping of a sample, and therefore the critical temperature for magnetic ordering. New mysterious phenomena have been found in (Ga,Mn)As thin layers, where the layer thickness, temperature, or applied magnetic field change the value and even a sign of anomalous Hall conductivity (AHC) (Chiba et al., PRL 104, 106601 (2010)). The explanation of these effects could reopen ways for the search of new spintronic materials, namely 2D magnetic semiconductors.

The new Wannier-based tool for the interpolation of any function derived from the Bloch vectors and the Hamiltonian enables accurate calculation of the quantities which were not accessible for an ab-initio treatment before. Especially, it is suited for the anomalous Hall conductivity, orbital magnetization, and optical conductivity.

We work with the Wannier90 v.2.0 code, and specifically with the Berry-phase post-processing tool. It has been demonstrated that this approach reproduces the AHC (Wang et al. PRB 76, 195109 (2007)) and orbital magnetizations (Lopez et al. PRB 85, 014435 (2012)) for simple metals.The scalability of this code needs to be improved to achieve good speed-up for thousands of cpu, to be able to treat large elementary cells for surfaces and thin layers. We plan to work on this code, in order to make it suitable for the Tier-0 system, and therefore for the large-scale production.

Project name: Memory optimization for the Octopus scientific code

Project leader: Prof. Angel Rubio; European Theoretical Spectroscopy Facility, SPAIN
Collaborators: Dr. Micael Oliveira
(Université of Liège – BE)
Research field: Fundamental Physics
Resource awarded: 250000 GPU core hours on MareNostrum @ BSC, Spain;



Abstract: Density Functional Theory (DFT) and its Time-Dependent (TDDFT) variant are a convenient quantum-mechanic approaches to study the electronic structure of molecular systems and its time evolution behaviour. It has been proved their capacity to provide accurate results on the description of a big variety of phenomena in a relatively cheap computational cost. Although improvements in scientific codes and HPC infrastructures made over the last years have considerably increased, the size of the systems that can be routinely simulated using TDDFT is still limited and performing calculations with thousands of atoms keep being a significant challenge.

The main objective of this project is to improve the performance of the OCTOPUS code by optimizing the initialization of the wave function. It has been shown that this process could be enhanced by reducing the huge memory required for this process. For this aim, the a new type of parallelization will be implemented for this stage.

The accomplishment of this project would open new possibilities for the treatment of larger systems that up to now were almost impossible reducing the time and memory limitations.