Publications by the researcher in collaboration with Patricia González Gómez (40)

2020

  1. Fault tolerance of MPI applications in exascale systems: The ULFM solution

    Future Generation Computer Systems, Vol. 106, pp. 467-481

2018

  1. Insights into application-level solutions towards resilient MPI applications

    Proceedings - 2018 International Conference on High Performance Computing and Simulation, HPCS 2018

2017

  1. A portable and adaptable fault tolerance solution for heterogeneous applications

    Journal of Parallel and Distributed Computing, Vol. 104, pp. 146-158

  2. An application-level solution for the dynamic reconfiguration of mpi applications

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

  3. Assessing resilient versus stop-and-restart fault-tolerant solutions in MPI applications

    Journal of Supercomputing, Vol. 73, Núm. 1, pp. 316-329

  4. Resilient MPI applications using an application-level checkpointing framework and ULFM

    Journal of Supercomputing, Vol. 73, Núm. 1, pp. 100-113

2015

  1. I/O optimization in the checkpointing of OpenMP parallel applications

    Proceedings - 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2015

2014

  1. Extending an application-level checkpointing tool to provide fault tolerance support to openMP applications

    Journal of Universal Computer Science, Vol. 20, Núm. 9, pp. 1352-1372

  2. Failure avoidance in MPI applications using an application-level approach

    Computer Journal, Vol. 57, Núm. 1, pp. 100-114

  3. Improving an MPI application-level migration approach through checkpoint file splitting

    Proceedings - Symposium on Computer Architecture and High Performance Computing

  4. In-memory application-level checkpoint-based migration for MPI programs

    Journal of Supercomputing, Vol. 70, Núm. 2, pp. 660-670

2012

  1. Reducing application-level checkpoint file sizes: Towards scalable fault tolerance solutions

    Proceedings of the 2012 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, ISPA 2012

2011

  1. Analysis of performance-impacting factors on checkpointing frameworks: The CPPC case study

    Computer Journal, Vol. 54, Núm. 11, pp. 1821-1837

  2. Extending the Globus information service with the Common Information Model

    Proceedings - 9th IEEE International Symposium on Parallel and Distributed Processing with Applications, ISPA 2011