Basilio Bernardo
Fraguela Rodríguez
Catedrático de Universidad
Publicaciones (111) Publicaciones de Basilio Bernardo Fraguela Rodríguez
2024
-
A new thread-level speculative automatic parallelization model and library based on duplicate code execution
Journal of Supercomputing, Vol. 80, Núm. 10, pp. 13714-13737
-
STuning-DL: Model-Driven Autotuning of Sparse GPU Kernels for Deep Learning
IEEE Access, Vol. 12, pp. 70581-70599
2023
-
VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2023
-
VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores
International Conference for High Performance Computing, Networking, Storage and Analysis, SC
2022
-
A highly optimized skeleton for unbalanced and deep divide-and-conquer algorithms on multi-core clusters
Journal of Supercomputing, Vol. 78, Núm. 8, pp. 10434-10454
-
Probing the Efficacy of Hardware-Aware Weight Pruning to Optimize the SpMM routine on Ampere GPUs
Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
-
The New UPC++ DepSpawn High Performance Library for Data-Flow Computing with Hybrid Parallelism
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
2021
-
A Parallel Skeleton for Divide-and-conquer Unbalanced and Deep Problems
International Journal of Parallel Programming, Vol. 49, Núm. 6, pp. 820-845
-
A software cache autotuning strategy for dataflow computing with UPC++ DepSpawn
Computational and Mathematical Methods
-
High-performance dataflow computing in hybrid memory systems with UPC++ DepSpawn
Journal of Supercomputing, Vol. 77, Núm. 7, pp. 7676-7689
-
Opencnn: A winograd minimal filtering algorithm implementation in Cuda
Mathematics, Vol. 9, Núm. 17
-
ScalaParBiBit: scaling the binary biclustering in distributed-memory systems
Cluster Computing, Vol. 24, Núm. 3, pp. 2249-2268
2020
-
An automatic optimizer for heterogeneous devices
Future Generation Computer Systems, Vol. 106, pp. 572-584
2019
-
A Fast Solver for Large Tridiagonal Systems on Multi-Core Processors (Lass Library)
IEEE Access, Vol. 7, pp. 23365-23378
-
Analysis of interval-grouped data in weed science: The binnednp Rcpp package
Ecology and Evolution, Vol. 9, Núm. 19, pp. 10903-10915
-
Easy dataflow programming in clusters with UPC++ DepSpawn
IEEE Transactions on Parallel and Distributed Systems, Vol. 30, Núm. 6, pp. 1267-1282
-
Enhanced global optimization methods applied to complex fisheries stock assessment models
Applied Soft Computing Journal, Vol. 77, pp. 50-66
-
Portable and efficient FFT and DCT algorithms with the Heterogeneous Butterfly Processing Library
Journal of Parallel and Distributed Computing, Vol. 125, pp. 135-146
2018
-
Guiding the Optimization of Parallel Codes on Multicores Using an Analytical Cache Model
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
-
Heterogeneous distributed computing based on high-level abstractions
Concurrency Computation