Diego
Andrade Canosa
Profesor Titular de Universidade
Publicacións (42) Publicacións de Diego Andrade Canosa
2024
-
STuning-DL: Model-Driven Autotuning of Sparse GPU Kernels for Deep Learning
IEEE Access, Vol. 12, pp. 70581-70599
2023
-
Influence of face-to-face in hybrid teaching: A case of study for a master
Iberian Conference on Information Systems and Technologies, CISTI
-
VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2023
-
VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores
International Conference for High Performance Computing, Networking, Storage and Analysis, SC
2022
-
Probing the Efficacy of Hardware-Aware Weight Pruning to Optimize the SpMM routine on Ampere GPUs
Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
-
The New UPC++ DepSpawn High Performance Library for Data-Flow Computing with Hybrid Parallelism
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
2021
-
A software cache autotuning strategy for dataflow computing with UPC++ DepSpawn
Computational and Mathematical Methods
-
High-performance dataflow computing in hybrid memory systems with UPC++ DepSpawn
Journal of Supercomputing, Vol. 77, Núm. 7, pp. 7676-7689
-
Opencnn: A winograd minimal filtering algorithm implementation in Cuda
Mathematics, Vol. 9, Núm. 17
-
ScalaParBiBit: scaling the binary biclustering in distributed-memory systems
Cluster Computing, Vol. 24, Núm. 3, pp. 2249-2268
2020
-
An automatic optimizer for heterogeneous devices
Future Generation Computer Systems, Vol. 106, pp. 572-584
2019
-
A Fast Solver for Large Tridiagonal Systems on Multi-Core Processors (Lass Library)
IEEE Access, Vol. 7, pp. 23365-23378
-
Easy dataflow programming in clusters with UPC++ DepSpawn
IEEE Transactions on Parallel and Distributed Systems, Vol. 30, Núm. 6, pp. 1267-1282
-
Using Artificial Vision Techniques for Individual Player Tracking in Sport Events
XoveTIC 2019: The 2nd XoveTIC Conference (XoveTIC 2019), A Coruña, Spain, 5–6 September
2018
-
Guiding the Optimization of Parallel Codes on Multicores Using an Analytical Cache Model
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
-
Heterogeneous distributed computing based on high-level abstractions
Concurrency Computation
2017
-
Facilitating the development of stencil applications using the Heterogeneous Programming Library
Concurrency Computation , Vol. 29, Núm. 12
-
High productivity multi-device exploitation with the Heterogeneous Programming Library
Journal of Parallel and Distributed Computing, Vol. 101, pp. 51-68
2016
-
Towards a High Level Approach for the Programming of Heterogeneous Clusters
Proceedings of the International Conference on Parallel Processing Workshops
-
Writing a performance-portable matrix multiplication
Parallel Computing, Vol. 52, pp. 65-77