Picture for George A. Constantinides

George A. Constantinides

Imperial College London

Exploring FPGA designs for MX and beyond

Add code
Jul 01, 2024
Figure 1 for Exploring FPGA designs for MX and beyond
Figure 2 for Exploring FPGA designs for MX and beyond
Figure 3 for Exploring FPGA designs for MX and beyond
Figure 4 for Exploring FPGA designs for MX and beyond
Viaarxiv icon

Unlocking the Global Synergies in Low-Rank Adapters

Add code
Jun 21, 2024
Viaarxiv icon

Optimised Grouped-Query Attention Mechanism for Transformers

Add code
Jun 21, 2024
Figure 1 for Optimised Grouped-Query Attention Mechanism for Transformers
Figure 2 for Optimised Grouped-Query Attention Mechanism for Transformers
Figure 3 for Optimised Grouped-Query Attention Mechanism for Transformers
Figure 4 for Optimised Grouped-Query Attention Mechanism for Transformers
Viaarxiv icon

NeuraLUT: Hiding Neural Network Density in Boolean Synthesizable Functions

Add code
Feb 29, 2024
Figure 1 for NeuraLUT: Hiding Neural Network Density in Boolean Synthesizable Functions
Figure 2 for NeuraLUT: Hiding Neural Network Density in Boolean Synthesizable Functions
Figure 3 for NeuraLUT: Hiding Neural Network Density in Boolean Synthesizable Functions
Figure 4 for NeuraLUT: Hiding Neural Network Density in Boolean Synthesizable Functions
Viaarxiv icon

LQER: Low-Rank Quantization Error Reconstruction for LLMs

Add code
Feb 04, 2024
Viaarxiv icon

Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?

Add code
Oct 21, 2023
Figure 1 for Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?
Figure 2 for Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?
Figure 3 for Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?
Figure 4 for Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?
Viaarxiv icon

PolyLUT: Learning Piecewise Polynomials for Ultra-Low Latency FPGA LUT-based Inference

Add code
Sep 05, 2023
Figure 1 for PolyLUT: Learning Piecewise Polynomials for Ultra-Low Latency FPGA LUT-based Inference
Figure 2 for PolyLUT: Learning Piecewise Polynomials for Ultra-Low Latency FPGA LUT-based Inference
Figure 3 for PolyLUT: Learning Piecewise Polynomials for Ultra-Low Latency FPGA LUT-based Inference
Figure 4 for PolyLUT: Learning Piecewise Polynomials for Ultra-Low Latency FPGA LUT-based Inference
Viaarxiv icon

FPGA Resource-aware Structured Pruning for Real-Time Neural Networks

Add code
Aug 09, 2023
Figure 1 for FPGA Resource-aware Structured Pruning for Real-Time Neural Networks
Figure 2 for FPGA Resource-aware Structured Pruning for Real-Time Neural Networks
Figure 3 for FPGA Resource-aware Structured Pruning for Real-Time Neural Networks
Figure 4 for FPGA Resource-aware Structured Pruning for Real-Time Neural Networks
Viaarxiv icon

ATHEENA: A Toolflow for Hardware Early-Exit Network Automation

Add code
Apr 17, 2023
Figure 1 for ATHEENA: A Toolflow for Hardware Early-Exit Network Automation
Figure 2 for ATHEENA: A Toolflow for Hardware Early-Exit Network Automation
Figure 3 for ATHEENA: A Toolflow for Hardware Early-Exit Network Automation
Figure 4 for ATHEENA: A Toolflow for Hardware Early-Exit Network Automation
Viaarxiv icon

Abstract Interpretation on E-Graphs

Add code
Mar 17, 2022
Figure 1 for Abstract Interpretation on E-Graphs
Figure 2 for Abstract Interpretation on E-Graphs
Viaarxiv icon