Splitwise: Efficient generative LLM inference using phase splitting
Pratyush Patel, Esha Choukse, Chaojie Zhang, Aashaka Shah, Íñigo Goiri, Saeed Maleki, Ricardo Bianchini
ISCA | June 2024
Pratyush Patel, Esha Choukse, Chaojie Zhang, Aashaka Shah, Íñigo Goiri, Saeed Maleki, Ricardo Bianchini
ISCA | June 2024
Pratyush Patel, Esha Choukse, Chaojie Zhang, Íñigo Goiri, Brijesh Warrier, Nithish Mahalingam, Ricardo Bianchini
ASPLOS | April 2024
Pratyush Patel, Esha Choukse, Chaojie Zhang, Íñigo Goiri, Brijesh Warrier, Nithish Mahalingam, Ricardo Bianchini
ArXiv | August 2023, Vol abs/2308.12908
Pratyush Patel, Zibo Gong, S. Rizvi, Esha Choukse, Pulkit A. Misra, T. Anderson, Akshitha Sriraman, Esha Choukse, Pulkit Misra
IEEE Computer Architecture Letters | June 2023, Vol 22: pp. 141-144
Pratyush Patel, Esha Choukse, Chaojie Zhang, Aashaka Shah, Íñigo Goiri, Saeed Maleki, Ricardo Bianchini
ISCA | June 2024
Pratyush Patel, Esha Choukse, Chaojie Zhang, Íñigo Goiri, Brijesh Warrier, Nithish Mahalingam, Ricardo Bianchini
ASPLOS | April 2024
Pratyush Patel, Esha Choukse, Chaojie Zhang, Íñigo Goiri, Brijesh Warrier, Nithish Mahalingam, Ricardo Bianchini
ArXiv | August 2023, Vol abs/2308.12908
Pratyush Patel, Zibo Gong, S. Rizvi, Esha Choukse, Pulkit A. Misra, T. Anderson, Akshitha Sriraman, Esha Choukse, Pulkit Misra
IEEE Computer Architecture Letters | June 2023, Vol 22: pp. 141-144
Pratyush Patel, Esha Choukse, Chaojie Zhang, Aashaka Shah, Íñigo Goiri, Saeed Maleki, Ricardo Bianchini
ISCA | June 2024
Pratyush Patel, Esha Choukse, Chaojie Zhang, Íñigo Goiri, Brijesh Warrier, Nithish Mahalingam, Ricardo Bianchini
ASPLOS | April 2024
Pratyush Patel, Esha Choukse, Chaojie Zhang, Íñigo Goiri, Brijesh Warrier, Nithish Mahalingam, Ricardo Bianchini
ArXiv | August 2023, Vol abs/2308.12908
Pratyush Patel, Zibo Gong, S. Rizvi, Esha Choukse, Pulkit A. Misra, T. Anderson, Akshitha Sriraman, Esha Choukse, Pulkit Misra
IEEE Computer Architecture Letters | June 2023, Vol 22: pp. 141-144