Brian Towles
Electrical and Computer Engineering
Adjunct Assistant Professor in the Department of Electrical andComputer Engineering
Education
- D.Phil. Stanford University, 2005
Positions
- Adjunct Assistant Professor in the Department of Electrical andComputer Engineering
Courses Taught
- ECE 652: Advanced Computer Architecture II
- COMPSCI 650: Advanced Computer Architecture II
Publications
- Zu Y, Ghaffarkhah A, Dang HV, Towles B, Hand S, Huda S, et al. Resiliency at Scale: Managing Google’s TPUv4 Machine Learning Supercomputer. In: Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation, NSDI 2024. 2024. p. 761–74.
- Jouppi NP, Kurian G, Li S, Ma P, Nagarajan R, Nai L, et al. TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings. In: Proceedings - International Symposium on Computer Architecture. 2023. p. 1147–60.
- Shim KS, Greskamp B, Towles B, Edwards B, Grossman JP, Shaw DE. The Specialized High-Performance Network on Anton 3. In: Proceedings - International Symposium on High-Performance Computer Architecture. 2022. p. 1211–23.
- Shaw DE, Adams PJ, Azaria A, Bank JA, Batson B, Bell A, et al. Anton 3: Twenty Microseconds of Molecular Dynamics Simulation before Lunch. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC. 2021.
- Adams PJ, Batson B, Bell A, Bhatt J, Butts JA, Correia T, et al. The ΛnTON 3 ASIC: A fire-breathing monster for molecular dynamics simulations. In: 2021 IEEE Hot Chips 33 Symposium, HCS 2021. 2021.
- Predescu C, Lerer AK, Lippert RA, Towles B, Grossman JP, Dirks RM, et al. The u-series: A separable decomposition for electrostatics computation with improved accuracy. The Journal of chemical physics. 2020 Feb;152(8):084113.
- Grossman JP, Towles B, Greskamp B, Shaw DE. Filtering, Reductions and Synchronization in the Anton 2 Network. In: Proceedings - 2015 IEEE 29th International Parallel and Distributed Processing Symposium, IPDPS 2015. 2015. p. 860–70.
- Butts JA, Batson B, Chao JC, Deneroff MM, Dror RO, Fenton CH, et al. The ANTON 2 chip a second-generation ASIC for molecular dynamics. In: 2014 IEEE Hot Chips 26 Symposium, HCS 2014. 2014.
- Shaw DE, Grossman JP, Bank JA, Batson B, Butts JA, Chao JC, et al. Anton 2: Raising the Bar for Performance and Programmability in a Special-Purpose Molecular Dynamics Supercomputer. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC. 2014. p. 41–53.
- Towles B, Grossman JP, Greskamp B, Shaw DE. Unifying on-chip and inter-node switching within the Anton 2 network. In: Proceedings - International Symposium on Computer Architecture. 2014. p. 1–12.
- Grossman JP, Towles B, Bank JA, Shaw DE. The role of cascade, a cycle-based simulation infrastructure, in designing the anton special-purpose supercomputers. In: Proceedings - Design Automation Conference. 2013.
- Grossman JP, Kuskin JS, Bank JA, Theobald M, Dror RO, Ierardi DJ, et al. Hardware support for fine-grained event-driven computation in anton 2. In: International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS. 2013. p. 549–60.
- Grossman JP, Kuskin JS, Bank JA, Theobald M, Dror RO, Ierardi DJ, et al. Hardware support for fine-grained event-driven computation in anton 2. In: ACM SIGPLAN Notices. 2013. p. 549–60.
- Jiang N, Balfour J, Becker DU, Towles B, Dally WJ, Michelogiannakis G, et al. A detailed and flexible cycle-accurate Network-on-Chip simulator. In: ISPASS 2013 - IEEE International Symposium on Performance Analysis of Systems and Software. 2013. p. 86–96.
- Dror RO, Grossman JP, MacKenzie KM, Towles B, Chow E, Salmon JK, et al. Overcoming communication latency barriers in massively parallel scientific computation. IEEE Micro. 2011 May 1;31(3):8–19.
- Dror RO, Grossman JP, Mackenzie KM, Towles B, Chow E, Salmon JK, et al. Exploiting 162-nanosecond end-to-end communication latency on Anton. In: 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010. 2010.
- Shaw DE, Dror RO, Salmon JK, Grossman JP, MacKenzie KM, Bank JA, et al. Millisecond-scale molecular dynamics simulations on Anton. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC ’09. 2009.
- Grossman JP, Salmon JK, Ho CR, Ierardi DJ, Towles B, Batson B, et al. Hierarchical simulation-based verification of anton, a special-purpose parallel machine. In: 26th IEEE International Conference on Computer Design 2008, ICCD. 2008. p. 340–7.
- Shaw DE, Deneroff MM, Dror RO, Kuskin JS, Larson RH, Salmon JK, et al. Anton, a special-purpose machine for molecular dynamics simulation. Communications of the ACM. 2008 Jul 1;51(7):91–7.
- Shaw DE, Deneroff MM, Dror RO, Kuskin JS, Larson RH, Salmon JK, et al. Anton, a special-purpose machine for molecular dynamics simulation. In: Proceedings - International Symposium on Computer Architecture. 2007. p. 1–12.
- Kim J, Dally WJ, Towles B, Gupta AK. Microarchitecture of a high-radix router. In: Proceedings - International Symposium on Computer Architecture. 2005. p. 420–31.
- Singh A, Dally WJ, Gupta AK, Towles B. Adaptive channel queue routing on k-ary n-cubes. In: Annual ACM Symposium on Parallel Algorithms and Architectures. 2004. p. 11–9.
- Singh A, Dally WJ, Towles B, Gupta AK. Globally Adaptive Load-Balanced Routing on Tori. IEEE Computer Architecture Letters. 2004 Jan 1;3(1):2.
- Towles B, Dally WJ. Guaranteed scheduling for switches with configuration overhead. IEEE/ACM Transactions on Networking. 2003 Oct 1;11(5):835–47.
- Singh A, Dally WJ, Gupta AK, Towles B. GOAL: A load-balanced adaptive routing algorithm for torus networks. In: Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA. 2003. p. 194–205.
- Khailany B, Dally WJ, Rixner S, Kapasi UJ, Owens JD, Towles B. Exploring the VLSI scalability of stream processors. In: Proceedings - International Symposium on High-Performance Computer Architecture. 2003. p. 153–64.
- Towles B, Dally WJ, Boyd S. Throughput-centric routing algorithm design. In: Annual ACM Symposium on Parallel Algorithms and Architectures. 2003. p. 200–9.
- Owens JD, Khailany B, Towles B, Dally WJ. Comparing Reyes and OpenGL on a stream architecture. In: Proceedings of the SIGGRAPH/Eurographics Workshop on Graphics Hardware. 2002. p. 47–56.
- Towles B, Dally WJ. Worst-case traffic for oblivious routing functions. IEEE Computer Architecture Letters. 2002 Jan 1;1(1):4.
- Towles B, Dally WJ. Guaranteed scheduling for switches with configuration overhead. In: Proceedings - IEEE INFOCOM. 2002. p. 342–51.
- Gupta AK, Dally WJ, Singh A, Towles B. Scalable opto-electronic network (SOENet). In: Proceedings - Symposium on the High Performance Interconnects, Hot Interconnects. 2002. p. 71–6.
- Khailany B, Dally WJ, Chang A, Kapasi UJ, Namkoong J, Towles B. VLSI design and verification of the imagine processor. In: Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors. 2002. p. 289–94.
- Owens JD, Rixner S, Kapasi UJ, Mattson P, Towles B, Serebrin B, et al. Media processing applications on the imagine stream processor. Proceedings-IEEE International Conference on Computer Design: VLSI in Computers and Processors. 2002 Jan 1;295–302.
- Towles B, Dally WJ. Worst-case traffic for oblivious routing functions. In: Annual ACM Symposium on Parallel Algorithms and Architectures. 2002. p. 1–8.
- Singh A, Dally WJ, Towles B, Gupta AK. Locality-preserving randomized oblivious routing on torus networks. In: Annual ACM Symposium on Parallel Algorithms and Architectures. 2002. p. 9–19.
- Khailany B, Dally WJ, Kapasi UJ, Mattson P, Namkoong J, Owens JD, et al. Imagine: Media processing with streams. IEEE Micro. 2001 Mar 1;21(2):35–46.
- Dally WJ, Towles B. Route packets, not wires: On-chip interconnection networks. In: Proceedings - Design Automation Conference. 2001. p. 684–9.