Brian Towles
Electrical and Computer Engineering
Adjunct Assistant Professor in the Department of Electrical andComputer Engineering

Education
- D.Phil. Stanford University, 2005
Positions
- Adjunct Assistant Professor in the Department of Electrical andComputer Engineering
Courses Taught
- ECE 652: Advanced Computer Architecture II
- COMPSCI 650: Advanced Computer Architecture II
Publications
- Zu Y, Ghaffarkhah A, Dang HV, Towles B, Hand S, Huda S, et al. Resiliency at Scale: Managing Googleu2019s TPUv4 Machine Learning Supercomputer. In: Proceedings of the 21st Usenix Symposium on Networked Systems Design and Implementation Nsdi 2024. 2024. p. 761u201374.
- Jouppi NP, Kurian G, Li S, Ma P, Nagarajan R, Nai L, et al. TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings. In: Proceedings International Symposium on Computer Architecture. 2023. p. 1147u201360.
- Shim KS, Greskamp B, Towles B, Edwards B, Grossman JP, Shaw DE. The Specialized High-Performance Network on Anton 3. In: Proceedings International Symposium on High Performance Computer Architecture. 2022. p. 1211u201323.
- Shaw DE, Adams PJ, Azaria A, Bank JA, Batson B, Bell A, et al. Anton 3: Twenty Microseconds of Molecular Dynamics Simulation before Lunch. In: International Conference for High Performance Computing Networking Storage and Analysis Sc. 2021.
- Adams PJ, Batson B, Bell A, Bhatt J, Butts JA, Correia T, et al. The u039bnTON 3 ASIC: A fire-breathing monster for molecular dynamics simulations. In: 2021 IEEE Hot Chips 33 Symposium Hcs 2021. 2021.
- Predescu C, Lerer AK, Lippert RA, Towles B, Grossman JP, Dirks RM, et al. The u-series: A separable decomposition for electrostatics computation with improved accuracy. The Journal of chemical physics. 2020 Feb;152(8):084113.
- Grossman JP, Towles B, Greskamp B, Shaw DE. Filtering, Reductions and Synchronization in the Anton 2 Network. In: Proceedings 2015 IEEE 29th International Parallel and Distributed Processing Symposium IPDPS 2015. 2015. p. 860u201370.
- Butts JA, Batson B, Chao JC, Deneroff MM, Dror RO, Fenton CH, et al. The ANTON 2 chip a second-generation ASIC for molecular dynamics. In: 2014 IEEE Hot Chips 26 Symposium Hcs 2014. 2014.
- Shaw DE, Grossman JP, Bank JA, Batson B, Butts JA, Chao JC, et al. Anton 2: Raising the Bar for Performance and Programmability in a Special-Purpose Molecular Dynamics Supercomputer. In: International Conference for High Performance Computing Networking Storage and Analysis Sc. 2014. p. 41u201353.
- Towles B, Grossman JP, Greskamp B, Shaw DE. Unifying on-chip and inter-node switching within the Anton 2 network. In: Proceedings International Symposium on Computer Architecture. 2014. p. 1u201312.
- Grossman JP, Towles B, Bank JA, Shaw DE. The role of cascade, a cycle-based simulation infrastructure, in designing the anton special-purpose supercomputers. In: Proceedings Design Automation Conference. 2013.
- Grossman JP, Kuskin JS, Bank JA, Theobald M, Dror RO, Ierardi DJ, et al. Hardware support for fine-grained event-driven computation in anton 2. In: International Conference on Architectural Support for Programming Languages and Operating Systems ASPLOS. 2013. p. 549u201360.
- Grossman JP, Kuskin JS, Bank JA, Theobald M, Dror RO, Ierardi DJ, et al. Hardware support for fine-grained event-driven computation in anton 2. In: ACM SIGPLAN Notices. 2013. p. 549u201360.
- Jiang N, Balfour J, Becker DU, Towles B, Dally WJ, Michelogiannakis G, et al. A detailed and flexible cycle-accurate Network-on-Chip simulator. In: Ispass 2013 IEEE International Symposium on Performance Analysis of Systems and Software. 2013. p. 86u201396.
- Dror RO, Grossman JP, MacKenzie KM, Towles B, Chow E, Salmon JK, et al. Overcoming communication latency barriers in massively parallel scientific computation. IEEE Micro. 2011 May 1;31(3):8u201319.
- Dror RO, Grossman JP, Mackenzie KM, Towles B, Chow E, Salmon JK, et al. Exploiting 162-nanosecond end-to-end communication latency on Anton. In: 2010 ACM IEEE International Conference for High Performance Computing Networking Storage and Analysis Sc 2010. 2010.
- Shaw DE, Dror RO, Salmon JK, Grossman JP, MacKenzie KM, Bank JA, et al. Millisecond-scale molecular dynamics simulations on Anton. In: Proceedings of the Conference on High Performance Computing Networking Storage and Analysis Sc 09. 2009.
- Grossman JP, Salmon JK, Ho CR, Ierardi DJ, Towles B, Batson B, et al. Hierarchical simulation-based verification of anton, a special-purpose parallel machine. In: 26th IEEE International Conference on Computer Design 2008 Iccd. 2008. p. 340u20137.
- Shaw DE, Deneroff MM, Dror RO, Kuskin JS, Larson RH, Salmon JK, et al. Anton, a special-purpose machine for molecular dynamics simulation. Communications of the ACM. 2008 Jul 1;51(7):91u20137.
- Shaw DE, Deneroff MM, Dror RO, Kuskin JS, Larson RH, Salmon JK, et al. Anton, a special-purpose machine for molecular dynamics simulation. In: Proceedings International Symposium on Computer Architecture. 2007. p. 1u201312.
- Kim J, Dally WJ, Towles B, Gupta AK. Microarchitecture of a high-radix router. In: Proceedings International Symposium on Computer Architecture. 2005. p. 420u201331.
- Singh A, Dally WJ, Gupta AK, Towles B. Adaptive channel queue routing on k-ary n-cubes. In: Annual ACM Symposium on Parallel Algorithms and Architectures. 2004. p. 11u20139.
- Singh A, Dally WJ, Towles B, Gupta AK. Globally Adaptive Load-Balanced Routing on Tori. IEEE Computer Architecture Letters. 2004 Jan 1;3(1):2.
- Towles B, Dally WJ. Guaranteed scheduling for switches with configuration overhead. IEEE ACM Transactions on Networking. 2003 Oct 1;11(5):835u201347.
- Singh A, Dally WJ, Gupta AK, Towles B. GOAL: A load-balanced adaptive routing algorithm for torus networks. In: Conference Proceedings Annual International Symposium on Computer Architecture ISCA. 2003. p. 194u2013205.
- Towles B, Dally WJ, Boyd S. Throughput-centric routing algorithm design. In: Annual ACM Symposium on Parallel Algorithms and Architectures. 2003. p. 200u20139.
- Khailany B, Dally WJ, Rixner S, Kapasi UJ, Owens JD, Towles B. Exploring the VLSI scalability of stream processors. In: Proceedings International Symposium on High Performance Computer Architecture. 2003. p. 153u201364.
- Owens JD, Khailany B, Towles B, Dally WJ. Comparing Reyes and OpenGL on a stream architecture. In: Proceedings of the SIGGRAPH Eurographics Workshop on Graphics Hardware. 2002. p. 47u201356.
- Towles B, Dally WJ. Worst-case traffic for oblivious routing functions. IEEE Computer Architecture Letters. 2002 Jan 1;1(1):4.
- Owens JD, Rixner S, Kapasi UJ, Mattson P, Towles B, Serebrin B, et al. Media processing applications on the imagine stream processor. Proceedings IEEE International Conference on Computer Design VLSI in Computers and Processors. 2002 Jan 1;295u2013302.
- Towles B, Dally WJ. Guaranteed scheduling for switches with configuration overhead. In: Proceedings IEEE INFOCOM. 2002. p. 342u201351.
- Gupta AK, Dally WJ, Singh A, Towles B. Scalable opto-electronic network (SOENet). In: Proceedings Symposium on the High Performance Interconnects Hot Interconnects. 2002. p. 71u20136.
- Towles B, Dally WJ. Worst-case traffic for oblivious routing functions. In: Annual ACM Symposium on Parallel Algorithms and Architectures. 2002. p. 1u20138.
- Khailany B, Dally WJ, Chang A, Kapasi UJ, Namkoong J, Towles B. VLSI design and verification of the imagine processor. In: Proceedings IEEE International Conference on Computer Design VLSI in Computers and Processors. 2002. p. 289u201394.
- Singh A, Dally WJ, Towles B, Gupta AK. Locality-preserving randomized oblivious routing on torus networks. In: Annual ACM Symposium on Parallel Algorithms and Architectures. 2002. p. 9u201319.
- Khailany B, Dally WJ, Kapasi UJ, Mattson P, Namkoong J, Owens JD, et al. Imagine: Media processing with streams. IEEE Micro. 2001 Mar 1;21(2):35u201346.
- Dally WJ, Towles B. Route packets, not wires: On-chip interconnection networks. In: Proceedings Design Automation Conference. 2001. p. 684u20139.