NorthPole - IBM

Explore IBM's NorthPole, a neural inference architecture that eliminates off-chip memory by intertwining compute with memory on-chip for state-of-the-art energy efficiency.

NorthPole At A Glance

Release Year: 2023
Status: Released
Chip Type: Digital
Software: Custom end-to-end toolchain
Applications: Neural inference, Image classification, Object detection
Neurons: 256 cores
Synapses: 224 MB on-chip SRAM
Weight bits: 8, 4, 2
Activation bits: 8, 4, 2
Power: ~74 W

NorthPole is a neural inference architecture that blurs the boundary between compute and memory by eliminating off-chip memory and intertwining compute with memory on-chip. It is a low-precision, massively parallel, and energy-efficient spatial computing architecture.

Overview

NorthPole is a revolutionary neural inference architecture developed by IBM that fundamentally reimagines the interaction between compute and memory. Inspired by the brain’s efficiency, it eliminates the “memory wall” by removing off-chip memory entirely. Instead, it features a massively parallel, densely interconnected system where memory is intertwined with compute on a single chip.

This “spatial computing” design makes the entire chip function as an active memory, where data movement is minimized, leading to dramatic gains in energy efficiency, space utilization, and latency. NorthPole is specialized for neural network inference, supporting low-precision 8, 4, and 2-bit operations, which are sufficient for state-of-the-art accuracy in many AI tasks.

Architecture

The NorthPole chip is fabricated in a 12-nm process and consists of an array of 256 cores. Each core contains its own memory and compute units, allowing for massive parallelism. Key architectural features include:

  • Cores: 256 digital, programmable cores. Each core can perform 2048 8-bit operations per cycle.
  • On-Chip Memory: A total of 224 MB of on-chip SRAM is distributed across the cores, with each core having access to its local memory bank.
  • Network-on-Chip (NoC): The architecture uses four distinct NoCs to manage data flow: one for feature map activations, one for inter-core communication, one for loading model weights, and one for instructions. This design is inspired by the white-matter and gray-matter pathways in the brain.
  • Data Precision: Natively supports 8-bit, 4-bit, and 2-bit integer precision for weights and activations, enabling significant efficiency gains.
  • No Off-Chip Memory: The entire model is stored on-chip, which means that once configured, NorthPole operates self-sufficiently without needing to access external DRAM, drastically reducing energy consumption.

Software and Tools

IBM has developed a complete, co-designed software toolchain for NorthPole. This toolchain automates the process of mapping a pre-trained neural network onto the chip’s architecture. It handles:

  • Model Mapping: Spatially mapping the layers of a neural network across the 256-core array.
  • Orchestration: Generating an explicit schedule for all computation, memory access, and communication to ensure high utilization and prevent resource collisions.
  • Quantization-Aware Training: The toolchain supports algorithms that incorporate low-precision constraints during the training phase, allowing models to maintain high accuracy when deployed on the hardware.

Externally, the chip operates as a simple active memory device with three main commands: write input, run network, and read output, making it easy to integrate into larger systems.

DateTitleAuthorsVenue/Source
October 2023Neural inference at the frontier of energy, space, and timeDharmendra S. Modha et al.Science

Help Us Improve this Guide

Our hardware guide is community-maintained. If you know of a chip we should add, see an error, or have updated information, please let us know by opening an issue on our GitHub repository.

Spiking Neural Network (SNN) Library Benchmarks

Spiking Neural Network (SNN) Library Benchmarks

Discover the fastest Spiking Neural Network (SNN) frameworks for deep learning-based optimization. Performance, flexibility, and more analyzed in-depth

Digital Neuromorphic Hardware Read List

Digital Neuromorphic Hardware Read List

Stay up-to-date with cutting-edge digital hardware designs for neuromorphic applications. Explore recent research on power-efficient event-driven spiking neural networks and state-of-the-art processors like TrueNorth and Loihi.

Spiking Neurons: A Digital Hardware Implementation

Spiking Neurons: A Digital Hardware Implementation

Learn how to model Leaky Integrate and Fire (LIF) neurons in digital hardware. Understand spike communication, synapse integration, and more for hardware implementation.