threshold-sklansky

4-bit Sklansky parallel prefix adder. Achieves minimum possible depth (log₂n) at the cost of maximum fanout. The fastest parallel prefix adder when fanout is not a constraint.

Function

S[3:0], Cout = A[3:0] + B[3:0] + Cin

Sklansky Structure (4-bit)

       G3,P3    G2,P2    G1,P1    G0,P0    Cin
         │        │        │        │       │
         │        │        │        │       │
Level 1  ●────────┤        ●────────┤       │
         │        │        │        │       │
         │        │        │        │       │
Level 2  ●────────●────────┴────────┘       │   ← HIGH FANOUT: G1:0 feeds both G2:0 and G3:0
         │        │                         │
         │        │                         │
       G3:0     G2:0     G1:0     G0       │
         │        │        │        │       │
         ▼        ▼        ▼        ▼       │
        XOR      XOR      XOR      XOR ─────┘
         │        │        │        │
         ▼        ▼        ▼        ▼
       Cout      S3       S2       S1       S0

The Fanout Problem

Sklansky's key characteristic is maximum fanout at the expense of wiring complexity:

                    Level 2 Fanout
                    ┌─────────────┐
                    │             │
     G3:2 ──────────┤             │
                    │   G1:0     ├──► feeds G2:0
     G3:0 ◄─────────┤   (prefix) ├──► feeds G3:0
                    │             │
                    └─────────────┘

     At 64 bits: G31:0 would fan out to 32 cells!

For large adders, this fanout becomes a critical path bottleneck due to wire capacitance.

Comparison: Parallel Prefix Adder Zoo

Adder Depth Prefix Cells Max Fanout Best For
Sklansky log₂(n) n·log₂(n)/2 n/2 Small adders, FPGAs
Kogge-Stone log₂(n) n·log₂(n)-n+1 2 Speed-critical, uniform load
Brent-Kung 2·log₂(n)-2 2n-2-log₂(n) 2 Area-constrained
Han-Carlson log₂(n)+1 varies varies Balanced trade-off

Why Sklansky Matters

  1. Optimal depth: Cannot do better than log₂(n) for parallel prefix
  2. Minimal prefix cells: Uses fewest cells among depth-optimal adders
  3. Simple structure: Regular, easy to generate programmatically
  4. FPGA-friendly: FPGAs handle fanout better than ASICs

Parallel Prefix Operation

(G_high, P_high) ○ (G_low, P_low) = (G_high + P_high·G_low, P_high·P_low)

This associative operation enables the parallel prefix tree construction.

Truth Table (Examples)

A B Cin S Cout Decimal
0000 0000 0 0000 0 0+0=0
0001 0001 0 0010 0 1+1=2
0110 0011 0 1001 0 6+3=9
1001 0111 0 0000 1 9+7=16
1111 1111 1 1111 1 15+15+1=31

Threshold Implementation

Generate (AND gate)

G_i: w[A_i]=1.0, w[B_i]=1.0, bias=-2.0
     fires when A_i + B_i >= 2 (both are 1)

Propagate (XOR via OR-NAND-AND)

P_i = (A_i OR B_i) AND (A_i NAND B_i)
    OR:   w=[1,1], b=-1  → fires if either input is 1
    NAND: w=[-1,-1], b=1 → fires unless both inputs are 1
    AND:  w=[1,1], b=-2  → fires if both OR and NAND fire

Parameters

Inputs 9 (A[3:0], B[3:0], Cin)
Outputs 5 (S[3:0], Cout)
Neurons 32
Layers 4
Parameters 132
Magnitude 56

Usage

from safetensors.torch import load_file
import torch

w = load_file('model.safetensors')

def sklansky_add(a3, a2, a1, a0, b3, b2, b1, b0, cin):
    a = [a0, a1, a2, a3]
    b = [b0, b1, b2, b3]

    # Generate and Propagate
    g = [a[i] & b[i] for i in range(4)]
    p = [a[i] ^ b[i] for i in range(4)]

    # Level 1: pairs
    g10 = g[1] | (p[1] & g[0])
    p10 = p[1] & p[0]
    g32 = g[3] | (p[3] & g[2])
    p32 = p[3] & p[2]

    # Level 2: Sklansky's high-fanout merge
    # G1:0 fans out to both G2:0 and G3:0 computation
    g20 = g[2] | (p[2] & g10)
    g30 = g32 | (p32 & g10)
    p30 = p32 & p10

    # Carries
    c0 = g[0] | (p[0] & cin)
    c1 = g10 | (p10 & cin)
    c2 = g20 | (p[2] & p10 & cin)
    c3 = g30 | (p30 & cin)

    # Sums
    return p[3]^c2, p[2]^c1, p[1]^c0, p[0]^cin, c3

# Test: 6 + 3 = 9
s3, s2, s1, s0, cout = sklansky_add(0,1,1,0, 0,0,1,1, 0)
print(f"6 + 3 = {s3*8 + s2*4 + s1*2 + s0}")  # Output: 9

Verification

All 512 input combinations (16 × 16 × 2) exhaustively verified correct.

Files

threshold-sklansky/
├── model.safetensors    # Threshold network weights
├── create_safetensors.py # Weight generation + exhaustive verification
├── config.json          # Circuit metadata
└── README.md

References

  • Sklansky, J. (1960). "Conditional-Sum Addition Logic"
  • IRE Transactions on Electronic Computers, EC-9(2), pp. 226-231
  • The original paper that introduced this construction

License

MIT

Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including phanerozoic/threshold-sklansky

Free AI Image Generator No sign-up. Instant results. Open Now