126.96.36.199. Architectures with 4-Input LUTs in Logic Elements
If your design can tolerate pipelining, the fastest way to add three numbers A, B, and C in devices that use 4-input lookup tables is to addA + B, register the output, and then add the registered output to C. Adding A + B takes one level of logic (one bit is added in one LE), so this runs at full clock speed. This can be extended to as many numbers as desired.
Adding five numbers in devices that use 4-input lookup tables requires four adders and three levels of registers for a total of 64 LEs (for 16-bit numbers).
Verilog HDL Pipelined Binary Tree
module binary_adder_tree (a, b, c, d, e, clk, out); parameter width = 16; input [width-1:0] a, b, c, d, e; input clk; output [width-1:0] out; wire [width-1:0] sum1, sum2, sum3, sum4; reg [width-1:0] sumreg1, sumreg2, sumreg3, sumreg4; // Registers always @ (posedge clk) begin sumreg1 <= sum1; sumreg2 <= sum2; sumreg3 <= sum3; sumreg4 <= sum4; end // 2-bit additions assign sum1 = A + B; assign sum2 = C + D; assign sum3 = sumreg1 + sumreg2; assign sum4 = sumreg3 + E; assign out = sumreg4; endmodule