**Introduction**

DDR2 SDRAM is the second generation of DDR SDRAM technology, with improvements that include lower power consumption, higher data bandwidth, enhanced signal quality, and on-die termination (ODT) schemes. DDR2 SDRAM brings higher memory performance to a broad range of applications, such as PCs, embedded processor systems, image processing, storage, communications, and networking.

Altera® Stratix® III and Stratix IV devices support DDR and DDR2 SDRAM interfaces with dedicated DQS circuitry. Table 1 displays the maximum clock frequency for DDR and DDR2 SDRAM in Stratix III devices.

<table>
<thead>
<tr>
<th>Type</th>
<th>Rate</th>
<th>Speed Grade</th>
<th>f_MAX (MHz)</th>
</tr>
</thead>
<tbody>
<tr>
<td>DDR SDRAM</td>
<td>Half</td>
<td>–2</td>
<td>200</td>
</tr>
<tr>
<td></td>
<td></td>
<td>–3</td>
<td>200</td>
</tr>
<tr>
<td></td>
<td></td>
<td>–4</td>
<td>200</td>
</tr>
<tr>
<td></td>
<td>–4L (4)</td>
<td>200</td>
<td></td>
</tr>
<tr>
<td></td>
<td>Full</td>
<td>–2</td>
<td>200</td>
</tr>
<tr>
<td></td>
<td></td>
<td>–3</td>
<td>200</td>
</tr>
<tr>
<td></td>
<td></td>
<td>–4</td>
<td>200</td>
</tr>
<tr>
<td></td>
<td>–4L (4)</td>
<td>167</td>
<td></td>
</tr>
</tbody>
</table>
Table 2 displays the maximum clock frequency for DDR and DDR2 SDRAM in Stratix IV devices.

<table>
<thead>
<tr>
<th>Type</th>
<th>Rate</th>
<th>Speed Grade</th>
<th>f_MAX (MHz)</th>
</tr>
</thead>
<tbody>
<tr>
<td>DDR2 SDRAM</td>
<td>Half</td>
<td>–2</td>
<td>400</td>
</tr>
<tr>
<td></td>
<td></td>
<td>–3</td>
<td>333</td>
</tr>
<tr>
<td></td>
<td></td>
<td>–4</td>
<td>333 (5)</td>
</tr>
<tr>
<td></td>
<td></td>
<td>–4L (4)</td>
<td>200</td>
</tr>
<tr>
<td></td>
<td>Full</td>
<td>–2</td>
<td>267</td>
</tr>
<tr>
<td></td>
<td></td>
<td>–3</td>
<td>233</td>
</tr>
<tr>
<td></td>
<td></td>
<td>–4</td>
<td>200</td>
</tr>
<tr>
<td></td>
<td></td>
<td>–4L (4)</td>
<td>167</td>
</tr>
</tbody>
</table>

Notes to Table 1:
(1) Numbers are preliminary until characterization is final. The supported operating frequencies are memory interface maximums for the device family. Your design’s actual achievable performance is based on design and system specific factors and static timing analysis of the completed design.
(2) Applies to both DIMMs and components.
(3) Applies to both commercial and industrial devices.
(4) Performance is based on 0.9-V core voltage. At 1.1-V core voltage, the –4L speed grade devices have the same performance as the –4 speed grade devices.
(5) Timing cannot be closed at the target speed in the Quartus II software version 8.0 but you can generate programming files. You can use these designs for prototyping and testing, but you should not go to production until Altera releases IP that can achieve these speeds.

Table 2 displays the maximum clock frequency for DDR and DDR2 SDRAM in Stratix IV devices.

<table>
<thead>
<tr>
<th>Type</th>
<th>Speed Grade</th>
<th>f_MAX (MHz)</th>
</tr>
</thead>
<tbody>
<tr>
<td>DDR SDRAM</td>
<td>–2</td>
<td>200</td>
</tr>
<tr>
<td></td>
<td>–3</td>
<td>200</td>
</tr>
<tr>
<td></td>
<td>–4</td>
<td>200</td>
</tr>
</tbody>
</table>
Background

This application note describes the FPGA design flow to implement external memory interfaces using Stratix III and Stratix IV devices, and provides design guidelines.

DDR and DDR2 SDRAMs are available as components and modules, such as DIMMs, SODIMMs, and RDIMMs. This application note describes implementing DDR and DDR2 SDRAM with Stratix III, Stratix IV, or HardCopy III devices, including information on electrical and timing analysis, and the generation of a complete board-level system that you may use to demonstrate and validate the interface.

Stratix III and Stratix IV devices feature a similar input/output element (IOE) structure, so they effectively have the same external memory interface capabilities. HardCopy III devices may also be considered to have identical capabilities to their companion devices.

Throughout this document, statements made for Stratix III devices apply equally for Stratix IV and HardCopy III devices, unless otherwise mentioned.

This section gives background information on the following topics:

- DDR and DDR2 SDRAM Overview
- IOE Dedicated DDR and DDR2 SDRAM Features
- DDR and DDR2 SDRAM Interface Termination and Topology
- ALTMEMPHY Megafunction Overview
**DDR and DDR2 SDRAM Overview**

This section gives an overview of DDR SDRAM and DDR2 SDRAM.

**DDR SDRAM Overview**

DDR SDRAM is a 2\textsuperscript{n} prefetch architecture with two data transfers per clock cycle. It uses a single-ended strobe, DQS, which is associated with a group of data pins, DQ, for read and write operations. Both DQS and DQ ports are bidirectional. Address ports are shared for read and write operations.

Write and read operations are sent in bursts, DDR SDRAM supports burst lengths of 2, 4, and 8. The column address strobe (CAS) latency is the latency between when the read command is clocked into the memory and the requested data is presented at the memory pins. DDR SDRAM can have CAS latencies of 2, 2.5, and 3, depending on operating frequency.

DDR SDRAM devices use the SSTL-2 2.5V I/O standard and can hold between 64 MB and 1 GB of data. Each device is divided into four banks, and each bank has a fixed number of rows and columns. Only one row per bank can be accessed at a time. The **ACTIVE** command opens a row and the **PRECHARGE** command closes a row.

DDR SDRAM has a maximum frequency of 200 MHz or 400 Mbps per DQ pin.

Altera ALTMEMPHY megafuction designs do not support burst lengths of 8.

**DDR2 SDRAM Overview**

DDR2 SDRAM is the second generation of the DDR SDRAM standard. It is a 4\textsuperscript{n} prefetch architecture (internally the memory operates at half the interface frequency) with two data transfers per clock cycle. DDR2 SDRAM can use a single-ended or differential strobe, DQS or DQSn, which is associated with a group of data pins, DQ, for read and write operations. DQS, DQSn, and DQ ports are bidirectional. Address ports are shared for read and write operations.

Write and read operations are sent in bursts, DDR2 SDRAM supports burst lengths of 4 and 8. DDR2 SDRAM supports CAS latencies of 2, 3, 4, and 5.

DDR2 SDRAM devices use the SSTL-18 1.8-V I/O standard and can hold between 256 MB and 4 GB of data. All DDR2 SDRAM devices have at least four banks, but higher-density devices (typically 1 GB and above)
have eight internal banks. With more banks available, the page-to-hit ratio is twice as great when compared to DDR SDRAM. DDR2 SDRAM also allows bank interleaving, which represents a significant advantage for applications accessing random data. Bank interleaving can be extremely effective for concurrent operations and can hide the timing overhead that are otherwise required for opening and closing individual banks.

DDR2 SDRAM also supports ODT signal options of 50, 75, or 150 Ω on all DQ, DM, and DQS and DQSn signals.

DDR2 SDRAM has a maximum frequency of 533 MHz or 1,066 Mbps per DQ pin.

Altera ALTMEMPHY megafonction-based designs do not support burst lengths of 8.

Altera DDR and DDR2 SDRAM High-Performance Controllers do not support bank interleaving.

**DDR and DDR2 SDRAM Comparison**

Table 3 compares DDR and DDR2 SDRAM features.

<table>
<thead>
<tr>
<th>Table 3. DDR and DDR2 SDRAM Features (Part 1 of 2)</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Feature</strong></td>
</tr>
<tr>
<td>Voltage</td>
</tr>
<tr>
<td>Density</td>
</tr>
<tr>
<td>Internal banks</td>
</tr>
<tr>
<td>Prefetch</td>
</tr>
<tr>
<td>Speed</td>
</tr>
<tr>
<td>Read latency</td>
</tr>
<tr>
<td>Additive latency</td>
</tr>
<tr>
<td>Write latency</td>
</tr>
<tr>
<td>Termination</td>
</tr>
</tbody>
</table>
Using DDR and DDR2 SDRAM in Stratix III and Stratix IV Devices

Table 3. DDR and DDR2 SDRAM Features (Part 2 of 2)

<table>
<thead>
<tr>
<th>Feature</th>
<th>DDR SDRAM</th>
<th>DDR2 SDRAM</th>
<th>DDR2 SDRAM Advantage</th>
</tr>
</thead>
<tbody>
<tr>
<td>Data strobes</td>
<td>Single-ended</td>
<td>Differential or single-ended</td>
<td>Improves timing margin.</td>
</tr>
</tbody>
</table>

(1) The Altera SDRAM high-performance controllers do not support additive latency.

IOE Dedicated DDR and DDR2 SDRAM Features

Stratix III devices have enhanced upon the IOE DDR capabilities of previous generations of devices by including the following functionality availability directly in the IOE.

- DDR registers
- Alignment and synchronization registers (including I/O clock divider)
- Half data-rate registers
- DQS phase-shift circuitry (up to 4 two-phase-offset DLLs)
- DQS postamble circuitry
- Single-ended, differential or complementary DQS mode
- Read and write leveling circuitry
- Dynamic on-chip termination (OCT) control

To use these features you should use the Altera DDR and DDR2 SDRAM High-Performance Controller (a complete solution) or the Altera ALTMEMPHY megafunction (for a fully configured PHY that requires an additional custom or third-party memory controller).

Alternatively, you may access these IOE features directly via the following low-level megafunctions:

- ALTDQ DQS megafuction—allows you to parameterize the following features:
  - DDR
  - alignment and synchronization
  - half data rate
  - DQS mode
- ALTDLL megafuction—allows you to parameterize the DQS phase-shift circuitry
- ALTOCT megafuction—allows you to parameterize the IOE OCT features.
- ALTPLL megafuction—allows you to parameterize the device PLL
- ALTIOBUF megafuction—allows you to parameterize the device IO
**Device Pin Utilization**

Table 4 shows the DDR and DDR2 SDRAM interface pins and how to connect them to Stratix III pins.

### Table 4. Stratix III DDR and DDR2 SDRAM Interface Pin Utilization

<table>
<thead>
<tr>
<th>Pin</th>
<th>Pin Planner Symbol</th>
<th>Stratix III Pin</th>
</tr>
</thead>
<tbody>
<tr>
<td>DQ</td>
<td>Q</td>
<td>DQ. Each DQ group has a common background color for all of the associated DQ and DM pins.</td>
</tr>
<tr>
<td>DM</td>
<td>Q</td>
<td></td>
</tr>
<tr>
<td>Differential DQS or DQSn (DDR2 only)</td>
<td>S or Sbar</td>
<td>DQS or DQSn. DDR2 supports either single-ended or differential DQS signaling.</td>
</tr>
<tr>
<td>Single-ended DQS</td>
<td>S</td>
<td>DQS.</td>
</tr>
<tr>
<td>mem_clk[0] or mem_clk_n[0]</td>
<td>—</td>
<td>Any unused DIFFIO_RX pins in the same bank or side for DDR2 SDRAM interfaces with differential DQS signaling. (1)</td>
</tr>
<tr>
<td>mem_clk[n:1] or mem_clk_n[n:1]</td>
<td>—</td>
<td>Any DIFFOUT pins for DDR2 SDRAM (with single-ended DQS signaling) and DDR SDRAM. (1)</td>
</tr>
<tr>
<td>Address and command</td>
<td>—</td>
<td>Any DIFFOUT pins for DDR2 and DDR SDRAM (where n is greater than or equal to 1)</td>
</tr>
<tr>
<td>Clock source</td>
<td>—</td>
<td>Dedicated PLL clock input pin with direct (not using a global clock net) connection to the PLL and optional DLL required by the interface.</td>
</tr>
<tr>
<td>Reset</td>
<td>—</td>
<td>Dedicated clock input pin (high fan-out signal).</td>
</tr>
</tbody>
</table>

**Note to Table 4:**

(1) ALTMEMPHY mimic path requirement only.

### DQ and DQS Group Interface Width

For maximum performance and best skew across the interface, you should select a device where each required memory interface can completely reside within a single bank, or at least one side of the device.

Maximum interface width varies from device to device depending on the number of I/Os and DQS and DQ groups available. The smallest 480-pin device sizes can typically support a 128-4MB 16-bit wide complete interface in both the top and bottom banks and a 32-bit wide complete interface in side banks. The largest 1,760-pin devices can support a 72-bit wide DQ interface in each left and right banks. Achievable interface
width depends on the number of address and command pins that the design requires. To ensure adequate PLL, clock and device routing resources are available, you should always test fit any IP in the Quartus II software before PCB sign-off.

Table 5 shows the number of DDR and DDR2 SDRAM suitable DQS and DQ groups available in Stratix III devices per side.

<table>
<thead>
<tr>
<th>Package</th>
<th>Side</th>
<th>x4</th>
<th>x8/x9</th>
</tr>
</thead>
<tbody>
<tr>
<td>484-pin BGA</td>
<td>Top and bottom</td>
<td>5</td>
<td>2</td>
</tr>
<tr>
<td></td>
<td>Left and right</td>
<td>12</td>
<td>4</td>
</tr>
<tr>
<td>780-pin BGA</td>
<td>Top and bottom</td>
<td>17</td>
<td>8</td>
</tr>
<tr>
<td></td>
<td>Left and right</td>
<td>14</td>
<td>6</td>
</tr>
<tr>
<td>1,152-pin BGA</td>
<td>Top and bottom</td>
<td>26</td>
<td>12</td>
</tr>
<tr>
<td></td>
<td>Left and right</td>
<td>26</td>
<td>12</td>
</tr>
<tr>
<td>1,517-pin BGA</td>
<td>Top and bottom</td>
<td>38</td>
<td>18</td>
</tr>
<tr>
<td></td>
<td>Left and right</td>
<td>34</td>
<td>16</td>
</tr>
<tr>
<td>1,760-pin BGA</td>
<td>Top and bottom</td>
<td>44</td>
<td>22</td>
</tr>
<tr>
<td></td>
<td>Left and right</td>
<td>40</td>
<td>18</td>
</tr>
</tbody>
</table>

**Note to Table 5:**

1. Numbers are preliminary.
2. Some DQS or DQ pins are dual purpose and can also be required as RUP, RDN, or configuration pins. A DQS or DQ group is lost if you use these pins for configuration or as RUP or RDN pins for calibrated OCT. Ensure that the DQS and DQ groups are not also required for configuration or calibrated OCT.

For more information, refer to the *External Memory Interfaces* chapter of the *Stratix III Device Handbook* and the *External Memory Interfaces* chapter of the *Stratix IV Device Handbook*.

**DDR and DDR2 SDRAM Interface Pin Description**

This section describes the DDR and DDR2 SDRAM interface pin description.

**Clock Signals**

DDR2 SDRAM devices use CK and CK# signals to clock the address and command signals into the memory. Furthermore, the memory uses these clock signals to generate the DQS signal during a read through the DLL inside the memory. The DDR2 SDRAM data sheet specifies the following timings:
Background

- \( t_{\text{DQSCK}} \) is the skew between the \( \text{CK} \) or \( \text{CK#} \) signals and the DDR3 SDRAM-generated \( \text{DQS} \) signal
- \( t_{\text{DSH}} \) is the \( \text{DQS} \) falling edge from \( \text{CK} \) rising edge hold time
- \( t_{\text{DSS}} \) is the \( \text{DQS} \) falling edge from \( \text{CK} \) rising edge setup time
- \( t_{\text{DQSS}} \) is the positive \( \text{DQS} \) latching edge to \( \text{CK} \) rising edge
- \( t_{\text{DQSCK}} \) is the \( \text{DQS} \) output access time from \( \text{CK} \)

The DDR2 SDRAM has a write requirement (\( t_{\text{DQSS}} \)) that states the positive edge of the \( \text{DQS} \) signal on writes must be within \( \pm 25\% \ (\pm 90^\circ) \) of the positive edge of the DDR2 SDRAM clock input. Therefore, you should generate the \( \text{CK} \) and \( \text{CK#} \) signals using the DDR registers in the IOE to match with the \( \text{DQS} \) signal and reduce any variations across process, voltage, and temperature. The positive edge of the DDR2 SDRAM clock, \( \text{CK} \), is aligned with the \( \text{DQS} \) write to satisfy \( t_{\text{DQSS}} \).

The Altera SDRAM high-performance controllers can generate the \( \text{CK} \) and \( \text{CK#} \) signals using the DDR registers in the IOE to match with the \( \text{DQS} \) signal and reduce any variations across process, voltage, and temperature.

Strobes, Data, DM, and Optional ECC Signals

The \( \text{DQS} \) is bidirectional. The \( \text{DQSn} \) pins in DDR2 SDRAM devices are optional but recommended for DDR2 SDRAM designs operating at more than 333 MHz. Differential \( \text{DQS} \) operation enables improved system timing due to reduced crosstalk and less simultaneous switching noise on the strobe output drivers. The \( \text{DQ} \) pins are also bidirectional. Regardless of interface width, DDR SDRAM always operates in \( \times 8 \) mode \( \text{DQS} \) groups. DDR2 SDRAM interfaces can operate in either \( \times 4 \) or \( \times 8 \) mode \( \text{DQS} \) groups, which is dependant on your chosen memory device or DIMM, and is also not related to the actual interface width. The \( \times 4 \) and \( \times 8 \) configurations use one pair of bidirectional data strobe signals, \( \text{DQS} \) and \( \text{DQSn} \), to capture input data. However, two pairs of data strobes, \( \text{UDQS} \) and \( \text{UDQS#} \) (upper byte) and \( \text{LDQS} \) and \( \text{LDQS#} \) (lower byte), are required by the \( \times 16 \) configuration devices. A group of \( \text{DQ} \) pins must remain associated with its respective \( \text{DQS} \) and \( \text{DQSn} \) pins.

The \( \text{DQ} \) signals are edge-aligned with the \( \text{DQS} \) signal during a read from the memory and are center-aligned with the \( \text{DQS} \) signal during a write to the memory. The memory controller shifts the \( \text{DQ} \) signals by \(-90^\circ\) during a write operation to center align the \( \text{DQ} \) and \( \text{DQS} \) signals. The memory controller delays the \( \text{DQS} \) signal during a read, so that the \( \text{DQ} \) and \( \text{DQS} \) signals are center aligned at the capture register. Stratix III devices use a phase-locked loop (PLL) to center-align the \( \text{DQS} \) signal with respect to the \( \text{DQ} \) signals during writes and use dedicated \( \text{DQS} \) phase-shift circuitry to shift the incoming \( \text{DQS} \) signal during reads. Figure 1 shows an example where the \( \text{DQS} \) signal is shifted by \( 90^\circ \) for a read from the DDR2 SDRAM.
Figure 1. DQ and DQS Relationship During a DDR2 SDRAM Read in Burst-of-Four Mode

Figure 2 shows an example of the relationship between the data and data strobe during a burst-of-four write.

Figure 2. DQ and DQS Relationship During a DDR2 SDRAM Write in Burst-of-Four Mode

The memory device’s setup (tDS) and hold times (tDH) for the write DQ and DM pins are relative to the edges of DQS write signals and not the CK or CK# clock. Setup and hold requirements are not necessarily balanced in in DDR2 SDRAM, unlike in DDR SDRAM devices.

The DQS signal is generated on the positive edge of the system clock to meet the tDQSS requirement. DQ and DM signals use a clock shifted –90° from the system clock, so that the DQS edges are centered on the DQ or DM signals when they arrive at the DDR2 SDRAM. The DQS, DQ, and DM board trace lengths need to be tightly matched (20 ps).

DDR2 SDRAM uses the DM pins during a write operation. Driving the DM pins low shows that the write is valid. The memory masks the DQ signals if the DM pins are driven high. While you can use any of the I/O pins in
the same bank as the associated DQS and DQ pins, to generate the DM signal, Altera recommends that you use the spare DQ pin within the same DQS group as the respective data, to minimize skew.

The DM signal’s timing requirements at the DDR2 SDRAM input are identical to those for DQ data. The Stratix III DDR registers, clocked by the –90° shifted clock, create the DM signals.

Some DDR2 SDRAM devices support error correction coding (ECC) to detect and automatically correct error in data transmission. The 72-bit DDR2 SDRAM modules contain eight ECC pins in addition to 64 data pins. Connect the eight DDR2 SDRAM device ECC pins to a single DQS or DQ group.

Address and Command Signals
Address and command signals in DDR2 SDRAM devices are clocked into the memory device using the CK or CK# signal. These pins operate at single data rate (SDR) using only one clock edge. The number of address pins depends on the DDR2 SDRAM device capacity. The address pins are multiplexed, so two clock cycles are required to send the row, column, and bank address. The CS, RAS, CAS, WE, CKE, and ODT pins are DDR2 SDRAM command and control pins.

The DDR2 SDRAM address and command inputs do not have a symmetrical setup and hold time requirement with respect to the DDR2 SDRAM clocks, CK, and CK#.

For SDRAM high-performance controllers in Stratix III devices, the address and command clock is always one of the PLL dedicated clock outputs whose phase can be adjusted to meet the setup and hold requirements of the memory clock. The address and command clock is also typically half-rate, although a full-rate implementation can also be created. The command and address pins use the DDIO output circuitry to launch commands from either the rising or falling edges of the clock. The chip select (CS_N) clock enable (CKE) and ODT pins are only enabled for one memory clock cycle and can be launched from either the rising or falling edge of the address and command clock signal. The address and other command pins are enabled for two memory clock cycles and can also be launched from either the rising or falling edge of the address and command clock signal.

In ALTMEMPHY-based solutions, the address and command clock ac_clk_1x is always half rate. However, because of the output enable assertion, CS_N, CKE, and ODT behave like full-rate signals even in a half-rate PHY.
PLL and DLL Features and Availability

Stratix III devices use PLLs to generate the memory controller clocks. The simplest slowest speed memory controllers may only require two clocks (0° system clock and –90° write clock). However, as interface speeds increase, it becomes harder to close timing and so dedicated resynchronization, postamble, and address and command clocks are typically required. Additionally, at higher frequencies maximum frequency becomes the bottleneck and half-rate designs are the typical solution. Thus complex half high data rate designs require typically 10 clock networks (refer to the External DDR Memory PHY Interface (ALTMEMPHY) Megafunction User Guide).

Stratix III devices are well equipped to address the clocking requirements of external DDR and DDR2 SDRAM interfaces. Stratix III PLLs have an increased number of outputs and global clock routing resources when compared to earlier device generations. Stratix III top and bottom PLLs feature 10 output (C) counters, also left and right PLLs feature 7 output (C) counters. This increased number of PLL outputs allows for the use of dedicated clock phases. In previous Stratix II designs, clock phases had to be shared.

In general, each Stratix III PLL has access to 4 global clocks (GCLK) and 6 regional clocks (RCLK) (left and right) or 10 RCLK (top and bottom).

Stratix III devices also feature four DLLs (one located in each corner of the device). The FPGA can support a maximum of four unique frequencies, with each DLL running at one frequency. Each DLL can also support two different phase offsets, which allow a single Stratix III device to support eight different DLL phase shift settings. Additionally, each DLL can access the two sides adjacent to its location. Thus each I/O bank is accessible by two different DLLs, giving more flexibility when creating multiple frequency and phase shift memory interfaces. Figure 3 shows PLL and DLL locations in Stratix III devices with global and regional clock resources.

For more information, refer to the Clock Networks and PLLs in Stratix III Devices chapter and the External Memory Interfaces chapter in the Stratix III Device Handbook.
IOE Registers

Stratix III IOE registers include the following feature enhancements, over the previous generation of devices, which greatly simplify high speed memory interface design:
Using DDR and DDR2 SDRAM in Stratix III and Stratix IV Devices

- Single-ended or differential DQS signaling
- Alignment and synchronization registers
- Half data rate registers
- I/O clock divider
- Programmable delay
- Read and write leveling—one per subbank. For example, bank 1a, 1b, and 1c = three circuits

Figure 4 and Figure 5 show the Stratix III IOE structure.
**Figure 4. Stratix III IOE Input Registers**

**Notes:**

1. You can bypass each register except the first in this path.
2. The 0-phase resynchronization clock from the read-leveling delay chain.
3. The input clock can be from the DQS logic block (whether the postamble circuitry is bypassed or not) or from a global clock line.
4. This input clock comes from the CQn logic block.
5. This resynchronization clock can come either from the PLL or from the read-leveling delay chain.
6. The I/O clock divider resides adjacent to the DQS logic block. In addition to the PLL and read-levelled resynchronization clock, the I/O clock divider can also be fed by the DQS bus or CQn bus.
7. The half-rate data and clock signals feed into a dual-port RAM in the FPGA core.
8. You can dynamically change the `dataoutbypass` signal after configuration.
Figure 5. Stratix III IOE Output Registers

Notes:
(1) You can bypass each register block of the output and output-enable paths.
(2) Data coming from the FPGA core are at half the frequency of the memory interface.
(3) Half-rate and alignment clocks come from the PLL.
(4) These registers are only used in DDR3 SDRAM interfaces.
(5) The write clock can come from either the PLL or the write leveling delay chain. There is a 90° offset between the DQ write clock and DQS write clock.
Single-ended or Differential DQS Signaling
Stratix III devices directly support differential DQS mode and the single-ended standard supported in previous device families. While DDR SDRAM only supports single-ended DQS, DDR2 SDRAM additionally includes the option of differential DQS signaling.

Differential DQS signaling is recommended for DDR2 SDRAM designs operating at or above 333 MHz. Differential DQS strobe operation enables improved system timing due to reduced crosstalk and less simultaneous switching noise on the strobe output drivers. You can use single-ended DQS mode for DDR2 SDRAM interfaces, but it requires more pessimistic timing data and hence results in less system timing margin.

DDR Registers
Similar to the previous generation of devices, DDR registers are provided on all sides of the device so that DDR I/O structures can be directly implemented in the IOE, thus saving core logic and ensuring tight skew is easily maintained, which eases timing. Stratix III devices now feature four DLLs, so DQS capture mode is now supported on every side of the device.

Alignment and Synchronization Registers
In previous device families the resynchronization registers had to be located in the core of the device, which made the placement of these registers with respect to the DDR IOE critical to ensure that timing is achieved. Stratix III devices have been enhanced to include the alignment and synchronization registers directly within the IOE, hence timing is now significantly improved and you are no longer concerned with ensuring critical register placement with respect to the DDR IOE. Typically, the resynchronization register is clocked via a dedicated output from the PLL. However, it may also be clocked directly from the read-leveling delay chain. The output alignment registers are typically clocked from the PLL.

Generally alignment and synchronization registers are optional and can be bypassed if not required; for ALTMEMPHY-based designs, these registers are required.

Regardless of interface speed, ALTMEMPHY always implements synchronization registers. Hence latency through the PHY may not be optimal for lower frequency designs.

Stratix III devices include only one leveling delay chain per I/O subbank. For example, subbank 1A includes a single leveling chain, 1B includes a second leveling chain, and so on.
If the resynchronization clock is sourced from the leveling delay chain, it may be cascaded from bank to bank, say 1A to 1B. In this configuration memory controllers must form a single contiguous block of DQS groups that are not staggered or interleaved with another memory controller.

**Half Data Rate Registers**
As external memory interface clock speeds increase, the core $f_{\text{MAX}}$ can become the limiting factor in interface design. A common solution, which increases core $f_{\text{MAX}}$ timing problems, is to implement a half data rate architecture. This solution has the effect of doubling the data width on the core side interfaces compared to a full-rate SDR solution, but also halves the required operating frequency. To simplify PHY design and provide easier design constraints, Stratix III devices include dedicated full-rate to half-rate registers within the IOE.

**Clock Divider**
To simplify and reduce the number of clocks required, a dedicated I/O clock divider is provided on a per DQS group basis, which can directly source the half-rate resynchronization clock from the full rate version.

To ease data alignment, a single I/O clock divider may be used for an entire interface, as the half rate resynchronization clock can be cascaded from DQ group to the adjacent DQ group. Hence, when using a common I/O clock divider, the high and low bit order may be performed across the entire interface. Individual I/O clock dividers require the data alignment to be performed on a DQ group basis.

ALTMEMPHY-based designs use a single I/O clock divider when operating in half-rate mode to reduce PHY complexity and reduce latency. Hence, ALTMEMPHY-based interfaces operating in half-rate mode cannot be interleaved.

**Programmable Delay**
Stratix III I/O registers include programmable delay chains that you may use to deskew interfaces. Each pin can have different delay settings, hence read and write margins can be increased as uncertainties between signals can be minimized.

ALTMEMPHY-based designs do not use dynamic delay chains to deskew interfaces.

**Read and Write Leveling**
Stratix III I/O registers include read and write leveling circuitry to enable skew to be removed or applied to the interface on a DQS group basis. There is one leveling circuit located in each I/O subbank.
ALTMEMPHY-based designs for DDR and DDR2 SDRAM do not use leveling circuitry, as it add latency. DDR and DDR2 SDRAM interfaces typically have balanced well-matched PCB delays and use a synchronized single-cycle clock structure.

**IOE OCT Features**

Stratix III devices support dynamic calibrated OCT—previous Stratix devices did not. This feature allows the specified series termination to be enabled during writes, and parallel termination to be enabled during reads. In addition to series OCT, Stratix III devices also allow slew rate control to be applied with drive strength options. These I/O features allow you to greatly simplify PCB termination schemes.

For further information, refer to the *Stratix III Device I/O Features* chapter in the *Stratix III Device Handbook* and *AN 408: DDR2 Memory Interface Termination, Drive Strength, and Loading Design Guidelines*.

### DDR and DDR2 SDRAM Interface Termination and Topology

This section discusses signal topology and termination of DDR and DDR2 SDRAM interfaces.

For more information, refer to memory vendor application notes and *AN 408: DDR2 Memory Interface Termination, Drive Strength, and Loading Design Guidelines*.

All DDR and DDR2 SDRAM interfaces use the following two classes of signal type:

- Unidirectional class I terminated signals, which include clocks, and address and command signals
- Bidirectional class II terminated signals, which include DQS, DQ, and DM signals

Although DM is output only, it typically has the same termination and topology as the DQ and DQS pins, to ensure identical timing characteristics.

In general, the only significant difference between electrical DDR and DDR2 SDRAM interfaces is that DDR SDRAM interfaces use the SSTL (2.5-V) I/O standard, and DDR2 SDRAM interfaces use the SSTL18 (1.8-V) I/O standard. In both cases, the $V_{TT}$ power source must be specified to both source and sink current.
Unidirectional Class I Terminated Signals

All class I signals are multiload signals—they either go to a DIMM that has multiple memory devices, or they go to all memory devices that make up the interface. Altera recommends the ideal topology is a balanced symmetrical tree. Altera recommends that the class I termination to $V_{TT}$ is placed:

- At the DIMM connector (for interfaces using DIMMs)
- At the first split or division of the symmetrical tree for discrete devices

Nonsymmetrical topologies or DIMMs result in over or undershoot and oscillations on the line, which may require compensation capacitors or a lower than ideal drive strength to be specified resulting in derated interface performance.

Memory clocks are typically chosen to ensure an even and matched number of loads on each clock pair, so that the timing to each memory device is consistent assuming equal trace delays. Each clock pair should be loaded to ensure that significant slew rate distortion does not occur. Memory clocks are typically differentially terminated with an effective 100-Ω resistance. You can achieve 100-Ω differential termination in one of the following ways:

- 100-Ω single resistor directly between the positive and negative signal.
- 50-Ω single-ended resistor to $V_{TT}$ on each positive and negative pin.
- 100-Ω up to $V_{CC}$ and 100 Ω down to ground on each positive and negative pin.

Electrically all these solutions look the same to differential AC signals.

For information about the electrical I/O termination, refer to the Stratix Device I/O Features chapter of the Stratix III Device Handbook.

FPGA drive strength and series termination setting should maximize edge rate while ensuring that over or undershoot are not encountered.

The combined use of drive strength and slew rate, or output series termination options mean Stratix III is ideally configurable for any Class I termination schemes.

For further information, refer to Micron Technical Note TN4720: Point-to-Point Package Sizes and Layout Basics.
**Bidirectional Class II Terminated Signals**

Class II signals are typically point-to-point, unless you are using either:

- Multiple DIMMs
- Stacked or dual rank DIMMs or topologies

Stratix III devices include on-chip series and parallel termination. So, generally, discrete termination at the FPGA end of the line is not required.

DDR2 SDRAM devices support dynamic parallel ODT at the memory end of the line. So, typically, discrete termination is not required at the memory end of the line.

DDR SDRAM devices do not include ODT. So, generally, parallel termination is required at the memory end of the line.

DDR and DDR2 SDRAM DIMMs include a series terminator and DDR2 SDRAM includes drive strength control. So if you are using DIMMs, a series terminator at the memory end of the line is never required.


ALTMEMPHY-based designs do not support multiple DIMMs or dual-rank stacked topologies, because calibration only takes place on the first rank.

**ALTMEMPHY Megafuction Overview**

The Altera ALTMEMPHY megafuction allows the rapid creation of a physical layer interface (PHY) in Stratix III devices. The PHY safely transfers data between memory and user logic. The easy-to-use ALTMEMPHY megafuction GUI enables the rapid configuration of the highly configurable PHY. You can use the ALTMEMPHY megafuction with either a user-designed controller or the Altera DDR and DDR2 SDRAM High-Performance Controllers. You can parameterize the ALTMEMPHY megafuction to support the following features:

- Full-rate or a half-rate operation
- Single-ended or differential DQS mode
- Dynamic termination

The ALTMEMPHY megafuction supports an initial calibration sequence to minimize the effect of process variations in the FPGA and memory device. During operation, the voltage and temperature (VT) tracking
mechanism eliminates the effects on timing margin of VT variation. The calibration process centers the resynchronization clock phase into the middle of the data valid window, to maximize the setup and hold margin.

Additionally, the ALTMEMPHY megafuntion automatically generates all required TimeQuest timing constraints.

All published Stratix III DDR and DDR2 SDRAM performance data assume the design uses the ALTMEMPHY megafunction.

Altera recommends the use of an ALTMEMPHY-based DDR or DDR2 SDRAM design whenever possible. However, in some situations a simpler ALTDQ_DQS solution may be preferred and potentially more optimal.

For more information, refer to the *External DDR Memory PHY Interface (ALTMEMPHY) Megafunction User Guide.*

For more information, refer to the *ALTDLL and ALTDQ_DQS Megafunctions User Guide.*

**DDR and DDR2 SDRAM in Stratix III Devices Design Flow**

Altera recommends the design guidelines described in this section as best practices for successful memory interface implementation in Stratix III devices. These guidelines provide the fastest out-of-the-box experience with external memory interfaces in Stratix III devices. Figure 6 shows the design flow required for Stratix III memory interfaces. Each step is discussed in detail in the following sections. This flow uses the DDR2 SDRAM High-Performance Controller.
Figure 6. Design Flow for Implementing External Memory Interfaces in Stratix III Devices

Start Design

Select Device

Determine Board Design Constraints

Perform Board Level Simulations

Adjust Termination Drive Strength

Add Constraints

Instantiate PHY and Controller in a Quartus II Project

Perform RTL/Functional Simulation

Does Simulation Give Expected Results?

Yes

No

Debug Design

Compile Design and Verify Timing

Does the Design Have Positive Margin?

Yes

No

Verify Design Functionality on Board

Does Signals Meet Electrical Requirements?

Yes

No

Debug Design

Is Design Working?

Yes

No

Design Done

Adjust Constraints

Option
Select a Device

This section discusses the following topics:

- Bandwidth
- Full or Half Rate SDRAM Controller
- PLL and Clock Usage
- DLL Usage and Sharing
- Top, Bottom, Left, Right, and Hybrid Device Sides
- DQ and DQS Width Limits
- Address and Command, Clock, and Other Signals

Memory controllers in Stratix III devices require access to dedicated IOE features, PLLs, and several clock networks. Stratix III devices are feature rich in all of these areas, so you must consider detailed resource and pin planning whenever implementing complex IP or multiple IP cores. This section provides an overview of what to consider in such instances.

For more information, refer to the Stratix III Device Handbook and the relevant IP user guides.

Altera recommends that you create an example top-level design with the desired pin outs and all interface IP instantiated, which enables the Quartus II software to validate your design and resource allocation before PCB and schematic sign off.

As the structure of memory controllers varies considerably, this section uses the ALTMEMPHY architecture, where appropriate.

Bandwidth

Before designing any memory interface, determine the required bandwidth of the memory interface. Bandwidth can be expressed as:

\[
\text{Bandwidth} = \text{data width (bits)} \times \text{data rate transfer} (1/s) \times \text{efficiency}
\]

Data rate transfer (1/s) = 2 × frequency of operation

After calculating the bandwidth requirements of your system, determine which memory type and device to use. Altera has a memory selection white paper, which highlights the differences between the memory types.

For information about selecting the different memory types, refer to the Selecting the Right High-Speed Memory Technology for Your System white paper.
DRAM typically has an efficiency of around 70%, but when using the Altera memory controller efficiency can vary from 10 to 92%.

For information on DDR and DDR2 SDRAM efficiency, refer to *The Efficiency of the DDR & DDR2 SDRAM Controller Compiler* white paper.

In addition, Altera's FPGA devices support various data widths for different memory interfaces. The memory interface support between density and package combinations differs, so you must determine which FPGA device density and package combination best suits your application.

For information about the FPGA density and package support for the different memory types, refer to the *External Memory Interfaces in Stratix III Devices* chapter of the *Stratix III Device Handbook*.

**Full or Half Rate SDRAM Controller**

When implementing memory controllers consider whether a half-rate or a full-rate datapath is optimal for your design. Full or half-rate mode have the following definitions:

- Full-rate mode presents data to the local interface at twice the width of the actual SDRAM interface at the full SDRAM clock rate
- Half-rate mode presents data to the local interface at four times the width of the actual SDRAM interface at half the SDRAM clock rate

Implementing memory controllers in half-rate mode results in the highest possible SDRAM clock frequency, while allowing the more complex core logic to operate at half this frequency. This implementation is most useful when core HDL designs are difficult to implement at the higher SDRAM clock frequency, but the required SDRAM bandwidth per I/O pin is still quite high.

However, controller operations are faster in full-rate mode with the IP operating at the same clock frequency as your system.

Consider that DDR devices can have a number of banks open at once. Each bank has a currently selected row. Changing the column within the selected row of an open bank requires no additional bank management commands to be issued. Changing the row in an active bank, or changing the bank both incur a protocol penalty that requires the precharge (PCH) command closes the active row or bank, and the active (ACT) command opens (or activates) the new row or bank combination.
The duration of this penalty is a function of the controller clock frequency, the memory clock frequency, and the memory device characteristics. Calculating the impact of a change of memory and controller configuration on a given system is not a trivial task, as it depends on the nature of the accesses that are performed.

In this example each command takes a single clock cycle in a full-rate controller, but two clock cycles in a half-rate controller. The bank is not available for the subsequent ACT command until \( t_{RP} \) after the PCH. So while the issuing of commands can be slower using a half-rate controller, the respective memory timing parameters remain the same.

Hence when a memory controller is used in half-rate mode, the control circuitry is clocked at half rate and so control operations are slower than in full-rate mode. However, the memory’s clock frequency and physical properties are not affected.

Half-rate memory controllers always require at least one extra clock resource, half-rate system clock, in addition to the number of full-rate clocks that are required by default.

**PLL and Clock Usage**

The exact number of clocks and hence PLLs required in your design depends greatly on the memory interface frequency, and the IP used.

You can build simple DDR slow-speed interfaces that typically require only two clocks: system and write. You can then use the rising and falling edges of these two clocks to derive four phases (0, 90, 180, and 270°).

However, as clock speeds increase, the timing margin decreases and additional clocks are required, to optimize setup and hold and meet timing. Typically, the following dedicated clocks may be required: resynchronization, and address and command.

Stratix III IOE includes dedicated circuitry for postamble protection, which is derived directly from the resynchronization clock.

In addition, some memory controller designs, like the ALTMEMPHY megafunction, use a VT tracking clock to measure and compensate for VT changes and their effects.

Half-rate memory controllers also require an additional half-rate system clock.
Consider the following points:

- PLLs in Stratix III devices connect to four maximum global clock nets.
- Top or bottom PLLs in Stratix III devices connect to ten maximum regional clock nets.
- Left or right PLLs in Stratix III devices connect to six maximum regional clock nets.
- EP3S...80 and larger devices have two PLLs located in the middle of each side of the device.
- EP3S...200 and larger device additionally have corner PLLs, which connect to six regional clock nets only.
- Dual regional clock nets are created by using a regional clock net from each region. For example, a single dual regional clock net uses two regional clock nets.
- If the design uses a dedicated PLL to only generate a DLL input reference clock, the PLL mode must be set to **No Compensation**, or the Quartus II software forces this setting automatically.
- If the design cascades PLLs, the source (upstream) PLL should have a low-bandwidth setting, while the destination (downstream) PLL should have a high-bandwidth setting.
- In Stratix III devices, two PLLs may be cascaded to each other through the clock network. In addition, where two PLLs exist adjacent to each other, there is a direct connection between them that does not require the global clock network. Using this path reduces clock jitter when cascaded PLLs. Cascaded PLLs are not recommended for ALTMEMPHY-based designs.
  
  ![You can only cascade PLLs between adjacent PLLs on the same side of the device.](image)
  
  ![If PLLs are cascaded in ALTMEMPHY based designs, you must use the adjacent PLL (direct connection) method.](image)

- Input and output delays are only fully compensated for, when the dedicated clock input pins associated with that specific PLL are used as its clock source.
- If the clock source for the PLL is not a dedicated clock pin for that specific PLL, jitter is increased, timing margin suffers, and the design may require an additional global or regional clock.

The following additional ALTMEMPHY-specific points apply:

- ALTMEMPHY megafunctions require one global or regional clock, and five regional clock nets in Stratix III devices. Hence six clocks in total are required.
Any PLL on any side of a Stratix III device can support a single ALTMEMPHY interface. Ideally, you should pick a PLL and a PLL input clock pin that are located on the same side of the device as the memory interface pins.

As each PLL can only connect to four global clock nets, while the ALTMEMPHY megafunction requires six clock nets, an ALTMEMPHY-based design cannot cross from one side of a Stratix III device to the other side. For example, an ALTMEMPHY-based design can only exist within a dual regional side of a Stratix III device.

If a single ALTMEMPHY interface spans two side quadrants, a middle side PLL must be the source for that interface. The ten dual region clocks that the single interface requires block the design using the adjacent PLL (if available) for a second interface.

If a single ALTMEMPHY interface spans two top or bottom quadrants, a middle top or bottom PLL must be the source for that interface. The ten dual region clocks that the single interface requires should not block the design using the adjacent PLL (if available) for a second interface.

For more information on clock networks, refer to Clock Networks and PLLs in Stratix III Devices in the Stratix III Device Handbook.

For more information on multiple memory controllers, refer to AN 462: Implementing Multiple Memory Interfaces Using the ALTMEMPHY Megafunction.

**DLL Usage and Sharing**

DDR and DDR2 SDRAM interfaces in Stratix III devices use DQS phase-shift circuitry for data capture. All Stratix III devices include a total of four DLLs: one located in each corner of the device. Each DLL can support two different phase offsets, and each DLL can access the two sides adjacent to its location. Hence, there are opportunities for DLL sharing or multiple different memory interface types on a single side of a Stratix III device.

DLL reference clocks must come from either dedicated clock input pins located on either side of the DLL or from specific PLL output clocks. Any clock running at the memory frequency is valid for the DLLs.

For more information on DLLs, refer to the External Memory Interfaces chapter in the Stratix III Device Handbook.

To minimize the number of clocks routed directly on the PCB, typically this reference clock is sourced from the memory controllers PLL. In general, DLLs can use the PLLs directly adjacent to them (corner PLLs when available) or the closest PLL located in the two sides adjacent to its location.
When designing for 780-pin packages with SE80, SE110 and SL150 devices, the PLL to DLL reference clock connection is limited.

Figure 7 shows the 780-pin package devices PLL and DLL reference clock connections. DLL3 is isolated from a direct PLL connection and can only receive a reference clock externally from pins clk[11:4].p.

The DLL reference clock should be the same frequency as the memory interface, but the phase is not important.
The required DQS capture phase is optimally chosen based on operating frequency and external memory interface type (DDR, DDR2, DDR3, QDRII, or RLDRAM II). As each DLL supports two possible phase offsets, two different memory interface types operating at the same frequency can easily share a single DLL. More may be possible, depending on the phase shift required.

Altera memory IP always specifies a default optimal phase setting, to override this setting, refer to the respective IP user guide.

To simplify the interface to core IP connections, multiple memory interfaces operating at the same frequency usually share the same system and static clocks as each other where possible. This sharing minimizes the number of dedicated clock nets required and reduces the number of different clock domains found within the same design.

As each DLL can directly drive four banks, but each PLL only has complete C (output) counter coverage of two banks (using dual regional networks), situations can occur where a second PLL operating at the same frequency is required. As cascaded PLLs increase jitter and reduce timing margin, you are advised to first ascertain if an alternative second DLL and PLL combination is not available and more optimal.

Top, Bottom, Left, Right, and Hybrid Device Sides

This section discusses how to determine which device side to use (top and bottom, left and right, and hybrid).

Top or Bottom and Left or Right Interfaces

Ideally any interface should wholly reside in a single bank. However, interfaces that span multiple adjacent banks or the entire side of a device are also fully supported. Although vertical and horizontal timing parameters are not identical, timing closure can be achieved on all sides of the FPGA for the maximum interface frequency.

Hybrid Interfaces

The PLL regional clock net restriction and the fact that each DLL can drive its two adjacent sides suggests that an optimal PLL, DLL, and memory interface configuration resides in a single quadrant spanning two adjacent sides of the device. For maximum performance, Altera recommends that data groups for external memory interfaces should ideally reside within a single bank, but always within the same side of a device. High-speed memory interfaces in top or bottom versus left or right IOE have different timing characteristics and timing margins are affected.
However, Altera can support interfaces with hybrid data groups that wrap around a corner of the device between vertical and horizontal I/O at some speeds (see Table 6)

<table>
<thead>
<tr>
<th>Table 6. Hybrid Memory Interface Speeds (Half Rate) Notes (1) through (4)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Memory Type</td>
</tr>
<tr>
<td>-------------</td>
</tr>
<tr>
<td>DDR2 SDRAM</td>
</tr>
<tr>
<td></td>
</tr>
<tr>
<td></td>
</tr>
<tr>
<td>DDR SDRAM</td>
</tr>
<tr>
<td></td>
</tr>
<tr>
<td></td>
</tr>
</tbody>
</table>

**Notes to Table 6:**

(1) Numbers are preliminary until characterization is final. The supported operating frequencies listed here are memory interface maximums for the FPGA device family. Your design’s actual achievable performance is based on design and system specific factors, and static timing analysis of the completed design.

(2) Applies for both DIMMS and components.

(3) For the Quartus II version 8.0 hybrid functionality is not allowed and you see the following fitter error:

"Error: Cannot place DQ I/O "mem_dq[nn]" to I/O location Pin_Nn since its memory interface I/O group cannot be placed"

(4) The Quartus II version 8.0 SP1 includes native support for hybrid interfaces at these rates.

(5) At 1.1-V core voltage, the –4L speed grade devices have the same performance as the –4 speed grade devices.

**DQ and DQS Width Limits**

Stratix III device do not limit the width of DDR or DDR2 SDRAM interfaces beyond the following requirements:

- The entire interface DQ and clock and address signals should reside within the same bank or side of the device.
- Maximum possible interface width in any particular device is limited by the number of DQS groups available within that bank or side, see Table 5.
- Sufficient regional clock networks are available to the interface PLL to allow implementation within the required number of quadrants.
- Sufficient spare pins exist within the chosen bank or side of the device to include all other address and command, and clock pin placement requirements.
- The greater the number of banks, the greater the skew, hence Altera recommends that you always generate a test project of your desired configuration and confirm that it meets timing.
Address and Command, Clock, and Other Signals

This section describes the following signals:

- Address and command
- Clock
- Other signals

DDR SDRAM Component Additional Pins

The largest individual DDR SDRAM components typically available are 1 GB ×4 devices. These devices usually require a maximum of 29 pins, which can be broken down in the following way:

- 4 DQ pins
- 1 DM pin
- 1 DQS pin
- 14 A[13:0] pins
- 2 BA[1:0] pins
- 2 CK and CK# pins
- 5 CKE, CS#, RAS#, CAS#, WE# pins

DQ, DM, and DQS should reside in a dedicated ×4 DQS group, the remaining 23 additional signals should be placed within the same bank.

DDR SDRAM DIMM Additional Pins

The largest DDR SDRAM DIMMs typically available are 2 GB ×72 dual rank modules. These modules usually require a maximum of 119 pins, which can be broken down in the following way:

- 72 DQ pins
- 9 DM[8:0] pins
- 9 DQS[8:0] pins
- 14 A[13:0] pins
- 2 BA[1:0] pins
- 6 CK and CK#[2:0] pins
- 7 CKE[1:0], CS#[1:0], RAS#, CAS#, WE# pins

DQ, DM, and DQS should reside in 9 ×8 DQS groups ensuring that DQ group pin order is maintained. The remaining 29 additional signals should be placed within the same bank or side of the device.

ALTMEMPHY-based interfaces do not directly support dual rank implementations.
DDR2 SDRAM Component Additional Pins
The largest individual DDR2 SDRAM components typically available are 1 GB ×4 devices. These devices usually require a maximum of 32 pins, which can be broken down in the following way:

- 4 DQ pins
- 1 DM pins
- 2 DQS and DQSn pins
- 14 A[13:0] pins
- 3 BA[2:0] pins
- 2 CK and CK# pins
- 5 CKE, CS#, RAS#, CAS#, WE# pins
- 1 ODT pin

DDR2 SDRAM has additional ODT and DQSn signals compared with DDR SDRAM. Stratix III ×4 DQS groups support either DQSn or DM, but not both. Additional DM pins can be placed in spare DDIO pin located within the same bank (to minimize skew), if you do not require leveling delay chains.

ALTMEMPHY-based interfaces do not support both differential DQS mode and DM in a ×4 configuration.

DQ, DQS, and DQSn (and DM) should reside in a dedicated ×4 DQS group. The remaining 25 or 26 (with DM) additional signals should be placed within the same bank.

Future 2 GB and 4 GB devices will require A14 and A15 signal connectivity.

DDR2 SDRAM DIMM Additional Pins
The largest DDR2 SDRAM DIMMs typically available are 4 GB ×72 dual rank modules. These modules usually require a maximum of 132 pins, which can be broken down in the following way:

- 72 DQ pins
- 9 DM[8:0] pins
- 9 DQS[8:0] pins
- 9 DQSn[8:0] pins
- 15 A[14:0] pins
- 3 BA[2:0] pins
- 6 CK and CK#[2:0] pins
- 7 CKE0[1:0], CS#[1:0], RAS#, CAS#, WE# pins
- 2 ODT[1:0] pins
DQ, DM, and DQS and DQSn should reside in 9 ×8 DQS groups ensuring that DQ group pin order is maintained. The remaining 33 additional signals should be placed within the same bank or side of the device.

ALTMEMPHY-based interfaces do not directly support dual rank implementations.

RUP and RDN Calibration Blocks
If calibrated series, parallel, or dynamic termination is used for the I/O in your design, your design requires a calibration block. This block requires a pair of RUP and RDN pins located within the same VCCIO voltage bank.

This calibration block is not required to be within the same bank or side of the device as the IOEs it is serving. However, RUP and RDN pins are typically shared with DQ and DQS pins in Stratix III devices.

DQS and DQSn pins in some of the ×4 groups can also be used as RUP and RDN pins. You cannot use a ×4 group for memory interfaces if you are using its pin members as RUP and RDN pins for OCT calibration. You may use the ×8/×9 group that includes this ×4 group, if either of the following conditions apply:

- You are not using DM pins with your differential DQS pins
- You are not using complementary or differential DQS pins

A ×8/×9 group comprises 12 pins, as the groups are formed by stitching two groups of ×4 mode with 6 total pins each. A typical ×8 or ×16 DDR2 SDRAM device consists of one DQS, one DM, and 8 DQ pins, which totals 10 pins. So if you choose your pin assignment carefully, you can use the 2 extra pins for RUP and RDN.

If you are using both DM and differential DQS modes at the same time, pick different pin locations for RUP and RDN pins—for example, in the bank that contains address and command pins.

You need to pick your DQS and DQ pins manually for the ×8, ×16 and ×18, or ×32 and ×36 groups, if they have pins they are using for RUP and RDN. The Quartus II software may not place these pins correctly and may give you a no-fit.

Instantiate PHY and Controller in a Quartus II Project

After selecting the appropriate device and memory type, create a project in the Quartus® II software that targets the device and memory type.

When instantiating the datapath for DDR and DDR2 SDRAM interfaces in Stratix III devices, Altera recommends that you use the ALTMEMPHY megafuction for the datapath and PHY. The ALTMEMPHY megafuction features a license-free PHY that you may use with the Altera SDRAM high-performance controllers or your own custom controller.

The Altera high-performance controllers automatically include the ALTMEMPHY megafuction. Even if you plan to use your own controller, Altera recommends that you first create a design using a SDRAM high-performance controller and then replace the Altera controller with your own controller. This method gives you an example design, which you can simulate and verify on your own PCB.

For more information about instantiating the PHY, refer to the External DDR Memory PHY Interface (ALTMEMPHY) Megafuction User Guide.

Perform Board-Level Simulations and Line Simulation

This design flow indicates that you determine board design constraints and perform board-level simulations at the end of the flow. However, Altera recommends prelayout SI simulations (line simulations) should take place before board layout and that you use these parameters and rules during the initial design development cycle. Advanced I/O timing and board trace models now directly impact device timing closure.

Add Constraints

The next step in the design flow is to add the timing, location, and physical constraints related to the external memory interface. These constraints include timing, pin locations, I/O standards, and pin loading assignments. The ALTMEMPHY megafuction only supports timing analysis using the TimeQuest Timing Analyzer with Synopsys Design Constraints (.sdc) assignments. These constraints are derived from the parameters you entered for the ALTMEMPHY megafuction or the SDRAM high-performance controller, based on the DDR2 and DDR SDRAM data sheet and tolerances from the board layout. The ALTMEMPHY megafuction uses TimeQuest timing constraints and the timing driven fitter to achieve timing closure.
After instantiating the ALTMEMPHY megafunction, the ALTMEMPHY MegaWizard generates the following files that you need to properly constrain the design:

- `<variation_name>_phy_ddr_timing.sdc` to set timing constraints
- `<variation_name>_pin_assignments.tcl` to add I/O standard setting assignments
- `<variation_name>_phy_assign_dq_groups.tcl` to add the DQ group assignments to relate the DQ and DQS pin groups together for the Quartus II fitter to place them correctly

These script files are based on the design name used when instantiating the ALTMEMPHY megafunction. If you plan to use your own top-level design, you must edit the scripts to match your custom top-level design.

For more information about creating, generating, and setting the constraints for the design, *External DDR Memory PHY Interface (ALTMEMPHY) Megafunction User Guide*.

To determine which drive strength and termination to use, refer to *AN 408: DDR2 Memory Interface Termination, Drive Strength and Loading*.

**Plan Resources**

This section describes planning resources.

Table 7 shows the pin placements that Altera recommends.

<table>
<thead>
<tr>
<th>Signal</th>
<th>Pin on FPGA</th>
<th>Pin on Memory Device</th>
</tr>
</thead>
<tbody>
<tr>
<td>Data (mem_dq)</td>
<td>DQ</td>
<td>DQ</td>
</tr>
<tr>
<td>Data mask (mem_dm)</td>
<td>DQ (1)</td>
<td>DM</td>
</tr>
<tr>
<td>Data strobe (mem_dqs)</td>
<td>DQS or DQSn</td>
<td>DQS or DQS#</td>
</tr>
<tr>
<td>Memory clock (mem_clk)</td>
<td>DQ, or DQS,</td>
<td>CK or CK#</td>
</tr>
<tr>
<td></td>
<td>or DQSn (2),</td>
<td></td>
</tr>
<tr>
<td></td>
<td>(3)</td>
<td></td>
</tr>
<tr>
<td>Address</td>
<td>Any user I/O (4)</td>
<td>A or BA</td>
</tr>
<tr>
<td>Command</td>
<td>Any user I/O (4)</td>
<td>CS#, RAS#, CAS#, WE#, CKE, or ODT</td>
</tr>
</tbody>
</table>

**Notes to Table 7:**

1. The DM pins must be in the DQ group.
2. Any unused DQ or DQS pins with DIFFIO_RX capability for mem_clk[0] and mem_clk_n[0].
3. Any unused DQ or DQS pins with DIFFOUT capability for mem_clk[n:1] and mem_clk_n[n:1]. Where n is greater than or equal to 1.
4. Ensure that address and command pins are placed on the same side of the device as the memory clock pins. Also if OCT is used, ensure that the RUP and RDN pins are assigned correctly.
The SDRAM high-performance controllers do not generate pin assignments for non-memory signals such as clock sources or pin location assignments for the design. Launch Pin Planner to make these assignments to the design.

**Advanced IO Timing**

As part of I/O planning, especially with high-speed designs, you should take board-level signal integrity and timing into account. When adding an FPGA device with high-speed interfaces to a board design, the quality of the signal at the far end of the board route, and the propagation delay in getting there, are vital for proper system operation.

The advanced I/O timing option is turned on by default for Stratix III devices.

Ensure that the overall board trace models are a reasonable approximation for each I/O standard on each PCB. For high-speed complex interfaces like DDR and DDR2 SDRAM, ensure that the board trace models are accurate for each specific signal class by using Pin Planner. Pin Planner includes a GUI schematic representation of the board trace model that you are modifying.

Board trace models include two transmission line segments (near and far). These line segments are ideal for SDRAM interfaces. You can use the near transmission line to represent the PCB and the far transmission line to represent the DIMM.

The board trace model should only include PCB or off chip information. Do not include the Stratix III IO pin and package capacitance, OCT, or drive strength settings, as the Quartus II software ascertains these dynamically.

ODT at the memory should be included as external discreet termination and the capacitive loading of the memory should be calculated for each net and also added.

> Ideally the distributed capacitance and inductance of your PCB traces should be ascertained from your PCB development tool. However, in general a 50-ohm trace is approximately 3 pF and 8 nH per inch.

Trace delay information can be entered on a per net basis if desired, but in general a net group basis should be sufficient. Multiple nets can be selected at the same time and then have their respective board trace models all entered simultaneously.
Altera suggests the following net groups:

- `mem_clk`
- `mem_addr (mem_a and mem ba)`
- `mem_ctrl (mem_cas#, mem_cke, mem_CS_n, mem_odt, mem_ras_n, mem_we_n)`
- `mem_dq_group0 (mem_dq[7..0], mem_dm[0], mem_dq_group1, mem_dq[15..8], mem_dm[1])`
- `mem_dq_group`
- `mem_dqs0 and mem_dqsn0`.

> The DQS pin can be combined with the respective DQ group as a single-ended signal, otherwise each differential DQS pin pair should be entered separately.

DIMM board trace models and SDRAM component capacitive loading information should be obtained from your memory vendor directly and must be included into your Quartus II board trace model parameters.

More precise board trace models result in more accurate TimeQuest timing analysis.

For more information, refer to the I/O Management chapter of the Quartus II Handbook.

**Perform RTL or Functional Simulation (Optional)**

After instantiating the SDRAM high-performance controller, it generates an example design and driver for testing the memory interface. Figure 8 shows a system-level diagram of the example design that the SDRAM high-performance controller creates for the design.
For more information on the different files generated by the DDR2 SDRAM high-performance controller, refer to the *DDR and DDR2 SDRAM High-Performance Controller User Guide*.

During the parameterization of the SDRAM high-performance controller, there is an option to generate a simulation model of the ALTMEMPHY megafuction, an example design, and a testbench, so that functional simulation may be performed on the design. The Quartus II software supports simulation of the DDR2 SDRAM interface with the ALTMEMPHY megafuction and the SDRAM high-performance controller.

### Compile Design and Verify Timing

After constraining the design, compile the design in the Quartus II software. During the generation of the ALTMEMPHY megafuction or the SDRAM high-performance controller, the MegaWizard Plug-In Manager generates a verify timing script `<variation_name>_phy_report_timing.tcl`. After compiling the design in the Quartus II software, run the timing script to produce the timing report for different paths, such as write data, read data, address and command, and core (entire interface) timing paths in the design.

The verify timing script reports about margins on the following paths:
Address and command setup and hold margin
• Half-rate address and command setup and hold margin
• Core setup and hold margin
• Core reset and removal setup and hold margin
• Write setup and hold margin
• Read capture setup and hold margin

Refer to AN 438: Constraining and Analyzing Timing for External Memory Interfaces, for detailed information about timing analysis and reporting using the ALTMEMPHY megafUNCTION.

Adjust Constraints

In the timing report of the design, you can see the worst case setup and hold margin for the different paths in the design. If the setup and hold margin are unbalanced, achieve a balanced setup and hold margin by adjusting the phase setting of the clocks that clock these paths.

For example, for the address and command margin, the address and command outputs are clocked by an address and command clock that can be different with respect to the system clock, which is $0^\circ$. The system clock clocks the clock outputs going to the memory. If the report timing script indicates that using the default phase setting for the address and command clock results in more hold time than setup time, adjust the address and command clock to be less negative than the default phase setting with respect to the system clock so that there is less hold margin. Similarly, adjust the address and command clock to be more negative than the default phase setting with respect to the system clock if there is more setup margin.

For detailed information about the clocks that the ALTMEMPHY megafUNCTION uses, refer to the ALTMEMPHY Megafunction User Guide.

Determine Board Design Constraints and Perform Board-Level Simulations

To determine the correct board constraints, run board-level simulations to see if the settings provide the optimal signal quality. With many variables that can affect the signal integrity of the memory interface, simulating the memory interface provides an initial indication of how well the memory interface performs. There are various electronic design automation (EDA) simulation tools available to perform board-level simulations. The simulations should be performed on the data, data strobe, control, command, and address signals. If the memory interface does not have good signal integrity, adjust the settings, such as drive strength setting,
termination scheme or termination values to improve the signal integrity (realize that changing these settings affects the timing and it may be necessary to go back to the timing closure if these change).

For detailed information about understanding the different effects on signal integrity design, refer to AN 408: DDR2 Memory Interface Termination, Drive Strength and Loading Design Guidelines.

Trace information from your board-level simulation should be fed back into the Quartus II Advanced I/O timing information.

**Device-Side Termination**

The Stratix III devices support both series and parallel OCT resistors to improve signal integrity. The Stratix III OCT eliminates the need for external termination resistors on the FPGA side, which simplifies board design and reduces overall board cost. You can dynamically switch between the series and parallel OCT resistor depending on whether the Stratix III devices are performing a write or a read operation. The OCT features offer user-mode calibration to compensate for any variation in VT during normal operation to ensure that the OCT values remain constant. The parallel and series OCT features on the Stratix III devices are available in either 25 or 50 Ω settings.

**Memory-Side Termination**

On the DDR2 SDRAM, there is a dynamic parallel ODT feature that you can turn on when the FPGA is writing to the DDR2 SDRAM and turn off when the FPGA is reading from the DDR2 SDRAM. To further improve signal integrity, DDR2 SDRAM supports output drive strength control so that the driver can better match the transmission line.

For more information on available settings of the ODT, the output drive strength features, and the timing requirements for driving the ODT pin, refer to your DDR2 SDRAM datasheet.

**Adjust Termination Drive Strength**

Altera recommends the following termination scheme for single rank DDR2 SDRAM interfaces:

- **FPGA side:**
  - DQ and DQS: calibrated 50-Ω dynamic OCT
  - DM: calibrated 50-Ω series OCT
  - Command and address: maximum drive strength
  - Memory clock: uncalibrated 50-Ω series OCT
- **DDR2 SDRAM side:**
● DQ, DQS, and DM: 50-Ω ODT and reduced output drive strength
● Command and address: 50-Ω external discrete parallel termination to VTT
● Memory clock: 100-Ω differential termination

Memory clocks use uncalibrated 50-Ω series OCT, to ensure the memory device does not observe glitches during power-up and initialization.

Although the recommendations are based on the simulations and experimental results, you must perform simulations, either using I/O buffer information specification (IBIS) or HSPICE models, to determine the quality of signal integrity on your designs.

Verify Design Functionality

Perform system level verification to correlate the system against your design targets, using the Altera SignalTap® II logic analyzer.

Refer to the Design Debugging Using the SignalTap II Embedded Logic Analyzer chapter in Volume III of the Quartus II Software Handbook for detailed information about using the SignalTap II.

Example Project Walkthrough

This walkthrough shows how to use the design flow (see “DDR and DDR2 SDRAM in Stratix III Devices Design Flow” on page 22) to design a 72-bit wide, 400-MHz, 800-Mbps DDR2 SDRAM interface. This example design also provides some recommended settings, including termination scheme and drive strength setting, to simplify the design. Although the example design is specifically for the DDR2 SDRAM interface, the design flow for a DDR SDRAM interface is the same.

The example design targets the Stratix III FPGA Development Kit, which includes a DIMM module (MT9HTF12872AY-800E). This flow applies to any other development kit or PCB.

Software Requirements

This walkthrough assumes that you have experience with the Quartus II software. In addition, ensure you have the following software installed:

- Quartus II software v8.0 SP1
- DDR and DDR2 SDRAM High-Performance Controller v8.0 SP1
Select Device

This example design uses the EP3SL150F1152-C2 device, which supports 72-bit wide DDR2 SDRAM at 400 MHz. The design uses a 72-bit wide 1-GB Micron MT9HTF12872AY-800E 400-MHz DDR2 SDRAM DIMM.

Create a Quartus II Project

Create a project in the Quartus II software that targets the EP3SL150F1152-C2 device, see Figure 9.

For detailed step-by-step instructions on how to create a Quartus II project, refer to the Tutorial in the Quartus II software, which is available by clicking the Help menu in the Quartus II window and selecting Tutorial.
Instantiate a PHY and a Controller

After creating a Quartus II project, instantiate the DDR2 SDRAM controller. This example design uses the DDR2 SDRAM high-performance controller, which instantiates the ALTMEMPHY megafuncon automatically.

Before you open the MegaWizard Plug-In, you must add the Derated (Micron MT9HTF12872AY-800 400Mhz).xml file to your <installation directory>\80\ip\ddr2_high_perf\lib directory. The .xml file is included in the application note .zip file.

Select the DDR2 SDRAM High Performance Controller in the Interfaces section of the MegaWizard Plug-In Manager, see Figure 10. For this example, enter ddr2_dimm for the name of the DDR2 SDRAM high-performance controller.

Figure 10. Select the DDR2 SDRAM High-Performance Controller
Parameterize the DDR2 SDRAM high-performance controller to interface with a 400-MHz, 72-bit wide DDR2 SDRAM interface.

1. In the memory setting tab, set Speed grade to 2.

2. For PLL reference clock frequency, enter 50 (to match the on-board oscillator).

3. For the memory preset, select Derated (Micron MT9HTF12872AY-800 400MHz), which gives a 72-bit wide 1-GB 400-MHz DDR2 DIMM, see Figure 11.

Figure 11. Parameterize the DDR2 SDRAM High-Performance Controller
4. To create a memory preset click **Modify parameters**. In the **Preset Editor** dialog box, you can modify the memory presets, see Figure 12.

**Figure 12. Modify the Memory Presets to Create a Custom Memory**

The $t_{IS}$, $t_{IH}$, $t_{DS}$, and $t_{DH}$ parameters typically require slew rate derating.

For more information on slew rate derating and how to perform slew rate derating calculations, refer to the memory vendor datasheet.
Simulation and measurement show the following slew rate for the clock, address and command, \(DQ\), and \(DQS\) pins on the Stratix III Development Board when using the default I/O standard and drive options:

- Address and command = 0.5 V/ns
- \(CLK\) and \(CLK\#\) = 1.5 V/ns (differential)
- \(DQ\) = 1.5 V/ns
- \(DQS\) and \(DQSn\) = 2.8 V/ns (differential)

Hence, the correct \(t_{IS}\), \(t_{IH}\), \(t_{DS}\), and \(t_{DH}\) values for this design are:

- \(t_{IS}\) = 345 ps
- \(t_{IH}\) = 280 ps
- \(t_{DS}\) = 317 ps
- \(t_{DH}\) = 292 ps

You should always simulate or measure your own design and topology to ensure accurate timing information and analysis.

The DDR2 SDRAM has a write requirement \((t_{DQSS})\) that states the positive edge of the \(DQS\) signal on writes must be within ±25% (±90°) of the positive edge of the DDR2 SDRAM clock input. To achieve this skew requirement, ALTMEMPHY-based designs always use DDR IOE registers to generate the \(CK\) and \(CK\#\) signals.

5. In the **PHY Settings** panel, add the board skew parameter for the board in the **Board Timing Parameters** section. This timing parameter is the board trace variation between \(DQ\) and \(DQS\). The default value is 20 ps. If your board can perform better or worse than this number, update it accordingly. The wizard uses this number to calculate the overall system timing margin. For this example design, enter the value of 20 ps as the board skew tolerance target is 20 ps.

Figure 13 shows generation and board timing parameters.
6. Also, in the **Advanced PHY Settings** section (see Figure 13), turn on **Use differential DQS** to enhance signal to noise ratio. Turn on this option where noise margin is a concern. Differential DQS mode is recommended for DDR2 SDRAM interfaces operating at above 266 MHz and enhances signal to noise ratio. The Stratix III development board is routed for differential DQS mode.

7. Turn on **Enable Dynamic Parallel On-Chip Termination (OCT)** for this example as the Stratix III development board does not include discrete external termination on the DQ, DQS, or DM pins, as the board was designed to use OCT.
8. Enter 240 in **Dedicated clock phase** for the **Address/Command Clock Settings**. Timing analysis shows that 240° is optimal for the Stratix III development board.

The settings in **Auto-Calibration Simulation Options** are for RTL simulation only and are not applicable for gate-level simulation.

9. Click **Finish** to generate your MegaCore function variation. The MegaWizard Plug-In Manager generates all the files necessary for your DDR2 SDRAM controller, and generates an example top-level design, which you may use to test or verify board operation. **Figure 14** shows generation messages including tips on Quartus II settings.

For detailed step-by-step instructions for parameterizing the DDR2 SDRAM high-performance controller, refer to the **DDR and DDR2 SDRAM High-Performance Controller User Guide**.

---

**Figure 14. Generation**

![Generation - DDR2 SDRAM High Performance Controller](image)

---

**Add Constraints**

After instantiating the DDR2 SDRAM high-performance controller, the ALTMEMPHY megafunction generates the constraints files for the example design. Apply these constraints to the design before compilation.
Add Timing Constraints

When you instantiate an SDRAM high-performance controller, it generates a timing constraints file, `<variation_name>_phy_ddr_timing.sdc`. The timing constraint file constrains the clock and input and output delay on the SDRAM high-performance controller.

To add timing constraints, follow these steps:

1. On the Assignments menu click **Settings**.
2. In the **Category** list, expand **Timing Analysis Settings**, and select **TimeQuest Timing Analyzer**.
3. Select the `<variation_name>_phy_ddr_timing.sdc` file and click **Add**.
4. Click **OK**.

Add Pin and DQ Group Assignments

The pin assignment script, `<variation_name>_pin_assignments.tcl`, sets up the I/O standards for the DDR2 SDRAM interface. It also launches the DQ group assignment script, `<variation_name>_phy_assign_dq_groups.tcl`, which relates the DQ and DQS pin groups together for the fitter to place them correctly in the Quartus II software.

This script does not create a clock for the design. You need to create a clock for the design and provide pin assignments for the signals of both the example driver and testbench that the MegaCore variation generates.

Run the `<variation_name>_pin_assignments.tcl` to add the pin, I/O standards, and DQ group assignments to the example design.

Set Top-Level Entity

Before compiling the design, set the top-level entity of the project to the correct entity. The ALTMEMPHY megafuntion entity is `<variation_name>_phy.v` or `vhd`; the SDRAM high-performance controller entity is `<variation_name>.v` or `vhd`.

The example top-level design, which instantiates the SDRAM high-performance controller and an example driver, is `<variation_name>_example_top.v` or `vhd`.

To set the top-level file, follow these steps:
1. Open the top-level entity file, `<variation_name>_example_top.v` or `vhd`.

2. On the Project menu click **Set as Top-Level Entity**.

**Set Optimization Technique**

To ensure the remaining unconstrained paths are routed with the highest speed and efficiency, set the optimization technique to **Speed**. To set the optimization technique, follow these steps:

1. On the Assignments menu click **Settings**.

2. Select **Analysis & Synthesis Settings**.

3. Select **Speed** under **Optimization Technique**. Click **OK**.

**Set Fitter Effort**

To set the fitter effort to Standard Fit, follow these steps:

1. On the Assignments menu click **Settings**.

2. Expand **Fitter Settings**.

3. Turn on **Optimize Hold Timing** and select **All Paths**.

4. Turn on **Optimize Fast Corner Timing**.

5. Select **Standard Fit** under **Fitter Effort**.

6. Click **OK**.

**Enter Pin Location Assignments**

To enter the pin location assignments, follow these steps:

1. Run Analysis and Synthesis. On the Processing menu, point to **Start** and click **Start Analysis and Synthesis**.

2. Assign all of your pins, so the Quartus II software fits your design correctly and gives correct timing analysis. To assign pin locations for the Stratix III development board, run the Altera-provided `S3_Host_DDR2_PinLocations.tcl` file or manually assign pin locations by using the Pin Planner.
The SDRAM high-performance controller autogenerated scripts do not make any pin location assignments.

If you are at the design exploration phase of your design cycle and do not have any PCB defined pin locations, you should still manually define an initial set of pin constraints, which can become more specific during your development process.

To manually assign pin locations, follow these steps:

1. Open Pin Planner. On the Assignments menu, click Pin Planner.

2. Assign DQ and DQS pins.
   a. To select the device DQS pin groups that the design uses, assign each DQS pin in your design to the required DQS pin in the Pin Planner. The Quartus II Fitter then automatically places the respective DQ signals onto suitable DQ pins within each group. To see DQS groups in Pin Planner, right click, select Show DQ/DQS Pins, and click In x8/x9 Mode. Pin Planner shows each DQS group in a different color and with a different legend: S = DQS pin, Sbar = DQSn pin and Q = DQ pin (see Figure 15).

   Most DDR2 SDRAM devices operate in ×8/×9 mode, however as some DDR2 SDRAM devices operate in ×4 mode, refer to your specific memory device datasheet.

   b. Select the DQ mode to match the DQ group width (number of DQ pins/number of DQS pins) of your memory device. DQ mode is not related to the memory interface width.

   DQ group order and DQ pin order within each group is not important. However, you must place DQ pins in the same group as their respective strobe pin.
3. Place DM pins within their respective DQ group.

4. Place address and control command pins on any spare I/O pins ideally within the same bank or side of the device as the `mem_clk` pins.

5. Ensure you place `mem_clk` pins on differential I/O pairs for the `CK/CK#` pin pair. To identify differential I/O pairs, right-click in Pin Planner and select **Show Differential Pin Pair Connections**. Pin pairs show a red line between each pin pair.

   ![Image showing differential pin pair connections]

   You must place `mem_clk[0]` and `mem_clk_n[0]` on a `DIFFIO_RX` pin pair, if your design uses differential DQS signaling.
6. Ensure mem_clk pins use any regular adjacent I/O pins—ideally differential I/O pairs for the CK/CK# pin pair. To identify differential I/O pairs, right-click in Pin Planner and select Show Differential Pin Pair Connections. Pin pairs show a red line between each pin pair.

7. Place the clock_source pin on a dedicated PLL clock input pin with a direct connection to the SDRAM controller PLL and DLL pair—usually on the same side of the device as your memory interface. This recommendation reduces PLL jitter, saves a global clock resource, and eases timing and fitter effort.

8. Place the global_reset_n pin (like any high fan-out signal) on a dedicated clock pin.

For more information on how to use the Quartus II Pin Planner, refer to the I/O Management chapter in volume 2 of the Quartus II Handbook.

**Virtual Pins**

The example top-level design, which is autogenerated by the high-performance controller, includes an example driver to stimulate the interface. This example driver is not part of the SDRAM high-performance controller IP, but allows easy testing of the IP.

The example driver outputs several test signals to indicate its operation and the status of the stimulated memory interface. These signals are pnf, pnf_per_byte, and test_complete. These signals are not part of the memory interface, but are for testing. You should take these signals to either a debug header or set the signals to virtual pin using the Quartus II Assignment Editor. When using the example driver for testing, do not remove these signals from the top-level signal list. Otherwise the Quartus II software optimizes the driver away, and the example driver fails.

To assign virtual pin assignments for the Stratix III development board, run the Altera-provided s3_Host_ddr2_exdriver_vpin.tcl file or manually assign virtual pin assignments using the Assignment Editor.

**Advanced I/O Timing**

ALTMEMPHY-based designs assume that the memory address and command signals are matched length to the memory clock signals. Typically, this length match is not true for DIMM-based designs. You should verify the difference in your design. For the Stratix III development board fitted with the MT9HTF12872AY DIMM, the address and command signals are 750 ps longer than the clock signals.
To amend the TimeQuest.sdc file, `<variation name>_phy_ddr_timing.sdc`, to include this difference, follow these steps:

1. Open the `ddr2_dimm_phy_ddr_timing.sdc` file in a text editor and find the following line (usually line 31):
   
   ```
   set t(additional_addresscmd_tpd) 0.000
   ```

2. Change the line to the following text:
   
   ```
   set t(additional_addresscmd_tpd) 0.750
   ```

3. Save the file.

   ![Tip] If the DDR2 SDRAM controller .sdc file is regenerated, this change is lost and you must re-edit the file.

### Board Trace Delay Models

For accurate I/O timing analysis, the Quartus II must be aware of the board trace and loading information. This information should be derived and refined during your PCB development process of prelayout (line) simulation and finally post-layout (board) simulation. For external memory interfaces that use memory modules (DIMMs), this information should include the trace and loading information of the module in addition to the main and host platform, which you can obtain from your memory vendor.

To enter board trace information, follow these steps:

1. In Pin Planner, select the pin or group of pins that you want to enter the information for.

2. Right-click and select **Board Trace Model**.

   Figures 16 through 19 show a typical board trace model for an address, memory clock, DQ, and DQS pin on the Stratix III development board including the data for the MT9HTF12872AY-800E memory module that is included with the kit.
Figure 16. Stratix III Development Board Address Signal Board Trace Model
Figure 17. Stratix III Development Board Memory Clock Signal Board Trace Model
Figure 18. Stratix III Development Board DQ Signal Board Trace Model
Table 8 shows the board trace model parameters for the Stratix III development board.

### Table 8. Stratix III Development Board Trace Model Summary

<table>
<thead>
<tr>
<th>Net</th>
<th>Near (FPGA End of Line)</th>
<th>Far (Memory End of Line)</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>Length</td>
<td>L_per_length</td>
</tr>
<tr>
<td>Addr (1)</td>
<td>2.717</td>
<td>3.3</td>
</tr>
<tr>
<td>CLK</td>
<td>3.069</td>
<td>2.8</td>
</tr>
<tr>
<td>CKE/CS#</td>
<td>2.717</td>
<td>3.3</td>
</tr>
<tr>
<td>ODT</td>
<td>2.717</td>
<td>3.3</td>
</tr>
<tr>
<td>DQS0</td>
<td>3.017</td>
<td>3.4</td>
</tr>
<tr>
<td>DQS1</td>
<td>3.005</td>
<td>3.4</td>
</tr>
<tr>
<td>DQS2</td>
<td>2.851</td>
<td>3.4</td>
</tr>
<tr>
<td>DQS3</td>
<td>2.653</td>
<td>3.4</td>
</tr>
</tbody>
</table>
Altera recommends you use the Board Trace Model assignment on all DDR and DDR2 SDRAM interface signals. To apply board trace model assignments for the Stratix III development board, run the Altera-provided `S3_Host_DDR2_BTModels.tcl` file or manually assign virtual pin assignments using the Quartus II Pin Planner.

The Stratix III development board has the following compensation capacitors fitted to its DDR2 SDRAM address and command, and \( \text{CLK} \) and \( \text{CLK}# \) signals:

- **Address and command** = 33 pF compensation capacitors
- **\( \text{CLK} \) and \( \text{CLK}# \)** = 7 pF (differential) compensation capacitors.

These capacitors are typically fitted to designs that use nonsymmetrical DIMM designs. You should simulate your design to see if compensation capacitors are required. Stratix III devices have various programmable drive strength and OCT I/O options, so compensation capacitors should not usually be required. Fitting compensation capacitors reduces the edge rate of your signals, so you should observe memory vendor derating guidelines.

For more information on compensation capacitors, refer to *Micron Technical Note TN_47_01*.

### Perform RTL or Functional Simulation (Optional)

This section describes RTL and functional simulation.
Set Up Simulation Options

To set up simulation option, follow these steps:

1. Obtain and copy the vendors memory model to a suitable location. For example, obtain the `ddr2.v` and `ddr2_parameters.vh` memory model files from the Micron website and save them in the testbench directory.

   Some vendor DIMM models do not use DM pin operation, which can cause calibration failures. In these cases, use the vendors component models directly.

2. Open the memory model file in a text editor and add the following define statements to the top of the file:

   ```
   'define sg25
   'define x8
   ```

   The two define statements prepare the DDR2 SDRAM interface model.

   The first statement specifies the memory device speed grade as –25. The second statement specifies the memory device width per DQS.

3. Open the testbench in a text editor, instantiate the downloaded memory model, and connect its signals to the rest of the design.

4. Delete the `START` and `END MEGAWIZARD` comments to ensure the MegaWizard Plug-In Manager does not overwrite the changes when the controller is regenerated.

Run Simulation with NativeLink

To run the simulation with NativeLink, follow these steps:

1. Set the absolute path to your third-party simulator executable.
   a. On the Assignments menu, point to EDA Tool Settings and click Simulation.
   b. In the Category list expand EDA Tool Settings and click Simulation.
   c. Under Tool Name, select ModelSim-Altera.
   d. Under NativeLink settings, select Compile test bench.
e. Enter the name of your testbench top-level module and simulation period.

2. Elaborate your design. On the Processing menu point to Start and click Start Analysis and Elaboration.

3. On the Tools menu point to the EDA Simulation Tool and click Run EDA RTL Simulation. This step creates the \simulation directory in your project directory and a script that compiles all necessary files and runs the simulation.

   For example waveforms, refer to the DDR and DDR2 SDRAM High-Performance Controller User Guide.

**Compile Design and Verify Timing**

To compile the design, on the Processing menu, click Start Compilation.

After successfully compiling the design, run the MegaWizard-generated verify timing script, ddr2_dimm_phy_report_timing.tcl, which produces a timing report for the design.

Figure 20 shows the timing margin report in the message window in the Quartus II software.
The report timing script performs the following tasks:

- Creates a timing netlist
- Reads the .sdc file
- Updates the timing netlist

To run the report timing script in the TimeQuest Timing Analyzer window, follow these steps:

1. Open the panel in the Quartus II software.
2. Double-click Update Timing Netlist in the left pane, which automatically runs Create Timing Netlist and Read SDC. After a task is executed, it turns green.
3. After completing the tasks, run the report timing script by going to the Script menu and clicking Run Tcl Script.
Figure 21 shows the timing margin report in the **TimeQuest Timing Analyzer** window after running the report timing script. The results are the same as the Quartus II software results, see Figure 20.

**Figure 21. Timing Margin Report in TimeQuest Timing Analyzer**

For more information about the **TimeQuest Timing Analyzer** window, refer to the **Quartus II TimeQuest Timing Analyzer** chapter in volume III of the **Quartus II Handbook**.

Refer to AN 438: **Constraining and Analyzing Timing for External Memory Interfaces** for detailed information about timing analysis.

**Adjust Constraints**

For example, if the timing margin report shows negative hold time on the address and command datapath, adjusting the clock that is regulating the address and command output registers can improve the hold margin on the address and command datapath. To find out which clock is clocking...
the address and command registers, click on the address and command report in the Report panel in TimeQuest timing analyzer and select the path that violates the hold time, see Figure 22.

**Figure 22. Report on the Path That Violates Hold Time**

![Figure 22](image)

The report indicates that clk6 of the PLL is clocking the address and command registers. Go to the PLL megafunction and change the phase setting of *clk6*. For this design, the initial phase setting of *clk6* is set to 315°, resulting in the address and command being launched too early, which causes a hold time violation. To remedy this violation, delay the launch of the address and command by delaying *clk6*, by increasing the phase setting. The negative hold margin reported is –45 ps. Therefore, delay clk6 by an amount larger than that. Using the frequency of *clk6*, translate the amount of time delay to degrees in the PLL setting. For this example, *clk6* is 200 MHz which 45 ps translates to 3°. To ensure positive margin for hold, delay clk6 by more than 3°, which means the new phase setting for *clk6* is larger than 318°. For this example, set the new phase setting for *clk6* to 330° so there is sufficient hold time. Alternatively, you can select a phase that balances setup and hold times.
After modifying the \texttt{clk6} phase setting, recompile the design for the new PLL setting to take effect. Run the report timing script again. Figure 23 shows the timing margin reported in the Quartus II software after adjusting the phase setting of \texttt{clk6}.

![Figure 23. Timing Margin Reported After Adjusting clk6](image)

The timing report shows that all the timing margins are met.

**Determine Board Design Constraints and Perform Board-Level Simulations**

Stratix III devices support both series and parallel OCT resistors to improve signal integrity. Another benefit of the Stratix III OCT resistors is eliminating the need for external termination resistors on the FPGA side. This feature simplifies board design and reduces overall board cost. You can dynamically switch between the series and parallel OCT resistor depending on whether the Stratix III devices are performing a write or a read operation. The OCT features offer user-mode calibration to
compensate for any variation in voltage and temperature during normal operation to ensure that the OCT values remain constant. The parallel and series OCT features of the Stratix III devices are available in either a 25-Ω or 50-Ω setting.

Refer to the *Stratix III Device I/O Features* chapter of the *Stratix III Device Handbook* for more information about the OCT features.

Refer to the respective memory data sheet for additional information about the available settings of the ODT and the output driver impedance features, and the timing requirements for driving the ODT pin in DDR2 SDRAM.

*Figure 24* illustrates the write operation to the DDR2 SDRAM with the ODT feature turned on and using the 50-Ω series OCT feature of the Stratix III FPGA device. In this setup, the transmitter (FPGA) is properly terminated with matching impedance to the transmission line, thus eliminating any ringing or reflection. The receiver (DDR2 SDRAM) is also properly terminated when the dynamic ODT setting is at 75 Ω.

*Figure 24. Write Operation Using Parallel ODT and 50-Ω Series OCT of the Stratix III FPGA Device*

*Figure 25* illustrates the read operation from the DDR2 SDRAM using the parallel OCT feature of the Stratix III device. In this setup, the driver’s (DDR2 SDRAM) output impedance is not larger than 21 Ω. This impedance is in keeping with SSTL-18 JEDEC specification JESD79-2. Combined with an on dual-inline memory modules (DIMM) series resistor, the impedance matches that of the transmission line resulting in optimal signal transmission to the receiver (FPGA). On the receiver (FPGA) side, it is properly terminated with 50-Ω, which matches the impedance of the transmission line, thus eliminating any ringing or reflection.

*Figure 25 illustrates the read operation from the DDR2 SDRAM using the parallel OCT feature of the Stratix III device. In this setup, the driver’s (DDR2 SDRAM) output impedance is not larger than 21 Ω. This impedance is in keeping with SSTL-18 JEDEC specification JESD79-2. Combined with an on dual-inline memory modules (DIMM) series resistor, the impedance matches that of the transmission line resulting in optimal signal transmission to the receiver (FPGA). On the receiver (FPGA) side, it is properly terminated with 50-Ω, which matches the impedance of the transmission line, thus eliminating any ringing or reflection.*
Finally, the loading seen by the FPGA during writes to the memory is different between a system using DIMMs versus a system using components. The additional loading from the DIMM connector can reduce the edge rates of the signals arriving at the memory thus affecting available timing margin.


**Adjust Drive Strength**

Due to the loading of the line, the Quartus II software may report that the default or chosen drive strength cannot drive the line to the specified toggle rate or minimum pulse width, see Figure 26. If you encounter this error, use the stronger drive strength I/O standard. Ensure that you re-simulate your design with the new drive strength to ensure that signal quality is still acceptable.

The Quartus II software v8.0 has a bug that results in an incorrect calculation for the toggle rate for differential I/O standards.
The `mem_clk` and `mem_clk_n[2:0]` require drive strength change from Output Termination – Series 50 ohm without calibration to Current Strength – 12mA. Using Stratix III development board trace model, differential SSTL18-CI 12-mA drive results in a maximum frequency of 337 MHz.

The `mem_dqs` and `dqs_n[8..0]` require a drive strength setting change from Output Termination – Series 50 ohm without calibration to Current Strength – 12mA.

To change the `mem_clk` and `mem_dqs` drive strength, either use the Quartus II Assignment Editor or run the Altera-provided `S3_Host_DDR2_IODriveStrength.tcl` script.
Verifying Design on a Board

The SignalTap II logic analyzer shows read and write activity in the system.

For more information on using the SignalTap II logic analyzer, refer to the following documents:

- Design Debugging Using the SignalTap II Embedded Logic Analyzer chapter in the Quartus II Handbook
- AN 323: Using SignalTap II Embedded Logic Analyzers in SOPC Builder Systems
- AN 446: Debugging Nios II Systems with the SignalTap II Logic Analyzer

To add the SignalTap II logic analyzer, follow these steps:

1. On the Tools menu click SignalTap II Logic Analyzer.
2. In the Signal Configuration window next to the Clock box, click … (Browse Node Finder).
3. Type *phy_clk in the Named box, for Filter select SignalTap II: pre-synthesis and click List.
4. Select ddr2_dimm_example_top|ddr2_dimm:ddr2_dimm_inst|ddr2_dimm_controller_phy:ddr2_dimm_controller_phy_inst|phy_clk|phy_clk in Nodes Found and click > to add the signal to Selected Nodes.
5. Click OK.
6. Under Signal Configuration, specify the following settings:
   - For Sample depth, select 512
   - For RAM type, select Auto
   - For Trigger flow control, select Sequential
   - For Trigger position, select Center trigger position
   - For Trigger conditions, select 1
7. On the Edit menu, click Add Nodes.
8. Search for specific nodes by typing *local* in the Named box, for Filter select SignalTap II: pre-synthesis and click List.
9. Select the following nodes in **Nodes Found** and click > to add to **Selected Nodes**:

- local_address
- local_rdata
- local_rdata_valid
- local_read_req
- local_ready
- local_wdata
- local_wdata_req
- local_write_req
- pnf
- pnf_per_byte
- test_complete (trigger)
- resynchronization_successful (contained within the PHY instance)
- postamble_successful (contained within the PHY instance)
- rsu_codvw_phase (contained within the sequencer instance)
- rsu_read_latency (contained within the sequencer instance)
- rsu_no_dvw_err (contained within the sequencer instance)
- rsu_grt_one_dvw_err (contained within the sequencer instance)
- rsu_multiple_valid_latencies_err (contained within the sequencer instance)

Do not add any DDR SDRAM interface signals to the SignalTap II logic analyzer. The load on these signals increases and adversely affects the timing analysis.

10. Click **OK**.

11. To reduce the SignalTap II logic size, turn off **Trigger Enable** on the following bus signals:

- local_address
- local_rdata
- local_wdata
- pnf_per_byte
- rsu_codvw_phase
- rsu_read_latency

12. Right-click **Trigger Conditions** for the test_complete signal and select **Rising Edge**.

**Figure 27** shows the completed SignalTap II logic analyzer.
13. On the File menu, click Save, to save the SignalTap II .stp file to your project.

If you see the message Do you want to enable SignalTap II file “stp1.stp” for the current project, click Yes.

**Compile the Project**

Once you add signals to the SignalTap II logic analyzer, recompile your design, on the Processing menu, click Start Compilation.

**Verify Timing**

Once the design compiles, ensure that TimeQuest timing analysis passes successfully. In addition to this FPGA timing analysis, check your PCB or system SDRAM timing. To run timing analysis, run the *_phy_report_timing.tcl script.

1. On the Tools menu, click Tcl Scripts.
2. Select `<variation name>._phy_report_timing.tcl` and click Run.

Connect the Development Board
Connect the development board to your computer.

Download the Object File
On the Tools menu, click SignalTap II Logic Analyzer. The SignalTap II dialog box appears.

The SOF Manager should contain the `<your project name>.sof` file. To add the correct file to the SOF Manager, follow these steps:

1. Click ... to open the Select Program Files dialog box (see Figure 28).
2. Select `<your project name>.sof`.
3. Click Open.
4. To download the file, click the Program Device button (see Figure 28).

Test the Example Design in Hardware
When the example design including SignalTap II successfully downloads to your development board, click Run Analysis to run once, or click Autorun Analysis to run continuously. Figure 29 shows the design analysis.
Conclusion

DDR and DDR2 SDRAM devices are widely used in FPGA designs, and DDR technology is the most popular DRAM architecture. Stratix III devices have dedicated circuitry to interface with DDR and DDR2 SDRAM at speeds up to 200 MHz (400 Mbps) and 400 MHz (800 Mbps), respectively, with comfortable and consistent margins. System designers can enhance their Stratix III system performance through the use of commercial off-the-shelf SDRAM without increasing cost. Altera offers a complete, proven memory solution in Stratix III devices for DDR and DDR2 interface, which allows you to use these devices in applications requiring low power, high bandwidth, and improved performance.

References

- External Memory Interfaces chapter of the Stratix III Device Handbook
- External Memory Interfaces chapter of the Stratix IV Device Handbook
- External DDR Memory PHY Interface (ALTMEMPHY) Megafunction User Guide
- DDR and DDR2 SDRAM High-Performance Controller User Guide
- DDR and DDR2 SDRAM Controller Compiler User Guide
Table 9 shows the revision history for this document.

### Table 9. Document Revision History

<table>
<thead>
<tr>
<th>Date and Document Version</th>
<th>Changes Made</th>
<th>Summary of Changes</th>
</tr>
</thead>
<tbody>
<tr>
<td>August 2008 v2.0</td>
<td>Significant rewrite.</td>
<td>Updated for the Quartus II software version 8.0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>Added Stratix IV and HardCopy III devices</td>
</tr>
<tr>
<td></td>
<td></td>
<td>Updated walkthrough to target Stratix III Development Kit</td>
</tr>
<tr>
<td>October 2007 v1.1</td>
<td>Updated and added new figures, and added a new section.</td>
<td>—</td>
</tr>
<tr>
<td>February 2007 v1.0</td>
<td>Initial release</td>
<td>—</td>
</tr>
</tbody>
</table>
Using DDR and DDR2 SDRAM in Stratix III and Stratix IV Devices