Why use custom design?
- Latch: we use latch-based design to minimize the overhead of a regular D-flip-flop, and do timing borrowing from stage to stage easily.
- Regular placement: fully control of cell placement is very suitable for DSP blocks which has regular data path, and we can minimize the local cell routing. Normally standard cells have the same cell height, so if we want to increase the driven strength we have to increase the cell width. But in our custom design, different driven strength is done by increasing the cell height to make sure the data path flow is regular and well organized.
- Clock: we can manually place latches in rows in order to use regular fish-bone clock tree and large driver cells to minimize clock skew and insertion delay.
- Domino: we can use domino logic and tri-state logic when it’s needed.
Challenges of custom design?
- Design effort: much much larger than RTL plus synthesis, and very hard to modify. For example, we are translating one of our 28nm HPM process design into 28nm HPC+ and adding some new features. It will take 3 engineers, 1 design and 2 layout, 4 to 5 months to finish. If it’s RTL, it’ll take no more than a week.
- Timing analysis: we cannot use Primetime, instead we have to setup Nanotime and do characterization. The OCV is not fully supported. Run time is much longer.
- May not be useful in 16nm FinFET process:
- even larger on-chip variation in 16nm makes lack of STA tool a major disadvantage
- self-heating effect makes large clock trunk drivers weak spots