I needed to align instruction execution to a 64bit boundary, for custom CPU architecture that I'm building. Basically the ISA had 3 types of instruction lengths; 16bit, 32bit, and 48bit. The core did 64bit fetch from the Instruction Cache, the issue was that if an instruction was in between two blocks of data I needed to fetch two blocks. That would impact performance a bit. So I had to modify GCC sources of the ISA that I'm using. So instead of doing it the right way, I just did it the lazy way and modified the GCC backend to print

.p2alignw 3, 0x00ff, 4
.p2alignw 3, 0x00ff, 2

On each 48bit and 32bit instruction to generate NOPs. And it did work lol.

Add Comment