crc32: add fast variant of regular crc_32 function
This relies on a macro, CRC32_FAST, to select which version to compile
with. In tests the fast version is 2x faster at the expense of requiring
960 more bytes for the lookup table. For now the default is the speed
optimized version but in the future I would like to enable this for ports
where we can afford the extra storage and/or memory requirements.
Change-Id: I8c7fde6b6ff130f0fdc7c8e472825e25bcdba022