I think the high level question is do we really want BYTES_IN_PARTICLE != BYTES_IN_WORD?

Under some object models, there is an advantage in 64 bit land to be able to "mis-align" the lowest word of the object so that a subsequent field (eg at offset 4 in the object) can be 64bit aligned.  I think it is likely that with the switch to the forward scalar object model and the current placement of  the array length to the "right" of the TIB, that there is no longer an advantage in doing this.

If it makes the code simpler/cleaner than my suggestion would be that we drop the distinction between BYTES_IN_PARTICLE and BYTES_IN_WORD.  

There are still cases where the VM is going to want to allocate storage that (at a known offset from the start of the allocated region) is allocated to a larger unit than BYTES_IN_WORD  (for example double[] in 32 bit mode and scalar objects that contain long/double fields in 32 bit mode).  So, we need this ability, but subword alignment in 64 bit mode may not be that useful (at least under the current object model).