Update: I have tested my ASM code as "standalone" code, and it works fine (I just have changed "beq bit_clear" to "bne bit_clear", and added push-pop).
So I suppose that there is one reason of incorrect operation of FIQ handler: it had not registered properly, and CPU goes to undefined space. But why?