That's the OP's point. His/her question is: why allocate an entire page if, instead, you could have a single bit 'process did not initialize its source of randomness yet' in kernel space per process?
The only advantage I see is that the current solution allows one to implement the random number generator independently of the kernel. Introducing that bit creates a tight coupling.
Actually, there's an even better solution: simply give the kernel a userspace address range which will be zeroed on fork. There's simply no reason at all why this address range should be restricted to exactly 4KB or a multiple thereof (one could imagine the kernel doing some page-table tricks to avoid a full memset() for large areas, but that's an optimization that can be added transparently to an API that supports arbitrary address ranges).
The futex API (including set_tid_address) is precedence for this kind of syscall.