On the MIT lecture e.g. they have an example of a DQN with convolutional layers.
https://youtu.be/i6Mi2_QM3rA?t=1531
Or here: https://towardsdatascience.com/welcome-to-deep-reinforcement-learning-part-1-dqn-c3cab4d41b6b
Would be interested if this is a problem in general or just with this implementation of DQN. I couldn't find anywhere that it is a problem.