@YuKay it can be a bit complicated depending on how you want to look at it.
Caution: great simplification of how things work to follow.
4K = 3840 x 2160 pixels or 4096 x 2160 pixels
The EVO sensor is 12 MP which is typically 4000 x 3000 pixels
The 1" sensor is 20 MP which is 5472 x 3648 pixels
More pixels doesn't mean better pixels.
The sensors are likely recent Sony designs/manufacture. Typically the 1" sensor would have larger pixels and that means more sensitivity to light which is usually seen in better low light photos and videos with less noise.
Now comes the technical challenges. How do you go from 4000 x 3000 to 3840 x 2160 or from 5472 x 3648 to 3840 x 2160? Do you simply only "look" at the 3840 x2160 subset of pixels or do you sample all and downsize to the desired 3840 x 2160.
The magic that happens with the choices made in getting to 3840 x 2160 and by the image processor used. With great choices like Autel seems to have made, in good light situations you may not see much difference between 4K60p from a 1" sensor and a 1/2.3" sensor. In low level light the physics of pixel size will definitely favor the larger pixels in the 1" sensor. There's only so much magic that the image processor can do.
If you're taking photos instead of video the differences are magnified by the greater resolution of the 1" sensor.
P.S. never under estimate the value of a good glass lens
P.P.S. the stuttering *might" be attributable to how the video from the camera is handled in post processing.
I hope this makes sense.