Pixel shift: 4 x megapixels but how much more resolution?
Moderators: Chris S., Pau, Beatsy, rjlittlefield, ChrisR
Pixel shift: 4 x megapixels but how much more resolution?
My brain is melting. Help!
My Sony A7riv supports 4-shot and 16-shot pixel shifting which I use frequently and my question relates to this. I've discussed pixel-shifting on other threads but not in this precise context so I'll define the terms again for clarity.
The 4-shot mode shifts the sensor left, right, up or down by 3.76µm, the pitch of one pixel (or sensel) for each shot. When these 4 images are combined in Sony's Imaging Edge software, every pixel in the output RAW image gets true, sampled R, G and B values with no interpolated values generated by a Debayering algorithm (as happens for single shot images where only one of R, G or B is ever a true sample while the other two are guessed based on values in neighbouring pixels). The 4-shot method significantly improves colour resolution but doesn't increase image size. The output image has identical pixel dimensions and megapixel size as each of the input images.
The 16-shot mode can be considered as four separate 4-shot sequences. However, in this case the sensor is pre-shifted left, right, up or down by HALF the sensel pitch (1.88µm) before triggering each 4-shot sequence that proceeds exactly as stated above. When the resultant 16 images are combined in Sony's Imaging Edge software, each and every pixel in the output RAW image has true R, G and B values as before, but the image is twice the pixel width and height of the originals, thus four times as many megapixels.
In most documentation and discussions, megapixels and resolution are conflated so the resultant image is deemed to have twice the linear resolution or four times more resolution overall. This is clearly not true! For that you would need to use a sensor with four times as many sensels in the same space (twice as many horizontally and vertically). They'd be 1.88µm square too - which they certainly aren't in this case. Pixel-shifting doesn't quite emulate that.
Instead, every pixel in the output is a mixture of the RGB light (signal) that fell directly on separate portions of it (several times) AND half the signal that fell on each of it's four neighbours in the left, right, up and down directions AND quarter of the signal that fell on each of it's four diagonal neighbours. It's like a sensor with 3.76µm sensels positioned at 1.88µm pitch and where the overlapping parts could all receive a full-strength signal from the light falling on each of them (physically impossible).
So what on earth does this mean for a resolution increase - or the amount of oversampling if you prefer to look at it that way. I'm completely flummoxed. I wouldn't be surprised if sqrt(2) was involved here, i.e. the increase in true resolution or oversampling is 1.4x, but I can't figure it out. The more I think about it, the more confused I get.
Over to the geniuses, and thanks in advance.
Cheers
Beats
Erratum: in trying to simplify the description as much as possible for clarity, I inadvertently implied/stated that the sensor was shifted up, down, left and right from a fixed "home" position. The sensor is shifted down, left, up and right in that order, with a shot being taken before each move. The sensor ends up back in the initial position. I've left the original text so as not to invalidate related responses.
My Sony A7riv supports 4-shot and 16-shot pixel shifting which I use frequently and my question relates to this. I've discussed pixel-shifting on other threads but not in this precise context so I'll define the terms again for clarity.
The 4-shot mode shifts the sensor left, right, up or down by 3.76µm, the pitch of one pixel (or sensel) for each shot. When these 4 images are combined in Sony's Imaging Edge software, every pixel in the output RAW image gets true, sampled R, G and B values with no interpolated values generated by a Debayering algorithm (as happens for single shot images where only one of R, G or B is ever a true sample while the other two are guessed based on values in neighbouring pixels). The 4-shot method significantly improves colour resolution but doesn't increase image size. The output image has identical pixel dimensions and megapixel size as each of the input images.
The 16-shot mode can be considered as four separate 4-shot sequences. However, in this case the sensor is pre-shifted left, right, up or down by HALF the sensel pitch (1.88µm) before triggering each 4-shot sequence that proceeds exactly as stated above. When the resultant 16 images are combined in Sony's Imaging Edge software, each and every pixel in the output RAW image has true R, G and B values as before, but the image is twice the pixel width and height of the originals, thus four times as many megapixels.
In most documentation and discussions, megapixels and resolution are conflated so the resultant image is deemed to have twice the linear resolution or four times more resolution overall. This is clearly not true! For that you would need to use a sensor with four times as many sensels in the same space (twice as many horizontally and vertically). They'd be 1.88µm square too - which they certainly aren't in this case. Pixel-shifting doesn't quite emulate that.
Instead, every pixel in the output is a mixture of the RGB light (signal) that fell directly on separate portions of it (several times) AND half the signal that fell on each of it's four neighbours in the left, right, up and down directions AND quarter of the signal that fell on each of it's four diagonal neighbours. It's like a sensor with 3.76µm sensels positioned at 1.88µm pitch and where the overlapping parts could all receive a full-strength signal from the light falling on each of them (physically impossible).
So what on earth does this mean for a resolution increase - or the amount of oversampling if you prefer to look at it that way. I'm completely flummoxed. I wouldn't be surprised if sqrt(2) was involved here, i.e. the increase in true resolution or oversampling is 1.4x, but I can't figure it out. The more I think about it, the more confused I get.
Over to the geniuses, and thanks in advance.
Cheers
Beats
Erratum: in trying to simplify the description as much as possible for clarity, I inadvertently implied/stated that the sensor was shifted up, down, left and right from a fixed "home" position. The sensor is shifted down, left, up and right in that order, with a shot being taken before each move. The sensor ends up back in the initial position. I've left the original text so as not to invalidate related responses.
Last edited by Beatsy on Fri Oct 06, 2023 11:47 pm, edited 1 time in total.
Re: Pixel shift: 4 x megapixels but how much more resolution?
not a genius, but I worked with pixel shift quite a lot and like to think about resolution..
short story is that the real world difference seems impossible to calculate theoretically, for several reasons.
the main one is that it depends on the physical construction of the sensor, mainly fill factor and micro lenses.
if the sensors light sensitive surface was completely flat and each pixel would have a 44% fill factor, then in theory it should be possible to achieve 4x the resolution of a full RGB image. with a fill factor of 100% it would be hardly any increase.
in reality the sensor surface is quite complex with micro lenses to guide the light, reducing what is possible. it probably will even depend on the location of the pixel (center vs corner) and the angle the light hits the sensor (focal length and design of the lens).
on top, the mechanics in the pixel shift likely isn't perfect.
so instead of calculating, it makes more sense to simply test a real world system. of course that also introduces limitations of the lens, but that's usually a good thing.
just some thoughts
short story is that the real world difference seems impossible to calculate theoretically, for several reasons.
the main one is that it depends on the physical construction of the sensor, mainly fill factor and micro lenses.
if the sensors light sensitive surface was completely flat and each pixel would have a 44% fill factor, then in theory it should be possible to achieve 4x the resolution of a full RGB image. with a fill factor of 100% it would be hardly any increase.
in reality the sensor surface is quite complex with micro lenses to guide the light, reducing what is possible. it probably will even depend on the location of the pixel (center vs corner) and the angle the light hits the sensor (focal length and design of the lens).
on top, the mechanics in the pixel shift likely isn't perfect.
so instead of calculating, it makes more sense to simply test a real world system. of course that also introduces limitations of the lens, but that's usually a good thing.
just some thoughts
chris
- rjlittlefield
- Site Admin
- Posts: 24489
- Joined: Tue Aug 01, 2006 8:34 am
- Location: Richland, Washington State, USA
- Contact:
Re: Pixel shift: 4 x megapixels but how much more resolution?
The conflated version is more accurate than not: half-pixel shifting does double the usable cutoff frequency of the sensor, but the extended range is captured at a progressively lower contrast.
A math summary goes like this...
Assume that the photosites of the sensor have 100% uniform coverage, so their response is a hat function with total width equal to pixel pitch.
The Fourier transform of a hat function is a sinc function whose first zero occurs at one full cycle per width of the hat.
So, ignoring sampling, the MTF of the sensor will be that same sinc function, not falling to zero until a spatial frequency of one cycle per pixel width.
But when if you sample only at full pixel positions, the Nyquist limit kicks in and restricts the valid range to less than one cycle per two pixel widths. All frequencies higher than that will be aliased down to lower frequencies, giving an incorrect result.
Half-pixel shifting moves the Nyquist limit out to one cycle per sensor pixel width, which then matches the first zero of the MTF. This allows valid capture of spatial frequencies between one and two cycles per two sensor pixels, albeit at the reduced contrast of the sinc function as it drops toward its first zero.
Make sense?
--Rik
A math summary goes like this...
Assume that the photosites of the sensor have 100% uniform coverage, so their response is a hat function with total width equal to pixel pitch.
The Fourier transform of a hat function is a sinc function whose first zero occurs at one full cycle per width of the hat.
So, ignoring sampling, the MTF of the sensor will be that same sinc function, not falling to zero until a spatial frequency of one cycle per pixel width.
But when if you sample only at full pixel positions, the Nyquist limit kicks in and restricts the valid range to less than one cycle per two pixel widths. All frequencies higher than that will be aliased down to lower frequencies, giving an incorrect result.
Half-pixel shifting moves the Nyquist limit out to one cycle per sensor pixel width, which then matches the first zero of the MTF. This allows valid capture of spatial frequencies between one and two cycles per two sensor pixels, albeit at the reduced contrast of the sinc function as it drops toward its first zero.
Make sense?
--Rik
Re: Pixel shift: 4 x megapixels but how much more resolution?
Thanks Chris,
I have tried comparing 4 vs 16 pixel-shift images of the same subject, but I'm so close to the limits of resolution (using 365nm illumination), and arguably need a larger image scale projected onto the sensor, it was difficult to draw any useful conclusions.
I'm currently projecting images at 1x through a setup designed for C-Mount cameras - it's what my Oly BX61 came fitted with. I had decided to try more magnification with a projection lens and was gifted the perfect "Olympus PE 2.5x 125" by a very good friend. After reading loads of "stuff" about mounting cameras on the BX series, I *thought* I'd found the perfect adapters - but they didn't work - at all. I didn't waste *too* much money

I now know what I'm really looking for, but expect a long wait for it. So I'm exploring my current method of increasing image scale via pixel shift a bit further. It might be worth acceppting that instead of enduring long term annoyance, and cost, trying to convert to a projection lens setup. I thought a bit of "theory" might help settle that decision. It probably won't, as you say, but worth looking even if only to scratch the itch.
Re: Pixel shift: 4 x megapixels but how much more resolution?
Hi Rik,rjlittlefield wrote: ↑Fri Oct 06, 2023 8:35 am... half-pixel shifting does double the usable cutoff frequency of the sensor, but the extended range is captured at a progressively lower contrast.
--Rik
Yes, it did all make perfect sense in the end - once I realised "sinc function" wasn't a typoo of "sine function". I looked that up and all became clear.
The quoted section was immediately clear though. Thanks for the explanation - it all helps.
One emerging question though. The "progressively lower contrast" applies to progressively smaller or closer features on a subject, right? That is, in a practical example, the closer the dots on a diatom get, the less contrast there will be between them and the gap between. This sounds pretty much like the situation normally (without pixel shift). Have I missed something there?
Cheers
Re: Pixel shift: 4 x megapixels but how much more resolution?
I can't speak for sonys 4x vs 16x capture image shift, but for me a good estimate of an images "true" information stored in a digital image is to downscale it by a factor of x, then upscale it again by 1/x to the original resolution, and finding the factor where the process doesn't result in any information loss.
on the S1R with my best lenses this downscale factor x is about 0.7x
in other words:
a 8x pixelshifted image nominally has 188MP.
if I downscale it by 0.7x I get an 132MP image which can be upscaled again to 188MP without noticeable loss of detail.
a perfect full RGB image of the sensor would result in 47MP worth of information (actually impossible to achieve without oversampling).
so the pixelshifted image has about 2.8x the resolution in total pixels, or about 1.67x in each direction.
I would expect the sony to be in the same ballpark. If you like, upload a full resolution or a 4x and 16x pixelshift tif file (a small crop of the sharpest part actually is enough) and I'll have a look.
on the S1R with my best lenses this downscale factor x is about 0.7x
in other words:
a 8x pixelshifted image nominally has 188MP.
if I downscale it by 0.7x I get an 132MP image which can be upscaled again to 188MP without noticeable loss of detail.
a perfect full RGB image of the sensor would result in 47MP worth of information (actually impossible to achieve without oversampling).
so the pixelshifted image has about 2.8x the resolution in total pixels, or about 1.67x in each direction.
I would expect the sony to be in the same ballpark. If you like, upload a full resolution or a 4x and 16x pixelshift tif file (a small crop of the sharpest part actually is enough) and I'll have a look.
chris
Re: Pixel shift: 4 x megapixels but how much more resolution?
There is a hidden flaw for half pixel shifts, namely, they are shifting half pixels up, down, left, and right. No diagonal shifts as mentioned. Each camera sensor has four smaller cells covered with R, G, B filters (not sure if it is called Bayer pattern), most of them arranged with two greens, one red and one blue, and most likely the R and B are positioned diagonally. So unless sensor is shifted diagonally, the R cell will never capture anything where the B cell is (was), ie, info captured by R cell never does contain blue channel info.
Therefore, the signals captured by simply shifting up, down, left and right can NOT be interpreted as having a denser sensor (each cell is a sensor) with a global RGB filter on the lens (so all cells captures R, G, B info).
Without shifting, yeah, we can treat each pixel (4 cells) as if they are just one sample and then apply Nyquist theory or any other signal processing theory. But when shifting, and only up, down, left and right, no diagonal shifting, then things get fundamentally different, the discrete nature of the system might kick in.
Just 2 cents.
Therefore, the signals captured by simply shifting up, down, left and right can NOT be interpreted as having a denser sensor (each cell is a sensor) with a global RGB filter on the lens (so all cells captures R, G, B info).
Without shifting, yeah, we can treat each pixel (4 cells) as if they are just one sample and then apply Nyquist theory or any other signal processing theory. But when shifting, and only up, down, left and right, no diagonal shifting, then things get fundamentally different, the discrete nature of the system might kick in.
Just 2 cents.
- rjlittlefield
- Site Admin
- Posts: 24489
- Joined: Tue Aug 01, 2006 8:34 am
- Location: Richland, Washington State, USA
- Contact:
Re: Pixel shift: 4 x megapixels but how much more resolution?
Nope, you're good. Small details lose contrast for multiple reasons, whose effects tend to combine by multiplying the MTFs.Beatsy wrote: ↑Fri Oct 06, 2023 11:26 amOne emerging question though. The "progressively lower contrast" applies to progressively smaller or closer features on a subject, right? That is, in a practical example, the closer the dots on a diatom get, the less contrast there will be between them and the gap between. This sounds pretty much like the situation normally (without pixel shift). Have I missed something there?
--Rik
- rjlittlefield
- Site Admin
- Posts: 24489
- Joined: Tue Aug 01, 2006 8:34 am
- Location: Richland, Washington State, USA
- Contact:
Re: Pixel shift: 4 x megapixels but how much more resolution?
Something has gone wrong with that textual description (and all the analysis that follows from it).
The animation at https://www.youtube.com/watch?v=lwwzM81k5VE makes clear that diagonal shifts are captured also. They are the "4th cycle", beginning at 1:41.
--Rik
Re: Pixel shift: 4 x megapixels but how much more resolution?
Thanks Rik. Yes, in trying to simplify the description as much as possible for clarity, I inadvertently implied/stated that the sensor was shifted up, down, left and right from a fixed "home" position. As the video shows, the sensor is shifted down, left, up and right in that order - so diagonals are included. My bad. Sorry. I've added an "erratum" to the original post.rjlittlefield wrote: ↑Fri Oct 06, 2023 9:01 pmSomething has gone wrong with that textual description (and all the analysis that follows from it).
The animation at https://www.youtube.com/watch?v=lwwzM81k5VE makes clear that diagonal shifts are captured also. They are the "4th cycle", beginning at 1:41.
--Rik
Re: Pixel shift: 4 x megapixels but how much more resolution?
ah, OK. Well, I do not have access to Youtube all the time
so from OP's description, I mistook it as it was shifted by half pixel (by the words
It looks like it is a two staged half pixel shifting scheme, in that case, sure it all works fine (in theory).
However, for those schemes with single stage half pixel shift, the additional loss of contrast will be there, as essentially, that kind of scheme can be thought as "sensor assisted" image enlargement, it will be better than image enlargement, but the missing information require some kind of artificial calculation to fill in the void.

), then perform full pixel shifts (by wordspre-shifted . . . by HALF
) I mistook "stated above as for the 4 full pixel shift operation.before triggering each 4-shot sequence that proceeds exactly as stated above
It looks like it is a two staged half pixel shifting scheme, in that case, sure it all works fine (in theory).
However, for those schemes with single stage half pixel shift, the additional loss of contrast will be there, as essentially, that kind of scheme can be thought as "sensor assisted" image enlargement, it will be better than image enlargement, but the missing information require some kind of artificial calculation to fill in the void.
Re: Pixel shift: 4 x megapixels but how much more resolution?
For what it is worth, the engineers at Olympus have estimated that the real resolution of an 80Mp image made by a 20Mp sensor with 8-shot pixel shifting is equivalent to a 50Mp image. The scaling factor is about 0.625. This is the reduction they make in the camera when outputting a jpg from the 80Mp RAW shifted image.