The Radeon 760M was potentially A better handheld Gaming Devices choice with it's 8 RDNA3 CUs that obviously had to be implemented using both shader arrays as a single shader array only provides for 6 RDNA3 CUs. So the 760M is probably implemented as 4CUs from each shader array with 2 CUs disabled on each shader Array.
Now Each Shader Array(See Block Diagram) houses 2 Render Back-Ends(RBE), with 8 ROPs for each Render Back-End for a total of 4 Render Back-Ends across 2 shader arrays and up to 32 ROPs total. But on the 760M only 16 of those ROPs are enabled for the 760M(One render Back-End per shader array). So the 760M's Render Back-End only has 1 RBE enabled per shader array and only 16 ROPs total and a very unbalanced ROPs to Shaders ratio there that really is not going to need that much shader compute there from the 8 enabled CUs on that 760M binning.
If the 760M had all 32 ROPs enabled across all 4 RBEs across 2 shader arrays then it's Pixel Fill Rates could have matched or even exceeded the 780M's Pixel Fill Rates as the 760M would have had fewer CUs and Shader cores enabled and used less power as a result for higher average sustained clock rates. And really the 760M with only 16 ROPs enabled is never going to see even the 8CUs worth of shader compute taxed to begin with, so unbalanced that Shader Cores to ROPs/RBE ratio is there. A Radeon 760M variant with 32-ROPs/4-RBEs enabled would have had a better Shader Cores to RBEs ratio that could game every bit as well as the 780M as not that much shader compute is needed! And that's especially so for RDNA3 where each Shader core can dual issue FP32 instructions, so plentiful amounts of FP32 compute for 4 full RBEs and 32 total ROPs enabled even with only 8CUs enabled on that 760M SKU.
Now the numbers of TMUs scales with the CU counts but the RBEs scales with the Shader Arrays enabled so the 760M with less TMUs enabled would have less Texel processing capabilities, but if the clocks are higher on average from having less Shaders enabled and less power draw for higher average sustained clock rates, then any Texel processing deficiency can at least be partially made up there. And if 32 ROPs where enabled on the 760M then if the average clocks rates sustained was higher then the 760M would have a higher average sustained pixel fill rate than even the 780M for gaming workloads.