Quote from: Dont_Look_Up on April 15, 2025, 20:28:18ZBook Ultra has larger battery than the Flow Z13 (74.5 vs 56 Watt-hours)
Quote from: Dont_Look_Up on April 15, 2025, 20:28:18shows how unoptimised are the default power profiles.
Quote from: Worgarthe on April 15, 2025, 14:13:45Edit: So the game runs at 30 fps but the total system power is 7.6 to 8.4W which is completely insane and way too impressive 😳
Quote"Dual-Channel" seems wrong even on HP's datasheet and quickspecs. Ignoring that LPDDR5 channels are 32bit, the Ryzen AI Max 256-bit memory bus should have 4 64bit channels ("Quad-channel"). AMD's specs do not list a channel number. Memory channels are not mentioned on this site's review nor in Apple's specifications (though some sites say the MacBookPro M4 Max 128GB has 8 channels). For accuracy maybe the channels should be omitted here as in the MacBookPro review. (Or make it "multi-channel" if you need to contrast with "single-channel".)Good point, it could be 8*32-bit, instead of 4*64-bit like in normal desktop systems or both. But it leads to the same following bandwidth speed calculation:
QuoteYou can't use the AIDA memory benchmark result for this formula ..Right, ty for making me looking it up again: Strix Halo is a 256-bit chip, so its theoretical bandwidth is:
= 8000 * 64 (bit per channel) * 4 (channels) (= the 256-bit memory bus width) / 8 (bit to Byte) / 1000 (MB to GB).
= 256 GB/s
The practical benchmark value is often 70-80% of the theoretical value, so 256 GB/s * 0.75 = ~192 GB/s.tokens per second = 192 GB/s / 39.6 GB (Llama-3.3-70B-Instruct-Q4_K_M.gguf)
= 4.85
Quote from: tokens per second on April 15, 2025, 08:55:01I agree. You can still roughly calculate the speed (works for "dense" models like the LLama-70B):You can't use the AIDA memory benchmark result for this formula because it's just using a classic CPU related workload with it's own intrinsic bottlenecks. I am sure, the iGPU can achieve much more throughput near the theoretical limit of the memory bus.Code Selecttokens per second = bandwidth / filesize
= 121177 MB/s / 39600 MB (Llama-3.3-70B-Instruct-Q4_K_M.gguf)
= 3.06
Quote from: Dont_Look_UpNo he is playing "Celeste" at 1080p with full effects for 9.5 hours!Ok, thank you for the info with the timestamp basically, I know the vid is long so that's helpful and I appreciate it. I will check in a moment.
Check his video at 34min (I can't post the link).
Quote from: Worgarthe on April 15, 2025, 11:29:22Quote from: Dont_Look_Up on April 15, 2025, 11:25:34He gets 9.5 hours of battery life while gaming!Playing Minesweeper and Solitaire?
Quote from: Dont_Look_Up on April 15, 2025, 11:25:34He gets 9.5 hours of battery life while gaming!Playing Minesweeper and Solitaire?
Quote3. The biggest bummer: As always there is not a single test regarding AI model inference using llama.cpp or ollama. This notebook has been especially made for this purpose so please add some benchmarks using at least a 70b Q4 model and do the same on a similar performing MacBook Pro.I agree. You can still roughly calculate the speed (works for "dense" models like the LLama-70B):
tokens per second = bandwidth / filesize
= 121177 MB/s / 39600 MB (Llama-3.3-70B-Instruct-Q4_K_M.gguf)
= 3.06