@ljs
establishing higher order folios will have fault scalability/reclaim advantages also
By the time khugepaged comes around, fault scalability is probably not much of a concern anymore, right? Since you're probably done faulting when khugepaged comes around to collapse stuff? Unless we're talking about swapin performance. But yeah, reclaim I could see.
As for follow up, I think you could at least in theory benefit from multiple mTHP sizes as if we were unable to obtain a larger mTHP size we could try for smaller, but you'd always want to have the largest possible.
I think you wouldn't get a benefit out of it in terms of TLB - like, on modern AMD machines, I believe if you can't get a 16K page, then the TLB will have to use 4K entries anyway and from the TLB perspective, there's no sense in having 8K pages?