It would be nice if AWS could write something official about what they are doing.
I've been noticing major performance changes in our instances and have no idea if it is related to Meltdown or something else.
Google released a blog post specifically on performance: https://blog.google/topics/google-cloud/protecting-our-googl...
It would be nice to have similar transparency from AWS.
My team saw a 40% CPU usage increase on all of our EC2 instances and even our RDS instances. We were shocked since the media was downplaying the performance impact.
I tried to start a poll but it seems as though my team was just the unlucky one: item?id=16109036
That's one interesting aspect of these issues and mitigations is that performance really depends on the workload. Just because Google saw little performance impact on their servers, doesn't mean your application won't see. Or because someone said their CPU usage went up 2x doesn't mean it will go up for you.
On an unrelated note, kind of wish Meltdown had been discovered and exposed separately from Spectre. Intel has managed to weasel its way of out of taking responsibility by implying that this is not a bug and all the other CPUs have similar issues. If they had to respond to Meltdown only, it would have made it a bit harder for their PR and legal department to deny the security and performance implications.
"Why I like to run my own hardware for $100, Alex"
You can patch various tiers of servers at your own leisure, depending on threat levels and exposure. Measure the impact, capacity plan, etc. Rather than it being forced on you across all tiers because cloud.
Pardon this likely naive question, but I haven’t seen it addressed yet in all the coverage: what’s the cost in electricity of patching this vulnerability? Does a company like amazon running a massive cloud infrastructure see a non-negligible increase in their cost of doing business?
Anyone has more info on the performance recovery today? We experienced similar performance issues over the last few days with a seemingly complete recovery today (on a cluster of ~2500 HVM T-1s).
Trying to foresee the future...
Could we expect Intel to fix the design flaw^Wfeature so that future server appliance (but also desktop) can run without KPTI while still not being affected by Meltdown ? If so, what timeline could we expect ? Say a year for new CPU designs, plus a year to roll-out new machines in datacenter ?