H-1B's are absolutely exploited. I'd personally testify to the abuses at financial companies.
Hearing threats of "If you don't complete this by tomorrow, you can expect to be back in India by next week" From one Indian to another too...
It's awful to witness it. Please don't spread misinformation that "They are not taking abuse or being exploited"
That is extremely disingenuous.
Sorry that's untrue. Top companies such as Google, Meta or Microsoft absolutely do not abuse H-1B workers or treat them any different. As I said there may be smaller outfits, or IT shops where this is happening.
You have to look harder. It's not always explicit as OP says with threats of deportation. Rather there's a huge power imbalance.
Who can we ask to stay late? Who "doesn't mind" 12 hour days? Who "doesn't mind' being on call. Who won't mind if they get a smaller bonus or raise? How about Sandeep who is afraid to say no because if he says no too many times and loses his job him and his entire family have to move back overseas with minimal notice?
That's how real exploitation happens these days. And sometimes even good managers don't realize they're doing it, because, after all, poor Sandeep even said he didn't mind! He's just a really hard worker!
You are right, unless you have been present at every manager-employee interaction you can't say it has "never" happened. But to claim that this is happening requires more than just one or two instances, right?
I used to think the same thing until I was brought in on an important legacy project that put me tangential to the "inner circle" and heard a lot of this kind of thing along with other shady practices like authoring the RFP for the government to publish so that the requirements favored the company.
So much of this.
The amount of times I've seen someone complain about slow DB performance when they're trying to connect to it from a different VPC, and bottlenecking themselves to 100Mbits is stupidly high.
Literally depending on where things are in a data center... If you're looking for closely coupled and on a 10G line on the same switch, going to the same server rack. I bet you performance will be so much more consistent.
Datacenters are up to 400 Gbps and beyond (many places are adopting 1+ Tbps on core switching).
However, individual servers may still operate at 10, 25, or 40 Gbps to save cost on the thousands of NICs in a row of racks. Alternatively, servers with multiple 100G connections split that bandwidth allocation up among dozens of VMs so each one gets 1 or 10G.
Yes, but you have to think about contention. Whilst the Top of rack might have 2x400 gig links to the core, thats shared with the entire rack, and all the other machines trying to shout at the core switching infra.
Then stuff goes away, or route congested, etc, etc, etc.
Bandwidth delay product does not help serialized transactions. If you're reaching out to disk for results, or if you have locking transactions on a table the achievable operations drops dramatically as latency between the host and the disk increases.
The typical way to trade bandwidth away for latency would, I guess, be speculative requests. In the CPU world at least. I wonder if any cloud providers have some sort of framework built around speculative disk reads (or maybe it is a totally crazy trade to make in this context)?
You'd need the whole stack to understand your data format in order to make speculative requests useful. It wouldn't surprise me if cloud providers indeed do speculative reads but there isn't much they can do to understand your data format, so chances are they're just reading a few extra blocks beyond where your OS read and are hoping that the next OS-initiated read will fall there so it can be serviced using this prefetched data. Because of full-disk-encryption, the storage stack may not be privy to the actual data so it couldn't make smarter, data-aware decisions even if it wanted to, limiting it to primitive readahead or maybe statistics based on previously-seen patterns (if it sees that a request for block X is often followed by block Y, it may choose to prefetch that next time it sees block X accessed).
A problem in applications such as databases is when the outcome of an IO operation is required to initiate the next one - for example, you must first read an index to know the on-disk location of the actual row data. This is where the higher latency absolutely tanks performance.
A solution could be to make the storage drives smarter - have an NVME command that could say like "search in between this range for this byte pattern" and one that can say "use the outcome of the previous command as a the start address and read N bytes from there". This could help speed up the aforementioned scenario (effectively your drive will do the index scan & row retrieval for you), but would require cooperation between the application, the filesystem and the encryption system (typical, current FDE would break this).
This said the problem can get more complex than this really fast. Write barriers for example and dirty caches. Any application that forces writes and the writes are enforced by the kernel are going to suffer.
The same is true for SSD settings. There are a number of tweakable values on SSDs when it comes to write commit and cache usage which can affect performance. Desktop OS's tend to play more fast and loose with these settings and servers defaults tend to be more conservative.
Which can cause considerable "amusement" depending on the provider - one I won't name directly but is much more centered on actual renting racks than their (now) cloud offering - if you had a virtual machine older than a year or so, deleting and restoring it would get you on a newer "host" and you'd be faster for the same cost.
Otherwise it'd stay on the same physical piece of hardware it was allocated to when new.
"Hardware degradation detected, please turn it off and back on again"
I could do a migration with zero downtime in VMware for a decade but they can't seamlessly move my VM to a machine that works in 2024? Great, thanks. Amusing.
It's better (and better still with other providers) but I naively thought that "add more RAM" or "add more disk" was something they would be able to do with a reboot at most.
Resizing VMs doesn't really fit the "cattle" thinking of public cloud, although IMO that was kind of a premature optimization. This would be a perfect use case for live migration.
Cloud makes provisioning more servers quicker because you are paying someone to basically have a bunch of servers ready to go right away with an API call instead of a phone call, maintained by a team that isn’t yours, with economies of scale working for the provider.
Cloud does not do anything else.
None of these latency/speed problems are cloud-specific. If you have on-premise servers and you are storing your data on network-attached storage, you have the exact same problems (and also the same advantages).
Unfortunately the gap between local and network storage is wide. You win some, you lose some.
Oh, I'm not a complete neophyte (in what seems like a different life now, I worked for a big hosting provider actually), I was just surprised that there was a big penalty for cross-VPC traffic implied by the parent poster.
It's more of a matter of adding additional abstraction layers. For example in most public clouds the best you can hope for is to place two things in the same availability zone to get the best performance. But when I worked at Google, internally they had more sophisticated colocation constraint than that: for example you can require two things to be on the same rack.
Your comment implies that a network hop between two VPCs is inherently slow. My understanding is the VPCs are akin to a network encryption isolation boundary but should not meaningfully slow down network transfers between each other..
I think you may be conflating the fact that across two VPCs you may be slightly more likely to be doing a cross availability zone or potentially even cross region network hop? I just think it's important to be on the pulse of what's really going on here
AWS is pretty transparent about what sort of cores you are exactly getting, and has different types available for different use-cases; typical example would something like r7iz that is aimed for peak single-threaded perf https://aws.amazon.com/ec2/instance-types/r7iz/
It's awful to witness it. Please don't spread misinformation that "They are not taking abuse or being exploited" That is extremely disingenuous.