ssijoko

cryptídea


23, Letras PT-JP, trans bi gal who's also a war vessel piloted by small animals, gay as hell

PT-BR/ENG
discord: ssijoko#0170
birdsite/tumblr: @ssijoko (bird site's mostly in portuguese tho)


erica
@erica

For months now, I've been having an issue with my internet that I'm incapable of solving. The most basic way I can explain it is: If I am downloading at high speeds (70MB/s and higher) or uploading and downloading at the same time (Streaming to Twitch and watching YouTube at the same time, for example) my internet drops out. Web pages will still load, I am not kicked off the internet entirely, but whatever connection I had to that server is timed out and will take a couple seconds to re-establish.

I'll put a LOT of details and troubleshooting I've done below the fold. I am truly at my wit's end. If you can help, I'd appreciate it.


Some primary details.
Here is my dxdiag. My router is an ASUS RT-AX82U and our internet is gig fibre from Telus here in Vancouver, BC. Speedtest result, Fast.com result.

Here is a HWMonitor snapshot.

Quick Info (For summary reasons, please read until the end, the problem is likely not listed here.)

  • This issue has been going on for approximately six-to-seven months. It's very possible the issue was present long before this but time is really hard to figure out within the past couple years. It's been an issue forever but I can't pinpoint a precise time frame. Some computer upgrades were done in that time, some hardware changes, etc. A lot has happened both related to my computer and not.
  • It happens at all times during the day, every day.
  • It does not affect my wife's computer who is on the same network. Our computers used to be relatively identical in hardware, yet only I had the issues. I've since upgraded my computer and the issues persist. She continues to have no problem.
  • The internet does not "cut out" the way it does with severe dropout issues. I can run services like Speedtest/Fast.com without issue, also. These test services show no issue on their end.
  • This happens when I bypass every step of our network and plug from my PC directly into the wall.

I will update additional information here if I come across it.

  • I do not run a VPN on this computer.
  • It's mentioned below but for clarity's sake I have changed the ethernet cable. Both were Cat.6
  • My wife's computer is on wifi but we hard-wired it to test and in both cases she had no issues.
  • The issues happened on wifi and wired for my old computer. My new one no longer has wifi capabilities.
  • It's not DNS. DNS has been changed to 1111/1001 and the DNS was flushed. (And steam cache cleared for good measure.) The issue remains.
  • It's not ipv6, I disabled it and the issue remains.
  • Once the problem is triggered in Steam, it cannot redownload until ??? magic happens. I don't know how else to explain it. Repeatedly clicking the retry button in the Downloads screen will not even start the download, it will just constantly tell me "no internet connection". Like it's just blocking the download before it can try.
  • Some webpages take a suspiciously long time to load. As in, a few seconds instead of instantly. The way it manifests looks like when I try to load a webpage without an internet connection. (On Firefox, it's just a blank screen with "Connecting to..." in the bottom left)
  • It's not Windows' NCSI issue or WPAD/PADS. Both were checked.
  • It does not seem to be ipv4/ipv6 related, both inherently and more granularly to things like checksum offloads. (A few of these options were tested with no results.)
  • I can't remember if I've said this below but for posterity's sake: synching in Dropbox (up or down) gives me no noticeable issues. Unfortunately Dropbox provides me with no real detailed readout of network traffic so it looks like things are smooth sailing. Unlike Steam, however, the traffic never outright stops/fails.
  • Logging into a new user account on Windows provided help, the issue happened there as well.
  • A Wireshark test shows connection loss somewhere between the computer and the router. It's hard to pinpoint the exact cause because of how the problem is manifesting but currently the likely culprit is the ONIC.

The Internet

The Problem

As stated, I have gigabit fibre from Telus. It was installed into our house because the current/previous owners of the property are on a different ISP (Shaw). This is what I thought would be the first problem since internet in Canada notoriously sucks.

Troubleshooting

  • Every point of issue was addressed.
  • They changed our modem, no change.
  • They changed the actual terminal that connects us to the fibre drop, no change.
  • They flushed our IP and did a full reset of our system, connection, and all that on their backend. No change.
  • As far as I can tell, this is not an ISP issue. Everything they can check from their end shows absolutely zero problems or flags from their end all the way the plug in our house.
  • This led us to the conclusion that, since my wife was not having issues, my network card was the problem, however...

The Hardware

The Problem

My dxdiag is provided above but most issues I face, I face with software instead of in my PC experience itself. This morning, I could not log into Dropbox using email because the authentication would immediately drop. Logging in through Google worked, I'm not sure why.

My computer is also, now, mostly new. These issues started on my old computer and as mentioned above, thought the network card was the culprit. Since it was an onboard network interface, I upgraded a few things alongside the motherboard. The only parts that I kept are the CPU, the RAM, the 500GB NVMe main drive, a 1TB SSD secondary drive, and my CPU cooler (a Corsair H600 something or other. it's an RGB liquid cooling one). Everything else, I upgraded.

The old motherboard was a Asus Prime X570-P. Both my old and new motherboard have a Realtek network card.

Troubleshooting

  • Motherboard upgraded. I went from an Asus one to an MSI board, the problem persists.
  • The ethernet cord I am using is the same as my old PC but that was also changed from an older one that had the same issue.
  • The keyboard I'm using is a Cooler Master SK622 and last night when testing things, unplugging it mysteriously solved the problem for hours. I switched to an old keyboard and thought by some insane reason the kb was the issue. However, the problem came back. I don't think the keyboard is the problem but there was something that gave me hours of relief when doing that and maybe the answer is in there.
  • I am on a fresh Windows install and a lot of my old drivers are not reinstalled yet.
  • The ones that are, that are out of the ordinary at least, are for my audio interface which is an Arturia Minifuse 1. This is one of the devices I switched out months ago but it was purchased after we changed routers.
  • We changed routers. As mentioned above, we have a ASUS RT-AX82U but the issue happened before we got this one and the issue also happens when networked directly into the modem.
  • Every drive and every device was plugged or unplugged to see if it was an issue. The problem persists.
  • About 7 months ago, I changed my CPU cooler. I switched it out for the same model but I had to because the little fan inside the liquid cooling heat sync was rattling. Since switching it, that rattling has stopped but the fans have definitely been louder. Like they're working to cool a lot more than they have to. It would make very little sense to me that the CPU is overheating and causing it to drop network connections considering this would be so much more obvious if this was the case. I wouldn't be able to play games (I can), I'd be kicked off Discord calls all the time (I'm not). I doubt it's this, but I should point out this part changed anyway.

Steam

The Problem

When downloading from Steam, it will start downloading at full throttle and last a few seconds before cutting out. The download sharply stops and provides no information other than "Content Servers Unreachable" or, more often, "No Internet Connection".

If the connection is feeling especially unstable, it won't even synch Steam Cloud save files.

Here is a snippet of a log from Steam showing whatever issues arise when downloading.

Troubleshooting

  • Clearing the download cache gives me a momentary fix, though by momentary I mean anywhere between one second and one minute. It's the only thing I can do that provides a certifiable lapse in issue, restarting Steam does little to help.
  • I've tried downloading to other drives and this does nothing. Downloading to a brand-new NVMe SSD has the same problem as my old USB-C SSD drives.
  • Reinstalling Steam did not fix the issue.
  • Reinstalling Windows did not fix the issue.
  • Since this happens with other software, I figure Steam is not the issue anyway.

OBS / Twitch / Discord

The problem

I don't have any screenshots to provide for this but any regular viewer of my streams will know what happens. My stream will just cut out if I try to do "too much internet stuff" at once. I stream at about an average of 3500kbps upstream. Our internet is rated for 1GB/s up & down so, again, not sure I'm really testing the limits of my network here. However, if I'm streaming, opening something like a youtube video (often just loading YouTube itself) will crash the connection to Twitch and YouTube. The stream will cut out and after a few seconds, it will resume. If I stop the stream manually in Twitch, I won't be able to re-start the stream until I give it a while because it gives me an error saying it cannot connect to Twitch. Streaming music from Apple Music will very rarely make this happen but it's almost always streaming from YouTube. Trying to stream the Capcom Showcase had about 4-5 disconnects at varying intervals.

Here is a 'stats for nerds' readout of YouTube if you need it.

The same thing happens in Discord. Streaming a videogame at 720p30 is fine! I've never run into an issue. The moment I try to stream something I am watching at the same time, however. Streaming a show to some friends a few weekends ago produced the same result where I was able to stream it for a good hour or two until the issues settled in and it would disconnect at completely random intervals. I would not get fully kicked off the internet and my wife, who was also in the call, did not disconnect at all. On my end, the stream I was watching would start buffering and Discord would be unconnected but it's probably important to point out that I was never dropped from the call at any point. Just a "ah, you cut out for a bit"

Here is an OBS log of when I was able to replicate this issue.

Troubleshooting

  • There's very little troubleshooting I could do here. I looked at some things in my router settings at the time that this started and I couldn't really find anything, and I'm not knowledgable enough about networking to touch anything without fearing breaking something. Like I said, though, this happened even when I wasn't connected to our router.

Other Bullshit

The Problem

Trying to solve all this stuff has me at a loss. There's so much other stuff I tried that provided no solution.

Troubleshooting

  • As I mentioned earlier, I'm on a fresh Windows install. Very little carried from the previous install, my C:/ drive was wiped clean and as far as I know Windows only carries your theme/personalization settings.
  • I took a glance at Windows Firewall and didn't notice anything out of the ordinary. Temporarily disabling it to let a download through did nothing.
  • Streaming stuff works fine. I can watch Youtube, Twitch, Netflix, etc etc with no issues and the only time I do encounter issues, it's generally a service or actual internet hiccup.
  • Downloading things work fine. I can download files off browsers without issue and I frequently snatch files from a private seedbox through FTP without issue. The browser files go at whatever general internet speeds are, my FTP files transfer much faster but again: without issue.

This is all I can think about to summarize and explain this issue.

I've tried everything. Something about this setup is the issue, it's absolutely absurd the problem is happening on two computers which share just a handful of parts and more importantly do not share the one managing networking. I can't think of what the problem could be. I've tried what I can with the time and energy I have. I have work to do and the shock this is causing to my mental is too much. I spent too much money hoping this would have been solved.

If you have any information that could help, if you have any troubleshooting steps I should take that I have not outlined above, please for the love of god tell me.

I need this internet for work. I paid for the fastest speeds because I need them to do what I do without worrying about issues and I'm now facing the worst of it. If this doesn't get fixed, I'm considering opening a bounty on this because I just don't know what else to do.

I will give out an Ascari plush to whoever finds the solution. It's the least I can do right now. Please solve my problem and free her from my hands.


You must log in to comment.

in reply to @erica's post:

It might help or it might not but your post didn't mention it so just in case: you tried using another set of DNS servers like Google's 8.8.8.8/8.8.4.4 or Cloudflare's 1.1.1.1/1.0.0.1? I know your ISP said everything is fine but like, maybe their "this is fine" is routing you towards nodes of popular web services that end up congested somehow this making you prone to issues like this? Again, it might not help at all, but it's fairly easy to do, so I reckon it's worth a shot.

As for Steam specifically it seems ppl are saying that maybe changing the download location helps but that obviously wouldn't solve for the other issues 😵‍💫

I've tried changing download locations but yeah, that solved nothing. It also affects more than Steam so that's out of the question.

If not tried changing DNS servers because I don't know what that is or how to do that, unfortunately. I assume if I changed it to Google's I wouldn't be able to just, use that

Knowing how to change your DNS server is a nice skill to have.
A couple years ago, turkey had a bad internet censoring phase and at least once people did write in a wall 8.8.8.8 because most censoring from gov involve asking ISP to mess with their DNS results.

It's also the how pihole work. You tell your computer and phone the DNS is a raspberry running a modified DNS server that will block all ads server, so there is no overhead when loading the web, ads requests simply fails.

The first thing I’d try to test this is miserable, installing Linux on an internal or external ssd and setting up steam to download a game (or finding a simpler test case like downloading a random large file off a suitably fast service and doing that in Linux, just to see if there is something peculiar to windows causing this to happen. Hopefully someone else has an easier path to resolving the issue.

I was going to suggest this to ~completely rule out issues on the software side for your computer. You don't even need to install Linux; just boot into a live environment, and (idk) download a bunch of Linux ISOs via bit torrent while you stream via OBS?

oh, this sounds like maybe a problem with bufferbloat on your router? it's where it doesn't prioritize packets coming into the network buffer, so under high load when the buffer fills, it just starts dropping new packets as they arrive. Then the only new packets that can enter the buffer are when it has room if the packet arrives before others fill it up again

there's different queue prioritization algorithms and QoS stuff you can do that might be implemented for your router. gimme a minute to read this more thoroughly, but let me know if any of that sounds reasonable

it looks like your router lets you do some manual traffic shaping with QoS tagging that could help somewhat, but it looks like its based on port & protocol, and since most of your traffic is gonna be wrapped in HTTPS, that might not help massively

would you be open to flashing different software on your router like openwrt? edit: ah, it doesn't look like openwrt supports your model of asus router, but there's asus merlin, an open-source firmware that also does QoS

imo, try dinking with whatever QoS features it has built-in first if you feel squeamish about flashing your router, then try merlin firmware if that fails

I've also experienced odd dropouts for some of my machines when I enabled QoS on an old router. In particular (iirc) it happened to the machine was trying to guarantee prioritization for.

Yeah, it's one of those problems that are like, really? My current router also has had some problems - I set up it's DHCP server to give out an IP for a pihole for DNS that I run locally, and it just gave out the ISP's DNS while the dashboard indicated it was giving out the pihole's DNS.

For this sort of problem, I'd definitely simplify the network topology as much as possible to try to troubleshoot, especially since it sounds fairly easily reproducible (ie, "get a bunch of traffic going").

hi! im a network engineer.

So the problem here is interesting in that it is clearly not a network layer problem (you still get other connections working while some cut out), and it only happens with some services. Since you can stream netflix/youtube etc, it doesn't seem to be triggered by high bandwidth utilization, either. the youtube stats don't seem to show anything, and the OBS logs only say "connection timed out, closing" essentially. the steam logs say that it couldn't find the chunk it was looking for on multiple cache servers, stopped trying, started again, and then it started working, which is kinda weird.

so it doesn't seem to be a layer 3 problem, it's not DNS, it's probably not buffers. my first thoughts here are a) your router or windows have trouble dealing with too many open connections at once (tcp port exhaustion), though that would be surprising, b) your ISP is doing some weird CGNAT thing and things break because they don't expect CGNAT OR your router doesn't handle it properly OR you're going over IPv6 to some services and not others and either the ones over v6 or the ones over v4 break because of some configuration in your computer / router / NIC.

Suggestions would be:

  • try seeing if these issues still happen with a VPN? or even get more generalized with a VPN
  • try exchanging your NIC with your wife's computer and see if it's related to the NIC or to the rest of the machine / OS
  • try disabling IPv6 and seeing if that makes things better or worse
  • does your wife have those issues if she uses your computer? or could it be a usage pattern that you have that triggers the problem?

that's kinda weird tho good one and good luck

It wasn't ipv6 unfortunately. I don't run a VPN and I can't easily exchange my NIC with hers since they're both onboard.

The problem on my end is replicable to a point of frustration. Any Steam download triggers it. Sometimes I get lucky and the download just goes. But rarely.

Not to this extreme degree, but I had a similar problem with a PC wired over an ethernet connection in college. I was having trouble with an issue similar to what you've listed here, and a bunch of networking features on my consoles would not work--even when it would tell me that I had an open NAT, it would behave like my NAT was closed, etc.

The solution was disabling IPv6 in control panel/on my devices--something, somewhere along the network (either in the college's network or comcast or whatever, not sure, i'm very dumb at networking) meant that I needed to stay on IPv4.

edit--never mind, I'm wrong about the specifics here--it wasn't IPv6->4 I don't think, since this would've been ~2013 or 2014? But it definitely involved unchecking something in my hardware and network settings, I'll see if I can find the exact thing. Disabling IPv6 might just be the modern equivalent though

I have an absurdly-similar problem, currently using HeroNet cable though and not any fibre. The failure point speed-wise is extremely similar and the services of failure are the same.

Assessing the issue on my end, it appears to happen regardless-of-anything if wifi is enabled in my router (MikroTik hAP mini), and will happen essentially on a timer if wifi is off. This always has only affected devices operating over ethernet, and particularly-exclusively my production desktop, particularly-much-more only since changing my motherboard to a newer one which has a particularly more cabable onboard ethernet NIC than previous, but only in terms of raw capability and not features.

My own, extremely similar issue, notably acted nearly the same when switching to a PCIe ethernet NIC with significantly more features, with exceptions related to the bandaid solution I found.

I can lengthen this aforementioned timer by a significant amount by reducing the target operating clock of my router as well, and keeping my home cold, which has indicated to me in my case that this issue is related to thermal or power-consumption throttling of my router. Looking through this thread, I've seen buffer bloat be mentioned by two other individuals, one of which dismissing, which would be fitting for throttling of the router to cause.

I would check thermals of your router somehow. If it doesn't have instrumentation to retrieve the temperature from within the LAN, try putting a powerful fan over it and see if it stops happening, or starts to take longer. If this is thermals-related, turning off Wi-Fi on the router may severely help as Wi-Fi radios like to heat up devices a large amount, or reducing the target operating clock.

Typically the switch chip inside of a router will heat up severely by the actual rate of all traffic, meaning fast-tracked traffic would heat up the switch more if this is the point of failure and it may remove this issue to turn off the fasttrack in your router, as much as this might reduce network performance.

I'm sorry this is happening to you, it sounds horrendous to troubleshoot and you've already put in so much effort and dollars.

One thing that would help diagnose the issues is a packet capture using a tool like Wireshark, which records and decodes every network packet going through your computer while the capture is running. Unfortunately these captures aren't very intuitive to read, and since it's a copy of network traffic it's probably got information in it which either reveals your location/IP (candy for bad actors) or outright contains passwords or other information. So if you're able to find someone who you trust who can help read Wireshark packet captures (or you spend some time reading about it yourself and then post some obfuscated capture contents), that'd be the next troubleshooting step.

For example, a packet capture would help directly identify your connection to Twitch or Steam, and tell you the precise error that was given when the connection was closed. It would be interesting to find out if these connection protocols have something in common, or if the errors themselves are the same error.

You mention your modem changed and your motherboard was changed - presumably meaning you have a new network card - and the issue is still persisting. Did your router change? Have you tried isolating devices on your home network to see if happens when only your device is on the network? (And then adding back bit by bit to find a trigger/threshold?) At this point you've eliminated a lot of what's changed, so you have to start hunting down what hasn't.

Stuff like this can get really tough to track down. When in doubt, shut everything down - all services, all devices, everything, and then start re-enabling one by one. You can do this in Safe Mode on your computer. Bring services online one at a time and see what breaks.

It’s mentioned above in a different section that the router has changed. Also that it happens with the PC wired directly to the modem.

So FTTP terminal changed, modem changed, router changed, Ethernet cable changed, network interface changed (seemingly twice? But at least once), motherboard changed, Windows reinstalled, new CPU cooler… this is a sneaky little gremlin.

Your old motherboard (Asus PRIME X570-P) and your new motherboard (B550-A-PRO) seem to share the exact same network chipset: Realtek® 8111H (listed as RTL8111H on the Asus), which means you're more than likely using the same driver between your entire OS reinstalls and Motherboard replacements.

There was a streamer I knew that had issues wish his MSI motherboard and onboard NIC controller, but he had one with the "Gamer Network Card" (KillerTM E2500 Gigabit LAN controller) which has a common packet management issues (I think it's related to QoS, but I never looked too much into it): https://www.reddit.com/r/KillerNetworking/comments/oqfyo5/killer_e2500_ethernet_issues/.

The thing is, the problems you are listing are very similar - connection drops but only sometimes, only partially, wouldn't drop when he was streaming a specific game, but would drop if he streamed literally any other content, etc.

I read your entire troubleshooting section, but you never did say you tried a different network device - a USB Ethernet adapter or a PCI Express network card. Those can be had for quite cheaply (between 8-11 dollars) and would eliminate that extra variable.

I couldn't find any similar reports on your chipset, but it seems very curious to me that both motherboards have the exact same network issues and they have the same network controller. (unless I'm mistaken on your current MSI motherboard, but your dxdiag only tells me ms-7c56, which shoes up as a B550-A-PRO to me), so there's where I would start.

I was actually going to suggest something similar. Sometimes ethernet chipsets are just gargbage and it's not noticed easily. Aura, does your board have WiFi built in? If so try using it with the ethernet unplugged to see if it still persists. If it acts more normal maybe try picking up a PCI-E Ethernet card from Intel or getting your hands on a fast USB3 Ethernet dongle if you can't do that. This definitely sounds like a networking issue more than a Windows or any other software issue (outside of drivers that is)

I think they said that their first motherboard had WiFi, but the new one doesn't. Either way, the old ethernet driver and the device would still be enabled when they were trying the WiFi connection, so disabling that device/uninstalling the driver entirely would also be something to think about in this scenario.

I would still make sure (in a scenario where you are using a different device to test it) to disable and/or uninstall the existing network chipset/driver. Who knows what it could be doing to the Windows network stack.

If I read correct you have a new Windows install, I'm assuming that's from the new board so there shouldn't be any issues with old drivers and that issue wouldn't really be causing what's going on anyway. Hopefully the new Ethernet card works! Fingers crossed!!!

Just to clarify, both motherboards have the same Network Controller Chipset (RTL8111H), which means that Windows would have installed the same drivers from the Windows repository. If you manually installed motherboard manufacturer provided drivers, they would also be the same (but more likely a different version of the same driver, as manufacturers rarely ever update 3rd party system drivers, unless something massively broken happens in one version, and more often than not, not even then).

Changing the motherboard doesn't change your network driver if the chipset on the motherboard is the exact same one :D

Hi Erica,

I would suggest a different network card such as Intel. Realtek Network cards are known to have faulty drivers, I own an ASUS G14 laptop and the community has well documented that the Realtek Network card in these laptops are not good at all and I've swapped it out for an Intel AX210 network card.

I would say try a different network card brand and see if that works!

Tsunau

That is a persistent bugger of a problem. :/

I mean, we’re getting down to a handful of things that haven’t changed? CPU, RAM, storage…maybe power supply? And the electrical wiring in your residence. (It’s not unheard of, minor little circuit-wide problems causing intermittent issues. Has any of your testing caused you to be plugged into another electrical circuit? Is your electrical wiring notably old?)

Well, can’t hurt to test what’s left. Let’s say Memtest86+ for the RAM, and…maybe Prime95 for the CPU? Let each run as long as you can stand, Memtest is something you’re gonna install on a flash drive and boot from instead of Windows, but Prime95 is a Windows application that just happens to cause an absurd amount of CPU load. Problems should arise in minutes, Memtest does basic tests in like the first 30 minutes or so that should catch anything serious, but it has extended tests that will keep trying little corner cases for hours.

You’ll probably want HWMonitor running a lot of the time, that should tell you about your CPU temps and a bunch of other info that might be helpful. Core Temp is a tool more focused on specifically the CPU temps if the first doesn’t work or doesn’t sound appealing or whatever.

If the USB keyboard thing wasn’t a total fluke, you may have issues with your mobo’s USB hardware? But you’ve switched motherboards, and reinstalled Windows. I don’t understand how any USB-level problem would withstand those two things.

I asked around and the general agreement was with this. One even more direct thing is that the CPU has a special section dedicated to encryption features and it could be that Steam is stressing that particular section leading to this sort of thing.

Memtest86+ would be a first test. Given you have two similar machines, maybe swapping the CPU between them might show the issue move between the two of them which would narrow it down?

There's a chance that whatever is happening gets logged to the Windows Event Viewer. I'm suspicious of the "access denied" messages in the Steam log -- that could be a socket being forcefully closed by the OS, though for what reason I couldn't say.

back in the day i used to have a router that would fall over at random, and eventually I worked out that it was bittorrent — the router didn’t have that big of a connection tracking pool and it would saturate and then, in the glorious tradition of shit written badly in C, it would crash.

this kind of smells like that kind of shenanigan: your connections work but when you start saturating your own link it just stops dead and it’s making me wonder if some tracking pool is dropping longer-running connections off the back. Could be CGNAT, could be your network driver, could be something else.

i’d eliminate CGNAT as a confounding factor — see if your router has a public IP — and then boot linux on the thing which should help determine if it’s a hardware thing or a windows-plus-hardware thing.

also: the ipv6 thing can go both ways. it could be that your ISP has ipv4 issues that your wife never runs into because she’s on ipv6. try turning it back on again and see if the issue becomes less frequent. this is what i’d suspect next if you have CGNAT. your ISP might even be handing out a NAT64 dns server, putting everyone except you on a mostly-ipv6 internet connection even to legacy sites.

you could also test the long-running connection thing with something low bandwidth. i’m not sure what has a fast keepalive that’s ready to hand — irc would probably catch a >30s connection failure, perhaps?

honestly though i’m mostly stumped. this is a nasty one.

I doubt this is the issue (and really lack any specific knowledge to help you), but on the off chance it is I thought I would say something. My motherboard has...extremely bad USB port management, and if I "overload" certain ports in certain ways it'll just drop USB devices. I have one USB port I can never plug anything into, and a pair where I can plug four devices into one side using a USB hub, but plugging just two of those devices into each of those side by side ports causes the motherboard to randomly kick one. I can imagine a situation where this behaviour somehow leads to it dropping network devices.

Since unplugging your keyboard mysteriously solved the problem for hours, maybe try avoiding that USB port and using a different one with a USB hub? I assume the replacement keyboard that caused the issue to resume was inserted into the same USB port.

High order spitball here, but- in reading your troubleshooting actions, it sounds like your local user account was preserved / migrated during the reinstall. So, worth a shot:

create a new user account
attempt the high bandwidth actions that were failing before from the new user account, see if any of them behave differently.

This shouldn't work, but if it does, you may have user specific files / registry entries / etc. that are persisting as an issue.

I agree that creating a new user account is something worth trying. I see a lot of weird issues, and sometimes it's the windows account or something specific to it. I'd create a local admin user account by running command prompt as admin, then doing the two following commands:

net user "username" password /add
net localgroup administrator "username" /add

This gets you a local account with the username "username" and password "password." I'd just delete it once you are done testing.

Also, since no one else has tossed it out there. a great catch-all for windows are these two commands:

sfc /scannow
DISM.exe /Online /Cleanup-Image /restorehealth

These both can fix corrupt files in windows, although it's unlikely to be the problem, there no harm in trying them.

Re-read and see that you've tried a new user account already. Crazy issue. My other instinct is that there is some Quality of Service stuff on the router causing issues. Sometimes that stuff hurts more than helps.

This would be a hell of a ticket if it came in at work. Props for providing so much detail

One more thing comes to mind, Windows 10/11 has a built-in "metered connection" toggle. It should be in Settings > Network & Internet > Ethernet. iirc it's set off by default, but maybe something with your ISP makes Windows think it's a metered connection. The issues you are seeing would be consistent with something trying to meter your connection.

Shot in the dark, but have you tried a traceroute test from your computer to whatever services' servers that you are having issues with? Steam would be a good test, get a network monitoring app like Wireshark and start a download, idk how steam does their CDN but by only running Steam should make finding the IP or hostname easy. I know from work (support for a hyperscaler) that bad routing can basically turn fast internet bad into the 56k era due to packet loss and latency.

Does the connection drops during a speedtest? After getting a new router I encountered a similar issue for a few days, felt slow and spotty, and would die during a synthetic speedest. I reseated the SFP Fibre module, reseted and updated the router and it went away.

Godspeed, hope this networking hell will be done soon!

Given everything else so far my guess is it’s to do with electrical/noise issues rather than the network itself. Something unshielded or destructively ringing at certain network.. intensities? In order I guess try a different power outlet for everything, then power cables swapped around or replaced, I might even suggest try DisplayPort instead of HDMI or vice versa but it’s probably not that, definitely try a powered USB hub for all peripherals (plugged into the back then the front of the computer to test), then try a not onboard/builtin network adapter (at least a USB Ethernet dongle to see if the problem still arises). It wasn’t totally clear to me if the power supply def got replaced, if not I would think of doing that soon-ish but I know it’s a PITA.

Jeez this sounds horrible. This reminds me of when my Bluetooth would always freakishly disconnect and reconnect and I found that the only solution(that I found on reddit) was to unplug the power cable from my pc, press/hold the power button for 30 seconds, and that would discharge the weird power issue and the Bluetooth device on the mobo would work like normal when I plugged and booted the pc, I nearly threw my pc into the river when this weird trick worked!

Recently I had a weird network issue with a msi b550 board that would essentially "work" but had weird side effects, the issue dealt with shitty drivers, windows had auto pulled and installed the wrong drivers to the pc, have a glance at the Device Manager maybe something weird is happening there. I even went to the network manufacturers website to download the proper drivers, but it still didnt work. The only solution was to install the drivers that came with the CD for the mobo, only then did it work, even the mobo drivers from the mobo website(identical to the cd version) would not launch correctly unless it was running off some kind of mounted cd device or through some bloatware looking app, what a weird thing, this was for a recent generation mobo too. My guess is that its pulling bad drivers. Network cards are pretty shitty with a bunch of weird cross over cards that are labeled one thing but are actually another card. I would try to reach out to Level1Tech community, Wendel is like the go to for pc related issues or maybe reach out to Brad Shoemaker and Will Smith, they are constantly running through weird network/tech issues and since you are familiar with former giabtbomb alumni maybe they can help.

Haunted PCs are such a fucking cluster fuck to fix, sorry your going through this.

A couple thoughts after stewing on this for a few days and watching the updates:

  1. What do your router logs look like? They can sometimes provide hints about connection drops.
  2. Does your router have any QoS enabled? I've had some problems with QoS implementations even when trying to guarantee a machine is prioritized.
  3. Have you tried reproducing the problem under another OS (like a Linux Live USB)? There's policy stuff that just trying a new user in Windows isn't going to affect and such, and there may have been something inadvertently set when you set up your new OS after reinstalling.
  4. I'm also somewhat inclined to think this is a result of some electronic noise in your house. Isolating with a UPS would help (if it's really noise), but it can be something of a last-resort to debug since they can be expensive. That said, they can be pretty handy in terms of preventing work loss when the power goes out, and can be handy in a pinch for charging devices if the power goes out for extended periods.

There's also esoteric behavior like Windows having the Nagle Algorithm enabled by default - there's some information about it and how to disable it here: https://support.microsoft.com/en-us/topic/fix-tcp-ip-nagle-algorithm-for-microsoft-message-queue-server-can-be-disabled-74ba2f6a-e558-d1df-1c60-57b0fab68ccc - I wouldn't expect that to solve your problem, but it can help improve latency and stuttering when gaming and such.

This sort of behavior can happen with an IP Address conflict. Which is a higher level issue than what most are posting here, but it's more insidious because it can go undiagnosed for a while.

The solution is to Network -> Change Adapter Settings and see if there is an existing IP set there for the interface you're working with. If there is then either set a new one on the same subnet (making sure it doesn't conflict with another device on the network) or clear the entries and see if the issue persists. Check both IPv6 and IPv4 settings but prioritize IPv4. The router ahould be assigning addresses automatically through DHCP, which a static set IP would conflict with. Routers can also have a static table that they set on their end that could conflict.

If there are no addressses set for any adapters open command prompt (Win+R and search cmd.exe) try ipconfig /release and ipconfig /renew to force a new IP lease on DHCP for your local network.