oh god how did this get here i am not good with computer

 


 

Background music:
Click here because I can't put an audio widget in the profile

 

The scenes with the shark are usually very intense and disturbing.

 

I use Arch BTW

 

Fun fact: Neo-Nazi dipshit cartoonist Stonetoss is in fact Hans Kristian Graebener of Spring, Texas


blep
@blep
This page's posts are visible only to users who are logged in.

DecayWTF
@DecayWTF

there's a bunch of dumb bullshit going on here. I started writing this last night and it's kinda long.


First thing: / as path separator is a Unixism and was not historically standard by any means. It is generally accepted that the first hierarchical filesystem was implemented as part of Multics (an ambitious attempt at a timesharing OS that could serve as a "computing utility" with similar reliability to the phone or electrical systems) and used > as a path separator. DEC's TOPS-20 and VMS both use(d) a . (and have other similarities as well), MIT's homegrown ITS used a ;. Even among "modern" operating systems, this continued; Mac OS prior to OS X used a :. Beyond that, many of these systems had other niceties; TOPS-20 and VMS, for instance, have structured pathnames where the directory is a single delimited component, wrapped in <> and [], respectively. Many systems include some sort of drive or volume specifier; Unix's single tree is extremely unusual, historically. Finally, many operating systems, particularly microcomputer systems like the mini-OS built into the Commodore 1541 disk drive, Digital Research's CP/M and early PC-DOS, but even some early mainframe OSes, had no support for hierarchical filesystems at all, so there was no path to separate.

So how does this relate to Windows? Well, when the IBM Personal Computer was in development, besides the home computer market (C64, Timex Sinclair, BBC Micro, TRS-80, etc) there was already a thriving-ish market for business machines based on the Intel 8080/8085 or Zilog Z80 CPUs, which ran CP/M. IBM had no operating system for the IBM PC at the time and approached both Digital Research about porting CP/M to the PC, and Microsoft, who at the time were known for selling the Microsoft BASIC interpreter that was already licensed to every computer manufacturer that would take MS's calls.

There's a lot of mostly bullshit legendry around how this all went down but the important part is that MS did not have an OS to sell at the time, so instead they acquired a CP/M semi-workalike called 86-DOS from Seattle Computer Products. CP/M didn't have directory paths, but instead had drive letters (named devices like this being a typical way to address different volumes on basically every non-Unix system; CP/M just had a particularly simple-minded approach), so each filename, fully qualified, was something like A:FILENAME.TXT, for a file named FILENAME.TXT on the first floppy drive. 86-DOS copied this and a lot of other CP/M-isms.

The early PC DOS utilities settled on / as a switch specifier for passing parameters to commands (eg, if you want a directory listing and you want the listing to pause after each "page", or screen, of listing you would type "DIR /P"). There's a lot of confusion about where this came from. It didn't come from CP/M or IBM's mainframe OSes, none of which used this convention, though that's commonly said to be the case. However, DEC systems (like TOPS-10 and TOPS-20) did, and MS did a lot of early development of eg Altair BASIC on a PDP-10 so it's likely that's where it's from.

So for PC DOS 1.0 we had drive letters pointing to named floppy drives, and forward slashes as command flags and that was fine. DOS 2.0, however, introduced a hierarchical filesystem (a modification of the really simple fs used in 1.0 that was called FAT, and which in a modifier 32-bit form is still commonly used today) so as to be able to support hard drives. The obvious choice for the path separator is a forward slash; it's 1982/1983 now and Unix has unequivocally won, such that DOS 2.0 is consciously taking on a number of other Unixisms like pipes, file descriptors and environment variables. However, we use forward slashes pervasively for command switches. So instead we use backslashes. We don't discard the drive letter system for many reasons (backwards compatibility and the fact that most PCs, even those with hard drives, never had a stable "root" FS because people were constantly booting from floppies that we're going to be swapped around for various reason) just adapt it; assign letters to all floppies first, then to all hard drives. IBM branded hardware only ever supported two internal floppy drives, and two drives was typical on clones, so C: was usually the first hard drive, and so as of DOS 5.0 that was hardwired in which is why your system drive is C:. Windows up to 3.11 was all just a shell on top of DOS (and Win9x was still mostly that) so Windows inherits the same structure too.

Nowadays, modern Windows supports / as a path separator and supports Unix-style volume mounts as well, but drive letters are still a primary designator and mostly storage devices are still addressed this way, but if you have a lot of partitions you can set up mount points for the ones you can't assign. In the old days you just couldn't have more than 26 total drives but there was no hardware that could support that anyway without stupid partition tricks, so you'd only run into the limit if you were doing a lot of network drive mapping or going nuts with SUBST, which meant that in practice it was mostly something systems administrators worried about.

Those long paths starting with \ are called UNC paths and are a sorta-kinda single tree naming system that allows all "devices" to appear in a single tree. "\localhost\c$" is actually referring to your computer "over the network" and connecting to the autogenerated full disk share for the C: drive.


You must log in to comment.

in reply to @blep's post:

when you access \localhost\c$, you access an SMB network share of your c drive that Windows sets up automatically. fun fact: this allows domain admins to access any domain machine’s c drive over the network. it can actually be removed, though i haven’t tried that.

just checked on my local machine (where I've disabled the default shares) and apparently \\localhost\[drive letter]$ works to access local drives even if you've disabled the shares, which is wild

what happens when you mount more than 26 disks?

you can mount a disk at any (empty) directory of any existing NTFS volume. so actually there’s no guarantee that C:\SomeDir is really on the “C” volume

If you mount more than 24 disks (A: and B: are reserved for floppy disks specifically), any further disks will be mounted but won't be easily accessible. Windows doesn't have "mount points": mounting a filesystem happens automatically when you first access a device that's marked as a volume, and things like drive letters or "mount points" are actually symbolic links.

The real path of a disk is something like \Device\HarddiskVolume3 (in kernel path syntax), and a drive letter is actually a symbolic link with a path like \GLOBAL??\C:. C: is not a file, but a symbolic link object, and \GLOBAL?? is not a directory, but an object directory: both are abstractions used to give a structure to the kernel object namespace. Symbolic links under \GLOBAL?? are known as "DOS devices", and include drive letters, proper DOS devices like COM1, AUX, PRN etc. and other kinds of symbolic links to devices.

A Win32 path like C:\dir\file is translated to a kernel path like \??\C:\dir\file; for various backwards compatibility reasons, the translation includes stripping trailing dots from path components, converting forward slashes to backslashes and collapsing special directory entries like .. and . (e.g. C:\dir\.\dir/../file... translates to \??\C:\dir\file as well). In the kernel namespace, there is actually no object directory named \??: it's shorthand for a somewhat complex per-process search path, with \GLOBAL?? as its fallback.

Does this look familiar? Yes, the \\?\-prefixed syntax is a way to skip the normalization step, and feed barely-disguised kernel paths straight to the APIs. If you pass \\?\C:\dir\file, a trivial translation algorithm will be used that turns that into \??\C:\dir\file (what about .. and . directory entries and forward slashes? you are at the mercy of the filesystem implementation for how those are handled). An older escape syntax is the \\.\ prefix, which works almost exactly the same, but is generally used to access devices that aren't in the old DOS set of devices accessible from any path; e.g. COM1 through COM9 are accessible by simply putting the device name in any part of the path (COM1, \\.\COM1 C:\COM1, COM1.txt, C:\COM1\etc all work), but COM10 and above must be accessed as \\.\COM10.

Don't read too much into the \\ syntax, all this is unrelated to UNC (network) paths, which predate Windows NT and the \\.\ and \\?\ escapes. In fact, UNC paths like \\localhost\c$\Users\cat\Desktop\help.txt are internally translated as \??\UNC\localhost\c$\Users\cat\Desktop\help.txt, where UNC is the "DOS device" (symbolic link) for the Multiple UNC Provider (MUP), a router that sits in front of network filesystems, calling each in turn until one accepts the host/share pair in the path. In this case, the host is localhost and the share is c$, and the path will be accepted by the SMB redirector, which automatically creates a share named x$ for each x: drive letter (yes, they're remotely accessible). The reason the share name doesn't include the : part is that it's an invalid character in paths, except to separate the drive letter from the rest of the path; and the reason for the $ is that it's an ancient convention to create hidden entities: all shares (and, I believe, user names and group names as well) that end with $ won't be enumerated.

The full explanation of the Windows path syntax is much more complex. This is the best explanation I know of, although it might be outdated by now: https://googleprojectzero.blogspot.com/2016/02/the-definitive-guide-on-win32-to-nt.html

How can you access the 25th disk? Drive letters, as I said, aren't the only kind of symbolic links created for disk volumes. Others include:

  • mount manager symbolic links; they look like \\?\Volume{GUID}. You can enumerate thse with command line utility mountvol (which you can also use to assign and revoke drive letters or "mount points", which aren't really mount points). They can be enumerated programmatically with FindFirstVolume/FindNextVolume
  • device manager (aka "Plug & Play") symbolic links; they look like \\?\<device instance path>#{device interface GUID} (the device instance path's backslashes are replaced with #). For example, for a volume enumerated as STORAGE\VOLUME\{AF0B22BB-8E21-4013-9764-211103064B98}#0000000011D00000 that implements interface {53f5630d-b6bf-11d0-94f2-00a0c91efb8b} (regular disk volume), the device manager will create symbolic link \\?\STORAGE#Volume#{af0b22bb-8e21-4013-9764-211103064b98}#0000000033D00000#{53f5630d-b6bf-11d0-94f2-00a0c91efb8b}. These can be enumerated with device manager APIs, like CM_Get_Device_Interface_List or SetupDiEnumDeviceInterfaces.

You can also create "mount points" on supported filesystems (like NTFS), as special "reparse point" directories. "Reparse points" are a multi-purpose superset of symbolic links, that come in many different kinds. One of the built-in kinds of reparse points is the "mount point", a symbolic link to a volume managed by the mount manager (not any volume but specifically those managed by the mount manager). These differ from proper UNIX mount points in a couple ways:

  • they're actually persisted on disk, they aren't just in-memory abstractions (as a corollary, the target filesystem's on-disk format must allow for storing reparse points). This has its advantages, because you can reorder volumes on a disk, or move them to different physical disks altogether, and all the mount points will keep magically working (volume GUIDs are stored on-disk, in that hidden reserved space at the end of the disk. Does your partitioning software correctly support the - undocumented - format of that database?).
  • since they're just symbolic links, a volume can have many of them. Corollary: if C:\mount is a mount point for \Device\HarddiskVolume4, \\?\C:\mount\.. translates to \Device\HarddiskVolume4\.., which points to the root directory of the target volume, not the root directory of C:; in regular use, this is hidden by the fact that paths like C:\mount\.. are normalized in user mode before being passed to the kernel (C:\mount\.. -> C:\ -> \??\C:\ -> \Device\HarddiskVolume3\), but if you use the \\?\ escape you have to perform this kind of normalization yourself (otherwise, \\?\C:\mount\.. \??\C:\mount\.. -> \Device\HarddiskVolume4\..). On UNIX, this doesn't happen because you can only have one mount point per filesystem (... with exceptions/complications that I won't go into), and the kernel can pretend that the root of the mounted filesystem is actually a subdirectory of the directory that contains the mount point: \media\mount\.. actually "exists" (albeit only in memory) and it correctly points to \media\, instead of being user mode trickery like on Windows.
  • Windows applications generally have no idea what to do with mount points and they'll behave like you pranked them. Expect all sorts of things to stop working correctly, from file operations failing to incorrect calculations of disk space (to avoid this, just pass a full path to GetDiskFreeSpaceEx, instead of passing just the drive letter to GetDiskFreeSpace).