WineHQ
WineHQ

8.6. File management

With time, Windows API comes closer to the old Unix paradigm "Everything is a file". Therefore, this whole section dedicated to file management will cover firstly the file management, but also some other objects like directories, and even devices, which are manipulated in Windows in a rather coherent way. We'll see later on some other objects fitting (more or less) in this picture (pipes or consoles to name a few).

First of all, Wine, while implementing the file interface from Windows, needs to maps a file name (expressed in the Windows world) onto a file name in the Unix world. This encompasses several aspects: how to map the file names, how to map access rights (both on files and directories), how to map physical devices (hardisks, but also other devices - like serial or parallel interfaces - and even VxDs).

8.6.1. Various Windows formats for file names

Let's first review a bit the various forms Windows uses when it comes to file names.

8.6.1.1. The DOS inheritance

At the beginning was DOS, where each file has to sit on a drive, called from a single letter. For separating device names from directory or file names, a ':' was appended to this single letter, hence giving the (in)-famous C: drive designations. Another great invention was to use some fixed names for accessing devices: not only where these named fixed, in a way you couldn't change the name if you'd wish to, but also, they were insensible to the location where you were using them. For example, it's well known that COM1 designates the first serial port, but it's also true that c:\foo\bar\com1 also designates the first serial port. It's still true today: on XP, you still cannot name a file COM1, whatever the directory!!!

Well later on (with Windows 95), Microsoft decided to overcome some little details in file names: this included being able to get out of the 8+3 format (8 letters for the name, 3 letters for the extension), and so being able to use "long names" (that's the "official" naming; as you can guess, the 8+3 format is a short name), and also to use very strange characters in a file name (like a space, or even a '.'). You could then name a file My File V0.1.txt, instead of myfile01.txt. Just to keep on the fun side of things, for many years the format used on the disk itself for storing the names has been the short name as the real one and to use some tricky aliasing techniques to store the long name. When some newer disk file systems have been introduced (NTFS with NT), in replacement of the old FAT system (which had little evolved since the first days of DOS), the long name became the real name while the short name took the alias role.

Windows also started to support mounting network shares, and see them as they were a local disk (through a specific drive letter). The way it has been done changed along the years, so we won't go into all the details (especially on the DOS and Win9x side).

8.6.1.2. The NT way

The introduction of NT allowed a deep change in the ways DOS had been handling devices:

  • There's no longer a forest of DOS drive letters (even if the assign was a way to create symbolic links in the forest), but a single hierarchical space.

  • This hierarchy includes several distinct elements. For example, \Device\Hardisk0\Partition0 refers to the first partition on the first physical hard disk of the system.

  • This hierarchy covers way more than just the files and drives related objects, but most of the objects in the system. We'll only cover here the file related part.

  • This hierarchy is not directly accessible for the Win32 API, but only the NTDLL API. The Win32 API only allows to manipulate part of this hierarchy (the rest being hidden from the Win32 API). Of course, the part you see from Win32 API looks very similar to the one that DOS provided.

  • Mounting a disk is performed by creating a symbol link in this hierarchy from \Global??\C: (the name seen from the Win32 API) to \Device\Harddiskvolume1 which determines the partition on a physical disk where C: is going to be seen.

  • Network shares are also accessible through a symbol link. However in this case, a symbol link is created from \Global??\UNC\host\share\ for the share share on the machine host) to what's called a network redirector, and which will take care of 1/ the connection to the remote share, 2/ handling with that remote share the rest of the path (after the name of the server, and the name of the share on that server).

    Note: In NT naming convention, \Global?? can also be called \?? to shorten the access.

All of these things, make the NT system pretty much more flexible (you can add new types of filesystems if you want), you provide a unique name space for all objects, and most operations boil down to creating relationship between different objects.

8.6.1.3. Wrap up

Let's end this chapter about files in Windows with a review of the different formats used for file names:

  • c:\foo\bar is a full path.

  • \foo\bar is an absolute path; the full path is created by appending the default drive (i.e. the drive of the current directory).

  • bar is a relative path; the full path is created by adding the current directory.

  • c:bar is a drive relative path. Note that the case where c: is the drive of the current directory is rather easy; it's implemented the same way as the case just below (relative path). In the rest of this chapter, drive relative path will only cover the case where the drive in the path isn't the drive of the default directory. The resolution of this to a full pathname defers according to the version of Windows, and some parameters. Let's take some time browsing through these issues. On Windows 9x (as well as on DOS), the system maintains a process wide set of default directories per drive. Hence, in this case, it will resolve c:bar to the default directory on drive c: plus file bar. Of course, the default per drive directory is updated each time a new current directory is set (only the current directory of the drive specified is modified). On Windows NT, things differ a bit. Since NT implements a namespace for file closer to a single tree (instead of 26 drives), having a current directory per drive is a bit awkward. Hence, Windows NT default behavior is to have only one current directory across all drives (in fact, a current directory expressed in the global tree) - this directory is of course related to a given process -, c:bar is resolved this way:

    • If c: is the drive of the default directory, the final path is the current directory plus bar.

    • Otherwise it's resolved into c:\bar.

    • In order to bridge the gap between the two implementations (Windows 9x and NT), NT adds a bit of complexity on the second case. If the =C: environment variable is defined, then it's value is used as a default directory for drive C:. This is handy, for example, when writing a DOS shell, where having a current drive per drive is still implemented, even on NT. This mechanism (through environment variables) is implemented on CMD.EXE, where those variables are set when you change directories with the cd. Since environment variables are inherited at process creation, the current directories settings are inherited by child processes, hence mimicking the behavior of the old DOS shell. There's no mechanism (in NTDLL or KERNEL32) to set up, when current directory changes, the relevant environment variables. This behavior is clearly band-aid, not a full featured extension of current directory behavior.

    Wine fully implements all those behaviors (the Windows 9x vs NT ones are triggered by the version flag in Wine).

  • \\host\share is UNC (Universal Naming Convention) path, i.e. represents a file on a remote share.

  • \\.\device denotes a physical device installed in the system (as seen from the Win32 subsystem). A standard NT system will map it to the \??\device NT path. Then, as a standard configuration, \??\device is likely to be a link to in a physical device described and hooked into the \Device\ tree. For example, COM1 is a link to \Device\Serial0.

  • On some versions of Windows, paths were limited to MAX_PATH characters. To circumvent this, Microsoft allowed paths to be 32,767 characters long, under the conditions that the path is expressed in Unicode (no Ansi version), and that the path is prefixed with \\?\. This convention is applicable to any of the cases described above.

To summarize, what we've discussed so, let's put everything into a single table...

Table 8-1. DOS, Win32 and NT paths equivalences

Type of pathWin32 exampleNT equivalentRule to construct
Full pathc:\foo\bar.txt\Global??\C:\foo\bar.txtSimple concatenation
Absolute path\foo\bar.txt\Global??\J:\foo\bar.txt Simple concatenation using the drive of the default directory (here J:)
Relative pathgee\bar.txt \Global??\J:\mydir\mysubdir\gee\bar.txt Simple concatenation using the default directory (here J:\mydir\mysubdir)
Drive relative pathj:gee\bar.txt

  • On Windows 9x (and DOS), J:\toto\gee\bar.txt.

  • On Windows NT, J:\gee\bar.txt.

  • On Windows NT, J:\tata\titi\bar.txt.

  • On Windows 9x (and DOS), \toto is the default directory on drive J:.

  • On Windows NT, if =J: isn't set.

  • On Windows NT, if =J: is set to J:\tata\titi.

UNC (Uniform Naming Convention) path\\host\share\foo\bar.txt \Global??\UNC\host\share\foo\bar.txt Simple concatenation.
Device path\\.\device\Global??\deviceSimple concatenation
Long paths\\?\...  With this prefix, paths can take up to 32,767 characters, instead of MAX_PATH for all the others). Once the prefix stripped, to be handled like one of the previous ones, just providing internal buffers large enough).

8.6.2. Wine implementation

We'll mainly cover in this section the way Wine opens a file (in the Unix sense) when given a Windows file name. This will include mapping the Windows path onto a Unix path (including the devices case), handling the access rights, the sharing attribute if any...

8.6.2.1. Mapping a Windows path into an absolute Windows path

First of all, we described in previous section the way to convert any path in an absolute path. Wine implements all the previous algorithms in order to achieve this. Note also, that this transformation is done with information local to the process (default directory, environment variables...). We'll assume in the rest of this section that all paths have now been transformed into absolute from.

8.6.2.2. Mapping a Windows (absolute) path onto a Unix path

When Wine is requested to map a path name (in DOS form, with a drive letter, e.g. c:\foo\bar\myfile.txt), Wine converts this into the following Unix path $(WINEPREFIX)/dosdevices/c:/foo/bar/myfile.txt. The Wine configuration process is responsible for setting $(WINEPREFIX)/dosdevices/c: to be a symbolic link pointing to the directory in Unix hierarchy the user wants to expose as the C: drive in the DOS forest of drives.

This scheme allows:

  • a very simple algorithm to map a DOS path name into a Unix one (no need of Wine server calls)

  • a very configurable implementation: it's very easy to change a drive mapping

  • a rather readable configuration: no need of sophisticated tools to read a drive mapping, a ls -l $(WINEPREFIX)/dosdevices says it all.

This scheme is also used to implement UNC path names. For example, Wine maps \\host\share\foo\bar\MyRemoteFile.txt into $(WINEPREFIX)/dosdevices/unc/host/share/foo/bar/MyRemoteFile.txt. It's then up to the user to decide where $(WINEPREFIX)/dosdevices/unc/host/share shall point to (or be). For example, it can either be a symbolic link to a directory inside the local machine (just for emulation purpose), or a symbolic link to the mount point of a remote disk (done through Samba or NFS), or even the real mount point. Wine will not do any checking here, nor will help in actually mounting the remote drive.

We've seen how Wine maps a drive letter or a UNC path onto the Unix hierarchy, we now have to look at how the filename is searched within this hierarchy. The main issue is about case sensitivity. Here's a reminder of the various properties for the file systems in the field.

Table 8-2. File systems' properties

FS NameLength of elementsCase sensitivity (on disk)Case sensitivity for lookup
FAT, FAT16 or FAT32Short name (8+3)Names are always stored in upper-caseCase insensitive
VFATShort name (8+3) + alias on long name Short names are always stored in upper-case. Long names are stored with case preservation. Case insensitive
NTFSLong name + alias on short name (8+3). Long names are stored with case preservation. Short names are always stored in upper-case. Case insensitivity
Linux FS (ext2fs, ext3fs, reiserfs...)Long nameCase preservingCase sensitive

Case sensitivity vs. preservation: When we say that most systems in NT are case insensitive, this has to be understood for looking up for a file, where the matches are made in a case insensitive mode. This is different from VFAT or NTFS "case preservation" mechanism, which stores the file names as they are given when creating the file, while doing case insensitive matches.

Since most file systems used in NT are case insensitive and since most Unix file systems are case sensitive, Wine undergoes a case insensitive search when it has found the Unix path is has to look for. This means, for example, that for opening the $(WINEPREFIX)/dosdevices/c:/foo/bar/myfile.txt, Wine will recursively open all directories in the path, and check, in this order, for the existence of the directory entry in the form given in the file name (i.e. case sensitive), and if it's not found, in a case insensitive form. This allows to also pass, in most Win32 file API, a Unix path (instead of a DOS or NT path), but we'll come back to this later. This also means that the algorithm described doesn't correctly handle the case of two files in the same directory, whose names only differ on the case of the letters. This means that if there are two files in the same directory whose names match in a case sensitive comparison, Wine will pick up the right one if the filename given matches one of the names (in a case sensitive way), but will pickup one of the two (without defining the one it's going to pickup) if the filename given matches none of the two names in a case sensitive way (but in a case insensitive way). For example, if the two filenames are my_neat_file.txt and My_Neat_File.txt, Wine behavior when opening MY_neat_FILE.txt is undefined.

As Windows, at the early days, didn't support the notion of symbolic links on directories, lots of applications (and some old native DLLs) are not ready for this feature. Mainly, they imply that the directory structure is a tree, which has lots of consequences on navigating in the forest of directories (i.e. there cannot be two ways for going from directory to another, there cannot be cycles...). In order to prevent some bad behavior for such applications, Wine sets up an option. By default, symbolic links on directories are not followed by Wine. There's an option to follow them (see the Wine User Guide), but this could be harmful.

Wine considers that Unix file names are long filename. This seems a reasonable approach; this is also the approach followed by most of the Unix OSes while mounting Windows partitions (with filesystems like FAT, FAT32 or NTFS). Therefore, Wine tries to support short names the best it can. Basically, they are two options:

  • The filesystem on which the inspected directory lies in a real Windows FS (like FAT, or FAT32, or NTFS) and the OS has support to access the short filename (for example, Linux does this on FAT, FAT32 or VFAT). In this case, Wine makes full use of this information and really mimics the Windows behavior: the short filename used for any file is the same than on Windows.

  • If conditions listed above are not met (either, FS has no physical short name support, or OS doesn't provide the access access to the short name), Wine decides and computes on its own the short filename for a given long filename. We cannot ensure that the generated short name is the same than on Windows (because the algorithm on Windows takes into account the order of creation of files, which cannot be implemented in Wine: Wine would have to cache the short names of every directory it uses!). The short name is made up of part of the long name (first characters) and the rest with a hashed value. This has several advantages:

    • The algorithm is rather simple and low cost.

    • The algorithm is stateless (doesn't depend of the other files in the directory).

    But, it also has the drawbacks (of the advantages):

    • The algorithm isn't the same as on Windows, which means a program cannot use short names generated on Windows. This could happen when copying an existing installed program from Windows (for example, on a dual boot machine).

    • Two long file names can end up with the same short name (Windows handles the collision in this case, while Wine doesn't). We rely on our hash algorithm to lower at most this possibility (even if it exists).

Wine also allows in most file API to give as a parameter a full Unix path name. This is handy when running a Wine (or Winelib) program from the command line, and one doesn't need to convert the path into the Windows form. However, Wine checks that the Unix path given can be accessed from one of the defined drives, insuring that only part of the Unix / hierarchy can be accessed.

As a side note, as Unix doesn't widely provide a Unicode interface to the filenames, and Windows implements filenames as Unicode strings (even on the physical layer with NTFS, the FATs variant are ANSI), we need to properly map between the two. At startup, Wine defines what's called the Unix Code Page, that's is the code page the Unix kernel uses as a reference for the strings. Then Wine uses this code page for all the mappings it has to do between a Unicode path (on the Windows side) and a Ansi path to be used in a Unix path API. Note, that this will work as long as a disk isn't mounted with a different code page than the one the kernel uses as a default.

We describe below how Windows devices are mapped to Unix devices. Before that, let's finish the pure file round-up with some basic operations.

8.6.2.3. Access rights and file attributes

Now that we have looked how Wine converts a Windows pathname into a Unix one, we need to cover the various meta-data attached to a file or a directory.

In Windows, access rights are simplistic: a file can be read-only or read-write. Wine sets the read-only flag if the file doesn't have the Unix user-write flag set. As a matter of fact, there's no way Wine can return that a file cannot be read (that doesn't exist under Windows). The file will be seen, but trying to open it will return an error. The Unix exec-flag is never reported. Wine doesn't use this information to allow/forbid running a new process (as Unix does with the exec-flag). Last but not least: hidden files. This exists on Windows but not really on Unix! To be exact, in Windows, the hidden flag is a metadata associated to any file or directory; in Unix, it's a convention based on the syntax of the file name (whether it starts with a '.' or not). Wine implements two behaviors (chosen by configuration). This impacts file names and directory names starting by a '.'. In first mode (ShowDotFile is FALSE), every file or directory starting by '.' is returned with the hidden flag turned on. This is the natural behavior on Unix (for ls or even file explorer). In the second mode (ShowDotFile is TRUE), Wine never sets the hidden flag, hence every file will be seen.

Last but not least, before opening a file, Windows makes use of sharing attributes in order to check whether the file can be opened; for example, a process, being the first in the system to open a given file, could forbid, while it maintains the file opened, that another process opens it for write access, whereas open for read access would be granted. This is fully supported in Wine by moving all those checks in the Wine server for a global view on the system. Note also that what's moved in the Wine server is the check, when the file is opened, to implement the Windows sharing semantics. Further operation on the file (like reading and writing) will not require heavy support from the server.

The other good reason for putting the code for actually opening a file in the server is that an opened file in Windows is managed through a handle, and handles can only be created in Wine server!

Just a note about attributes on directories: while we can easily map the meaning of Windows FILE_ATTRIBUTE_READONLY on a file, we cannot do it for a directory. Windows semantics (when this flag is set) means do not delete the directory, while the w attribute in Unix means don't write nor delete it. Therefore, Wine uses an asymmetric mapping here: if the directory (in Unix) isn't writable, then Wine reports the FILE_ATTRIBUTE_READONLY attribute; on the other way around, when asked to set a directory with FILE_ATTRIBUTE_READONLY attribute, Wine simply does nothing.

8.6.2.4. Operations on file

8.6.2.4.1. Reading and writing

Reading and writing are the basic operations on files. Wine of course implements this, and bases the implementation on client side calls to Unix equivalents (like read() or write()). Note, that the Wine server is involved in any read or write operation, as Wine needs to transform the Windows-handle to the file into a Unix file descriptor it can pass to any Unix file function.

8.6.2.4.2. Getting a Unix fd

This is major operation in any file related operation. Basically, each file opened (at the Windows level), is first opened in the Wine server, where the fd is stored. Then, Wine (on client side) uses recvmsg() to pass the fd from the wine server process to the client process. Since this operation could be lengthy, Wine implement some kind of cache mechanism to send it only once, but getting a fd from a handle on a file (or any other Unix object which can be manipulated through a file descriptor) still requires a round trip to the Wine server.

8.6.2.4.3. Locking

Windows provides file locking capabilities. When a lock is set (and a lock can be set on any contiguous range in a file), it controls how other processes in the system will have access to the range in the file. Since locking range on a file are defined on a system wide manner, its implementation resides in wineserver. It tries to make use of Unix file locking (if the underlying OS and the mounted disk where the file sits support this feature) with fcntl() and the F_SETLK command. If this isn't supported, then wineserver just pretends it works.

8.6.2.4.4. I/O control

There's no need (so far) to implement support (for files and directories) for DeviceIoControl(), even if this is supported by Windows, but for very specific needs (like compression management, or file system related information). This isn't the case for devices (including disks), but we'll cover this in the hereafter section related to devices.

8.6.2.4.5. Buffering

Wine doesn't do any buffering on file accesses but rely on the underlying Unix kernel for that (when possible). This scheme is needed because it's easier to implement multiple accesses on the same file at the kernel level, rather than at Wine levels. Doing lots of small reads on the same file can turn into a performance hog, because each read operation needs a round trip to the server in order to get a file descriptor (see above).

8.6.2.4.6. Overlapped I/O

Windows introduced the notion of overlapped I/O. Basically, it just means that an I/O operation (think read / write to start with) will not wait until it's completed, but rather return to the caller as soon as possible, and let the caller handle the wait operation and determine when the data is ready (for a read operation) or has been sent (for a write operation). Note that the overlapped operation is linked to a specific thread.

There are several interests to this: a server can handle several clients without requiring multi-threading techniques; you can handle an event driven model more easily (i.e. how to kill properly a server while waiting in the lengthy read() operation).

Note that Microsoft's support for this feature evolved along the various versions of Windows. For example, Windows 95 or 98 only supports overlapped I/O for serial and parallel ports, while NT also supports files, disks, sockets, pipes, or mailslots.

Wine implements overlapped I/O operations. This is mainly done by queuing in the server a request that will be triggered when something the current state changes (like data available for a read operation). This readiness is signaled to the calling processing by queuing a specific APC, which will be called within the next waiting operation the thread will have. This specific APC will then do the hard work of the I/O operation. This scheme allows to put in place a wait mechanism, to attach a routine to be called (on the thread context) when the state changes, and to be done is a rather transparent manner (embedded any the generic wait operation). However, it isn't 100% perfect. As the heavy operations are done in the context of the calling threads, if those operations are lengthy, there will be an impact on the calling thread, especially its latency. In order to provide an effective support for this overlapped I/O operations, we would need to rely on Unix kernel features (AIO is a good example).

8.6.2.5. Devices & volume management

We've covered so far the ways file names are mapped into Unix paths. There's still need to cover it for devices. As a regular file, devices are manipulated in Windows with both read / write operations, but also control mechanisms (speed or parity of a serial line, volume name of a hard disk, ...). Since this is also supported in Linux, there's also a need to open (in a Unix sense) a device when given a Windows device name. This section applies to DOS device names, which are seen in NT as nicknames to other devices.

Firstly, Wine implements the Win32 to NT mapping as described above, hence every device path (in NT sense) is of the following form: /??/devicename (or /DosDevices/devicename). As Windows device names are case insensitive, Wine also converts them to lower case before any operation. Then, the first operation Wine tries is to check whether $(WINEPREFIX)/dosdevices/devicename exists. If so, it's used as the final Unix path for the device. The configuration process is in charge of creating for example, a symbolic link between $(WINEPREFIX)/dosdevices/PhysicalDrive0 and /dev/hda0. If such a link cannot be found, and the device name looks like a DOS disk name (like C:), Wine first tries to get the Unix device from the path $(WINEPREFIX)/dosdevices/c: (i.e. the device which is mounted on the target of the symbol link); if this doesn't give a Unix device, Wine checks whether $(WINEPREFIX)/dosdevices/c:: exists. If so, it's assumed to be a link to the actual Unix device. For example, for a CD Rom, $(WINEPREFIX)/dosdevices/e:: would be a symbolic link to /dev/cdrom. If this doesn't exist (we're still handling a device name of the C: form), Wine tries to get the Unix device from the system information (/etc/mtab and /etc/fstab on Linux). We cannot apply this method in all the cases, because we have no insurance that the directory can actually be found. One could have, for example, a CD Rom which he/she want only to use as audio CD player (i.e. never mounted), thus not having any information of the device itself. If all of this doesn't work either, some basic operations are checked: if the devicename is NUL, then /dev/null is returned. If the device name is a default serial name (COM1 up to COM9) (resp. printer name LPT1 up to LPT9), then Wine tries to open the Nth serial (resp. printer) in the system. Otherwise, some basic old DOS name support is done AUX is transformed into COM1 and PRN into LPT1), and the whole process is retried with those new names.

To sum up:

Table 8-3. Mapping of Windows device names into Unix device names

Windows device nameNT device nameMapping to Unix device name
<any_path>AUX>\Global??\AUX Treated as an alias to COM1
<any_path>PRN\Global??\PRNTreated as an alias to LPT1
<any_path>COM1\Global??\COM1 $(WINEPREFIX)/dosdevices/com1 (if the symbol link exists) or the Nth serial line in the system (on Linux, /dev/ttyS0).
<any_path>LPT1\Global??\LPT1 $(WINEPREFIX)/dosdevices/lpt1 (if the symbol link exists) or the Nth printer in the system (on Linux, /dev/lp0).
<any_path>NUL\Global??\NUL/dev/null
\\.\E:\Global??\E: $(WINEPREFIX)/dosdevices/e:: (if the symbolic link exists) or guessing the device from /etc/mtab or /etc/fstab.
\\.\<device_name> \Global??\<device_name> $(WINEPREFIX)/dosdevices/<device_name> (if the symbol link exists).

Now that we know which Unix device to open for a given Windows device, let's cover the operation on it. Those operations can either be read / write, IO control (and even others).

Read and write operations are supported on real disks & CDROM devices, under several conditions:

  • Foremost, as the ReadFile() and WriteFile() calls are mapped onto the Unix read() and write() calls, the user (from the Unix perspective of the one running the Wine executable) must have read (resp. write) access to the device. It wouldn't be wise to let a user write directly to a hard disk!!!

  • Block sizes for read and write must be of the size of a physical block (generally 512 for a hard disk, depending on the type of CD used), and offsets must also be a multiple of the block size.

Wine also reads (if the first condition above about access rights is met) the volume information from a hard disk or a CD ROM to be displayed to a user.

Wine also recognizes VxDs as devices. But those VxDs must be the Wine builtin ones (Wine will never allow to load native VxDs). Those are configured with symbolic links in the $(WINEPREFIX)/dosdevices/ directory, and point to the actual builtin DLL. This DLL exports a single entry point, that Wine will use when a call to DeviceIoControl is made, with a handle opened to this VxD. This allows to provide some kind of compatibility for old Win9x apps, still talking directly to VxD. This is no longer supported on Windows NT, newest programs are less likely to make use of this feature, so we don't expect lots of development in this area, even though the framework is there and working. Note also that Wine doesn't provide support for native VxDs (as a game, report how many times this information is written in the documentation; as an advanced exercise, find how many more occurrences we need in order to stop questions whether it's possible or not).