Proposed interface for file operations by the file dialog code (dlls\commdlg\filedlg*.c)
wine at troy.rollo.name
Mon Aug 15 21:55:35 CDT 2005
Work is currently proceeding on a branched version to create additional APIs
for WINE that use UNIX path names rather than Windows ones. This is useful
for Winelib apps and seeks to make them look more like they are native apps,
thereby addressing some of the complaints that Winelib apps are somehow of
lesser status than ports using other APIs. After some discussion, it was
decided that this would remain a branch until at least such time as the
implementation was proven to work in real-world situations.
Part of this involves producing file dialog APIs that operate appropriately
in this context. That includes taking UNIX path names on input, producing
UNIX path names on output and in callbacks to the application, and browsing a
heirarchy that does not include windows-isms such as drive letters.
To make modifications directly in the existing code would result in a set of
differences that would result in significant headaches for branch maintenance
and any future merge-back to WineHQ. The objective is to reduce the
differences so as to improve compatibility between the branches.
The patch referenced above took a minimal-change approach to this problem by
implementing an interface that mostly implemented the small operations made
by the existing code, without even putting in local stubs (hence the
inconsistent calling conventions in the interface).
The general principle I have used is that path names should, as far as
possible, be opaque so that the file dialog code itself never examines their
contents directly, but rather calls functions in the interface to extract or
locate particular portions of the path, to modify or concatenate paths or to
make use of the paths.
The minimalist interface:
HRESULT WINAPI (*get_top_folder) (IShellFolder **);
LPITEMIDLIST (*get_pidl_from_name)(IShellFolder *, LPWSTR);
BOOL (*get_display_name) (LPCITEMIDLIST, LPWSTR);
BOOL WINAPI (*change_directory) (LPCWSTR);
UINT WINAPI (*get_directory) (UINT, LPWSTR);
void (*qualify_path) (LPWSTR, LPCWSTR);
void (*complete_path) (LPWSTR);
BOOL (*has_invalid_char) (LPCWSTR);
LPWSTR WINAPI (*find_next_component) (LPCWSTR);
LPWSTR WINAPI (*find_file_name) (LPCWSTR);
LPWSTR WINAPI (*find_extension) (LPCWSTR);
BOOL WINAPI (*file_exists) (LPCWSTR);
BOOL WINAPI (*is_directory) (LPCWSTR);
LPWSTR WINAPI (*add_dir_sep) (LPWSTR);
DWORD WINAPI (*get_full_path) (LPCWSTR, DWORD, LPWSTR, LPWSTR*);
Minimalist interface vs ideal interface:
On IRC this morning Alexandre said he would prefer a well-designed interface
to the minimalist approach, hence this discussion.
Since the interface is entirely internal to commdlg, I will use cdecl calling
Detailed discussion follows.
The code_page member was put there because the UNIX file name APIs may not use
the same code page as other A entry points. WINE uses CP_ACP (as does Windows
- although CP_THREAD_ACP is subject to further investigation) for most
purposes, but CP_UNIXCP for UNIX path names when translating them to UTF16.
Often CP_UNIXCP will be something like UTF8, or it may be ISO8859-1 in
situations where Windows would use CP1252. It may be that a Winelib
application should have CP_ACP set to be the UNIX code page, but they may not
(and unless they do something special will not.
So places where A->W or W->A conversions happen on path names need to make
sure they use the right code page in the context. The minimalist approach was
to make this a data member, but the ideal approach would be to have functions
which performed the appropriate conversions. Looking through the existing
code, the conversions are performed in some contexts where the output is
allocated, and others where the output is a fixed size buffer. If we want to
have just one conversion function for each direction, this would give us:
CHAR *filename_wtoa(WCHAR const *in, CHAR *out, int bufrange);
WCHAR *filename_atow(CHAR const *in, WCHAR *out, int bufrange);
"in" is the input buffer.
"out" is the output buffer (NULL if we want the method to allocate).
"bufrange" is the number of elements pointed to by "out".
The return value is a pointer to the file name on success, and is either the
value of "out", or an allocated buffer where "out" is NULL. On failure the
return value is NULL.
The sep_char is perhaps the rudest part of the minimalist interface since it
does not treat the path names as opaque. It is used in the following
contexts: The handling of CDM_GETFILEPATH, where it is used to paste the file
name and directory name together; and in FILEDLG95_InitControls, where it is
used to determine if the input file name has a path component.
With an ideal interface the CDM_GETFILEPATH handling would be changed to use a
general path qualification function. Determining if the input file name has a
path component could be handled in one of two ways: with a method for
querying this; or by searching for the start of the file name component and
testing if that is the start of the string [ie. find_file_name(input) !=
I prefer the latter method since it means one less entry in the interface, but
if the boolean function were preferred it might appear as:
BOOL has_path(char const *filename);
get_top_folder exists because the Windows path versions of the dialog use the
Desktop folder as their top level, but a UNIX path version should arguably
use the UNIX root as its top level. Operations retrieving the top level
folder (SHGetDesktopFolder) appear in many places. I would prefer to return
the pointer though, hence:
The next three functions in the minimal interface (get_pidl_from_name;
get_display_name; and get_folder_from_pidl) are functions that are already
implemented for Windows path names and are used to handle conversions between
item ID lists and path names. Unless somebody thinks the existing
implementations are in need of reworking, I don't see any reason not to
include them as is in the interface.
The next two functions (change_directory; and get_directory) are currently
direct calls to SetCurrentDirectoryW and GetCurrentDirectoryW in the default
implementation of the interface. In the UNIX path versions there is some
difficulty in how these should be handled for reasons that are too complex to
go into here, but by having the interface at least a start can be made on
figuring out how to deal with these. GetCurrentDirectoryW is usually called
in contexts where the buffer is allocated (albeit wastefully), but in one
case is called on a stack buffer. I am inclined to have the stack buffer
replaced by an allocated one, hence:
BOOL set_directory(WCHAR const *dir);
Next comes qualify_path, which is used to generate a fully qualified and
canonical (no '/../' sequences) path name given a directory name and a
(possibly already fully qualified) path name. The minimalist declaration is
based on the way this operation was implemented in FILEDLG95_OnOpen, but in
accordance with my preference for allocating string buffers I would prefer:
WCHAR *qualify_path(WCHAR const *path, WCHAR const *dir);
It may be that if dir == NULL the function would use the current directory.
complete_path is only used in one place - in FILEDLG95_OnOpen where the
routine walks through its path elements, and tacks on a trailing backslash to
paths like "c:" (which will be the first component of the path for "c:
\windows\system.ini"). find_next_component (currently set to
PathFindNextComponentW) is only used in that same context, so perhaps the
better solution is to combine these two with:
WCHAR *next_component(WCHAR const *path, WCHAR const *last);
"last" is the most recent return value (or NULL for the first call).
"path" is the input path.
The return value is an allocated string containing the next path component,
so you would get "c:\", "windows", "system.ini" as return values.
has_invalid_char is used to test if the path name contains any invalid
characters. The current default implementation is wrong, but reflects what
was already there. The general rule is that Win32 paths (at least in the file
dialog) should not contain '/', ':' (except as a drive letter (*)), '<'. '>',
and '|'. IIRC, wild cards are forbidden in file names, but the filedlg code
does not treat them as invalid because they are valid for entry into the edit
box of the file dialog. Under UNIX, there are no invalid file name characters
- any string is a valid file name although it may be a relative file name. I
would keep this function but rename it:
BOOL *valid_file_name(WCHAR const *filename);
(*) can the file dialogs address stream names under NT? Is it meaningful to do
the same under Wine since UNIX has no concept of file sub-streams?
find_file_name and find_extension find the final path component, and the
extension (if any) in the file name. These are fairly pure functions for
handling otherwise opaque path names, so they would remain as is but without
the WINAPI calling convention:
WCHAR *find_file_name(WCHAR *filename);
WCHAR *find_extension(WCHAR *filename);
Next come file_exists and is_directory. These could be replaced by a single
routine that requests the type of the file and returns -1 if the file does
int get_file_type(WCHAR const *filename);
Returns (with optional symbolic constants):
-1: no such file
0: ordinary file
This would also simplify some other code where is_directory is called
immediately after file_exists.
add_dir_sep is used when pasting a path name and a wildcard string. If there
are no objections, qualify_path could be used for such situations, thereby
avoiding the need for a separate add_dir_sep.
get_full_path is currently a call to GetFullPathNameW. It is used in 3 places
in FILEDLG95_InitControls. Each time it is used to do three things: 1.
Convert any 8.3 filenames to the long path name version, then extract the
file name portion (if any) and directory name portion of the resulting long
path, storing them in separate locations. I would simplify the call,
allocating the result:
WCHAR *get_full_path(WCHAR const *path);
The file name component would then be discovered by a call to find_file_name.
Obvious candidates for changes:
1. find_extension could be implemented using find_file_name in a way that
obviates the need for a separate find_extension.
2. Might find_file_name be implemented in terms of next_component? This likely
depends on the behaviour or next_component with a path like "f:\windows\" -
if it only returns "f:\" and "windows" then it would not be suitable, but if
it returns "f:\", "windows" and "" then it would.
CHAR *(*filename_wtoa)(WCHAR const *in, CHAR *out, int bufrange);
WCHAR *(*filename_atow)(CHAR const *in, WCHAR *out, int bufrange);
LPITEMIDLIST (*get_pidl_from_name)(IShellFolder *, LPWSTR);
BOOL (*get_display_name)(LPCITEMIDLIST, LPWSTR);
BOOL (*set_directory)(WCHAR const *dir);
int (*get_file_type)(WCHAR const *filename);
WCHAR *(*qualify_path)(WCHAR const *path, WCHAR const *dir);
WCHAR *(*get_full_path)(WCHAR const *filename);
WCHAR *(*next_component)(WCHAR const *path, WCHAR const *last);
WCHAR *(*find_file_name)(WCHAR *filename);
WCHAR *(*find_extension)(WCHAR *filename);
BOOL *(*valid_file_name)(WCHAR const *filename);
More information about the wine-devel