About symlinks, NTFS, xattr, forks/streams, and Win10 NT linux subsystem lxss

Continuing the discussion from Support for indexing attribute in DOpus generally and Metadata Panel:

For readers in this forum: The main reason why I am sharing this is because when working with symbolic links, ntfs attributes, forks (alternate data streams) and the available tools (including WIM, VHDX etc) for managing files, volumes, and network drives with Windows and SMB/Samba it can be quite difficult to find a coherent discussion of the items on the web. And yet, it is a very powerful pattern of components that is essential for devops and IOT work that often involves NAS based setups with multiple windows machines, or VMs and/or mixed linux environments.

Aside: prior to my current CTO job, I worked at Microsoft as Architect for VB and VBScript, JScript and JavaScript and also for Powershell, .NET Mobile, and other things. And, DOpus is an awesomely powerful tool for managing files, directories, filesystems, servers etc. I really can't say enough good things about DOpus and regularly push it to my friends and anyone who asks me for IT help.

(cross-posted from github: https://github.com/docker/for-win/issues/109).

NTFS not only supports symlinks, it does so with more control than on linux (ext family) and bsd (incl osx family). However, it is "poorly" understood because the web is littered with FUD about security issues.

NTFS supports forks (aka ADS - alternate data streams) which can and are used extensively in the OS and by many tools to also provide xattr capabilities. See Tuxera link here for "tag, xattr, stream/fork" discussion on referencing and using NTFS partitions on Linux and MacOS.

NTFS/Windows supports five forms of links (with symlinks implemented as ntfs-reparse-points).
a) Hard links for files (but not directories) that live on the same volume partition.
b) Exported (global) "soft" links termed as directory junctions (a type of "ntfs-reparse-point" only for directories not files)
c) Imported (local) links termed simply as "directory or file symbolic link" (a type of "ntfs-reparse-point")
d) LNK files termed simply as shell "shortcuts" (contain paths and can be tracked and managed by the OS)
e) URL files termed simply as "browser URL files" (portably work as URI link form using file://....).

NTFS is a journaled file-system (USN operations) so one can build or install services that monitor changes to the file system for various purposes including tracking any form of file-system changes.

Creating symlinks of the form (b) or (c) require admin rights (although downward relative path links shouldn't ever require that).

fsutil can be used with R2R, R2L, L2R, L2L to control resolution access of symbolic links referenced through SAMBA/SMB/CIFS mounted NTFS volumes.

When symlinks are viewed locally (same machine as the ntfs/volume is located on) then a JunctionLink and a DirSymLink to a Directory behave equivalently.

When symlinks are viewed remotely (as network mounted volume) a JunctionLink (/J) will be resolved on the remote-machine first, whereas a DirSymLink (/D) will always be resolved locally after the link is seen. Thus the export/import associated with their usage (Jn for export-usage, and Dir for import-usage).

NTFS/Windows uses drive letters so that can make the definition of absolute and relative target paths a little more complex to understand at first look. I.e., a target path of C:\... is obviously absolute. But, if you made a link on drive C:\foo and set the target to "\" the "\" is also absolute, just implicitly on C.

To be a relative path there must be no leading drive-letter and no leading "\" (aka "/"). Windows/NTFS namespace (UNC) paths actually are \\?... format and the NT subsystems use reparse points to create drive letters in the NT object-space. (see sysinternals tools for easy way to view the object-space WinObj tool)

A JunctionLink is only for "directories". In practice, JunctionLinks (a form of ntfs-reparse-point) can only reference and resolve local-volume-path-targets.

Thus JunctionLinks should really only be used with absolute target-paths and only when you want to export their target on a network drive. I.e., to export a link where its target is on different local volume-partition (same as hardlink rule) and you want that local-target to remotely and locally resolve to the same server-local-target volume-partition absolute path. I.e., you have a drive N:\foo - - > C:\bar-path; and you want local and remote users accessing the network-server's "N:\foo" directory to see the network server's "C:\bar-path" contents. If you don’t use a (/J) link, but use a (/D) link, the remote users will find nework-server's "N:\foo" confusingly resolving as their local "C:\bar-path".

The same rule, just less obvious, applies when using root-level references (absolute paths) with no drive letter appear on the same-local-volume. I.e., for when you want to export (invisibly resolving the link-target-path locally before sharing) the path to a remote network drive-share clients.

Like, say, you have a network share volume "N:\" and you want to export a path on some part of drive "C:\foo\path" (as a link target), then and only then does a (/J) JunctionLink make sense and is in fact the only way to do that (resolve an absolute path on a remote machine). If you used a (/D) dir-symlink with an absolute path on a network-drive, you would get quickly confused as a remote consumer of that network drive since it would try to find that absolute path on your local machine and not the remote machine.

You also need to use junctions on the same drive when the path cannot be made "relative" but is in fact absolute relative to the root of the same drive. So from "N:\foo\bar" to "N:\abc\def" you cannot make a valid absolute "\..." or relative "..\..." DirSymLink without issues, you need a JunctionLink of the form "\abc\def".

In all other cases, you want to use an NTFS DirSymLink or FileSymLink (same type of ntfs-reparse-point just applied to a directory vs a file).

See mklink command for basic usage from cli (command line). Also see dir command for displaying symlink targets, the "L" attribute, and also the /R for viewing xattr/forks on files and directories.

See also: wim services and 7z support for creating wim files, which are useful for archiving/zip/copying and restoring directory structures properly that contain symlinks. 7z a -snl -sns archive.wim source-files....

More detailed information on NTFS reparse-points and NT naming rules (including commentary on lxss file mechanism - linux-beta NT subsystem for Windows 10) can readily be found googling with the appropriate keywords used in this sentence and above paragraphs.

David Simmons (afm-scm.org / thelightphone.com)

1 Like