Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add helper function Dirent.FollowSymlink() #36

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

hjkatz
Copy link

@hjkatz hjkatz commented Sep 10, 2019

Added a helper function for handling symlinks to Dirent. This function can be used as follows:

    err := godirwalk.Walk("./example", &godirwalk.Options{
        FollowSymbolicLinks: true,
        Callback: func(path string, de *godirwalk.Dirent) error {
            // resolve any symlinks we encounter to their linked filepath
            de, err := de.FollowSymlink()
            if err != nil {
                return err
            }

            log.WithFields(log.Fields{
                "path":      path,
                "IsRegular": de.IsRegular(),
                "IsDir":     de.IsDir(),
                "IsSymlink": de.IsSymlink(),
            }).Info("")

            return nil
        },
    })

This is useful because without following the symlink via filepath.EvalSymlinks() and creating a new Dirent, then the original Dirent on some systems will return de.IsSymlink() // true and de.IsDir() // false and de.IsRegular() // false, which can cause accidental mistakes. By using de.FollowSymlink() you can then be confident that de.IsDir() and de.IsRegular() will now return stat information about the followed symlink.

This PR also includes adding Dirent.Path() and Dirent.path to support absolute paths when following the symlink.

Note: There exists a subtle flaw in that you can't reassign and initialize a variable in the same := or = in golang, so instead these must be separated out. :/ The other option is to pre-declare both variables such as var err error or adding it to the function proto.

@karrick
Copy link
Owner

karrick commented Sep 27, 2019

Can you provide a mini file system example that led you down the path of adding this functionality?

There is nothing in the new FollowSymLinks method that requires it to have internal representation of the Dirent structure. This method could very well be implemented in the callback:

		Callback: func(osPathname string, de *godirwalk.Dirent) error {
			// resolve any symlinks we encounter to their linked filepath
			if de.IsSymlink() {
				resolvedPath, err := filepath.EvalSymlinks(osPathname)
				if err != nil {
					return nil, err
				}
				absPath, err := filepath.Abs(resolvedPath)
				if err != nil {
					return nil, err
				}
				de, err = NewDirent(absPath)
				if err != nil {
					return nil, err
				}
			}

			log.WithFields(log.Fields{
				"path":      path,
				"IsRegular": de.IsRegular(),
				"IsDir":     de.IsDir(),
				"IsSymlink": de.IsSymlink(),
			}).Info("")

			return nil
		},

Or implemented as a function in a given program.

func NewDirentByFollowingSymlinks(osPathname string, de *godirwalk.Dirent) (*godirwalk.Dirent, error) {
	resolvedPath, err := filepath.EvalSymlinks(osPathname)
	if err != nil {
		return nil, err
	}
	absPath, err := filepath.Abs(resolvedPath)
	if err != nil {
		return nil, err
	}
	return NewDirent(absPath)
}

Then in the Callback, you could call this function:

		Callback: func(osPathname string, de *godirwalk.Dirent) error {
			// resolve any symlinks we encounter to their linked filepath
			if de.IsSymlink() {
				var err error
				de, err = NewDirentByFollowingSymlinks(osPathname, de)
				if err != nil {
					return nil, err
				}
			}

			log.WithFields(log.Fields{
				"path":      path,
				"IsRegular": de.IsRegular(),
				"IsDir":     de.IsDir(),
				"IsSymlink": de.IsSymlink(),
			}).Info("")

			return nil
		},

I have two primary and two minor concerns with adding this into godirwalk right now. My first concern is expanding the API without my clearly understanding the benefit of doing so, especially in light of the two alternate solutions I have provided above.

My second concern flows from the first. If we added FollowFromSymlink to the API, it has limited use because not everyone would necessarily want to create a new Dirent from the absolute pathname, but maybe simply the resolved path.

My third concern is minor, but storing the entire pathname in each Dirent structure adds memory churn, when it might not always be required.

Finally my final concern is some of the API calling semantics of the proposed FollowSymlink method. It calls by value rather than by pointer, making a copy of the Dirent, only to return a pointer of itself if the entry is not a symlink. Then for all of the error conditions, it returns a copy of the copy of the entry along with the error. This is a problem because it allows sloppy programming, where a user might not bother checking the error and might just check the first return value for something. Finally, in the happy case, it calls NewDirent which just allocates yet another chunk of RAM for the new Dirent instance.

@hjkatz
Copy link
Author

hjkatz commented Oct 4, 2019

Thanks for the review.

I'll address the minor concerns first about memory and relative path:

  • For the value receiver vs. pointer receiver, that can be changed and was simply a copy/paste from the other functions in dirent.go, I agree that a pointer receiver is better in this case.
  • For returning a new Dirent vs changing the underlying Dirent received via pointer, this is initially how I wanted to implement this but got different feedback from an private review first. I think that we could receive a pointer to FollowSymlink() and it would change the underlying dirent to receive information about the new node on the filesystem. Either way is fine with me.
  • For the extra pathname being stored, I found it strange that a Dirent doesn't know where it came from or how to represent itself, it's like it only stores the information related to some basic filesystem info. I think adding in path for Dirent in general would be a good choice to allow for more flexibility in this library

Why should we add this feature at all?

The use case that led me to this library is being able to walk filepaths in golang with the ability to interact with symlinks. This library is the most full featured and user friendly, and I think this additional API fits with godirwalk's calling card of "interacts with symlinks".

Edit: I also originally thought that FollowSymbolicLinks: true would have automatically followed the symlinks and was surprised to see that godirwalk only just told me about the links and didn't actually follow them. This led me to think, "Well why couldn't godirwalk just follow the links automatically for me?".

I could write an internal function that extends this api only in my one project, but I want to share this ability with the world and share in the user friendly functionality. Just knowing that a path is a symlink is great, because now we can interact with it, and being able to directly interact with symlinks as a first class citizen of this library is a cherry on top in my opinion. And that's why I sought to add this feature to godirwalk.

Clarification:

If we added FollowFromSymlink to the API, it has limited use because not everyone would necessarily want to create a new Dirent from the absolute pathname, but maybe simply the resolved path.

What do you mean by this where someone would only want to visit the symlink on the resolved path? In my understanding symlinks (as they're soft links) represent a relative path to another path on the system that must have an absolute representation. These symlinks could be relative to another relative path to the originating path used in the call to Walk() but it could also not be relative to that, so I'm a bit confused on what you mean by the resolved path as a resolved path may not exist relative to the original path for all symlinks we encounter while walking?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants