Tool: Hash file creation & validation

Overview:

Here's a script add-in that will provide the following:

  • A new function called GenerateHashFile
  • A new column called Hash Match

Screenshot:

(Here I'm using label assignments to highlight where there's an 'error' in the Hash Match column)

Usage of GenerateHashFile command:

  • GenerateHashFile
    Will generate a .md5 file called checksums.md5, containing all the md5 hashes of the selected files and their relative paths. If only one file is selected then the original filename will be used with .md5 appended.

  • GenerateHashFile TYPE=SHA
    Will generate a .sha1 file of the selected files with the sha1 hashes.

  • GenerateHashFile FILENAME=MyHashes.sha TYPE=SHA
    Will generate a .sha1 file of the selected files with the sha1 hashes, but this time will write the list to MyHashes.sha

Format of checksum file:

The checksum file format is common to many existing hash checkers, e.g. md5sum, hashcheck shell extension.

954856d66e704aea6446a6429abc002e *regeditor\de\OORegEdt.dll
1e782065c29d66aa0be77cbd72b0c616 *regeditor\en\OORegEdt.dll
55aa734b40e211abe30e911b54753818 *regeditor\OORegEdt.exe
1cf07ad9c204ea69bd4946c80e52a76c *regeditor\OORegEdt.INI
a987a7d847ca473bf53881a01d7c3537 *regeditor\OORegEdt32.exe
732eb88101907deea293a6506ce3acdc *regeditor\README.TXT
a02fd0e210db0175fe2132dbcc2c2867 *Cleanup.bat
c7517af60d377f3feaae84dbc279e0e9 *dbgview.chm
baaca87fe5ac99e0f1442b54e03056f4 *Dbgview.exe
d22ff2cc70fa2eec94aaa6c6f49e6eb0 *Eula.txt
7cb9cb2e8aeba12583eb2d6937e443ab *nircmdc.exe
bee5c8cd7840a53ce071feaacb9a3987 *odrive.exe

If you need another format let me know!

I welcome all comments and suggestions!

Code for reference:

Option Explicit

Dim fso           : Set fso = CreateObject( "Scripting.FileSystemObject" )
Dim fsu           : Set fsu = DOpus.FSUtil
Dim hashFiles     : Set hashFiles = CreateObject( "Scripting.Dictionary" )
Dim hashData      : Set hashData  = CreateObject( "Scripting.Dictionary" )

Function OnInit( initData )

	initData.name           = "Hash"
	initData.version        = "1.0"
	initData.copyright      = "(c) 2016 Tenebrous"
'	initData.url            = "https://resource.dopus.com/viewforum.php?f=35"
	initData.desc           = "Generates checksums.md5 and can validate files against it"
	initData.default_enable = true
	initData.min_version    = "12.0"

    With initData.AddCommand()
        .name     = "GenerateHashFile"
        .method   = "OnCommandGenerateHash"
        .desc     = "Generates a file with the hashes in for each selected file"
        .label    = "GenerateHashFile"
        .template = "FILENAME/K,TYPE/K,FILES/M"
    End With

    With initData.AddColumn()
        .name = "HashMatch"
        .label = "Hash Match"
        .type = ""
        .method = "OnColumnHashMatch"
        .autogroup = true
        .autorefresh = true
        .justify = "left"
        .multicol = false
    End With

End Function

Function OnCommandGenerateHash( scriptCmdData )

    Dim outputFile
    Dim outputPath
    Dim temp
    Dim hash
    Dim extension

    Dim fileList

    If scriptCmdData.func.args.got_arg.files Then
        Set fileList = scriptCmdData.func.args.files
    Else
        Set fileList = scriptCmdData.func.sourcetab.selected
    End If

    '' by default use md5 unless told otherwise
    hash = "md5"

    If scriptCmdData.func.args.got_arg.type Then
        If LCase(scriptCmdData.func.args.type) = "sha" Then
            hash = "sha"
        End If
    End If

    extension = HashExtension(hash)

    If scriptCmdData.func.args.got_arg.filename Then
        
        '' write to specified hash file

        outputFile = scriptCmdData.func.args.filename
        Set outputPath = DOpus.FSUtil.NewPath( outputFile )
        
        If outputPath.test_parent Then
            outputPath.Parent
        End If

    Else
        
        '' no hash file specified, decide on filename

        If fileList.count = 1 Then

            '' only one file selected, write hash to selected filename with hash extension 

            outputFile = fileList(0) & "." & extension
            Set outputPath = fsu.NewPath( outputFile )

            If outputPath.test_parent Then
                outputPath.Parent
            End If

        Else

            '' multiple files selected, write hash to "checksum." with hash extension

            Set outputPath = scriptCmdData.func.command.source
            Set temp = fsu.NewPath(outputPath)
            temp.Add "checksums." & extension
            outputFile = CStr(temp)

        End If

    End If



    Dim progress : Set progress = scriptCmdData.func.command.Progress

    progress.Init scriptCmdData.func.sourcetab.lister, "Calculating hashes"
    
    With progress
        .owned = true
        .bytes = false
        .abort = true
        .pause = false
        .skip  = false
    End With


    progress.Show
    progress.SetStatus "Counting files..."

    Dim outputStream : Set outputStream = fso.OpenTextFile( outputFile, 2, True )
    Dim file
    Dim item

    For Each file in fileList

        Set item = fsu.GetItem( file )

        If item.is_dir Then
            CountFolder file, progress
        Else
            progress.AddFiles 1
        End If    

    Next '' file


    progress.SetStatus "Calculating hashes..."

    For Each file in fileList

        Set item = fsu.GetItem( file )

        If item.is_dir Then
            CalculateFolder outputStream, outputPath, file, progress, hash
        ElseIf CStr(item) <> outputFile Then
            progress.SetName CStr(file)
            CalculateFile outputStream, outputPath, file, progress, hash
            progress.StepFiles 1
        End If

    Next '' file

    progress.Hide
    
    ClearCache
    scriptCmdData.func.command.RunCommand "Go REFRESH=all"

    Set outputStream = Nothing
    Set fso = Nothing

End Function

Function CountFolder( folderPath, progress )

    Dim enumerator
    Dim item

    Set enumerator = fsu.ReadDir( folderPath, True )
    While Not enumerator.complete
        Set item = enumerator.Next
        If Not item.is_dir Then
            progress.AddFiles 1
        End If
    Wend

    enumerator.Close

End Function

Function CalculateFolder( outputStream, basePath, folderPath, progress, hash )

    Dim enumerator
    Dim item

    Set enumerator = fsu.ReadDir( folderPath, True )
    While Not enumerator.complete
        Set item = enumerator.Next
        If Not item.is_dir Then
            progress.SetName CStr(item)
            CalculateFile outputStream, basePath, item, progress, hash
            progress.StepFiles 1
        End If
    Wend

    enumerator.Close

End Function

Function CalculateFile( outputStream, basePath, filename, progress, hash )
    outputStream.WriteLine fsu.Hash( filename, hash ) & " *" & Mid( filename, Len(basePath) + 2 )
End Function

Function OnColumnHashMatch( scriptColData )

    On Error Resume Next
    
    If scriptColData.item.is_dir Then
        Exit Function
    End If

    If Err.Number <> 0 Then
        Exit Function
    End If

    Select Case LCase(scriptColData.item.ext)

        Case "." & HashExtension("md5")
            scriptColData.value = CompareAllHashes( CStr( scriptColData.item ), "md5" )
            Exit Function

        Case "." & HashExtension("sha")
            scriptColData.value = CompareAllHashes( CStr( scriptColData.item ), "sha" )
            Exit Function

    End Select

    On Error Goto 0

    Dim result : result = ""

    result = CompareFileHash( "md5", scriptColData.item ) _
           & " " _
           & CompareFileHash( "sha", scriptColData.item )

    scriptColData.value = Trim( result )

End Function

Function CompareFileHash( hash, item )

    Dim result
    Dim path : Set path = fsu.NewPath( item.path )
    Dim hashFullPath

    '' firstly check for filename.<hash> in same folder
    Set hashFullPath = fsu.NewPath( path )
    hashFullPath.Add item.name & "." & HashExtension(hash)

    If fsu.Exists( hashFullPath ) Then
        result = CompareHash( hashFullPath.pathpart, CStr(hashFullPath), CStr(item), hash )
    Else        

        '' now find checksums.<hash> in current or any parent folder

        Set hashFullPath = FindHashFileForPath( path, HashExtension(hash) )

        If Not hashFullPath Is Nothing Then
            result = CompareHash( hashFullPath.pathpart, CStr(hashFullPath), CStr(item), hash )
        End If

    End If

    CompareFileHash = result

End Function

Function FindHashFileForPath( path, extension )

    Dim pathStr : pathStr = CStr( path )
    Dim key     : key = extension & "_" & pathStr

    If hashFiles.Exists( key ) Then
        Set FindHashFileForPath = hashFiles( key )
        Exit Function
    End If

    Dim hashFullPath
    Set FindHashFileForPath = Nothing

    Do While True 

        Set hashFullPath = fsu.NewPath( path )
        hashFullPath.Add "checksums." & extension

        If fsu.Exists( hashFullPath ) Then
            Set FindHashFileForPath = hashFullPath
            Exit Do
        End If

        If path.test_parent <> True Then
            Exit Do
        End If

        path.Parent

    Loop

    hashFiles.Add key, FindHashFileForPath

End Function

Function GethashData( hashFilePathStr )

    If hashData.Exists( hashFilePathStr ) Then
        Set GethashData = hashData( hashFilePathStr )
        Exit Function
    End If

    Set GethashData = CreateObject( "SCripting.Dictionary" )

    Dim inputStream  : Set inputStream = fso.OpenTextFile( hashFilePathStr, 1 )

    Dim line
    Dim parts

    While Not inputStream.AtEndOfStream
        
        line = Trim(inputStream.ReadLine)

        If line <> "" Then

            parts = Split( line, " *" )

            If UBound( parts ) = 1 Then
                GethashData.Add parts(1), parts(0)
            End If

        End If

    Wend    

    hashData.Add hashFilePathStr, GethashData

End Function

Function CompareHash( rootPath, checksumFilePath, itemPath, hash )

    Dim relativePath : relativePath = mid( itemPath, len(rootPath)+2 )
    Dim hashes       : Set hashes = GethashData( checksumFilePath )
    Dim previousHash
    Dim currentHash

    CompareHash = ""

    If hashes.Exists( relativePath ) Then

        previousHash = hashes( relativePath )
        currentHash = fsu.Hash( itemPath, hash )

        If previousHash = currentHash Then
            CompareHash = HashName(hash) + ":ok"
        Else
            CompareHash = HashName(hash) + ":error"
        End If

    End If

End Function

Function HashName(hash)
    Select Case hash
        Case "md5": HashName = "md5"
        Case "sha": HashName = "sha1"
    End Select
End Function

Function HashExtension(hash)
    Select Case hash
        Case "md5": HashExtension = "md5"
        Case "sha": HashExtension = "sha1"
    End Select
End Function

Function CompareAllHashes( checksumFile, hash )

    Dim line
    Dim parts
    Dim previousHash
    Dim currentChecksum
    Dim filename
    Dim filenamePath
    Dim item

    Dim count   : count   = 0
    Dim countOK : countOK = 0

    Dim inputStream : Set inputStream = fso.OpenTextFile( checksumFile, 1 )
    Dim inputPath   : Set inputPath = fsu.NewPath( checksumFile )
    inputPath.parent

    While Not inputStream.AtEndOfStream
        
        line = Trim(inputStream.ReadLine)

        If line <> "" Then

            parts = Split( line, " *" )

            If UBound( parts ) = 1 Then

                count = count + 1

                previousHash = parts(0)
                filename = parts(1)

                Set filenamePath = fsu.NewPath( inputPath )
                filenamePath.Add filename

                currentChecksum = fsu.Hash( filenamePath, hash )

                If previousHash = currentChecksum Then
                    countOK = countOK + 1
                End If

            End If

        End If

    Wend

    If countOK = count Then
        CompareAllHashes = HashName(hash) + ":ok (" & countOK & " / " & count & ")"
    Else
        CompareAllHashes = HashName(hash) + ":error (" & countOK & " / " & count & ")"
    End If

End Function

Function ClearCache()
    Set hashFiles = CreateObject( "Scripting.Dictionary" )
    Set hashData  = CreateObject( "Scripting.Dictionary" )
End Function

Hash.vbs.txt (10.8 KB)
Generate hash file (MD5).dcf (636 Bytes)
Generate hash file (SHA1).dcf (321 Bytes)

3 Likes

Nicely done!

It works in v11 too even if there is no highlight where there's an error. :+1:

1 Like

Replace Line 123
Dim outputStream : Set outputStream = fso.OpenTextFile( outputFile, 2, True )
TO
Dim outputStream : Set outputStream = fso.OpenTextFile( outputFile, 2, True, -2 )

Line 314
Dim inputStream : Set inputStream = fso.OpenTextFile( hashFilePathStr, 1 )
To
Dim inputStream : Set inputStream = fso.OpenTextFile( hashFilePathStr, 1, , -2 )

Line 314
Dim inputStream : Set inputStream = fso.OpenTextFile( checksumFile, 1 )
To
Dim inputStream : Set inputStream = fso.OpenTextFile( checksumFile, 1, , -2 )

Let scripts support Unicode:

  1. Unicode Little Endian With BOM
  2. Unicode Little Endian Without BOM
  3. Unicode Big Endian With BOM

Hey there,
Was looking for a means to do exactly this for generating checksums ready for coldstorage, but I'm struggling with the verification side of things.

Given structure:

  • Images
    • Cats
    • Dogs
      • Big Dogs
      • Small Dogs
    • FIsh

I go to the /images/ folder, right click "Dogs", generate the checksums which gives me Dogs.md5. So far, so good.

The ideal use case is I select the all the folders in "Images" and generate one large digest, and then would hopefully at some future point be able to reselect these folders and "Compare to manifest", or manually trigger the population of the Hash Match column, as these structures can be quite large and contain many thousands of files, so I only really want to disk thrash when necessary.

  1. Is it possible to trigger the calculation manually and ensure that it never runs without being triggered
  2. If not, if the column is hidden, am I correct to presume it won't trigger the script?

Many thanks!

Looks like you could use Folder > Flat View > Mixed (No Folders) to get a list of all files, select all, then run the script's command without arguments. That should create a checksums.md5 or similar file in the top-level folder with everything in it.

The verification part looks like it will look for that file in parent folders if it isn't found where each file is, so that side should work too. (As long as there aren't any other hash files lower down, at least.)

If the column is hidden, it won't run unless you add the column or run the script's command. (At least, from a quick look.)

Hey Leo,
The creation of the hashes is quite simple turns out, just select the files or folders, hit the button.
The verification starts at some indeterminate point after the column is revealed, and finishes at some later point providing you don't interfere with anything while it's working.

Is there any way to expose visibility of such a background task? Validating checksums on a 4Tb set could... take a while.

Scripts can show progress bars (when they're enumerating the files themselves) but there's nothing built-in that would do that automatically or for a script column that's working in the background.

A tool like QuickPar (which also handles md5/sfv) or QuickSFV might be worth looking at if you want to hash and later verify a huge number of files with a GUI. Both tools are old but still work well in my experience.

1 Like

I prefer MultiPar over QuickPar.