RegexColumnsReloaded (easily add regex columns or any kind of column)

There have been several regex column scripts in the past year.
I was first wow'ed by this feature when I came across wowbagger's script. Since then, I have been experimenting with columns and writing several scripts to suit my needs. At the risk of crowding the space, I thought I would share this one because some features may interest others.

This is the core script that is used for Columns: Movie Filename Database and Columns: Books & Comics Filename Database.

Tutorial
There is a full tutorial on my new page about Using File Names to Make a Database.

I'm really excited about this way of working with files and look forward to feedback and a fruitful discussion. Apart from direct feedback and brainstorming on this script, I am particularly interested in hearing your thoughts about using file naming conventions to manage metadata.

Features

  1. Can be used for any kind of column: space is clearly marked in the script to easily add NON-regex columns

  2. Columns should autorefresh when the name or the file changes

  3. Clear prefix facilitates column selection in the Folder Format pane without affecting display names

  4. The column definition can be configured with a number of properties, all optional
    See the tutorial about column definitions or this image.


  1. The simple data structure is directly addressable, eliminating the need for getters

Thanks to those who have answered questions at one time or another while I was playing with the scripts (tbone, Leo and Jon come to mind but I'm sure there are others).
:slight_smile:

Download
RegexColumnsReloaded.js.txt (30.5 KB)

Changelog
0.9.2 At tbone's request, changed the column prefix from RegexReloaded_ to RegexReloaded.
0.9.1 Added three properties (got these ideas from Apocalypse): fileOnly, dirOnly, fullCupValue.

1 Like
var columns = {
'Vendor' : {
     pattern: /^([^=]+)=([^=]+)=([^=]+)=([^=]+)=([^=]+)/,
     group: 2
     },
'Date' : {
     pattern: /^([^=]+)=([^=]+)=([^=]+)=([^=]+)=([^=]+)/,
     group: 4,
     type: date
     },
'Amount' : {
     pattern: /^([^=]+)=([^=]+)=([^=]+)=([^=]+)=([^=]+)/,
     group: 5
     },
// Trick Column, feel free to delete
'APreport' : {
     pattern: /^(AP)rep([^=]+)=([^=]+)=([^=]+)=([^=]+)=([^=]+)$|^AP(rep)([^=]+)=([^=]+)=([^=]+)$/,
     subjectPrefix: "APrep",
     returnTheFirstNonEmptyGroup: true
     },
// Trick Column, feel free to delete
'AP' : {
     pattern: /^(Y)(?:[^=]*=){4}/,
     group: 1,
     subjectPrefix: "Y"
     },
// Trick Column, feel free to delete
'Report' : {
     pattern: /^(Y)(?:[^=]*=){2}/,
     group: 1,
     subjectPrefix: "Y"
     }
 };

I tried modifying the regexColumns with the above column definitions but then they don't show up in the columns list of dopus. Am I doing something wrong?

Never mind figured it out. The type DATE needs to be in Quotes

(.*)\s*-\s*\(([^\)]+_)?([^\)#]*)#?(\d\d)\)(.*)\((\d{4})\)

Debuggex Demo

What to do when I have a single regex? Test against
Brandon Sanderson - (The Stormlight Archive #02) Words of Radiance
Brandon Sanderson - (The Stormlight Archive #01) The Way of Kings

  'AuthorV' : { 
       // The book's title (stripping [the keys=values] )
       pattern: /^([^-]*)\s?-\s?\(?([^\)#-]*)#?(\d\d+)(?:\)|\s?-\s?)([^(]*)(?:\((\d{4})\))?/i,
       group: 1
       },
  'SeriesV' : { 
       // e.g. Tintin
       // acceptable aliases for key: ser, series
       pattern: /^([^-]*)\s?-\s?\(?([^\)#-]*)#?(\d\d+)(?:\)|\s?-\s?)([^(]*)(?:\((\d{4})\))?/i,
       group: 2
       },
  'NumV' : { 
       // Number in the series, e.g. 01
       // acceptable aliases for key: num, nb, no
       pattern: /^([^-]*)\s?-\s?\(?([^\)#-]*)#?(\d\d+)(?:\)|\s?-\s?)([^(]*)(?:\((\d{4})\))?/i,
       group: 3,
       width: 3
       },
  'TitleV' : { 
       // The book's title (stripping [the keys=values] )
       pattern: /^([^-]*)\s?-\s?\(?([^\)#-]*)#?(\d\d+)(?:\)|\s?-\s?)([^(]*)(?:\((\d{4})\))?/i,
       group: 4
       },
  'YearV' : { 
       // The book's title (stripping [the keys=values] )
       pattern: /^([^-]*)\s?-\s?\(?([^\)#-]*)#?(\d\d+)(?:\)|\s?-\s?)([^(]*)(?:\((\d{4})\))?/i,
       group: 5,
       width: 4
       },

Isn't it wasteful to do the match everytime? Is there a way to cache the results use it in each of the other columns?

Yes, definitely. You just have to delete most of the script.

Do the matching once, fanning the match into groups.
Then you can use a switch statement on colName:

switch(colName) {
    case 'some name':
    ColumnData.value = something based on the regex groups
    break

   case 'some other name':
   ... 

As it is, this script can only access the first level of attributes for the item class. For example, the user can't use "metadata.doc.author" as target.

I found a solution to that on this question of StackOverflow.

Step-by-step

1 - Add this function that parses the target string and returns the correct last level object:

function findprop(obj, path) {
    var args = path.split('.'), i, l;
    for (i = 0, l = args.length; i < l; i++) {
        if (!obj[args[i]]) {
            return;
        }
        obj = obj[args[i]];
    }
    return obj;
}

2 - Replace:

This line
fileSubject = ColumnData.item[columns[colName].target];
For this
fileSubject = findprop(ColumnData.item, columns[colName].target);

Hi,

just stumbled across this script and I would like to give a try and use it as well, maybe to learn a bit more about scripting in DO and a like the idea of dynamic columns in DO.

However, besides the initial purpose of the script, I've no clue on how this would be of any help on my side. So, I would like to hear about some additional tips from others here where they use this script.

Thanks,
Michael

Hi Michael,
Have you seen the screenshots on the Using File Names to Make a Database page?

There's one screenshot for Movie columns, another for Book columns.

Another use would be for project files: Client, Project Code etc

One benefit is that it allows you to "pivot" the way the data is sorted, compared to a situation where everything is in subfolders (which of course is usually the best choice).

For instance let's say you organize movies into subfolders for directors. But then when you want to search movies by genre, by rating, key actor, year, etc, your "Directors" subfolders can be limiting, although Flat view helps.

In that kind of situation naming conventions + custom columns can be a solution.

1 Like