RegexColumnsReloaded (easily add regex columns or any kind of column)

playful · June 26, 2015, 5:18am

There have been several regex column scripts in the past year.
I was first wow'ed by this feature when I came across wowbagger's script. Since then, I have been experimenting with columns and writing several scripts to suit my needs. At the risk of crowding the space, I thought I would share this one because some features may interest others.

This is the core script that is used for Columns: Movie Filename Database and Columns: Books & Comics Filename Database.

Tutorial
There is a full tutorial on my new page about Using File Names to Make a Database.

I'm really excited about this way of working with files and look forward to feedback and a fruitful discussion. Apart from direct feedback and brainstorming on this script, I am particularly interested in hearing your thoughts about using file naming conventions to manage metadata.

Features

Can be used for any kind of column: space is clearly marked in the script to easily add NON-regex columns
Columns should autorefresh when the name or the file changes
Clear prefix facilitates column selection in the Folder Format pane without affecting display names
The column definition can be configured with a number of properties, all optional
See the tutorial about column definitions or this image.

The simple data structure is directly addressable, eliminating the need for getters

Thanks to those who have answered questions at one time or another while I was playing with the scripts (tbone, Leo and Jon come to mind but I'm sure there are others).

Download
RegexColumnsReloaded.js.txt (30.5 KB)

Changelog
0.9.2 At tbone's request, changed the column prefix from RegexReloaded_ to RegexReloaded.
0.9.1 Added three properties (got these ideas from Apocalypse): fileOnly, dirOnly, fullCupValue.

VijaySaraff · November 18, 2016, 4:30pm

var columns = {
'Vendor' : {
     pattern: /^([^=]+)=([^=]+)=([^=]+)=([^=]+)=([^=]+)/,
     group: 2
     },
'Date' : {
     pattern: /^([^=]+)=([^=]+)=([^=]+)=([^=]+)=([^=]+)/,
     group: 4,
     type: date
     },
'Amount' : {
     pattern: /^([^=]+)=([^=]+)=([^=]+)=([^=]+)=([^=]+)/,
     group: 5
     },
// Trick Column, feel free to delete
'APreport' : {
     pattern: /^(AP)rep([^=]+)=([^=]+)=([^=]+)=([^=]+)=([^=]+)$|^AP(rep)([^=]+)=([^=]+)=([^=]+)$/,
     subjectPrefix: "APrep",
     returnTheFirstNonEmptyGroup: true
     },
// Trick Column, feel free to delete
'AP' : {
     pattern: /^(Y)(?:[^=]*=){4}/,
     group: 1,
     subjectPrefix: "Y"
     },
// Trick Column, feel free to delete
'Report' : {
     pattern: /^(Y)(?:[^=]*=){2}/,
     group: 1,
     subjectPrefix: "Y"
     }
 };

I tried modifying the regexColumns with the above column definitions but then they don't show up in the columns list of dopus. Am I doing something wrong?

VijaySaraff · November 18, 2016, 4:37pm

Never mind figured it out. The type DATE needs to be in Quotes

VijaySaraff · December 11, 2016, 5:20pm

(.*)\s*-\s*\(([^\)]+_)?([^\)#]*)#?(\d\d)\)(.*)\((\d{4})\)

Debuggex Demo

What to do when I have a single regex? Test against
Brandon Sanderson - (The Stormlight Archive #02) Words of Radiance
Brandon Sanderson - (The Stormlight Archive #01) The Way of Kings

  'AuthorV' : { 
       // The book's title (stripping [the keys=values] )
       pattern: /^([^-]*)\s?-\s?\(?([^\)#-]*)#?(\d\d+)(?:\)|\s?-\s?)([^(]*)(?:\((\d{4})\))?/i,
       group: 1
       },
  'SeriesV' : { 
       // e.g. Tintin
       // acceptable aliases for key: ser, series
       pattern: /^([^-]*)\s?-\s?\(?([^\)#-]*)#?(\d\d+)(?:\)|\s?-\s?)([^(]*)(?:\((\d{4})\))?/i,
       group: 2
       },
  'NumV' : { 
       // Number in the series, e.g. 01
       // acceptable aliases for key: num, nb, no
       pattern: /^([^-]*)\s?-\s?\(?([^\)#-]*)#?(\d\d+)(?:\)|\s?-\s?)([^(]*)(?:\((\d{4})\))?/i,
       group: 3,
       width: 3
       },
  'TitleV' : { 
       // The book's title (stripping [the keys=values] )
       pattern: /^([^-]*)\s?-\s?\(?([^\)#-]*)#?(\d\d+)(?:\)|\s?-\s?)([^(]*)(?:\((\d{4})\))?/i,
       group: 4
       },
  'YearV' : { 
       // The book's title (stripping [the keys=values] )
       pattern: /^([^-]*)\s?-\s?\(?([^\)#-]*)#?(\d\d+)(?:\)|\s?-\s?)([^(]*)(?:\((\d{4})\))?/i,
       group: 5,
       width: 4
       },

Isn't it wasteful to do the match everytime? Is there a way to cache the results use it in each of the other columns?

playful · December 11, 2016, 6:22pm

Yes, definitely. You just have to delete most of the script.

Do the matching once, fanning the match into groups.
Then you can use a switch statement on colName:

switch(colName) {
    case 'some name':
    ColumnData.value = something based on the regex groups
    break

   case 'some other name':
   ...

andersonnnunes · March 17, 2017, 1:55am

As it is, this script can only access the first level of attributes for the item class. For example, the user can't use "metadata.doc.author" as target.

I found a solution to that on this question of StackOverflow.

Step-by-step

1 - Add this function that parses the target string and returns the correct last level object:

function findprop(obj, path) {
    var args = path.split('.'), i, l;
    for (i = 0, l = args.length; i < l; i++) {
        if (!obj[args[i]]) {
            return;
        }
        obj = obj[args[i]];
    }
    return obj;
}

2 - Replace:

This line
fileSubject = ColumnData.item[columns[colName].target];
For this
fileSubject = findprop(ColumnData.item, columns[colName].target);

Micky · January 26, 2020, 1:45pm

Hi,

just stumbled across this script and I would like to give a try and use it as well, maybe to learn a bit more about scripting in DO and a like the idea of dynamic columns in DO.

However, besides the initial purpose of the script, I've no clue on how this would be of any help on my side. So, I would like to hear about some additional tips from others here where they use this script.

Thanks,
Michael

playful · January 26, 2020, 2:27pm

Hi Michael,
Have you seen the screenshots on the Using File Names to Make a Database page?

There's one screenshot for Movie columns, another for Book columns.

Another use would be for project files: Client, Project Code etc

One benefit is that it allows you to "pivot" the way the data is sorted, compared to a situation where everything is in subfolders (which of course is usually the best choice).

For instance let's say you organize movies into subfolders for directors. But then when you want to search movies by genre, by rating, key actor, year, etc, your "Directors" subfolders can be limiting, although Flat view helps.

In that kind of situation naming conventions + custom columns can be a solution.