Using the metadata framework for classification and remediation

To apply tags to files, folders, and shares, you must create a CSV file with the metadata key value pairs. You may either create the CSV file manually or use a third-party tool or script to generate the CSV with tagging information for paths.

To apply metadata tags

  1. Create a CSV file with the tagging information. You can create more than one CSV file with tagging information for paths.

    To assign tags to the files, ensure that the CSV file name starts with File_ (for example, File_tags.csv). Enter paths for different files with the tag name and tag values. CSV files with any other name are considered to have paths of folders.

    Note:

    i18n and special characters are not supported in tag names.

  2. Save the CSV files in the data/console/tags folder in the Data Insight installation directory on the Management Server.
  3. A scheduled job TagsConsumerJob parses the CSV file and creates a Tags database for each share. The job imports the tags for the paths into Data Insight. The job runs once in a day by default.

    If the job is executed manually using the configcli command, the job forcefully consumes all the CSV files under Tags folder.

    Whenever the job runs, it checks if the modified time of any of the CSV files under the Tags folder is greater than the time of the previous execution of job. If the job finds any such CSV, it processes all the CSV files under Tags folder. If the CSV file(s) have not been modified after the job was last executed, the job does not take any action.

    The job does not accept any tag name that starts with mx_ because they are reserved for Data Insight internal tags usage. Whenever the job processes the CSV, Data Insight deletes all existing tags (except tags starting with "mx_") from all files and folders and attaches new tags.

    Note:

    If a path is tagged in two different CSV files with the same tag name, but with a different value, then the value in the last CSV file that is processed is applied.

  4. To replace existing tags, update the CSV with new tags. The scheduled job replaces existing tags with the new tags. If any paths are discarded during the last run of the job, then these are logged in $DATADIR/console/generictags_scan_status_5.0.db.

    If any paths are discarded, then these are logged in a database that stores the discarded paths during the last run of the job.

    To remove all tags, delete the CSV from the Tags folder.

  5. Create a DQL report to retrieve the tags from the database.

    Here are a few example queries that you can use:

    • To fetch all paths in your storage environment along with the tags (my_tag) assigned to them.

      FROM path GET name, TAG my_tag
    • To get all paths owned by user Joe Camel tagged with the needs_assesment tag.

      FROM owner GET TAG owner.path.needs_assessment, owner.path.name 
      IF user.name="joe_camel"
  6. To verify the names of tags that are stored for a share, run the idxreader command on the indexer node.
    idxreader  - i $MATRIX_DATA_DIR/indexer/default/99/99 
     - gettags all
Format of CSV file

The CSV file with the metadata tags should be in the following format:

File/folder path, tag name, tag value

For example, \\filer\share\foo,tname,tvalue

Where, tname refers to the name of the tag, and tvalue refers to the tag value.

Note:

Multiple values for a same for the same tag are not supported.

If the path or the tag name contains a comma, enclose the text in double quotes (","). For example, if the folder name is foo, bar, you can add the path in the CSV as follows:

"\\filer\share\foo,bar",t_name,t_value

For shares, the path should be present in the CSV file containing folder paths. Following are examples of share level paths:

CIFS/DFS

\\filer\share

SharePoint

URL of the site collection

NFS

<export path> For example, /data/finance/docs

Box

\\Box\<box name in Data Insight>