Click here to Skip to main content
15,123,851 members
Articles / Programming Languages / Markdown
Article
Posted 30 Aug 2021

Tagged as

Stats

3.4K views
163 downloads
4 bookmarked

My Media Search: Find and Enjoy the Media Files on your PC

Rate me:
Please Sign up or sign in to vote.
4.60/5 (6 votes)
30 Aug 2021Apache3 min read
Looking for that great Killers track about being in a Rut? Just enter "Killers Rut" and it'll come right up
See how easy it is to add full-text indexing and implement set operations in a fun little Windows app.

Introduction

My Media Search is a fun little Windows app for finding and enjoying your media files.

You tell it what directories to index like Pictures, Music, and Videos...

Image 1

...and it indexes the files in those folders (and not in ones you don't want) to support fast searches...

Image 2

...then you can search for silliness...

Image 3

Once you get a list of search results, you can open the files, open containing folders, and get all the detailed metadata the Windows maintains for each file:

Image 4

Man, that's a lot of metadata!

My Media Search Explained

My Media Search is a Windows Forms application, using .NET Framework 4.7.2. I wanted to use .NET 5.0, but the only out-of-the-box code I knew of for getting file properties and thumbnails was from Windows APIs, namely the Microsoft.WindowsAPICodePack libraries.

You had to see this coming... My Media Search is powered by the metastrings database! I retooled metastrings for this application, and for its general coherence as software:

  • Dropped support for MySQL. Using metastrings only makes sense where performance and scalability are not issues. This positions it to be useful for adding lightweight database support to applications, not to be have to support being part of an unwieldy client-server database solution.
  • With MySQL out of the way, I was able to lift the length limit on strings, as SQLite has no such limitation. This means you won't have to use the "long strings" API, which I was tempted to remove, but mscript uses it, so it lives on. For now.
  • Cementing the coherent role as a small, easy to use database, I added full-text indexing for all string values. Not long strings, just the Define / SELECT stuff. With full-text in place, metastrings was poised to implement My Media Search.

Code Overview

The lib project has the SearchInfo class that implements most all non-UI functionality.

The cmd project is a proof-of-concept for SearchInfo. You can use it to index an arbitrary directory, update the index, and perform searches. Note that it indexes the one directory you give it, erasing indexing for any other directories. Just a proof of concept.

C#
using System;
using System.Threading.Tasks;
using System.Collections.Generic;
using System.IO;

namespace fql
{
    class Program
    {
        [STAThread]
        static async Task<int> Main(string[] args)
        {
            if (args.Length < 1)
            {
                Console.WriteLine("Usage: <directory path>");
                return 0;
            }

            string dirPath = args[0].Trim();
            if (!Directory.Exists(dirPath))
            {
                Console.WriteLine("ERROR: Directory does not exist: {0}", dirPath);
                return 1;
            }

#if !DEBUG
            try
#endif
            {
                while (true)
                {
                    Console.WriteLine();
                    Console.WriteLine("Commands: reset, update, search, quit");

                    Console.WriteLine();
                    Console.Write("> ");

                    string line = Console.ReadLine().Trim();
                    if (string.IsNullOrEmpty(line))
                        continue;

                    Console.WriteLine();

                    if (line == "reset")
                    {
                        SearchInfo.Reset();
                        Console.WriteLine("DB reset");
                    }
                    else if (line == "update")
                    {
                        var updateResult = 
                            await SearchInfo.UpdateAsync
                            (
                                new List<string> { dirPath }, 
                                new List<string>(), 
                                OnDirectoryUpdate
                            );
                        Console.WriteLine("DB updated: files added: 
                                {0} - removed: {1} - modified: {2} - indexed: {3}", 
                                          updateResult.filesAdded, 
                                          updateResult.filesRemoved, 
                                          updateResult.filesModified,
                                          updateResult.indexSize);
                    }
                    else if (line.StartsWith("search "))
                    {
                        var results = await SearchInfo.SearchAsync
                                      (line.Substring("search ".Length).Trim());
                        Console.WriteLine($"Search results: {results.Count}");
                        foreach (var result in results)
                            Console.WriteLine(result);
                    }
                    else if (line == "quit")
                    {
                        Console.WriteLine("Quitting...");
                        break;
                    }
                }
            }
#if !DEBUG
            catch (Exception exp)
            {
                Console.WriteLine("EXCEPTION: {0}", exp);
                return 1;
            }
#endif
            Console.WriteLine("All done.");
            return 0;
        }

        static void OnDirectoryUpdate(UpdateInfo update)
        {
            Console.WriteLine(update.ToString());
        }
    }
}

The app project is the top-level Windows Forms application, nothing too interesting there, just usual Windows Forms stuff.

The SearchInfo Class

Get All Windows Metadata For a File

C#
public static Dictionary<string, string> GetFileMetadata(string filePath)
{
    Dictionary<string, string> metadata = new Dictionary<string, string>();

    Shell32.Shell shell = new Shell32.Shell();
    Shell32.Folder objFolder = shell.NameSpace(Path.GetDirectoryName(filePath));

    List<string> headers = new List<string>();
    for (int i = 0; i < short.MaxValue; ++i)
    {
        string header = objFolder.GetDetailsOf(null, i);
        if (string.IsNullOrEmpty(header))
            break;

        headers.Add(header);
    }
    if (headers.Count == 0)
        return metadata;

    foreach (Shell32.FolderItem2 item in objFolder.Items())
    {
        if (!filePath.Equals(item.Path, StringComparison.OrdinalIgnoreCase))
            continue;

        for (int i = 0; i < headers.Count; ++i)
        {
            string details = objFolder.GetDetailsOf(item, i);
            if (!string.IsNullOrWhiteSpace(details))
                metadata.Add(headers[i], details);
        }
    }

    return metadata;
} // GetFileMetadata()

That code is one of the requirements for using .NET Framework and not .NET 5+. Without this function and without the tiles view of search results, .NET 5+ would probably work.

The Search Index Algorithm

  1. Gather all file system file paths and last modified dates from the chosen directories
  2. Remove all file system file paths in the exclusion directories
  3. Gather all database file paths and last modified dates
  4. ProcessFiles: March all file paths - file system and database - determining which to add or remove from the database
  5. Do database operations to update the index

Here's the implementation of the central ProcessFiles function:

C#
private static void ProcessFiles(IEnumerable<string> filePaths, DirProcessInfo info)
{
    List<string> filesToAdd = new List<string>();
    List<object> filesToRemove = new List<object>(); // object for direct metastrings use

    foreach (string filePath in filePaths)
    {
        bool inDb = info.filesLastModifiedInDb.ContainsKey(filePath);
        bool inFs = info.filesLastModifiedInFs.ContainsKey(filePath);

        if (inDb && !inFs)
        {
            ++info.filesRemoved;
            filesToRemove.Add(filePath);
            continue;
        }

        if (inFs && !inDb)
        {
            ++info.filesAdded;
            filesToAdd.Add(filePath);
            continue;
        }

        if (!inDb && !inFs) // weird!
        {
            ++info.filesRemoved;
            filesToRemove.Add(filePath);
            continue;
        }

        // else in both

        if (info.filesLastModifiedInDb[filePath] < info.filesLastModifiedInFs[filePath])
        {
            ++info.filesModified;
            filesToAdd.Add(filePath);
            continue;
        }
    }
    info.toDelete = filesToRemove;

    info.toAdd = new List<Tuple<string, long, string>>(filesToAdd.Count);
    foreach (string filePath in filesToAdd)
    {
        string searchData =
            filePath.Substring(UserRoot.Length)
                .Replace(Path.DirectorySeparatorChar, ' ')
                .Replace('.', ' ');
        while (searchData.Contains("  "))
            searchData = searchData.Replace("  ", " ");
        searchData = searchData.Trim();

        long lastModified = info.filesLastModifiedInFs[filePath];
        info.toAdd.Add
        (
            new Tuple<string, long, string>(filePath, lastModified, searchData)
        );
    }
}

The code for computing the searchData string for full-text indexing splits up the path component, strips out file extensions, eliminates double spaces, and trims the result.

Once ProcessFiles figures out what needs to be done, this code interacts with metastrings to do the deed:

C#
using (var ctxt = msctxt.GetContext())        // create the metastrings Context
{
    update.Start("Cleaning search index...", dirProcInfo.toDelete.Count);
    updater?.Invoke(update);
    await ctxt.Cmd.DeleteAsync("files", dirProcInfo.toDelete);

    update.Start("Updating search index...", dirProcInfo.toAdd.Count);
    updater?.Invoke(update);
    Define define = new Define("files", null); // reuse this object, pull allocs out of loops
    foreach (var tuple in dirProcInfo.toAdd)
    {
        define.key = tuple.Item1;
        define.metadata["filelastmodified"] = tuple.Item2;
        define.metadata["searchdata"] = tuple.Item3;

        await ctxt.Cmd.DefineAsync(define);

        ++update.current;
        if ((update.current % 100) == 0)
            updater?.Invoke(update);
    }
}

Conclusion

So build the app and enjoy playing with it. I think you will find it useful for digging through your thousands of pictures and songs to find just what you're looking for.

Implementing this app was made easy by metastrings.

I hope you now have confidence adding full-text searching to your applications.

Enjoy!

History

  • 29th August, 2021: Initial version

License

This article, along with any associated source code and files, is licensed under The Apache License, Version 2.0

Share

About the Author

Michael Sydney Balloni
Software Developer
United States United States
Michael Balloni is a manager of software development at a cybersecurity software and services provider.

Check out https://www.michaelballoni.com for all the programming fun he's done over the years.

He has been developing software since 1994, back when Mosaic was the web browser of choice. IE 4.0 changed the world, and Michael rode that wave for five years at a .com that was a cloud storage system before the term "cloud" meant anything. He moved on to a medical imaging gig for seven years, working up and down the architecture of a million-lines-code C++ system.

Michael has been at his current cybersecurity gig for six years, making his way into management. He still loves to code, so he sneaks in as much as he can at work and at home.

Comments and Discussions

 
GeneralMy vote of 5 Pin
Member 137041435-Sep-21 6:40
MemberMember 137041435-Sep-21 6:40 
Thanks for this app and all this high level code.
GeneralRe: My vote of 5 Pin
Michael Sydney Balloni5-Sep-21 10:17
professionalMichael Sydney Balloni5-Sep-21 10:17 
GeneralNaming files so that they can be indexed more powerfully... Pin
Kent K2-Sep-21 4:48
professionalKent K2-Sep-21 4:48 
GeneralRe: Naming files so that they can be indexed more powerfully... Pin
Michael Sydney Balloni2-Sep-21 15:59
professionalMichael Sydney Balloni2-Sep-21 15:59 
QuestionNice I did something like this for my degree Pin
Sacha Barber31-Aug-21 3:33
mvaSacha Barber31-Aug-21 3:33 
AnswerRe: Nice I did something like this for my degree Pin
Michael Sydney Balloni31-Aug-21 16:15
professionalMichael Sydney Balloni31-Aug-21 16:15 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.