a website project involving the works of longmont potion castle - you can call me stretchie

🧭 Overview 🧭

this website allows for searching through subtitles and speakers within the longmont potion castle discography.

this website can currently be viewed at:

🎮 Features 🎮

its basic feature is that albums and tracks have pages with the track pages containing the subtitles for the track. its smart feature is that all of this aforementioned data is indexed so that search becomes possible. the neat feature is that the lpc usb collection can be uploaded into the site and then tracks can be easily played, as well as one can jump into a track at the point of when a certain subtitle line is spoken.

📘 Backstory 📘

some time ago, i wanted to know one question - how many calls does alex trebek show up in throughout the discography of lpc?

there are great resources like talkin’ whipapedia out there that has detailed info about albums, tracks, their subtitles, and other info, however its data isn’t structured in a formal way and therefore is not indexedable in a way that can answer my original question. given that i’ve been programming since i was in elementary school, i knew i could create something that would tell me, and i wanted it to be something that i could share within the niche community of lpc.

🔎 Searching 🔎

as of v1.4.0, the search feature uses a logical ‘and’ when operating, instead of a logical ‘or’. this change in behavior affects when multiple words are searched. before, the search would return any subtitles containing any word that was entered. now, the search will only return subtitles that contain all words being searched.

for example, a search term of “cheese pizza” previously return 134 results - all subtitles containing either the word “cheese” or “pizza”. now, the same search of “cheese pizza” returns 7 results - all subtitles containing both the words “cheese” and “pizza”.

note that results returned are not based on phrase matching. for example, a subtitle of “i want a cheese pizza” will be returned, but so will “i would like cheese on my pizza”. due to limitations of lunr.js, phrase matching is not possible.

also note that the ordering of the words does not matter, so a search for “cheese pizza” and for “pizza cheese” will return the same results.

⚙️ Components ⚙️

this website is built with the static site generator jekyll. whisper-webui is utilized to analyze the audio tracks and have it output subtitles (what is spoken) that include speaker diarization (determining who says what), which are then transformed into json files. each json file containing a track’s speakers and subtitles data must be manually reviewed and corrected as needed. as changes are made, jekyll build recreates the site’s pages and combines all JSON data into one single JSON data file (combined_data.json).

because the website is static, there is no server-end processing that occurs (other than serving files) - the searching functions run locally within the browser.

↪️ Converting Tracks to Subtitles ↪️

i am using whisper-webui (deployed via pinokio) to analyze the .mp3 files using speech-to-text with speaker diarization (who says what) to output subtitle files (.srt)

↘️ Converting Subtitles to JSON ↘️

i am using this python tool to convert the subtitle files to json, but it also outputs a metadata.json file and a metadata.yml file in accordance to what this project needs

💽 JSON Structure for Albums and Tracks 💽

the main JSON data file resides at /assets/data.json

{
  "Albums": [
    { "Album": "Longmont Potion Castle",
      "Album_Slug": "longmont-potion-castle",
      "Album_Picture": "LPC_1.jpg",
      "Year": 1988,
      "Tracks": [
        {
          "Track_Title": "Longmont Theme 1",
          "Track_Number": 1,
          "Track_JSONPath": "longmont-theme-1.json",
          "Track_Slug": "longmont-theme-1",
          "Aliases": "Wallace Thrasher",
          "Establishments": "UPS",
          "Speakers_Adjusted": "false",
          "Subtitles_Adjusted": "false"
          "USB_Filename": "longmont-theme-1.mp3",
          "Whisper_Model": "distil-whisper/distil-large-v3"
        }
      ]
    }
  ]
}

it is possible that some keys are not present in all tracks, but the necessary ones of Track_Title, Track_Number, Track_JSONPath, and Track_Slug are listed for each track.

💽 JSON Structure for Track Subtitles 💽

the JSON data for each track resides within a folder named as the respective album title’s slug with the /assets/json folder

[
    {
        "Index": 1,
        "Start Time": "00:00:02,140",
        "End Time": "00:00:02,920",
        "Speaker": "Woman 1",
        "Text": "Betty Boop Diner."
    },
    {
        "Index": 2,
        "Start Time": "00:00:04,008",
        "End Time": "00:00:08,449",
        "Speaker": "LPC",
        "Text": "Hi, can I please get a take-up or a pick-up?"
    }
]

🚘 Under The Hood 🚘

when the search pages are accessed, the single combined JSON data (/assets/json/combined_json.data) is retrieved from the server, then lunr indexes the data so that it becomes searchable. lunr currently indexes for two categories - speakers and subtitles.

the keys of USB_Directory and USB_Filename refer to the respective directory and filename of the mp3 that resides on a “LPC Ultimate Session Bundle” usb drive that are occasionally available for sale via lpc’s website. these two pieces of data are used to play audio, if the files from the usb collection are uploaded.

🛠️ Building 🛠️

to install the project’s dependencies, ensure Ruby is installed, then install its necessary gems by running: bundle install; bundle update;

to build, run this command from the jekyll directory: JEKYLL_ENV=development bundle exec jekyll build

to build and start a local web server, run this command from the jekyll directory: JEKYLL_ENV=development bundle exec jekyll serve

when deploying to production, JEKYLL_ENV must be changed to production. the development environment tends to display information within data.json more so than the production environment.

📤 Deployment 📤

commits to the main branch trigger two github actions:

  • deploy-production-build.yml:
  • publish-to-github-pages.yml:
    • runs jekyll build --baseurl "/wallace-thrasher" to generate the site on the “gh-pages” branch
    • a separate action then uses the “gh-pages” branch to deploy to github pages

the commit to “production-build” is pulled by netlify to redeploy its copy of the site.

✍️ How to Contribute ✍️

if you’ve read this far and have an interest in contributing to this project - it is welcomed and appreciated!

please refer to CONTRIBUTING.md

☑️ To-Do’s ☑️

the to-do list has been moved to TODO.md

🪪 Licensing 🪪

this project is licensed under the GPLv3, and this license applies to all past versions and branches of the project.

🤓 Technical Details 🤓

here are various badges related to this project’s code and its deployments

Deploy a Production Build

Publish to GitHub Pages – GitHub Action to publish to GitHub Pages

Netlify Status – deployment status to Netlify

GitHub last commit – when last committed to GitHub

GitHub code size – deployed source code size

GitHub repo size – source code repository size

GitHub License – the open source license

GitHub Release – the latest version

notes on version history can be found on the version history page

this website was last built on May 8, 2025 at 12:10 AM EDT