SlideTool

Short Description: SlideTool is a C#/.NET application designed to streamline the photo metadata tagging process by utilizing AWS-Textract for OCR and ExifTool for metadata embedding. It automates the scanning of directory paths, identifies slides and their corresponding handwritten metadata, and seamlessly integrates this information into the TIFF files’ metadata.

Technologies:

Name Type
C# Programming Language
GitHub Actions CI/CD Tool
.NET Framework
AWS-Textract SaaS

Long Description:

SlideTool is a C#/.NET application designed to streamline the photo metadata tagging process by utilizing AWS-Textract for OCR and ExifTool for metadata embedding. It automates the scanning of directory paths, identifies slides and their corresponding handwritten metadata, and seamlessly integrates this information into the TIFF files’ metadata.

Key Features

  • Directory Path Specification and Scanning: Users specify a directory path for the application to scan. SlideTool recursively navigates through all subdirectories, identifying digital representations of album pages, each containing a full-page scan and several individual raw slide scans.
  • OCR via AWS-Textract: Leverages AWS-Textract to perform OCR on the handwritten text found on the plastic frames of each slide, extracting detailed and delicate information.
  • Intelligent OCR Text Mapping: Processes OCR results to intelligently map each text instance back to its corresponding slide, ensuring precise data alignment.
  • Metadata Writing with ExifTool: Utilizes ExifTool to embed the OCR-extracted text into the metadata of the TIFF files, making the data searchable and preserving its integrity.
  • User-Friendly Interface: Designed to be operated easily with minimal training, facilitating efficient management of large datasets.

Security and Scalability

  • Data Integrity: Ensures that all extracted and embedded metadata retains its accuracy and integrity through sophisticated processing techniques.
  • Scalable Processing: Capable of handling large volumes of data efficiently, scaling to meet the demands of professional and archival environments.

Operational Details

  • Automated Workflow: Automates the entire process of metadata tagging from scanning directories to embedding metadata, reducing manual labor and error.
  • CI/CD Integration: Utilizes GitHub Actions to maintain and improve the software continuously, ensuring robustness and reliability.

Usage Scenario

SlideTool is ideal for museums, archives, and organizations that manage large photographic collections. It is crucial for environments that require accurate digitization and cataloging of photographic content, enhancing the accessibility and utility of archival photographs.

SlideTool dramatically improves the process of metadata tagging, transforming the management of photographic archives by automating key steps and ensuring high accuracy and efficiency.