Skip to content

Behind the scenes

Stefan Weil edited this page Jun 26, 2023 · 5 revisions

SchedulerJob

The first background job to be executed is the SchedulerJob. It enumerates all mounted storages and schedules a StorageCrawlJob for each of them.

StorageCrawlJob

The StorageCrawlJob fetches 100 files with the necessary mime type from the storage it was scheduled with and adds the files to the respective classifier queues.

ClassifyXXXJob

The Classifier Jobs fetch files from their respective queue and run a classifier model on the file contents by calling a Node.js process. For normal classifiers the returned tags will be set on the files that have been processed. In Case if the face classifier, the model returns face vectors along with some metadata for each face detected in a photo. Which are stored in a database table called recognize_face_detections.

Note, that at a given time multiple classifier jobs may be running simultaneously, each adhering to the processor limits in isolation, so that in sum more CPUs and resources may be used than the number you set in the settings.

ClusterFacesJob

This job fetches all face detections that have been stored for a user and runs a clustering algorithm over them, taking into account already assigned clusters and manual edits made by the user. Each user has their own clusters.

DAV endpoint

Recognize provides a WebDAV endpoint that exposes people as folders/collections containing photos that are enriched with metadata containing the recognized faces and their positions.

Clone this wiki locally