Importing in Miranda

This page briefly describes the broad outline of how we import records into the Miranda beta. Currently, there are two import methods: small batch importing and bulk importing. In both cases we import a JSON file that contains Miranda records which were created by transforming current holdings records into Miranda records, or by creating original Miranda records.

We will continue to refine and improve our importing process.

Small batch imports:

For individual or a small number of Miranda records, typically a JSON file containing under 300 records, we can import directly into Miranda from the system's web-based administrative interface. Through this interface, we can also validate records against the Miranda schema to ensure a successful import. The size limit of this importer, however, makes it unfeasible for larger imports.

Bulk imports:

For larger files or a batch of multiple files, we use the bulk import process, which is run through an AWS ECS task from the AWS Console. This decouples the import process from the administrative interface, allowing the import to operate as a long-running background process. In this method, files are first uploaded to a specified S3 Bucket. The files are then read by the task and imported into Miranda. This process can be monitored through logs that written to CloudWatch while the task runs.

During both import processes, Miranda read each import file and separates it into the varying records for import. It evaluates each record against the schema and either accepts it, or returns an error noting a validation issue with the record. If the record is accepted, three things then occur. The record is added to Miranda's data store, if properly flagged, it's added to Miranda's search index, and, if it references a file, such as an image, which the fileURL field, that URL will be added to a queue to be copied into Miranda's own file system.

Future importing

In the next phase of the project, we plan to automate regular updates from our Voyager and other Folger systems, and to allow administrators to create Miranda records in-platform, among other tasks.