Translate

Wednesday, 1 July 2026

Metadata

 Zenodo continuously advances its platform to support modern open science workflows. Running directly on the next-generation InvenioRDM infrastructure and hosted on CERN’s enterprise servers, Zenodo functions as a globally trusted hub for data persistence and scholarly compliance. [1, 2]

The latest technical capabilities, automated integrations, and policy rules on Zenodo include: [3]

1. AI-Assisted Metadata (The AIRDEC Project)

Zenodo integrates artificial intelligence to simplify the deposition workflow and improve metadata accuracy: [4]
  • Authority Linking: The system automatically cross-references and links authors to their ORCID profiles, and maps institutional data directly to Research Organization Registry (ROR) identifiers.
  • Contextual Suggestions: AI tools derive subject classifications and keywords directly from your uploaded abstract or record context.
  • Policy Compliance: Automated checks review submissions against repository and community curation guidelines before final publication. [4]

2. Post-Publication Deletion and Tombstone Architecture

To balance user mistakes with the strict permanence required by the scholarly record, Zenodo uses a specific tiered deletion protocol: [5]
  • The 30-Day Autonomy Window: Users have complete control to independently modify, edit, or delete files within the first 30 days post-publication. A dynamic countdown indicator displays exactly how many days remain during metadata edits. [5]
  • The Post-30-Day Support Queue: Once a record crosses the 30-day mark, it cannot be deleted by the user. A formal ticket must be sent to the Zenodo Support Team detailing a critical justification (e.g., severe copyright violation or a privacy leak). [5]
  • Automated Tombstones: If an older record is approved for deletion, the binary data files are wiped, but the landing page is replaced by an un-deletable "tombstone" placeholder. This preserves the registered DOI history, preventing dead web links and notifying citations that the material was removed. [5]

3. Horizon Europe and EU-Funded Curations

Zenodo serves as a primary compliance platform for European Union Open Science mandates, executing automated synchronization pipelines: [6, 7]
  • EU Community Tracking: The platform tracks and auto-harvests EU-funded data uploaded outside standard directories. Over 2,700 distinct European project sub-communities are onboarded onto the platform. [6]
  • The EU Open Research Repository: This specialized sub-collection provides dedicated data spaces for Horizon Europe, Marie Skłodowska-Curie Actions (MSCA), and European Research Council (ERC) grants. [7]
  • Expanded Storage Allowances: Standard uploads are restricted to a 50 GB threshold, but records submitted through verified EU Project Communities receive an expanded allowance of 200 GB per dataset. [1, 7]

4. Advanced API Structuring and Rate Limiting

Because Zenodo's metadata index is valuable to data-mining scrapers and large language models, the platform enforces strict API traffic boundaries:
  • Search API Boundaries: Zenodo implements targeted rate-limits across its specific records search endpoints to mitigate aggressive automated web harvesting.
  • Segregated Upload Pipelines: Programmatic deposition and file upload APIs use isolated pathways from public search indexes, ensuring massive batch-data uploads proceed at top speeds without getting throttled by search-rate parameters.
  • Sandbox Verification: Developers are required to build and test their token loops inside the Zenodo Sandbox API to confirm error-handling behavior before executing code against production data clusters.
If you want to tailor these capabilities to your specific technical environment, please tell me:
  • Are you looking for the exact Python sample code to interact with Zenodo's InvenioRDM API endpoints?
  • Do you want instructions on how to link your GitLab repository for automated tag releases?
  • Do you need assistance formatting a Data Management Plan (DMP) that requires explicit citation of CERN long-term data safety policies?
I can provide the specific code templates or configurations for your setup.

No comments: