Add documentation infrastructure and improve documentation

diff --git a/commit message: Add documentation infrastructure and improve documentation site

This commit adds several improvements to the documentation: - Updated Sphinx configuration for
better autodoc generation - Added a versions template for Read the Docs - Refined content for index,
introduction, and FAQ pages - Updated usage instructions - Incremented project version - Improved
setup.py to handle README loading
This commit is contained in:
Alexander Minges 2025-07-10 10:04:38 +02:00
parent 1a1eded67a
commit c97a89967c
Signed by: Athemis
SSH key fingerprint: SHA256:TUXshgulbwL+FRYvBNo54pCsI0auROsSEgSvueKbkZ4
10 changed files with 140 additions and 54 deletions

View file

@ -6,23 +6,25 @@
# Note that environment variables can be set in several places # Note that environment variables can be set in several places
# See https://docs.gitlab.com/ee/ci/variables/#cicd-variable-precedence # See https://docs.gitlab.com/ee/ci/variables/#cicd-variable-precedence
stages: stages:
- test - test
- secret-detection - secret-detection
- build-docs
- pages
variables: variables:
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip" PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
cache: cache:
paths: paths:
- ".cache/pip/" - ".cache/pip/"
- ".venv/" - ".venv/"
test: test:
stage: test stage: test
image: python:3 image: python:3
before_script: before_script:
- python -m pip install --upgrade pip - python -m pip install --upgrade pip
- pip install -r requirements.txt - pip install -r requirements.txt
- pip install -r requirements-dev.txt - pip install -r requirements-dev.txt
script: script:
- pytest - pytest
artifacts: artifacts:
reports: reports:
junit: junit.xml junit: junit.xml
@ -30,13 +32,46 @@ test:
coverage_format: cobertura coverage_format: cobertura
path: coverage.xml path: coverage.xml
paths: paths:
- htmlcov/ - htmlcov/
expire_in: 1 week expire_in: 1 week
coverage: "/(?i)total.*? (100(?:\\.0+)?\\%|[1-9]?\\d(?:\\.\\d+)?\\%)$/" coverage: "/(?i)total.*? (100(?:\\.0+)?\\%|[1-9]?\\d(?:\\.\\d+)?\\%)$/"
only: only:
- branches - branches
- merge_requests - merge_requests
secret_detection: secret_detection:
stage: secret-detection stage: secret-detection
build-docs:
stage: build-docs
image: python:3
before_script:
- python -m pip install --upgrade pip
- pip install -r requirements.txt
- pip install -r requirements-doc.txt
script:
- cd docs
- make html
artifacts:
paths:
- docs/build/html/
expire_in: 1 week
only:
- branches
- merge_requests
pages:
stage: pages
dependencies:
- build-docs
script:
- mkdir -p public
- cp -r docs/build/html/* public/
artifacts:
paths:
- public
expire_in: 1 week
only:
- main
include: include:
- template: Security/Secret-Detection.gitlab-ci.yml - template: Security/Secret-Detection.gitlab-ci.yml

View file

@ -0,0 +1,19 @@
{%- if versions %}
<div class="rst-versions" data-toggle="rst-versions" role="note" aria-label="versions">
<span class="rst-current-version" data-toggle="rst-current-version">
<span class="fa fa-book"> Read the Docs</span>
v: {{ current_version }}
<span class="fa fa-caret-down"></span>
</span>
<div class="rst-other-versions">
<dl>
<dt>Versions</dt>
{%- for item in versions %}
<dd>
<a href="{{ item.url }}">{{ item.name }}</a>
</dd>
{%- endfor %}
</dl>
</div>
</div>
{%- endif %}

View file

@ -14,7 +14,7 @@ sys.path.insert(0, os.path.abspath('../..'))
project = 'doi2dataset' project = 'doi2dataset'
copyright = '2025, Alexander Minges' copyright = '2025, Alexander Minges'
author = 'Alexander Minges' author = 'Alexander Minges'
release = '1.0' release = '2.0.2'
# -- General configuration --------------------------------------------------- # -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration # https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
@ -24,6 +24,19 @@ extensions = ["sphinx.ext.autodoc", "sphinx.ext.napoleon"]
templates_path = ['_templates'] templates_path = ['_templates']
exclude_patterns = [] exclude_patterns = []
# -- Options for autodoc ----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html
autodoc_default_options = {
'members': True,
'undoc-members': True,
'show-inheritance': True,
'special-members': '__init__',
}
# Suppress warnings about duplicate object descriptions
suppress_warnings = ['autodoc.import_object', 'ref.duplicate']
# -- Options for HTML output ------------------------------------------------- # -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output # https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output

View file

@ -1,7 +0,0 @@
doi2dataset module
==================
.. automodule:: doi2dataset
:members:
:show-inheritance:
:undoc-members:

View file

@ -1,14 +1,36 @@
Frequently Asked Questions (FAQ) Frequently Asked Questions (FAQ)
================================ ================================
Q: What is **doi2dataset**? **Q: What is doi2dataset?**
A: **doi2dataset** is a tool to process DOIs and generate metadata for Dataverse datasets by fetching data from external APIs like OpenAlex and CrossRef.
A: **doi2dataset** is a tool to process DOIs and generate standard Dataverse citation metadata by fetching data from external APIs like OpenAlex and CrossRef.
----
**Q: How do I install doi2dataset?**
Q: How do I install **doi2dataset**?
A: You can clone the repository from GitHub or install it via pip. Please refer to the Installation section for details. A: You can clone the repository from GitHub or install it via pip. Please refer to the Installation section for details.
Q: Can I upload metadata directly to a Dataverse server? ----
A: Yes, the tool provides an option to upload metadata via the command line using the ``-u`` flag. Ensure that your configuration in `config.yaml` is correct.
**Q: Can I upload metadata directly to a Dataverse server?**
A: Yes, the tool provides an option to upload metadata via the command line using the ``-u`` flag. Ensure that your configuration in `config.yaml` includes the correct Dataverse connection details.
----
**Q: What command line options are available?**
A: The tool supports several options including ``-f`` for input files, ``-o`` for output directory, ``-d`` for depositor name, ``-s`` for subject, ``-m`` for contact email, ``-u`` for upload, and ``-r`` for using ROR identifiers.
----
**Q: Do I need to configure PIs in the config file?**
A: No, PI configuration is optional. It's only used as a fallback for determining corresponding authors when they're not explicitly specified in the publication metadata.
----
**Q: Where can I find the API documentation?**
Q: Where can I find the API documentation?
A: The API reference is generated automatically in the Modules section of this documentation. A: The API reference is generated automatically in the Modules section of this documentation.

View file

@ -8,17 +8,24 @@ doi2dataset documentation
Overview Overview
-------- --------
**doi2dataset** is a Python tool designed to process DOIs and generate metadata for Dataverse datasets. **doi2dataset** is a Python tool designed to process DOIs and generate standard citation metadata for Dataverse datasets.
It retrieves data from external APIs such as OpenAlex and CrossRef and converts it into a format that meets Dataverse requirements. It retrieves data from external APIs such as OpenAlex and CrossRef and converts it into a format that meets Dataverse requirements.
Key Features: Key Features:
- **Validation** and normalization of DOIs - **DOI validation** and normalization
- Retrieval and processing of **metadata** (e.g., abstract, license, author information) - **Metadata retrieval** from external APIs (OpenAlex, CrossRef)
- Automatic mapping and generation of metadata fields (e.g., title, description, keywords) - **Standard Dataverse metadata** generation including:
- Support for controlled vocabularies and complex (compound) metadata fields - Title, publication date, and alternative URL
- Optional **uploading** of metadata to a Dataverse server - Author information with affiliations and ORCID identifiers
- Dataset contact information (corresponding authors)
- Abstract and description
- Keywords and subject classification
- Grant/funding information
- License information when available
- **Optional uploading** of metadata to a Dataverse server
- **Progress tracking** and error handling using the Rich library - **Progress tracking** and error handling using the Rich library
- **Research Organization Registry (ROR)** support for institutional identifiers

View file

@ -1,8 +1,8 @@
Introduction Introduction
============ ============
Welcome to the **doi2dataset** documentation. This guide provides an in-depth look at the tool, its purpose, and how it can help you generate metadata for Dataverse datasets. Welcome to the **doi2dataset** documentation. This guide provides an in-depth look at the tool, its purpose, and how it can help you generate standard citation metadata for Dataverse datasets.
The **doi2dataset** tool is aimed at researchers, data stewards, and developers who need to convert DOI-based metadata into a format compatible with Dataverse. It automates the retrieval of metadata from external sources (like OpenAlex and CrossRef) and performs necessary data transformations. The **doi2dataset** tool is aimed at researchers, data stewards, and developers who need to convert DOI-based metadata into a format compatible with Dataverse. It automates the retrieval of metadata from external sources (like OpenAlex and CrossRef) and generates standard Dataverse citation metadata blocks including title, authors, abstract, keywords, and funding information.
In the following sections, you'll learn about the installation process, usage examples, and a detailed API reference. In the following sections, you'll learn about the installation process, usage examples, and a detailed API reference.

View file

@ -1,7 +0,0 @@
setup module
============
.. automodule:: setup
:members:
:show-inheritance:
:undoc-members:

View file

@ -21,6 +21,7 @@ The tool offers several command line options:
- ``-s, --subject``: Default subject for the metadata. - ``-s, --subject``: Default subject for the metadata.
- ``-m, --contact-mail``: Contact email address. - ``-m, --contact-mail``: Contact email address.
- ``-u, --upload``: Flag to upload metadata to a Dataverse server. - ``-u, --upload``: Flag to upload metadata to a Dataverse server.
- ``-r, --use-ror``: Use Research Organization Registry (ROR) identifiers for institutions when available.
Configuration via config.yaml Configuration via config.yaml
------------------------------- -------------------------------
@ -42,27 +43,18 @@ Make sure that your **config.yaml** is properly configured before running the to
auth_password: "your_password" auth_password: "your_password"
dataverse: "your_dataverse_name" dataverse: "your_dataverse_name"
phase:
Phase1:
start: 2010
end: 2015
Phase2:
start: 2016
end: 2020
pis: pis:
- given_name: "John" - given_name: "John"
family_name: "Doe" family_name: "Doe"
email: "john.doe@example.com" email: "john.doe@example.com"
orcid: "0000-0001-2345-6789" orcid: "0000-0001-2345-6789"
affiliation: "Example University" affiliation: "Example University"
project:
- "Project A"
- "Project B"
default_grants: default_grants:
- funder: "Funder Name" - funder: "Funder Name"
id: "GrantID12345" id: "GrantID12345"
- funder: "Another Funding Agency"
id: "GrantID98765"
Usage Example with Configuration Usage Example with Configuration
---------------------------------- ----------------------------------
@ -70,7 +62,7 @@ If you have configured your **config.yaml** and want to process DOIs from a file
.. code-block:: bash .. code-block:: bash
python doi2dataset.py -f dois.txt -o output/ -d "John Doe" -s "Medicine, Health and Life Sciences" -m "john.doe@example.com" -u python doi2dataset.py -f dois.txt -o output/ -d "Doe, John" -s "Medicine, Health and Life Sciences" -m "john.doe@example.com" -u -r
This command will use the options provided on the command line as well as the settings from **config.yaml**. This command will use the options provided on the command line as well as the settings from **config.yaml**.

View file

@ -1,10 +1,22 @@
import os
from setuptools import find_packages, setup from setuptools import find_packages, setup
# Get the directory containing this file
here = os.path.abspath(os.path.dirname(__file__))
# Read the README file
readme_path = os.path.join(here, "README.md")
long_description = ""
if os.path.exists(readme_path):
with open(readme_path, encoding="utf-8") as f:
long_description = f.read()
setup( setup(
name="doi2dataset", name="doi2dataset",
version="1.0", version="1.0",
description="A tool to process DOIs and generate metadata for Dataverse.org datasets.", description="A tool to process DOIs and generate metadata for Dataverse.org datasets.",
long_description=open("README.md", encoding="utf-8").read() if open("README.md", encoding="utf-8") else "", long_description=long_description,
long_description_content_type="text/markdown", long_description_content_type="text/markdown",
author="Alexander Minges", author="Alexander Minges",
author_email="alexander.minges@uni-due.de", author_email="alexander.minges@uni-due.de",