79 lines
2.3 KiB
Markdown
79 lines
2.3 KiB
Markdown
# doi2dataset
|
|
|
|
**doi2dataset** is a Python tool designed to process DOIs and generate metadata for Dataverse.org datasets. It retrieves metadata from external APIs (such as OpenAlex and CrossRef), maps metadata fields, and can optionally upload the generated metadata to a Dataverse.org instance.
|
|
|
|
## Features
|
|
|
|
- **DOI Validation and Normalization:** Validates DOIs and converts them into a standardized format.
|
|
- **Metadata Retrieval:** Fetches metadata such as title, abstract, license, and author information from external sources.
|
|
- **Metadata Mapping:** Automatically maps and generates metadata fields (e.g., title, description, keywords) including support for controlled vocabularies and compound fields.
|
|
- **Optional Upload:** Allows uploading of metadata directly to a Dataverse.org server.
|
|
- **Progress Tracking:** Uses the Rich library for user-friendly progress tracking and error handling.
|
|
|
|
## Installation
|
|
|
|
Clone the repository from GitHub:
|
|
|
|
```bash
|
|
git clone https://git.athemis.de/Athemis/doi2dataset
|
|
cd doi2dataset
|
|
```
|
|
|
|
## Configuration
|
|
|
|
Configuration
|
|
|
|
Before running the tool, configure the necessary settings in the `config.yaml` file located in the project root. This file contains configuration details such as:
|
|
|
|
- Connection details (URL, API token, authentication credentials)
|
|
- Mapping of project phases
|
|
- Principal Investigator (PI) information
|
|
- Default grant configurations
|
|
|
|
## Usage
|
|
|
|
Run doi2dataset from the command line by providing one or more DOIs:
|
|
|
|
```bash
|
|
python doi2dataset.py [options] DOI1 DOI2 ...
|
|
```
|
|
|
|
### Command Line Options
|
|
|
|
- `-f, --file`
|
|
Specify a file containing DOIs (one per line).
|
|
|
|
- `-o, --output-dir`
|
|
Directory where metadata files will be saved.
|
|
|
|
- `-d, --depositor`
|
|
Name of the depositor.
|
|
|
|
- `-s, --subject`
|
|
Default subject for the metadata.
|
|
|
|
- `-m, --contact-mail`
|
|
Contact email address.
|
|
|
|
- `-u, --upload`
|
|
Upload metadata to a Dataverse.org server.
|
|
|
|
## Documentation
|
|
|
|
Documentation is generated using Sphinx. See the `docs/` directory for detailed API references and usage examples.
|
|
|
|
## Testing
|
|
|
|
Tests are implemented with pytest. To run the tests, execute:
|
|
|
|
```bash
|
|
pytest
|
|
```
|
|
|
|
## Contributing
|
|
|
|
Contributions are welcome! Please fork the repository and submit a pull request with your improvements.
|
|
|
|
## License
|
|
|
|
This project is licensed under the MIT License. See the [LICENSE.md](LICENSE.md) file for details.
|