Update testing documentation and improve test structure
All checks were successful
Test pipeline / test (push) Successful in 12s
All checks were successful
Test pipeline / test (push) Successful in 12s
This commit is contained in:
parent
1c84cae93b
commit
eb270cba9b
9 changed files with 617 additions and 20 deletions
60
README.md
60
README.md
|
@ -69,30 +69,34 @@ Documentation is generated using Sphinx. See the `docs/` directory for detailed
|
|||
|
||||
## Testing
|
||||
|
||||
Tests are implemented with pytest. The test suite provides comprehensive coverage of core functionalities.
|
||||
## Testing
|
||||
|
||||
### Running Tests
|
||||
|
||||
To run the tests, execute:
|
||||
Tests are implemented with pytest. The test suite provides comprehensive coverage of core functionalities. To run the tests, execute:
|
||||
|
||||
```bash
|
||||
pytest
|
||||
```
|
||||
|
||||
Or using the Python module syntax:
|
||||
|
||||
```bash
|
||||
python -m pytest
|
||||
```
|
||||
|
||||
### Code Coverage
|
||||
|
||||
The project includes code coverage analysis using pytest-cov. Current coverage is approximately 53% of the codebase, with key utilities and test infrastructure at 99-100% coverage.
|
||||
The project includes code coverage analysis using pytest-cov. Current coverage is approximately 61% of the codebase, with key utilities and test infrastructure at 99-100% coverage.
|
||||
|
||||
To run tests with code coverage analysis:
|
||||
|
||||
```bash
|
||||
pytest --cov=doi2dataset
|
||||
pytest --cov=.
|
||||
```
|
||||
|
||||
Generate a detailed HTML coverage report:
|
||||
|
||||
```bash
|
||||
pytest --cov=doi2dataset --cov-report=html
|
||||
pytest --cov=. --cov-report=html
|
||||
```
|
||||
|
||||
This creates a `htmlcov` directory. Open `htmlcov/index.html` in a browser to view the detailed coverage report.
|
||||
|
@ -102,38 +106,56 @@ A `.coveragerc` configuration file is provided that:
|
|||
- Configures reporting to ignore common non-testable lines (like defensive imports)
|
||||
- Sets the output directory for HTML reports
|
||||
|
||||
To increase coverage:
|
||||
1. Focus on adding tests for the MetadataProcessor class
|
||||
2. Add tests for the LicenseProcessor and SubjectMapper with more diverse inputs
|
||||
3. Create tests for the Configuration loading system
|
||||
Recent improvements have increased coverage from 48% to 61% by adding focused tests for:
|
||||
- Citation building functionality
|
||||
- License processing and validation
|
||||
- Metadata field extraction
|
||||
- OpenAlex integration
|
||||
- Publication data parsing and validation
|
||||
|
||||
Areas that could benefit from additional testing:
|
||||
- More edge cases in the MetadataProcessor class workflow
|
||||
- Additional CitationBuilder scenarios with diverse inputs
|
||||
- Complex network interactions and error handling
|
||||
|
||||
### Test Structure
|
||||
|
||||
The test suite is organized into six main files:
|
||||
|
||||
1. **test_doi2dataset.py**: Basic tests for core functions like phase checking, name splitting and DOI validation.
|
||||
2. **test_fetch_doi_mock.py**: Tests API interactions using a mock OpenAlex response stored in `srep45389.json`.
|
||||
3. **test_citation_builder.py**: Tests for building citation metadata from API responses.
|
||||
4. **test_metadata_processor.py**: Tests for the metadata processing workflow.
|
||||
5. **test_license_processor.py**: Tests for license processing and validation.
|
||||
6. **test_publication_utils.py**: Tests for publication year extraction and date handling.
|
||||
|
||||
### Test Categories
|
||||
|
||||
The test suite includes the following categories of tests:
|
||||
The test suite covers the following categories of functionality:
|
||||
|
||||
#### Core Functionality Tests
|
||||
|
||||
- **DOI Validation and Processing**: Tests for DOI normalization, validation, and filename sanitization.
|
||||
- **Phase Management**: Tests for checking publication year against defined project phases.
|
||||
- **Name Processing**: Tests for proper parsing and splitting of author names in different formats.
|
||||
- **Email Validation**: Tests for proper validation of email addresses.
|
||||
- **DOI Validation and Processing**: Parameterized tests for DOI normalization, validation, and filename sanitization with various inputs.
|
||||
- **Phase Management**: Tests for checking publication year against defined project phases, including boundary cases.
|
||||
- **Name Processing**: Extensive tests for parsing and splitting author names in different formats (with/without commas, middle initials, etc.).
|
||||
- **Email Validation**: Tests for proper validation of email addresses with various domain configurations.
|
||||
|
||||
#### API Integration Tests
|
||||
|
||||
- **Mock API Responses**: Tests that use a saved OpenAlex API response (`srep45389.json`) to simulate API interactions without making actual network requests.
|
||||
- **Data Fetching**: Tests for retrieving and parsing data from the OpenAlex API.
|
||||
- **Abstract Extraction**: Tests for extracting and cleaning abstracts from OpenAlex's inverted index format.
|
||||
- **Abstract Extraction**: Tests for extracting and cleaning abstracts from OpenAlex's inverted index format, including handling of empty or malformed abstracts.
|
||||
- **Subject Mapping**: Tests for mapping OpenAlex topics to controlled vocabulary subject terms.
|
||||
|
||||
#### Metadata Processing Tests
|
||||
|
||||
- **Citation Building**: Tests for properly building citation metadata from API responses.
|
||||
- **License Processing**: Tests for correctly identifying and formatting license information.
|
||||
- **License Processing**: Tests for correctly identifying and formatting license information from various license IDs.
|
||||
- **Principal Investigator Matching**: Tests for finding project PIs based on ORCID identifiers.
|
||||
- **Configuration Loading**: Tests for properly loading and validating configuration from files.
|
||||
- **Metadata Workflow**: Tests for the complete metadata processing workflow.
|
||||
|
||||
These tests ensure that all components work correctly in isolation and together as a system.
|
||||
These tests ensure that all components work correctly in isolation and together as a system, with special attention to edge cases and error handling.
|
||||
|
||||
## Contributing
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue