Validators API Reference¶

Validation classes and utilities.

XSD Validator¶

`XSDValidator` ¶

Bases: BaseValidator

XSD schema validator for C-CDA documents.

Validates C-CDA documents against the official HL7 C-CDA XSD schemas.

Usage

Use default schemas (auto-downloads if needed)¶

validator = XSDValidator() result = validator.validate(document) if result.is_valid: ... print("Document is valid!") else: ... print("Validation errors:") ... for error in result.errors: ... print(f" - {error}")

Or provide custom schema path¶

validator = XSDValidator("/path/to/schemas/CDA.xsd") result = validator.validate(document)

Note

XSD schemas are automatically downloaded on first use if not present. Set auto_download=False to disable automatic downloads.

Attributes¶

`schema_location` `property` ¶

Get the schema file location.

Functions¶

`init(schema_path=None, auto_download=True, max_errors=100)` ¶

Initialize XSD validator with schema file.

Parameters:

Name	Type	Description	Default
`schema_path`	`Optional[Union[str, Path]]`	Path to the CDA.xsd schema file. If None, uses default location and auto-downloads if needed.	`None`
`auto_download`	`bool`	Automatically download schemas if missing. Default: True. Set to False to disable automatic downloads.	`True`
`max_errors`	`Optional[int]`	Maximum number of errors to extract and store (default: 100). Set to None for unlimited. Limiting errors reduces memory usage.	`100`

Raises:

Type	Description
`FileNotFoundError`	If schema file doesn't exist and auto_download=False
`XMLSchemaParseError`	If schema is invalid

Note

On first use, XSD schemas (~2MB) will be automatically downloaded from HL7's official repository. This may take a few moments.

`validate(document)` ¶

Validate a C-CDA document against XSD schema.

Parameters:

Name	Type	Description	Default
`document`	`Union[_Element, str, bytes, Path]`	Document to validate. Can be: - etree._Element: Parsed XML element - str: XML string or file path - bytes: XML bytes - Path: Path to XML file	required

Returns:

Type	Description
`ValidationResult`	ValidationResult with errors from schema validation

Raises:

Type	Description
`FileNotFoundError`	If file path doesn't exist
`XMLSyntaxError`	If document is not well-formed XML

`validate_file(file_path)` ¶

Convenience method to validate a file.

Parameters:

Name	Type	Description	Default
`file_path`	`Union[str, Path]`	Path to XML file	required

Returns:

Type	Description
`ValidationResult`	ValidationResult with errors from schema validation

Raises:

Type	Description
`FileNotFoundError`	If file doesn't exist

`validate_string(xml_string)` ¶

Convenience method to validate an XML string.

Parameters:

Name	Type	Description	Default
`xml_string`	`str`	XML document as string	required

Returns:

Type	Description
`ValidationResult`	ValidationResult with errors from schema validation

`validate_bytes(xml_bytes)` ¶

Convenience method to validate XML bytes.

Parameters:

Name	Type	Description	Default
`xml_bytes`	`bytes`	XML document as bytes	required

Returns:

Type	Description
`ValidationResult`	ValidationResult with errors from schema validation

Schematron Validator¶

`SchematronValidator` ¶

Bases: BaseValidator

Schematron validator for C-CDA documents.

Validates C-CDA documents using ISO Schematron rules for business logic, template conformance, and ONC certification requirements.

Usage

Use default HL7 C-CDA R2.1 Schematron (auto-cleaned version)¶

validator = SchematronValidator() result = validator.validate(document) if result.is_valid: ... print("Document passes Schematron validation!") else: ... print("Validation errors:") ... for error in result.errors: ... print(f" - {error}")

Use custom Schematron file¶

validator = SchematronValidator("/path/to/custom.sch") result = validator.validate(document)

Note

Schematron validation requires both the .sch file and voc.xml vocabulary file. These are automatically downloaded and cleaned on first use.

The official HL7 Schematron file contains IDREF errors that prevent lxml from loading it. This validator automatically uses a cleaned version that fixes these errors while preserving all validation rules.

Source code in ccdakit/validators/schematron.py

class SchematronValidator(BaseValidator):
    """
    Schematron validator for C-CDA documents.

    Validates C-CDA documents using ISO Schematron rules for business logic,
    template conformance, and ONC certification requirements.

    Usage:
        >>> # Use default HL7 C-CDA R2.1 Schematron (auto-cleaned version)
        >>> validator = SchematronValidator()
        >>> result = validator.validate(document)
        >>> if result.is_valid:
        ...     print("Document passes Schematron validation!")
        >>> else:
        ...     print("Validation errors:")
        ...     for error in result.errors:
        ...         print(f"  - {error}")

        >>> # Use custom Schematron file
        >>> validator = SchematronValidator("/path/to/custom.sch")
        >>> result = validator.validate(document)

    Note:
        Schematron validation requires both the .sch file and voc.xml vocabulary file.
        These are automatically downloaded and cleaned on first use.

        The official HL7 Schematron file contains IDREF errors that prevent lxml
        from loading it. This validator automatically uses a cleaned version that
        fixes these errors while preserving all validation rules.
    """

    # SVRL namespace for validation report
    SVRL_NS = "http://purl.oclc.org/dsdl/svrl"

    def __init__(
        self,
        schematron_path: Optional[Union[str, Path]] = None,
        phase: Optional[str] = None,
        auto_download: bool = True,
        max_errors: Optional[int] = 100,
    ):
        """
        Initialize Schematron validator.

        Args:
            schematron_path: Path to Schematron file (.sch).
                If None, uses default HL7 C-CDA R2.1 Schematron.
            phase: Schematron phase to use (e.g., "errors", "warnings").
                If None, validates all phases.
            auto_download: Automatically download Schematron files if missing.
                Default: True. Set to False to disable automatic downloads.
            max_errors: Maximum number of errors to extract and store (default: 100).
                Set to None for unlimited. Limiting errors reduces memory usage significantly.

        Raises:
            FileNotFoundError: If schematron file doesn't exist and auto_download=False
            etree.SchematronParseError: If schematron is invalid

        Note:
            On first use, Schematron files (~63MB) will be automatically downloaded
            from HL7's official GitHub repository. This may take a few moments.
        """
        self.schematron_path = self._resolve_schematron_path(schematron_path)
        self.phase = phase
        self.auto_download = auto_download
        self.max_errors = max_errors

        # Attempt auto-download if file doesn't exist
        if not self.schematron_path.exists() and self.auto_download:
            self._attempt_auto_download()

        # Check if file exists after download attempt
        if not self.schematron_path.exists():
            raise FileNotFoundError(
                f"Schematron file not found: {self.schematron_path}\n"
                "Expected file: schemas/schematron/HL7_CCDA_R2.1.sch\n\n"
                "Options:\n"
                "1. Allow automatic download (default): SchematronValidator(auto_download=True)\n"
                "2. Download manually from: https://github.com/HL7/CDA-ccda-2.1\n"
                "3. Provide your own file: SchematronValidator(schematron_path='/path/to/file.sch')"
            )

        self.schematron = self._load_schematron()

    def _resolve_schematron_path(self, path: Optional[Union[str, Path]]) -> Path:
        """
        Resolve Schematron file path.

        Uses the cleaned version of HL7 C-CDA R2.1 Schematron by default, as the
        original file contains IDREF errors that prevent lxml from loading it.

        Args:
            path: User-provided path or None for default

        Returns:
            Resolved Path object
        """
        if path is not None:
            return Path(path)

        # Default to cleaned HL7 C-CDA R2.1 Schematron in package
        # The cleaned version has IDREF errors fixed for lxml compatibility
        current_dir = Path(__file__).parent
        package_root = current_dir.parent.parent

        # Check common locations for cleaned version first
        cleaned_locations = [
            package_root / "schemas" / "schematron" / "HL7_CCDA_R2.1_cleaned.sch",
            Path("schemas") / "schematron" / "HL7_CCDA_R2.1_cleaned.sch",
            Path.cwd() / "schemas" / "schematron" / "HL7_CCDA_R2.1_cleaned.sch",
        ]

        for location in cleaned_locations:
            if location.exists():
                return location

        # Fall back to original (for backwards compatibility)
        original_locations = [
            package_root / "schemas" / "schematron" / "HL7_CCDA_R2.1.sch",
            Path("schemas") / "schematron" / "HL7_CCDA_R2.1.sch",
            Path.cwd() / "schemas" / "schematron" / "HL7_CCDA_R2.1.sch",
        ]

        for location in original_locations:
            if location.exists():
                return location

        # Return default expected location (cleaned version, may not exist yet)
        return package_root / "schemas" / "schematron" / "HL7_CCDA_R2.1_cleaned.sch"

    def _attempt_auto_download(self) -> None:
        """
        Attempt to automatically download Schematron files.

        This method tries to download the official HL7 C-CDA R2.1 Schematron
        files if they're not present. Downloads are only attempted once.
        """
        try:
            print("Schematron files not found. Attempting automatic download...")
            print("This is a one-time download (~63MB). Please wait...")

            downloader = SchematronDownloader()
            success, message = downloader.download_all(force=False)

            if success:
                print(message)
                print("✓ Schematron files ready for validation!")
            else:
                warnings.warn(
                    f"Automatic download failed:\n{message}\n"
                    "You can provide your own Schematron file or download manually.",
                    UserWarning,
                    stacklevel=2,
                )

        except Exception as e:
            warnings.warn(
                f"Automatic download failed: {e}\n"
                "You can provide your own Schematron file using: "
                "SchematronValidator(schematron_path='/path/to/file.sch')",
                UserWarning,
                stacklevel=2,
            )

    def _load_schematron(self) -> isoschematron.Schematron:
        """
        Load and compile Schematron rules.

        Returns:
            Compiled Schematron object

        Raises:
            etree.SchematronParseError: If schematron is invalid
        """
        try:
            # Create custom resolver for voc.xml and other includes
            class SchematronResolver(etree.Resolver):
                def __init__(self, base_path: Path):
                    self.base_path = base_path.parent
                    super().__init__()

                def resolve(self, url, id, context):
                    # Handle voc.xml and other relative references
                    if url and not url.startswith(("http://", "https://", "file://")):
                        resolved_path = self.base_path / url
                        if resolved_path.exists():
                            return self.resolve_filename(str(resolved_path), context)
                    return None

            # Parse Schematron document with custom resolver
            parser = etree.XMLParser()
            parser.resolvers.add(SchematronResolver(self.schematron_path))

            # Parse with file path to set base URL
            schematron_doc = etree.parse(str(self.schematron_path), parser)

            # Create Schematron validator
            # store_schematron=False to reduce memory usage (saves ~10-20MB per validator)
            # store_report=True needed to access validation_report for error extraction
            # For HL7 files, we need to be less strict about validation
            kwargs = {
                "store_schematron": False,
                "store_report": True,
                # Skip schema validation to be more permissive with HL7 files
                "validate_schema": False,
            }

            if self.phase is not None:
                kwargs["phase"] = self.phase

            return isoschematron.Schematron(schematron_doc, **kwargs)

        except etree.XMLSyntaxError as e:
            raise etree.SchematronParseError(
                f"Failed to parse Schematron file at {self.schematron_path}: {e}"
            ) from e
        except Exception as e:
            raise etree.SchematronParseError(
                f"Failed to load Schematron at {self.schematron_path}: {e}"
            ) from e

    def validate(self, document: Union[etree._Element, str, bytes, Path]) -> ValidationResult:
        """
        Validate a C-CDA document against Schematron rules.

        Args:
            document: Document to validate. Can be:
                - etree._Element: Parsed XML element
                - str: XML string or file path
                - bytes: XML bytes
                - Path: Path to XML file

        Returns:
            ValidationResult with Schematron validation findings

        Raises:
            FileNotFoundError: If file path doesn't exist
            etree.XMLSyntaxError: If document is not well-formed XML
        """
        result = ValidationResult()

        try:
            # Parse document
            doc_element = self._parse_document(document)

            # Run Schematron validation
            is_valid = self.schematron.validate(doc_element)

            if not is_valid:
                # Extract validation messages from SVRL report
                report = self.schematron.validation_report
                issues, has_more = self._extract_issues_from_report(report)

                # Categorize issues by level (schematron reports as failed-assert or successful-report)
                for issue in issues:
                    if issue.level == ValidationLevel.ERROR:
                        result.errors.append(issue)
                    elif issue.level == ValidationLevel.WARNING:
                        result.warnings.append(issue)
                    else:
                        result.infos.append(issue)

                # Add info message if more errors exist
                if has_more:
                    result.infos.append(
                        ValidationIssue(
                            level=ValidationLevel.INFO,
                            message=f"Additional validation errors exist beyond the limit of {self.max_errors}. "
                            f"Use SchematronValidator(max_errors=None) to see all errors.",
                            code="ERROR_LIMIT_REACHED",
                        )
                    )

        except etree.XMLSyntaxError as e:
            result.errors.append(
                ValidationIssue(
                    level=ValidationLevel.ERROR,
                    message=f"XML syntax error: {e}",
                    location=f"Line {e.lineno}" if hasattr(e, "lineno") else None,
                    code="XML_SYNTAX_ERROR",
                )
            )
        except FileNotFoundError as e:
            result.errors.append(
                ValidationIssue(
                    level=ValidationLevel.ERROR,
                    message=str(e),
                    code="FILE_NOT_FOUND",
                )
            )
        except Exception as e:
            result.errors.append(
                ValidationIssue(
                    level=ValidationLevel.ERROR,
                    message=f"Schematron validation error: {e}",
                    code="SCHEMATRON_ERROR",
                )
            )

        return result

    def _extract_issues_from_report(self, report: etree._Element) -> tuple[List[ValidationIssue], bool]:
        """
        Extract validation issues from SVRL report.

        Args:
            report: SVRL validation report element

        Returns:
            Tuple of (List of ValidationIssue objects, has_more flag indicating if limit was reached)
        """
        issues = []
        has_more = False

        # Extract failed assertions (errors)
        for element in report.findall(f".//{{{self.SVRL_NS}}}failed-assert"):
            # Check if we've reached the limit
            if self.max_errors is not None and len(issues) >= self.max_errors:
                has_more = True
                break

            issue = self._parse_failed_assert(element)
            if issue:
                issues.append(issue)

        # Extract successful reports (warnings/info) if we haven't hit the limit
        if not has_more:
            for element in report.findall(f".//{{{self.SVRL_NS}}}successful-report"):
                # Check if we've reached the limit
                if self.max_errors is not None and len(issues) >= self.max_errors:
                    has_more = True
                    break

                issue = self._parse_successful_report(element)
                if issue:
                    issues.append(issue)

        return issues, has_more

    def _parse_failed_assert(self, element: etree._Element) -> Optional[ValidationIssue]:
        """
        Parse failed-assert element from SVRL report.

        Args:
            element: failed-assert element

        Returns:
            ValidationIssue or None
        """
        # Extract message text
        text_elem = element.find(f"{{{self.SVRL_NS}}}text")
        if text_elem is None:
            return None

        message = self._extract_text_content(text_elem)
        if not message:
            return None

        # Extract location (XPath where assertion failed)
        location = element.get("location")

        # Extract rule ID (CONF ID or template ID)
        rule_id = element.get("id")

        # Build error code from rule ID
        code = f"SCHEMATRON_{rule_id}" if rule_id else "SCHEMATRON_ERROR"

        # Format full error message for parser
        full_message = f"ERROR at {location}: {message}" if location else f"ERROR: {message}"

        # Parse error for enhanced display
        parsed_error = SchematronErrorParser.parse_error(full_message)

        return ValidationIssue(
            level=ValidationLevel.ERROR,
            message=message,
            location=location,
            code=code,
            parsed_data=parsed_error.to_dict(),
        )

    def _parse_successful_report(self, element: etree._Element) -> Optional[ValidationIssue]:
        """
        Parse successful-report element from SVRL report.

        Successful reports are typically warnings or informational messages.

        Args:
            element: successful-report element

        Returns:
            ValidationIssue or None
        """
        # Extract message text
        text_elem = element.find(f"{{{self.SVRL_NS}}}text")
        if text_elem is None:
            return None

        message = self._extract_text_content(text_elem)
        if not message:
            return None

        # Extract location
        location = element.get("location")

        # Extract rule ID
        rule_id = element.get("id")

        # Determine level based on rule ID or message content
        # C-CDA Schematron typically uses role="warning" or role="info"
        role = element.get("role", "").lower()
        if "warning" in role or "warn" in message.lower():
            level = ValidationLevel.WARNING
        else:
            level = ValidationLevel.INFO

        code = f"SCHEMATRON_{rule_id}" if rule_id else "SCHEMATRON_INFO"

        # Format full message for parser
        severity_label = "WARNING" if level == ValidationLevel.WARNING else "INFO"
        full_message = (
            f"{severity_label} at {location}: {message}"
            if location
            else f"{severity_label}: {message}"
        )

        # Parse for enhanced display
        parsed_error = SchematronErrorParser.parse_error(full_message)

        return ValidationIssue(
            level=level,
            message=message,
            location=location,
            code=code,
            parsed_data=parsed_error.to_dict(),
        )

    def _extract_text_content(self, element: etree._Element) -> str:
        """
        Extract text content from element, handling nested elements.

        Args:
            element: Element containing text

        Returns:
            Concatenated text content
        """
        # Get all text including from nested elements
        text_parts = []

        # Get element's direct text
        if element.text:
            text_parts.append(element.text.strip())

        # Get text from all descendants
        for child in element:
            if child.text:
                text_parts.append(child.text.strip())
            if child.tail:
                text_parts.append(child.tail.strip())

        # Join and clean up
        full_text = " ".join(text_parts)
        # Remove extra whitespace
        return " ".join(full_text.split())

    def validate_file(self, file_path: Union[str, Path]) -> ValidationResult:
        """
        Convenience method to validate a file.

        Args:
            file_path: Path to XML file

        Returns:
            ValidationResult with Schematron validation findings

        Raises:
            FileNotFoundError: If file doesn't exist
        """
        return self.validate(Path(file_path))

    def validate_string(self, xml_string: str) -> ValidationResult:
        """
        Convenience method to validate an XML string.

        Args:
            xml_string: XML document as string

        Returns:
            ValidationResult with Schematron validation findings
        """
        return self.validate(xml_string)

    def validate_bytes(self, xml_bytes: bytes) -> ValidationResult:
        """
        Convenience method to validate XML bytes.

        Args:
            xml_bytes: XML document as bytes

        Returns:
            ValidationResult with Schematron validation findings
        """
        return self.validate(xml_bytes)

    @property
    def schematron_location(self) -> Path:
        """Get the Schematron file location."""
        return self.schematron_path

    @property
    def validation_phase(self) -> Optional[str]:
        """Get the validation phase being used."""
        return self.phase

Attributes¶

`schematron_location` `property` ¶

Get the Schematron file location.

`validation_phase` `property` ¶

Get the validation phase being used.

Functions¶

`init(schematron_path=None, phase=None, auto_download=True, max_errors=100)` ¶

Initialize Schematron validator.

Parameters:

Name	Type	Description	Default
`schematron_path`	`Optional[Union[str, Path]]`	Path to Schematron file (.sch). If None, uses default HL7 C-CDA R2.1 Schematron.	`None`
`phase`	`Optional[str]`	Schematron phase to use (e.g., "errors", "warnings"). If None, validates all phases.	`None`
`auto_download`	`bool`	Automatically download Schematron files if missing. Default: True. Set to False to disable automatic downloads.	`True`
`max_errors`	`Optional[int]`	Maximum number of errors to extract and store (default: 100). Set to None for unlimited. Limiting errors reduces memory usage significantly.	`100`

Raises:

Type	Description
`FileNotFoundError`	If schematron file doesn't exist and auto_download=False
`SchematronParseError`	If schematron is invalid

Note

On first use, Schematron files (~63MB) will be automatically downloaded from HL7's official GitHub repository. This may take a few moments.

Source code in ccdakit/validators/schematron.py

def __init__(
    self,
    schematron_path: Optional[Union[str, Path]] = None,
    phase: Optional[str] = None,
    auto_download: bool = True,
    max_errors: Optional[int] = 100,
):
    """
    Initialize Schematron validator.

    Args:
        schematron_path: Path to Schematron file (.sch).
            If None, uses default HL7 C-CDA R2.1 Schematron.
        phase: Schematron phase to use (e.g., "errors", "warnings").
            If None, validates all phases.
        auto_download: Automatically download Schematron files if missing.
            Default: True. Set to False to disable automatic downloads.
        max_errors: Maximum number of errors to extract and store (default: 100).
            Set to None for unlimited. Limiting errors reduces memory usage significantly.

    Raises:
        FileNotFoundError: If schematron file doesn't exist and auto_download=False
        etree.SchematronParseError: If schematron is invalid

    Note:
        On first use, Schematron files (~63MB) will be automatically downloaded
        from HL7's official GitHub repository. This may take a few moments.
    """
    self.schematron_path = self._resolve_schematron_path(schematron_path)
    self.phase = phase
    self.auto_download = auto_download
    self.max_errors = max_errors

    # Attempt auto-download if file doesn't exist
    if not self.schematron_path.exists() and self.auto_download:
        self._attempt_auto_download()

    # Check if file exists after download attempt
    if not self.schematron_path.exists():
        raise FileNotFoundError(
            f"Schematron file not found: {self.schematron_path}\n"
            "Expected file: schemas/schematron/HL7_CCDA_R2.1.sch\n\n"
            "Options:\n"
            "1. Allow automatic download (default): SchematronValidator(auto_download=True)\n"
            "2. Download manually from: https://github.com/HL7/CDA-ccda-2.1\n"
            "3. Provide your own file: SchematronValidator(schematron_path='/path/to/file.sch')"
        )

    self.schematron = self._load_schematron()

`validate(document)` ¶

Validate a C-CDA document against Schematron rules.

Parameters:

Name	Type	Description	Default
`document`	`Union[_Element, str, bytes, Path]`	Document to validate. Can be: - etree._Element: Parsed XML element - str: XML string or file path - bytes: XML bytes - Path: Path to XML file	required

Returns:

Type	Description
`ValidationResult`	ValidationResult with Schematron validation findings

Raises:

Type	Description
`FileNotFoundError`	If file path doesn't exist
`XMLSyntaxError`	If document is not well-formed XML

Source code in ccdakit/validators/schematron.py

def validate(self, document: Union[etree._Element, str, bytes, Path]) -> ValidationResult:
    """
    Validate a C-CDA document against Schematron rules.

    Args:
        document: Document to validate. Can be:
            - etree._Element: Parsed XML element
            - str: XML string or file path
            - bytes: XML bytes
            - Path: Path to XML file

    Returns:
        ValidationResult with Schematron validation findings

    Raises:
        FileNotFoundError: If file path doesn't exist
        etree.XMLSyntaxError: If document is not well-formed XML
    """
    result = ValidationResult()

    try:
        # Parse document
        doc_element = self._parse_document(document)

        # Run Schematron validation
        is_valid = self.schematron.validate(doc_element)

        if not is_valid:
            # Extract validation messages from SVRL report
            report = self.schematron.validation_report
            issues, has_more = self._extract_issues_from_report(report)

            # Categorize issues by level (schematron reports as failed-assert or successful-report)
            for issue in issues:
                if issue.level == ValidationLevel.ERROR:
                    result.errors.append(issue)
                elif issue.level == ValidationLevel.WARNING:
                    result.warnings.append(issue)
                else:
                    result.infos.append(issue)

            # Add info message if more errors exist
            if has_more:
                result.infos.append(
                    ValidationIssue(
                        level=ValidationLevel.INFO,
                        message=f"Additional validation errors exist beyond the limit of {self.max_errors}. "
                        f"Use SchematronValidator(max_errors=None) to see all errors.",
                        code="ERROR_LIMIT_REACHED",
                    )
                )

    except etree.XMLSyntaxError as e:
        result.errors.append(
            ValidationIssue(
                level=ValidationLevel.ERROR,
                message=f"XML syntax error: {e}",
                location=f"Line {e.lineno}" if hasattr(e, "lineno") else None,
                code="XML_SYNTAX_ERROR",
            )
        )
    except FileNotFoundError as e:
        result.errors.append(
            ValidationIssue(
                level=ValidationLevel.ERROR,
                message=str(e),
                code="FILE_NOT_FOUND",
            )
        )
    except Exception as e:
        result.errors.append(
            ValidationIssue(
                level=ValidationLevel.ERROR,
                message=f"Schematron validation error: {e}",
                code="SCHEMATRON_ERROR",
            )
        )

    return result

`validate_file(file_path)` ¶

Convenience method to validate a file.

Parameters:

Name	Type	Description	Default
`file_path`	`Union[str, Path]`	Path to XML file	required

Returns:

Type	Description
`ValidationResult`	ValidationResult with Schematron validation findings

Raises:

Type	Description
`FileNotFoundError`	If file doesn't exist

Source code in ccdakit/validators/schematron.py

def validate_file(self, file_path: Union[str, Path]) -> ValidationResult:
    """
    Convenience method to validate a file.

    Args:
        file_path: Path to XML file

    Returns:
        ValidationResult with Schematron validation findings

    Raises:
        FileNotFoundError: If file doesn't exist
    """
    return self.validate(Path(file_path))

`validate_string(xml_string)` ¶

Convenience method to validate an XML string.

Parameters:

Name	Type	Description	Default
`xml_string`	`str`	XML document as string	required

Returns:

Type	Description
`ValidationResult`	ValidationResult with Schematron validation findings

Source code in ccdakit/validators/schematron.py

def validate_string(self, xml_string: str) -> ValidationResult:
    """
    Convenience method to validate an XML string.

    Args:
        xml_string: XML document as string

    Returns:
        ValidationResult with Schematron validation findings
    """
    return self.validate(xml_string)

`validate_bytes(xml_bytes)` ¶

Convenience method to validate XML bytes.

Parameters:

Name	Type	Description	Default
`xml_bytes`	`bytes`	XML document as bytes	required

Returns:

Type	Description
`ValidationResult`	ValidationResult with Schematron validation findings

Source code in ccdakit/validators/schematron.py

def validate_bytes(self, xml_bytes: bytes) -> ValidationResult:
    """
    Convenience method to validate XML bytes.

    Args:
        xml_bytes: XML document as bytes

    Returns:
        ValidationResult with Schematron validation findings
    """
    return self.validate(xml_bytes)

Base Validator¶

`BaseValidator` ¶

Bases: ABC

Abstract base class for C-CDA validators.

All validators should inherit from this class and implement the validate method.

Source code in ccdakit/validators/base.py

class BaseValidator(ABC):
    """
    Abstract base class for C-CDA validators.

    All validators should inherit from this class and implement
    the validate method.
    """

    @abstractmethod
    def validate(self, document: Union[etree._Element, str, bytes, Path]) -> ValidationResult:
        """
        Validate a C-CDA document.

        Args:
            document: Document to validate. Can be:
                - etree._Element: Parsed XML element
                - str: XML string or file path
                - bytes: XML bytes
                - Path: Path to XML file

        Returns:
            ValidationResult with errors, warnings, and info messages

        Raises:
            FileNotFoundError: If file path doesn't exist
            etree.XMLSyntaxError: If document is not well-formed XML
        """
        pass

    def _parse_document(self, document: Union[etree._Element, str, bytes, Path]) -> etree._Element:
        """
        Parse document into an lxml Element.

        Args:
            document: Document in various formats

        Returns:
            Parsed XML element

        Raises:
            FileNotFoundError: If file path doesn't exist
            etree.XMLSyntaxError: If document is not well-formed XML
        """
        if isinstance(document, etree._Element):
            return document

        if isinstance(document, Path):
            if not document.exists():
                raise FileNotFoundError(f"File not found: {document}")
            return etree.parse(str(document)).getroot()

        if isinstance(document, str):
            # Check if it looks like XML (starts with < or whitespace then <)
            stripped = document.lstrip()
            if stripped.startswith("<"):
                # Parse as XML string
                return etree.fromstring(document.encode("utf-8"))
            # Otherwise try as file path
            path = Path(document)
            if path.exists():
                return etree.parse(str(path)).getroot()
            # If not found, try parsing as XML anyway (might be malformed)
            return etree.fromstring(document.encode("utf-8"))

        if isinstance(document, bytes):
            return etree.fromstring(document)

        raise TypeError(
            f"Unsupported document type: {type(document)}. "
            "Expected etree._Element, str, bytes, or Path"
        )

Functions¶

`validate(document)` `abstractmethod` ¶

Validate a C-CDA document.

Parameters:

Name	Type	Description	Default
`document`	`Union[_Element, str, bytes, Path]`	Document to validate. Can be: - etree._Element: Parsed XML element - str: XML string or file path - bytes: XML bytes - Path: Path to XML file	required

Returns:

Type	Description
`ValidationResult`	ValidationResult with errors, warnings, and info messages

Raises:

Type	Description
`FileNotFoundError`	If file path doesn't exist
`XMLSyntaxError`	If document is not well-formed XML

Source code in ccdakit/validators/base.py

@abstractmethod
def validate(self, document: Union[etree._Element, str, bytes, Path]) -> ValidationResult:
    """
    Validate a C-CDA document.

    Args:
        document: Document to validate. Can be:
            - etree._Element: Parsed XML element
            - str: XML string or file path
            - bytes: XML bytes
            - Path: Path to XML file

    Returns:
        ValidationResult with errors, warnings, and info messages

    Raises:
        FileNotFoundError: If file path doesn't exist
        etree.XMLSyntaxError: If document is not well-formed XML
    """
    pass

Validation Rules¶

`ValidationRule` ¶

Bases: ABC

Base class for custom validation rules.

Example

class MyCustomRule(ValidationRule): def init(self): super().init( name="my_custom_rule", description="Validates custom business logic" )

def validate(self, document: etree._Element) -> List[ValidationIssue]:
    issues = []
    # Implement validation logic
    return issues

Source code in ccdakit/validators/rules.py

class ValidationRule(ABC):
    """
    Base class for custom validation rules.

    Example:
        class MyCustomRule(ValidationRule):
            def __init__(self):
                super().__init__(
                    name="my_custom_rule",
                    description="Validates custom business logic"
                )

            def validate(self, document: etree._Element) -> List[ValidationIssue]:
                issues = []
                # Implement validation logic
                return issues
    """

    def __init__(self, name: str, description: str):
        """
        Initialize validation rule.

        Args:
            name: Unique identifier for the rule
            description: Human-readable description of what the rule checks
        """
        self.name = name
        self.description = description

    @abstractmethod
    def validate(self, document: etree._Element) -> List[ValidationIssue]:
        """
        Apply rule to document.

        Args:
            document: Parsed C-CDA XML document element

        Returns:
            List of validation issues found (empty list if valid)
        """
        raise NotImplementedError(f"Rule '{self.name}' must implement validate()")

    def __repr__(self) -> str:
        """String representation of rule."""
        return f"<ValidationRule: {self.name}>"

Functions¶

`init(name, description)` ¶

Initialize validation rule.

Parameters:

Name	Type	Description	Default
`name`	`str`	Unique identifier for the rule	required
`description`	`str`	Human-readable description of what the rule checks	required

Source code in ccdakit/validators/rules.py

def __init__(self, name: str, description: str):
    """
    Initialize validation rule.

    Args:
        name: Unique identifier for the rule
        description: Human-readable description of what the rule checks
    """
    self.name = name
    self.description = description

`validate(document)` `abstractmethod` ¶

Apply rule to document.

Parameters:

Name	Type	Description	Default
`document`	`_Element`	Parsed C-CDA XML document element	required

Returns:

Type	Description
`List[ValidationIssue]`	List of validation issues found (empty list if valid)

Source code in ccdakit/validators/rules.py

@abstractmethod
def validate(self, document: etree._Element) -> List[ValidationIssue]:
    """
    Apply rule to document.

    Args:
        document: Parsed C-CDA XML document element

    Returns:
        List of validation issues found (empty list if valid)
    """
    raise NotImplementedError(f"Rule '{self.name}' must implement validate()")

`repr()` ¶

String representation of rule.

Source code in ccdakit/validators/rules.py

def __repr__(self) -> str:
    """String representation of rule."""
    return f"<ValidationRule: {self.name}>"

Validation rule classes are available in: - ccdakit.validators.rules - Base rules and composites - ccdakit.validators.common_rules - Common reusable rules

See the Validation Guide for usage examples.

Schema Manager¶

`SchemaManager` ¶

Manager for C-CDA XSD schemas.

Helps with schema discovery, downloading, and path management.

Source code in ccdakit/validators/utils.py

class SchemaManager:
    """
    Manager for C-CDA XSD schemas.

    Helps with schema discovery, downloading, and path management.
    """

    def __init__(self, schema_dir: Optional[Path] = None):
        """
        Initialize schema manager.

        Args:
            schema_dir: Directory containing schemas. Defaults to project's schemas/ directory.
        """
        self.schema_dir = schema_dir or DEFAULT_SCHEMA_DIR
        self.schema_dir.mkdir(parents=True, exist_ok=True)

    def is_installed(self) -> bool:
        """
        Check if C-CDA schemas are installed.

        Returns:
            True if CDA.xsd exists in schema directory
        """
        return self.get_cda_schema_path().exists()

    def get_cda_schema_path(self) -> Path:
        """
        Get path to main CDA.xsd schema file.

        Returns:
            Path to CDA.xsd (may not exist)
        """
        return self.schema_dir / "CDA.xsd"

    def get_schema_info(self) -> dict:
        """
        Get information about installed schemas.

        Returns:
            Dictionary with schema installation status and paths
        """
        cda_path = self.get_cda_schema_path()
        return {
            "installed": cda_path.exists(),
            "schema_dir": str(self.schema_dir),
            "cda_schema": str(cda_path),
            "cda_exists": cda_path.exists(),
            "files": [f.name for f in self.schema_dir.iterdir() if f.is_file()],
        }

    def download_schemas(
        self,
        version: str = "R2.1",
        url: Optional[str] = None,
        force: bool = False,
    ) -> Tuple[bool, str]:
        """
        Download C-CDA schemas from HL7.

        Note: This is a helper function, but schemas may need to be
        downloaded manually from HL7's website due to licensing.

        Args:
            version: C-CDA version (R2.1 or R2.0)
            url: Custom download URL (overrides version)
            force: Force re-download even if schemas exist

        Returns:
            Tuple of (success: bool, message: str)

        Raises:
            ValueError: If version is not supported
        """
        if self.is_installed() and not force:
            return (
                True,
                f"Schemas already installed at {self.schema_dir}. Use force=True to re-download.",
            )

        if url is None:
            if version not in SCHEMA_URLS:
                raise ValueError(
                    f"Unsupported version: {version}. "
                    f"Supported versions: {list(SCHEMA_URLS.keys())}"
                )
            url = SCHEMA_URLS[version]

        try:
            # Download zip file
            zip_path = self.schema_dir / "schemas.zip"
            urlretrieve(url, zip_path)

            # Extract schemas
            with zipfile.ZipFile(zip_path, "r") as zip_ref:
                zip_ref.extractall(self.schema_dir)

            # Clean up zip file
            zip_path.unlink()

            return True, f"Schemas downloaded successfully to {self.schema_dir}"

        except Exception as e:
            return False, f"Failed to download schemas: {e}"

    def print_installation_instructions(self) -> None:
        """Print instructions for manually downloading schemas."""
        instructions = f"""
C-CDA XSD Schema Installation Instructions
==========================================

The C-CDA XSD schemas must be downloaded from HL7 due to licensing restrictions.

Method 1: Download from HL7 (Recommended)
------------------------------------------
1. Visit the HL7 C-CDA download page:
   - R2.1: https://www.hl7.org/implement/standards/product_brief.cfm?product_id=492
   - R2.0: https://www.hl7.org/implement/standards/product_brief.cfm?product_id=379

2. Download the schema package (e.g., "CCDA_R2.1_Schemas.zip")

3. Extract the following files to: {self.schema_dir}
   - CDA.xsd (main schema file)
   - POCD_MT000040_CCDA.xsd
   - datatypes.xsd
   - voc.xsd
   - NarrativeBlock.xsd
   - SDTC/ directory (if available)

Method 2: Use Schema Manager (Automated)
-----------------------------------------
>>> from ccdakit.validators.utils import SchemaManager
>>> manager = SchemaManager()
>>> success, message = manager.download_schemas(version="R2.1")
>>> print(message)

Note: Automated download may not work due to HL7's licensing requirements.
      Manual download is recommended.

Verification
------------
After installation, verify schemas are available:

>>> manager = SchemaManager()
>>> info = manager.get_schema_info()
>>> print(info)

The 'cda_exists' field should be True.
"""
        print(instructions)

Functions¶

`init(schema_dir=None)` ¶

Initialize schema manager.

Parameters:

Name	Type	Description	Default
`schema_dir`	`Optional[Path]`	Directory containing schemas. Defaults to project's schemas/ directory.	`None`

Source code in ccdakit/validators/utils.py

def __init__(self, schema_dir: Optional[Path] = None):
    """
    Initialize schema manager.

    Args:
        schema_dir: Directory containing schemas. Defaults to project's schemas/ directory.
    """
    self.schema_dir = schema_dir or DEFAULT_SCHEMA_DIR
    self.schema_dir.mkdir(parents=True, exist_ok=True)

`is_installed()` ¶

Check if C-CDA schemas are installed.

Returns:

Type	Description
`bool`	True if CDA.xsd exists in schema directory

Source code in ccdakit/validators/utils.py

def is_installed(self) -> bool:
    """
    Check if C-CDA schemas are installed.

    Returns:
        True if CDA.xsd exists in schema directory
    """
    return self.get_cda_schema_path().exists()

`get_cda_schema_path()` ¶

Get path to main CDA.xsd schema file.

Returns:

Type	Description
`Path`	Path to CDA.xsd (may not exist)

Source code in ccdakit/validators/utils.py

def get_cda_schema_path(self) -> Path:
    """
    Get path to main CDA.xsd schema file.

    Returns:
        Path to CDA.xsd (may not exist)
    """
    return self.schema_dir / "CDA.xsd"

`get_schema_info()` ¶

Get information about installed schemas.

Returns:

Type	Description
`dict`	Dictionary with schema installation status and paths

Source code in ccdakit/validators/utils.py

def get_schema_info(self) -> dict:
    """
    Get information about installed schemas.

    Returns:
        Dictionary with schema installation status and paths
    """
    cda_path = self.get_cda_schema_path()
    return {
        "installed": cda_path.exists(),
        "schema_dir": str(self.schema_dir),
        "cda_schema": str(cda_path),
        "cda_exists": cda_path.exists(),
        "files": [f.name for f in self.schema_dir.iterdir() if f.is_file()],
    }

`download_schemas(version='R2.1', url=None, force=False)` ¶

Download C-CDA schemas from HL7.

Note: This is a helper function, but schemas may need to be downloaded manually from HL7's website due to licensing.

Parameters:

Name	Type	Description	Default
`version`	`str`	C-CDA version (R2.1 or R2.0)	`'R2.1'`
`url`	`Optional[str]`	Custom download URL (overrides version)	`None`
`force`	`bool`	Force re-download even if schemas exist	`False`

Returns:

Type	Description
`Tuple[bool, str]`	Tuple of (success: bool, message: str)

Raises:

Type	Description
`ValueError`	If version is not supported

Source code in ccdakit/validators/utils.py

def download_schemas(
    self,
    version: str = "R2.1",
    url: Optional[str] = None,
    force: bool = False,
) -> Tuple[bool, str]:
    """
    Download C-CDA schemas from HL7.

    Note: This is a helper function, but schemas may need to be
    downloaded manually from HL7's website due to licensing.

    Args:
        version: C-CDA version (R2.1 or R2.0)
        url: Custom download URL (overrides version)
        force: Force re-download even if schemas exist

    Returns:
        Tuple of (success: bool, message: str)

    Raises:
        ValueError: If version is not supported
    """
    if self.is_installed() and not force:
        return (
            True,
            f"Schemas already installed at {self.schema_dir}. Use force=True to re-download.",
        )

    if url is None:
        if version not in SCHEMA_URLS:
            raise ValueError(
                f"Unsupported version: {version}. "
                f"Supported versions: {list(SCHEMA_URLS.keys())}"
            )
        url = SCHEMA_URLS[version]

    try:
        # Download zip file
        zip_path = self.schema_dir / "schemas.zip"
        urlretrieve(url, zip_path)

        # Extract schemas
        with zipfile.ZipFile(zip_path, "r") as zip_ref:
            zip_ref.extractall(self.schema_dir)

        # Clean up zip file
        zip_path.unlink()

        return True, f"Schemas downloaded successfully to {self.schema_dir}"

    except Exception as e:
        return False, f"Failed to download schemas: {e}"

`print_installation_instructions()` ¶

Print instructions for manually downloading schemas.

Source code in ccdakit/validators/utils.py

    def print_installation_instructions(self) -> None:
        """Print instructions for manually downloading schemas."""
        instructions = f"""
C-CDA XSD Schema Installation Instructions
==========================================

The C-CDA XSD schemas must be downloaded from HL7 due to licensing restrictions.

Method 1: Download from HL7 (Recommended)
------------------------------------------
1. Visit the HL7 C-CDA download page:
   - R2.1: https://www.hl7.org/implement/standards/product_brief.cfm?product_id=492
   - R2.0: https://www.hl7.org/implement/standards/product_brief.cfm?product_id=379

2. Download the schema package (e.g., "CCDA_R2.1_Schemas.zip")

3. Extract the following files to: {self.schema_dir}
   - CDA.xsd (main schema file)
   - POCD_MT000040_CCDA.xsd
   - datatypes.xsd
   - voc.xsd
   - NarrativeBlock.xsd
   - SDTC/ directory (if available)

Method 2: Use Schema Manager (Automated)
-----------------------------------------
>>> from ccdakit.validators.utils import SchemaManager
>>> manager = SchemaManager()
>>> success, message = manager.download_schemas(version="R2.1")
>>> print(message)

Note: Automated download may not work due to HL7's licensing requirements.
      Manual download is recommended.

Verification
------------
After installation, verify schemas are available:

>>> manager = SchemaManager()
>>> info = manager.get_schema_info()
>>> print(info)

The 'cda_exists' field should be True.
"""
        print(instructions)

Validators API Reference¶

XSD Validator¶

XSDValidator ¶

Use default schemas (auto-downloads if needed)¶

Or provide custom schema path¶

Attributes¶

schema_location property ¶

Functions¶

__init__(schema_path=None, auto_download=True, max_errors=100) ¶

validate(document) ¶

validate_file(file_path) ¶

validate_string(xml_string) ¶

validate_bytes(xml_bytes) ¶

Schematron Validator¶

SchematronValidator ¶

Use default HL7 C-CDA R2.1 Schematron (auto-cleaned version)¶

Use custom Schematron file¶

Attributes¶

schematron_location property ¶

validation_phase property ¶

Functions¶

__init__(schematron_path=None, phase=None, auto_download=True, max_errors=100) ¶

validate(document) ¶

validate_file(file_path) ¶

validate_string(xml_string) ¶

validate_bytes(xml_bytes) ¶

Base Validator¶

BaseValidator ¶

Functions¶

validate(document) abstractmethod ¶

Validation Rules¶

ValidationRule ¶

Functions¶

__init__(name, description) ¶

validate(document) abstractmethod ¶

__repr__() ¶

Schema Manager¶

SchemaManager ¶

Functions¶

__init__(schema_dir=None) ¶

is_installed() ¶

get_cda_schema_path() ¶

get_schema_info() ¶

download_schemas(version='R2.1', url=None, force=False) ¶

print_installation_instructions() ¶

`XSDValidator` ¶

`schema_location` `property` ¶

`init(schema_path=None, auto_download=True, max_errors=100)` ¶

`validate(document)` ¶

`validate_file(file_path)` ¶

`validate_string(xml_string)` ¶

`validate_bytes(xml_bytes)` ¶

`SchematronValidator` ¶

`schematron_location` `property` ¶

`validation_phase` `property` ¶

`init(schematron_path=None, phase=None, auto_download=True, max_errors=100)` ¶

`validate(document)` ¶

`validate_file(file_path)` ¶

`validate_string(xml_string)` ¶

`validate_bytes(xml_bytes)` ¶

`BaseValidator` ¶

`validate(document)` `abstractmethod` ¶

`ValidationRule` ¶

`init(name, description)` ¶

`validate(document)` `abstractmethod` ¶

`repr()` ¶

`SchemaManager` ¶

`init(schema_dir=None)` ¶

`is_installed()` ¶

`get_cda_schema_path()` ¶

`get_schema_info()` ¶

`download_schemas(version='R2.1', url=None, force=False)` ¶

`print_installation_instructions()` ¶