Skip to content

Tech debt: Improve documentation of Event model fields in Kafka parser models #7118

@leandrodamascena

Description

@leandrodamascena

Description

Enhance the Kafka parser models with field descriptions and examples using Pydantic's Field() functionality. This improvement will provide better documentation and metadata for Kafka event parsing, following the pattern established in PR #7100.

Motivation

Currently, the Kafka models lack detailed field documentation, making it harder for developers to:

  • Understand field purposes without referencing external AWS documentation
  • Generate rich API documentation with tools like Swagger/OpenAPI
  • Create realistic test data using model factories
  • Get helpful IntelliSense in IDEs

Proposed Changes

Add description and examples parameters to all fields in the following models using Field():

Files to modify:

  • aws_lambda_powertools/utilities/parser/models/kafka.py

Reference events:
Check the sample events in tests/events/ for realistic field values:

  • kafkaEvent.json
  • kafkaMskEvent.json
  • kafkaSelfManagedEvent.json

Implementation Requirements

  • ✅ Add detailed description for each field explaining its purpose and usage
  • ✅ Include practical examples showing realistic AWS Kafka values
  • ✅ Base descriptions on official AWS MSK and Kafka documentation
  • ✅ Maintain all existing functionality, types, and validation logic
  • ✅ Follow the same pattern established in EventBridge, Kinesis, and ALB models

Example Implementation

# Before
class KafkaRecordModel(BaseModel):
    topic: str
    partition: int
    offset: int
    timestamp: int
    timestamp_type: str
    key: str
    value: str

# After  
class KafkaRecordModel(BaseModel):
    topic: str = Field(
        description="The Kafka topic name from which the record originated.",
        examples=[
            "my-topic",
            "user-events",
            "order-processing"
        ]
    )
    partition: int = Field(
        description="The partition number within the topic from which the record was consumed.",
        examples=[0, 1, 5]
    )
    offset: int = Field(
        description="The offset of the record within the partition.",
        examples=[123, 456789, 1000000]
    )
    timestamp: int = Field(
        description="The timestamp of the record in milliseconds since Unix epoch.",
        examples=[1640995200000, 1672531200000]
    )
    timestamp_type: str = Field(
        description="The type of timestamp (CreateTime or LogAppendTime).",
        examples=["CreateTime", "LogAppendTime"]
    )
    key: str = Field(
        description="The message key, base64-encoded.",
        examples=[
            "dXNlci0xMjM=",
            "b3JkZXItNDU2"
        ]
    )
    value: str = Field(
        description="The message value, base64-encoded.",
        examples=[
            "eyJtZXNzYWdlIjogIkhlbGxvIEthZmthIn0=",
            "eyJ1c2VyX2lkIjogMTIzLCAiYWN0aW9uIjogImxvZ2luIn0="
        ]
    )

Benefits

For Developers

  • Better IntelliSense with field descriptions and example values
  • Self-documenting code without needing external AWS documentation
  • Faster development with immediate reference for acceptable values

For Documentation Tools

  • Rich Swagger/OpenAPI docs via .model_json_schema()
  • Automated documentation generation with comprehensive metadata
  • Interactive documentation with practical examples

Getting Started

This is a great first issue for newcomers to Powertools for AWS! The task is straightforward and helps you get familiar with our codebase structure.

Need help?

We're here to support you! Feel free to:

  • Ask questions in the comments
  • Request guidance on implementation approach

Acknowledgment

Metadata

Metadata

Assignees

No one assigned

    Labels

    good first issueGood for newcomershelp wantedCould use a second pair of eyes/handsparserParser (Pydantic) utilitytech-debtTechnical Debt tasks

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions