Open Cybersecurity Schema Framework
The Open Cybersecurity Schema Framework (OCSF) is an open standard for cybersecurity event logging and data normalization. The framework is made up of a set of categories, event classes, data types, and an attribute dictionary. It is not restricted to cybersecurity nor to events, however the initial focus of the framework has been a schema for cybersecurity events.
This repository contains the core schema definitions that enable consistent representation of security events across different tools and platforms. The core schema for cybersecurity events is intended to be agnostic to implementations. OCSF is intended to be used by both products and devices which produce log events, analytic systems, and logging systems which retain log events.
🚀 Quick Start
Explore the Schema: Visit schema.ocsf.io to browse the complete schema interactively.
Key Resources:
- Understanding OCSF - Comprehensive white paper
- Contributing Guide - How to contribute to the schema
- Changelog - Latest updates and changes
📁 Repository Structure
├── events/ # Event class definitions organized by category
├── objects/ # Reusable object definitions
├── profiles/ # Schema profiles for specific use cases
├── extensions/ # Schema extensions (Linux, Windows, etc.)
├── metaschema/ # Schema validation rules
├── templates/ # Template definitions
├── categories.json # Event category definitions
├── dictionary.json # Attribute dictionary
└── version.json # Current schema version
🎯 What is OCSF?
OCSF provides:
- Standardized Event Schema: Common structure for cybersecurity events
- Extensible Framework: Support for custom extensions and profiles
- Format Agnostic: Works with JSON, Parquet, Avro, and other formats
- Vendor Neutral: Open standard not tied to any specific vendor
The framework consists of:
- Categories: High-level groupings (Network, System, Application, etc.)
- Event Classes: Specific event types within categories
- Objects: Reusable data structures
- Attributes: Individual data fields with standardized definitions
🔧 Usage
OCSF is designed for:
- Security Tools: SIEM, SOAR, EDR, and other security platforms
- Log Producers: Applications, devices, and systems generating security events
- Analytics Platforms: Tools processing and analyzing security data
- Data Pipelines: ETL processes normalizing security data
Automated PR Review
Pull requests that change schema definitions are automatically reviewed by two complementary workflows.
Static Anti-Pattern Check
A deterministic checker (check_antipatterns.py) runs on every PR and flags structural design issues in changed attributes. It runs without an API key — fast, free, and reproducible.
What it detects:
- Boolean Trap —
is_*_knownbooleans that mask multi-state concepts - Boolean Proliferation — objects accumulating 3+
is_*booleans that may belong in an enum - Missing Sibling —
_idenum attributes without a corresponding string sibling - Tautological Description — descriptions that just restate the attribute name
- Enum Without Description — enum values that have captions but no descriptions
- Generic Naming — attributes named
type,name,valuewith thin descriptions - Type Inconsistency — same attribute name with different types across objects
- ID Without Enum —
integer_t_idattributes with no enum values defined - Duplicate Attribute —
is_X/Xpairs with the same type - Duplicate Description — different attribute names sharing identical description text
- Learned Patterns — rules extracted from previous LLM reviews (stored in
.github/config/learned_antipatterns.json)
Deprecated attributes are ignored across all checks.
LLM Description Review
A Claude-powered reviewer (review_descriptions.py) evaluates the compiled (fully resolved) schema for description quality and semantic anti-patterns that static analysis cannot catch.
How it works:
- The
description-review-compileworkflow triggers onpull_requestevents, compiles the schema withocsf-schema-compiler, and saves review context as an artifact - The
description-review-commentworkflow triggers on successful compilation, downloads the artifact, calls Claude, and posts an advisory comment on the PR
What it evaluates:
- Description clarity, precision, and self-containedness
- Whether descriptions are specific enough for an LLM to correctly populate fields from raw telemetry
- Semantic anti-patterns (overlap between attributes, incorrect type choices, indirect attribution)
- CHANGELOG.md convention compliance
The LLM reviewer is context-aware — on re-reviews it acknowledges fixed suggestions and re-raises outstanding ones.
LLM-to-Static Learning Pipeline
When the LLM discovers a novel anti-pattern not covered by static rules, the finding is labeled in the PR comment and logged as JSON ready for addition to .github/config/learned_antipatterns.json. Once a human approves and merges the learned pattern, the static checker picks it up on all future PRs — turning a one-time LLM insight into a permanent, free, deterministic check.
🤝 Contributing
We welcome contributions! Please see our Contributing Guide for details on:
- How to propose schema changes
- Development workflow
- Community guidelines
📋 Versioning
OCSF follows semantic versioning. Check version.json for the current version.
📄 License
Licensed under the Apache License 2.0.
Need Help?