Semantics

Filters

Data filtering conditions for metrics, allowing you to focus on specific subsets of data and create targeted analytics.

Overview

Filters are essential for creating focused analytics by restricting data to specific criteria. They can be applied at different stages of query execution and support various comparison operators and data types.

Diagram showing how filters work at different stages of query execution

Use Cases

Filters are used for:

  • Data Segmentation: Focus on specific user groups, time periods, or regions
  • Quality Control: Exclude invalid or test data from analysis
  • Performance Optimization: Reduce data volume for faster queries
  • Business Logic: Apply complex business rules and conditions
  • Security: Restrict data access based on user permissions

Syntax

Type: SemanticFilter

List of filters to be applied to the data. Defines data filtering conditions for WHERE or HAVING clauses.

{
    "filters": [
        {
            "name": "active_users",
            "description": "Filter for active users only",
            "query": "status",
            "table": "users",
            "operator": "equals",
            "value": "active",
            "value_type": "string",
            "filter_type": "where",
            "is_active": true
        },
        {
            "name": "revenue_range",
            "description": "Revenue between limits",
            "query": "revenue",
            "operator": "between",
            "min_value": 1000,
            "max_value": 10000,
            "value_type": "number",
            "filter_type": "having"
        },
        {
            "name": "segment_filter",
            "description": "Multiple segment filter",
            "query": "segment",
            "operator": "in",
            "values": ["premium", "enterprise"],
            "value_type": "string"
        }
    ]
}

Parameters

name

Type: str
Required: Yes

Unique identifier name for this filter. Used to reference the filter in queries and API calls.

{
    "name": "active_users"
}

description

Type: Optional[str]
Default: None
Required: No

Human-readable explanation of this filter. Helps developers understand the filtering logic.

{
    "description": "Only include users with active status"
}

query

Type: str
Required: Yes

Column name or SQL expression that defines this filter. Used in query generation to specify the data field for filtering.

{
    "query": "status"
}

table

Type: Optional[str]
Default: None
Required: No

Source table name for the filter. Specifies which table the filter applies to.

{
    "table": "users"
}

operator

Type: Optional[FilterOperator]
Default: None
Required: No

Comparison operator for the filter. Determines how the filter value is compared.

Available Values:

  • equals, not_equals
  • greater_than, greater_than_equals
  • less_than, less_than_equals
  • in, not_in, like, not_like
  • is_null, is_not_null
  • between, not_between
{
    "operator": "equals"
}

value

Type: Optional[Any]
Default: None
Required: No

Single comparison value for the filter. Used with operators like equals, greater_than, etc.

{
    "value": "active"
}

value_type

Type: FilterValueType
Default: FilterValueType.STRING
Required: No

Type of the filter value. Ensures proper data type handling in the filter.

Available Values:

  • string, number, boolean, date, timestamp, array, null
{
    "value_type": "string"
}

filter_type

Type: FilterType
Default: FilterType.WHERE
Required: No

Where to apply the filter. Determines whether filtering happens before or after aggregation.

Available Values:

  • where: Pre-aggregation filtering
  • having: Post-aggregation filtering
{
    "filter_type": "where"
}

is_active

Type: bool
Default: True
Required: No

Whether this filter is currently active. Allows you to enable/disable filters without removing them.

{
    "is_active": true
}

custom_expression

Type: Optional[str]
Default: None
Required: No

Custom SQL expression that overrides the standard filter logic. Allows for complex filtering conditions.

{
    "custom_expression": "status = 'active' AND created_at > '2024-01-01'"
}

conditional

Type: bool
Default: False
Required: No

Boolean flag indicating whether to use conditional logic (CASE WHEN statements) instead of the standard filter operators.

{
    "conditional": true
}

conditions

Type: Optional[Condition]
Default: None
Required: No

A Condition object defining complex filtering logic with CASE WHEN statements.

{
    "conditional": true,
    "conditions": {
        "when_clauses": [...],
        "else_return": "..."
    }
}

Conditional Logic

Filters support conditional logic for complex filtering scenarios that can't be expressed with simple operators.

Example: Filter with Date Extraction

Filter by weekday sessions only:

{
    "name": "weekday_only",
    "description": "Filter for weekday sessions",
    "query": "session_date",
    "table": "sessions",
    "conditional": true,
    "conditions": {
        "when_clauses": [
            {
                "field": {
                    "column": "session_date",
                    "table": "sessions",
                    "transforms": [
                        {
                            "function": "EXTRACT",
                            "params": {"part": "DOW"}
                        }
                    ]
                },
                "operator": "IN",
                "compare_values": [1, 2, 3, 4, 5],
                "then_return": 1
            }
        ],
        "else_return": 0
    },
    "filter_type": "where"
}

Choosing Between Filter Types

Use standard operators when:

  • Simple value comparisons
  • Straightforward filtering logic
  • Performance is critical

Use custom_expression when:

  • Complex SQL is required
  • Mix of AND/OR logic
  • Can't be expressed with available operators

Use conditional when:

  • Need data transformations before filtering
  • CASE WHEN logic is required
  • Want to leverage the transform pipeline

For complete details on building conditional statements, see the Conditional Logic Architecture.

values

Type: Optional[List[Any]]
Default: None
Required: No

Array of values for IN/NOT_IN operators. Used when filtering against multiple values.

{
    "values": ["premium", "enterprise"]
}

min_value

Type: Optional[Any]
Default: None
Required: No

Minimum value for BETWEEN operator. Used with range-based filtering.

{
    "min_value": 1000
}

max_value

Type: Optional[Any]
Default: None
Required: No

Maximum value for BETWEEN operator. Used with range-based filtering.

{
    "max_value": 10000
}

formatting

Type: Optional[List[OutputFormat]]
Default: None
Required: No

Array of OutputFormat objects for filter value transformation. Allows formatting of filter values.

{
    "formatting": [
        {
            "name": "date_format",
            "type": "format",
            "format_type": "datetime",
            "format_string": "YYYY-MM-DD"
        }
    ]
}

Patterns

Basic Equality Filters

{
    "filters": [
        {
            "name": "active_users",
            "query": "status",
            "operator": "equals",
            "value": "active",
            "value_type": "string"
        },
        {
            "name": "premium_users",
            "query": "subscription_tier",
            "operator": "equals",
            "value": "premium",
            "value_type": "string"
        }
    ]
}

Range Filters

{
    "filters": [
        {
            "name": "revenue_range",
            "query": "total_revenue",
            "operator": "between",
            "min_value": 1000,
            "max_value": 10000,
            "value_type": "number"
        },
        {
            "name": "date_range",
            "query": "created_at",
            "operator": "between",
            "min_value": "2024-01-01",
            "max_value": "2024-12-31",
            "value_type": "date"
        }
    ]
}

List Filters

{
    "filters": [
        {
            "name": "allowed_countries",
            "query": "country_code",
            "operator": "in",
            "values": ["US", "CA", "MX", "GB"],
            "value_type": "string"
        },
        {
            "name": "excluded_segments",
            "query": "user_segment",
            "operator": "not_in",
            "values": ["test", "internal"],
            "value_type": "string"
        }
    ]
}

Pattern Matching Filters

{
    "filters": [
        {
            "name": "email_domains",
            "query": "email",
            "operator": "like",
            "value": "%@company.com",
            "value_type": "string"
        },
        {
            "name": "exclude_test_emails",
            "query": "email",
            "operator": "not_like",
            "value": "%@test.%",
            "value_type": "string"
        }
    ]
}

Null Value Filters

{
    "filters": [
        {
            "name": "has_email",
            "query": "email",
            "operator": "is_not_null",
            "value_type": "string"
        },
        {
            "name": "exclude_incomplete",
            "query": "phone_number",
            "operator": "is_not_null",
            "value_type": "string"
        }
    ]
}

Complex Custom Filters

{
    "filters": [
        {
            "name": "high_value_customers",
            "query": "user_id",
            "custom_expression": "total_revenue > 10000 AND subscription_tier = 'premium'",
            "value_type": "string"
        },
        {
            "name": "recent_active_users",
            "query": "user_id",
            "custom_expression": "last_login_date > CURRENT_DATE - INTERVAL '30 days' AND status = 'active'",
            "value_type": "string"
        }
    ]
}

WHERE vs HAVING Filters

WHERE Filters (Pre-aggregation)

{
    "name": "active_orders",
    "query": "status",
    "operator": "equals",
    "value": "completed",
    "filter_type": "where"
}

Filters individual records before aggregation

HAVING Filters (Post-aggregation)

{
    "name": "high_revenue_segments",
    "query": "total_revenue",
    "operator": "greater_than",
    "value": 100000,
    "filter_type": "having"
}

Filters aggregated results after grouping

Dynamic Filters

Parameter-based Filters

{
    "filters": [
        {
            "name": "date_range_filter",
            "query": "created_at",
            "operator": "between",
            "min_value": "$CORTEX_start_date",
            "max_value": "$CORTEX_end_date",
            "value_type": "date"
        }
    ]
}

User-specific Filters

{
    "filters": [
        {
            "name": "user_region_filter",
            "query": "region",
            "operator": "equals",
            "value": "$CORTEX_user_region",
            "value_type": "string"
        }
    ]
}

Performance

Indexing Strategy

{
    "filters": [
        {
            "name": "status_filter",
            "query": "status",
            "operator": "equals",
            "value": "active"
        }
    ]
}
# Ensure index on status column

Filter Order

  • Apply most selective filters first
  • Use indexed columns when possible
  • Consider filter cardinality

Best Practices

  1. Use Appropriate Operators: Choose the right operator for your use case
  2. Optimize for Performance: Use indexed columns and selective filters
  3. Handle Data Types: Ensure proper value types for accurate filtering
  4. Test Edge Cases: Consider null values and edge cases
  5. Document Complex Logic: Use descriptions for complex custom expressions

Video Tutorial

Video: How to set up and configure filters in the Cortex dashboard

Filter Builder Interface

Screenshot of the visual filter builder in the Cortex dashboard

Advanced Examples

Multi-condition Filters

{
    "filters": [
        {
            "name": "complex_user_filter",
            "custom_expression": "status = 'active' AND (subscription_tier = 'premium' OR total_revenue > 5000) AND created_at > '2024-01-01'"
        }
    ]
}

Time-based Filters

{
    "filters": [
        {
            "name": "last_30_days",
            "query": "created_at",
            "operator": "greater_than",
            "value": "CURRENT_DATE - INTERVAL '30 days'",
            "value_type": "date"
        }
    ]
}