c77_mvc

PostgreSQL Materialized View Management Extension

← Back to Portfolio

c77_mvc PostgreSQL Extension

A PostgreSQL extension for materialized view management and table fitness analysis.

PostgreSQL 11+ MIT License Version 1.0

🔍 Overview

c77_mvc (Materialized View and table fitness utilities) provides database administrators and developers with tools to:

  1. Create optimized materialized views with synthetic keys and content hashing
  2. Manage materialized view health with automatic staleness detection and refresh
  3. Analyze table fitness for partitioning, indexing, and query optimization
  4. Monitor data quality with encoding issue detection and isolation

✨ Features

📊 Materialized View Management

  • Create materialized views with synthetic keys and proper partitioning
  • Track content hashes to detect data changes efficiently
  • Isolate records with encoding issues into separate views
  • Monitor materialized view health with configurable thresholds
  • Automatically refresh views based on staleness metrics
  • Estimate refresh times based on historical performance

🔧 Table Fitness Analysis

  • Analyze column characteristics for partitioning and indexing
  • Identify optimal column combinations for keys and partitioning
  • Evaluate data quality with comprehensive metrics
  • Calculate overall Data Quality Index (DQI)
  • Use statistical sampling for efficient analysis of large tables

📋 Requirements

🚀 Installation

Quick Install

If you have both extensions available in your PostgreSQL extensions directory:

CREATE EXTENSION c77_dbh;  -- Install dependency first
CREATE EXTENSION c77_mvc;

From Source

# Clone repository
git clone https://git.jctr3.com/trogers1884/c77_mvc.git
cd c77_mvc

# Copy files to PostgreSQL extension directory
export PGEXTDIR=$(pg_config --sharedir)/extension
sudo cp c77_mvc.control $PGEXTDIR/
sudo cp c77_mvc--1.0.sql $PGEXTDIR/

# Create extension in your database
psql -d your_database -c "CREATE EXTENSION c77_dbh;"
psql -d your_database -c "CREATE EXTENSION c77_mvc;"

💡 Basic Usage

Table Fitness Analysis

-- Analyze a table for fitness metrics
SELECT * FROM public.c77_mvc_analyze_table_fitness('schema_name', 'table_name');

Creating Optimized Materialized Views

-- Create an optimized materialized view
SELECT * FROM public.c77_mvc_create_optimized_matv(
  'source_schema',           -- Source schema
  'source_table',            -- Source table
  'target_schema',           -- Target schema
  'matc_target_view_name',   -- Target materialized view name (must start with matc_)
  ARRAY['customer_id'],      -- Partition columns
  ARRAY['last_updated'],     -- Order-by columns
  ARRAY['notes'],            -- Columns to exclude from hash calculation (optional)
  false                      -- Filter for latest records only (optional)
);

Managing Materialized View Health

-- Check materialized view health
SELECT * FROM public.c77_mvc_manage_matv_health(
  'schema_name',       -- Schema name
  'matc_view_name',    -- Materialized view name
  'quick',             -- Validation type: 'quick', 'daily', 'full'
  NULL                 -- Action: NULL, 'refresh', 'repair', 'reindex'
);

-- Check and refresh if needed
SELECT * FROM public.c77_mvc_manage_matv_health(
  'schema_name',
  'matc_view_name',
  'daily',
  'refresh'
);

🏗️ View Structure

When you create an optimized materialized view, the extension creates multiple objects:

Object Naming Pattern Purpose
View vtw_* Source view with content hash, synthetic key, and encoding status
Materialized View matc_* Materialized copy of the vtw_ view
View vm_* Clean data view (excludes encoding issues)
View vprob_* Problematic data view (only encoding issues)

🔍 Table Fitness Analysis Deep Dive

The table fitness analysis provides comprehensive insights into your table structure and data quality:

Key Metrics Analyzed

📊 Column Characteristics

  • Data type distribution
  • Null value percentages
  • Unique value counts
  • Average and maximum lengths

🔑 Key Analysis

  • Primary key effectiveness
  • Composite key recommendations
  • Partitioning column candidates
  • Index optimization suggestions

📈 Data Quality Index (DQI)

  • Overall data quality score
  • Completeness metrics
  • Consistency indicators
  • Encoding issue detection

Example Analysis Output

-- Sample fitness analysis result
SELECT 
    table_name,
    total_rows,
    data_quality_index,
    recommended_partition_columns,
    suggested_indexes,
    encoding_issues_count
FROM public.c77_mvc_analyze_table_fitness('public', 'large_customer_table');

🎯 Materialized View Optimization

Synthetic Key Generation

The extension automatically generates synthetic keys for materialized views to improve performance:

-- The generated view includes a synthetic key
-- Example structure of vtw_ view:
SELECT 
    md5(concat_ws('|', col1, col2, col3))::uuid AS synthetic_key,
    sha256(row_data) AS content_hash,
    CASE WHEN has_encoding_issues(row_data) THEN true ELSE false END AS has_encoding_issues,
    *
FROM source_table;

Content Hash Tracking

Content hashing enables efficient change detection:

💡 How Content Hashing Works

  • SHA-256 hashing of row content (excluding specified columns)
  • Change detection by comparing hash values
  • Incremental refreshes based on hash differences
  • Data integrity verification through hash validation

Health Monitoring

The extension provides comprehensive health monitoring for materialized views:

-- Monitor view staleness and performance
SELECT 
    view_name,
    last_refresh_time,
    refresh_duration,
    staleness_score,
    recommended_action
FROM public.c77_mvc_manage_matv_health('myschema', 'matc_customer_summary', 'daily', NULL);

🔧 Advanced Configuration

Custom Partitioning Strategies

The extension supports various partitioning strategies based on your data patterns:

-- Time-based partitioning
SELECT * FROM public.c77_mvc_create_optimized_matv(
    'sales', 'transactions', 'analytics', 'matc_monthly_sales',
    ARRAY['date_trunc(''month'', transaction_date)'],  -- Monthly partitions
    ARRAY['transaction_date DESC'],
    ARRAY['notes', 'metadata'],
    false
);

-- Hash-based partitioning for large datasets
SELECT * FROM public.c77_mvc_create_optimized_matv(
    'users', 'profiles', 'cache', 'matc_user_profiles',
    ARRAY['hashtext(user_id::text) % 10'],  -- 10 hash partitions
    ARRAY['last_active DESC'],
    ARRAY['temporary_data'],
    true
);

Data Quality Monitoring

Separate views are created to isolate and monitor data quality issues:

-- Check for encoding issues
SELECT COUNT(*) FROM vprob_customer_data;

-- View clean data only
SELECT * FROM vm_customer_data WHERE active = true;

-- Full dataset with quality indicators
SELECT 
    *,
    CASE WHEN has_encoding_issues THEN 'PROBLEM' ELSE 'CLEAN' END AS data_status
FROM matc_customer_data;

📊 Performance Optimization

Statistical Sampling

For large tables, the extension uses statistical sampling to provide efficient analysis:

📈 Sampling Strategy

  • Adaptive sampling based on table size
  • Stratified sampling for better representation
  • Confidence intervals for accuracy estimation
  • Sample size optimization to balance speed and accuracy

Refresh Optimization

-- Optimize refresh performance with parallel processing
-- The extension automatically determines optimal refresh strategies

-- Check refresh performance history
SELECT 
    refresh_date,
    duration_seconds,
    rows_affected,
    refresh_method
FROM matv_refresh_history 
WHERE view_name = 'matc_large_dataset'
ORDER BY refresh_date DESC
LIMIT 10;

🎯 Use Cases

📊 Analytics Dashboards

Create fast, up-to-date materialized views for dashboard queries with automatic refresh management

🔍 Data Quality Monitoring

Isolate and monitor data quality issues while maintaining clean datasets for production use

⚡ Performance Optimization

Analyze table fitness to optimize indexing, partitioning, and query performance

🔄 ETL Pipeline Support

Efficiently track data changes and optimize refresh cycles for ETL processes

📈 Reporting Systems

Generate optimized materialized views for complex reporting requirements

🗂️ Data Archival

Create efficient snapshots and historical views with content integrity verification

🛠️ Troubleshooting

Common Issues

🚨 Dependency Extension Missing

If you see errors about missing functions, ensure c77_dbh extension is installed first:

CREATE EXTENSION c77_dbh;
CREATE EXTENSION c77_mvc;

Performance Tuning

Intelligent materialized view management for PostgreSQL!

This extension combines advanced materialized view optimization with comprehensive table fitness analysis to help you build high-performance data systems.

View Git Repository →