Nova postagem

查找

Pergunta
· Mar. 2

download files

Hi,

How can I connect to AWS S3 and download a file?

I'm running : IRIS for Windows (x86-64) 2024.1 (Build 262U) Thu Mar 7 2024 15:55:25 EST

 

Thanks

1 Comment
Discussão (1)2
Entre ou crie uma conta para continuar
Artigo
· Mar. 1 22min de leitura

Multi-Layered Security Architecture for IRIS Deployments on AWS with InterSystems IAM

Introduction

In today's rapidly evolving threat landscape, organizations deploying mission-critical applications must implement robust security architectures that protect sensitive data while maintaining high availability and performance. This is especially crucial for enterprises utilizing advanced database management systems like InterSystems IRIS, which often powers applications handling highly sensitive healthcare, financial, or personal data.

This article details a comprehensive, multi-layered security architecture for deploying InterSystems IRIS clusters on AWS using Kubernetes (EKS) and InterSystems IAM. By implementing defense-in-depth principles, this architecture provides protection at every level—from the network perimeter to the application layer and data storage.

Why a Multi-Layered Approach Matters

Single-layer security strategies are increasingly inadequate against sophisticated attack vectors. When one security control fails, additional layers must be in place to prevent a complete compromise. Our architecture implements security controls at five critical layers:

  1. Perimeter Security: Using AWS WAF and CloudFront to filter malicious traffic before it reaches your services
  2. Network Security: Leveraging AWS VPC, Security Groups, and Kubernetes network policies
  3. API Security: Implementing InterSystems IAM with advanced security plugins
  4. Application Security: Hardening the Web Gateway with strict URI restrictions
  5. Database Security: Configuring IRIS Cluster with robust authentication and encryption

By the end of this article, you'll understand how to implement each security layer and how they work together to create a defense-in-depth strategy that protects your IRIS deployments against a wide range of threats while maintaining performance and scalability.
 


Architecture Overview

Our security architecture is built around the principle of defense-in-depth, with each layer providing complementary protection. Here's a high-level overview of the complete solution:


IRIS Cluster Deployment Structure

Our IRIS Cluster is deployed using the InterSystems Kubernetes Operator (IKO) with a carefully designed topology:

  1. Data Tier: Two IRIS instances in a mirrored configuration for high availability and data redundancy
  2. Application Tier: Two IRIS application servers that access data via ECP (Enterprise Cache Protocol)
  3. API Gateway: InterSystems IAM (based on Kong) for API management and security
  4. Web Gateway: Three Web Gateway instances (CSP+Nginx) for handling web requests
  5. Arbiter: One arbiter instance for the mirrored data tier

This architecture separates concerns and provides multiple layers of redundancy:

  • The data tier handles the database operations with synchronous mirroring
  • The application tier focuses on processing business logic
  • The IAM layer manages API security
  • The Web Gateway layer handles HTTP/HTTPS requests

Each component plays a specific role in the security stack:

  1. AWS WAF (Web Application Firewall): Filters malicious traffic using rule sets that protect against common web exploits, SQL injection, and cross-site scripting (XSS). It also implements URI whitelisting to restrict access to only legitimate application paths.
  2. AWS CloudFront: Acts as a Content Delivery Network (CDN) that caches static content, reducing the attack surface by handling requests at edge locations. It also provides an additional layer of DDoS protection.
  3. AWS ALB (Application Load Balancer): Configured as a Kubernetes Ingress controller, it performs TLS termination and routes traffic to the appropriate backend services based on URL paths.
  4. InterSystems IAM: Built on Kong, this API gateway enforces authentication, authorization, rate limiting, and request validation before traffic reaches the application.
  5. Web Gateway: The InterSystems Web Gateway with hardened configuration restricts access to specific URI paths and provides additional validation.
  6. IRIS Cluster: The IRIS database deployed in a Kubernetes cluster with secure configuration, TLS encryption, and role-based access controls.

This multi-layered approach ensures that even if one security control is bypassed, others remain in place to protect your applications and data.


Layer 1: Perimeter Security with AWS WAF and CloudFront

The first line of defense in our architecture is at the network perimeter, where we implement AWS WAF and CloudFront to filter malicious traffic before it reaches our services.

1.1 AWS WAF Implementation

AWS Web Application Firewall is configured with custom rule sets to protect against common web exploits and restrict access to authorized URI paths only. Here's how we've configured it:

# WAF Configuration in Ingress alb.ingress.kubernetes.io/wafv2-acl-arn: arn:aws:wafv2:region-1:ACCOUNT_ID:regional/webacl/app_uri_whitelisting/abcdef123456 

Our WAF rules include:

  • URI Path Whitelisting: Only allowing traffic to specified application paths such as /app/, /csp/broker/, /api/, and /csp/appdata
  • SQL Injection Protection: Blocking requests containing SQL injection patterns
  • XSS Protection: Filtering requests with cross-site scripting payloads
  • Rate-Based Rules: Automatically blocking IPs that exceed request thresholds
  • Geo-Restriction Rules: Limiting access to specific geographic regions when appropriate

By implementing these rules at the perimeter, we prevent a significant portion of malicious traffic from ever reaching our application infrastructure.
 

1.2 CloudFront Integration

AWS CloudFront works alongside WAF to provide additional security benefits:

  • Edge Caching: Static content is cached at edge locations, reducing the load on backend services and minimizing the attack surface
  • DDoS Protection: CloudFront's globally distributed infrastructure helps absorb DDoS attacks
  • TLS Enforcement: All connections are secured with TLS 1.2+ and modern cipher suites
  • Origin Access Identity: Ensures that S3 buckets hosting static content can only be accessed through CloudFront

CloudFront is configured to forward specific headers to the backend services, ensuring that security contexts are preserved throughout the request flow:

  • X-Forwarded-For
  • X-Real-IP

This configuration allows downstream services to identify the original client IP address for rate limiting and logging purposes, even as requests pass through multiple layers.


Layer 2: Network Security with AWS VPC and Security Groups

The second layer of our security architecture focuses on network-level controls implemented through AWS VPC, Security Groups, and Kubernetes network policies.

2.1 VPC Design for Isolation

Our IRIS deployment runs within a custom VPC with the following characteristics:

  • Private Subnets: All IRIS and IAM pods run in private subnets with no direct internet access
  • NAT Gateways: Outbound internet access is controlled through NAT gateways
  • Multiple Availability Zones: Resources are distributed across three AZs for high availability

This design ensures that backend services are never directly exposed to the internet, requiring all traffic to flow through the controlled ingress points.

2.2 Security Group Configuration

Security groups act as virtual firewalls controlling inbound and outbound traffic. Our implementation includes multiple security groups with tightly scoped rules:

# Security Groups referenced in Ingress alb.ingress.kubernetes.io/security-groups: sg-000000000, sg-0100000000, sg-012000000, sg-0130000000  These security groups implement:
  • Ingress Rules: Allowing traffic only on required ports (443 for HTTPS)
  • Source IP Restrictions: Limiting access to specific CIDR blocks for administrative interfaces
  • Egress Rules: Restricting outbound connections to only necessary destinations

This granular control ensures that even if a container is compromised, its ability to communicate with other resources is limited by the security group rules.

2.3 Kubernetes Network Policies

Within the EKS cluster, we implement Kubernetes Network Policies to control pod-to-pod communication:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-iam-webgateway
  namespace: app1
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/component: webgateway
  policyTypes:
    - Ingress
  ingress:
    - ports:
        - protocol: TCP
          port: 443

 

These policies ensure that:

  • IRIS pods only accept connections from authorized sources (Web Gateway, IAM)
  • IAM pods only accept connections from the Ingress controller
  • Web Gateway pods only accept connections from IAM

This multi-layered network security approach creates isolation boundaries that contain potential security breaches and limit lateral movement within the application environment.


Layer 3: API Security with InterSystems IAM

At the heart of our security architecture lies InterSystems IAM , a powerful API management solution built on Kong. This component provides critical security capabilities including authentication, authorization, rate limiting, and request validation.

3.1 InterSystems IAM Overview

InterSystems IAM serves as the API gateway for all requests to IRIS services, ensuring that only authorized and legitimate traffic reaches your application. In our implementation, IAM is deployed as a StatefulSet within the same Kubernetes cluster as the IRIS instances, allowing for seamless integration while maintaining isolation of concerns.

The IAM gateway is configured with TLS/SSL termination and is exposed only through secure endpoints. All communication between IAM and the IRIS Web Gateway is encrypted, ensuring data privacy in transit.

3.2 Advanced Rate Limiting Configuration

To protect against denial-of-service attacks and abusive API usage, we've implemented advanced rate limiting through IAM's rate-limiting-advanced plugin. This configuration uses Redis as a backend store to track request rates across distributed IAM instances.

 

{ "name": "rate-limiting-advanced", "config": { "identifier": "ip", "strategy": "redis", "window_type": "sliding", "limit": [2000, 3000], "window_size": [60, 60], "redis": { "host": "my-release-redis-master.default.svc.cluster.local", "port": 6379, "timeout": 2000, "keepalive_pool_size": 30 }, "error_message": "API rate limit exceeded" } }  

This configuration provides two tiers of rate limiting:

  • Tier 1: 2,000 requests per minute with a sliding window
  • Tier 2: 3,000 requests per minute with a sliding window

The sliding window approach provides more accurate rate limiting compared to fixed windows, preventing traffic spikes at window boundaries. When a client exceeds these limits, they receive a 429 status code with a custom error message.

3.3 Secure Session Management

For applications requiring user sessions, we've configured IAM's session plugin with secure settings to prevent session hijacking and maintain proper session lifecycle:

{ "name": "session", "config": { "secret": "REDACTED", "cookie_secure": true, "cookie_same_site": "Strict", "cookie_http_only": true, "idling_timeout": 900, "absolute_timeout": 86400, "rolling_timeout": 14400 } }

Key security features implemented include:

  • HTTP-only cookies: Prevents JavaScript access to session cookies, mitigating XSS attacks
  • Secure flag: Ensures cookies are only sent over HTTPS connections
  • Same-Site restriction: Prevents CSRF attacks by restricting cookie usage to same-site requests
  • Multiple timeout mechanisms:
    • Idling timeout (15 minutes): Expires sessions after inactivity
    • Rolling timeout (4 hours): Requires re-authentication periodically
    • Absolute timeout (24 hours): Maximum session lifetime regardless of activity

3.4 Request Validation for Input Sanitization

To protect against injection attacks and malformed requests, we've implemented strict request validation using IAM's request-validator plugin. This is particularly important for securing the CSP broker, which is a critical component that handles client-server communication in InterSystems applications.

{ "name": "request-validator", "config": { "version": "kong", "body_schema": [ { "RequestParam": { "type": "integer", "required": true, "between": [1, 10] } }, { "EventType": { "type": "string", "required": true, "match": "^[a-zA-Z0-9$_]{100}$" } }, { "SessionID": { "type": "string", "required": true, "match": "^00b0[a-zA-Z0-9]{40}$" } } ], "verbose_response": true, "allowed_content_types": ["application/x-www-form-urlencoded"] } } 

This configuration enforces strict validation rules:

  • Input fields must match exact data types and constraints
  • String inputs must match specific regular expression patterns
  • Only allowed content types are accepted

The CSP broker is particularly sensitive because it serves as a communication channel between client browsers and the IRIS server. By validating all requests at the IAM layer before they reach the broker, we create an additional security barrier that protects against malformed or malicious requests targeting this critical component. When a request fails validation, IAM returns a detailed error response that helps identify the validation issue without revealing sensitive information about your backend systems.

3.5 Trusted IPs Configuration

To further enhance security, IAM is configured to recognize trusted proxies and properly determine client IP addresses:

{ "trusted_ips": [ "10.0.0.0/24", "10.1.0.0/24", "10.0.3.0/24" ], "real_ip_header": "X-Forwarded-For", "real_ip_recursive": "on" }

This configuration ensures that:

  • Rate limiting correctly identifies client IPs even behind proxies
  • Security rules using IP identification work properly
  • Access logs record actual client IPs rather than proxy IPs

By implementing these advanced security features in InterSystems IAM, we've created a robust API security layer that complements the perimeter and network security measures while protecting the application and database layers from malicious or excessive traffic.


Layer 4: Application Security with Web Gateway Hardening

The fourth layer of our security architecture focuses on hardening the InterSystems Web Gateway, which serves as the interface between the IAM API gateway and the IRIS database.

4.1 Web Gateway Configuration in Kubernetes

The Web Gateway is deployed as part of the IrisCluster custom resource, with specific security-focused configuration:

webgateway:

  image: containers.intersystems.com/intersystems/webgateway-nginx:2023.3
  type: nginx
  replicas: 2
  applicationPaths:
    - /csp/app1
    - /csp/app2
    - /app3
    - /csp/app4
    - /app5
    - /csp/bin
  alternativeServers: LoadBalancing
  loginSecret:
    name: iris-webgateway-secret

This configuration restricts the Web Gateway to only serve specific application paths, limiting the attack surface by preventing access to unauthorized endpoints.

4.2 CSP.ini Security Hardening

The Web Gateway's CSP.ini configuration is hardened with several security measures:

[SYSTEM]
No_Activity_Timeout=480
System_Manager=127.0.0.1
Maximum_Logged_Request_Size=256K
MAX_CONNECTIONS=4096
Server_Response_Timeout=60
Queued_Request_Timeout=60
Default_Server=IRIS

[APP_PATH:/app]
Alternative_Servers=LoadBalancing
Alternative_Server_0=1~~~~~~server-compute-0
Response_Size_Notification=Chunked Transfer Encoding and Content Length
KeepAlive=No Action
GZIP_Compression=Enabled
GZIP_Exclude_File_Types=jpeg gif ico png gz zip mp3 mp4 tiff
GZIP_Minimum_File_Size=500

 

Key security features in this configuration include:

  1. Disabled System Manager: The System Manager interface is disabled except from localhost
  2. Manual Configuration Only: Auto-configuration is disabled to prevent unauthorized changes
  3. Path Restrictions: Each application path has specific security settings
  4. Authentication Enforcement: AutheEnabled=64 enforces authentication
  5. Session Timeout: 15-minute session timeout aligned with IAM settings
  6. Locked CSP Names: Prevents path traversal attacks by locking CSP names

4.3 Advanced Nginx Security Configuration

Our implementation uses a heavily hardened Nginx configuration for the Web Gateway, which provides several layers of defense:

# Define whitelist using map

map $request_uri $whitelist_uri {

    default 0;

    "~^/app/.*$" 1;

    "~^/app/.*\.(csp|css|ico|js|png|woff2|ttf|jpg|gif)$" 1;

    "~^/csp/broker/cspxmlhttp.js$" 1;

    "~^/csp/broker/cspbroker.js$" 1;

    "~^/csp/app/.*$" 1;

    "~^/csp/bin/Systems/Module.cxw.*$" 1;

}



# Block specific URIs globally
map $request_uri $block_uri {

    default 0;

    "~*%25login" 1;

    "~*%25CSP\.PasswordChange\.cls" 1;

    "~*%25ZEN\.SVGComponent\.svgPage" 1;

}



# Custom error pages
error_page 403 /403.html;



# URI Whitelisting enforcement
if ($whitelist_uri = 0) {

    return 403;

}



# Deny access to forbidden file types
location ~* \.(ppt|pptx)$ {

    deny all;

    return 403;

}



# Deny access to blocked URIs
if ($block_uri) {

    return 403;

}



# Comprehensive logging for security analysis
log_format security '$real_client_ip - $remote_user [$time_local] '

                   '"$request" $status $body_bytes_sent '

                   '"$http_referer" "$http_user_agent" '

                   '"$http_x_forwarded_for" "$request_body"';

This configuration implements several critical security controls:

  1. URI Whitelisting: Only explicitly allowed paths can be accessed
  2. Blocking Dangerous Paths: Automatically blocks access to dangerous endpoints
  3. Blocking Risky File Types: Prevents access to potentially dangerous file types
  4. Security Logging: Detailed logging of all requests for forensic analysis
  5. Client IP Extraction: Properly extracts real client IPs from X-Forwarded-For headers
  6. Custom Error Pages: Standardized error responses that don't leak system information

Additionally, we implement strong security headers and request limits:

# Security headers

add_header X-XSS-Protection "1; mode=block" always;

add_header X-Content-Type-Options "nosniff" always;

add_header X-Frame-Options "SAMEORIGIN" always;

add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;

# Buffer and request size limits

client_max_body_size 50M;

client_body_buffer_size 128k;

client_header_buffer_size 1k;

large_client_header_buffers 4 4k;

# SSL/TLS security

ssl_protocols TLSv1.2 TLSv1.3;

ssl_prefer_server_ciphers on;

ssl_ciphers 'ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384';

These settings protect against:

  • Cross-site scripting (XSS)
  • MIME type confusion attacks
  • Clickjacking
  • SSL downgrade attacks
  • Buffer overflow attempts
  • Large payload attacks

4.4 TLS Configuration

The Web Gateway is configured to use modern TLS settings, ensuring secure communication:

tls:     

  webgateway:
  secret:
    secretName: iris-tls-secret

Our TLS implementation ensures:

  • Only TLS 1.2+ protocols are allowed
  • Strong cipher suites with forward secrecy are enforced
  • Certificates are properly validated
  • Session management is secure

By implementing this extensive hardening of the Web Gateway, we create a robust security layer that protects the IRIS database from unauthorized access and common web application vulnerabilities.

Layer 5: Database Security in IRIS Clusters

The final layer of our security architecture focuses on securing the IRIS database itself, ensuring that even if all previous layers are compromised, the data remains protected.

5.1 IrisCluster Secure Configuration with InterSystems Kubernetes Operator (IKO)

The IRIS cluster is deployed using the IrisCluster custom resource definition provided by the InterSystems Kubernetes Operator (IKO), with security-focused configuration:

apiVersion: intersystems.com/v1alpha1
kind: IrisCluster
metadata:
  name: example-app
  namespace: example-namespace
spec:
  tls:     

    common:
      secret:
        secretName: iris-tls-secret
    mirror:
      secret:
        secretName: iris-tls-secret
    ecp:
      secret:
        secretName: iris-tls-secret
  topology:     

    data:
      image: containers.intersystems.com/intersystems/iris:2023.3
      preferredZones:
        - region-1a
        - region-1b
      mirrored: true
      podTemplate:
        spec:
          securityContext:
            runAsUser: 51773  # irisowner
            runAsGroup: 51773
            fsGroup: 51773
      irisDatabases:
        - name: appdata
          mirrored: true
          ecp: true
      irisNamespaces:
        - name: APP
          routines: appdata
          globals: appdata
    compute:
      image: containers.intersystems.com/intersystems/iris:2023.3
      replicas: 1
      compatibilityVersion: "2023.3.0"
      webgateway:
        image: containers.intersystems.com/intersystems/webgateway-nginx:2023.3
        replicas: 1
        type: nginx
        applicationPaths:
          - /csp/sys
          - /csp/bin
          - /api/app
          - /app
    iam:
      image: containers.intersystems.com/intersystems/iam:3.4
    arbiter:
      image: containers.intersystems.com/intersystems/arbiter:2023.3
      preferredZones:
        - region-1c

Our IKO deployment includes several critical security features:

  1. TLS Encryption: All communication between IRIS instances is encrypted using TLS
  2. Database Mirroring: High availability with synchronous mirroring ensures data integrity
  3. Non-Root Execution: IRIS runs as the non-privileged irisowner user
  4. ECP Security: Enterprise Cache Protocol connections are secured with TLS
  5. Zone Distribution: Components are distributed across availability zones for fault tolerance
  6. Resource Isolation: Clear separation between data and compute nodes
  7. IRIS Namespaces: Properly configured namespaces that map to secure databases
  8. Arbiter Node: Dedicated arbiter node in a separate availability zone

5.2 IRIS Database Security Settings

Within the IRIS database, best practices for security include implementing several key security settings:

  1. Delegated Authentication: Configure IRIS to use external authentication mechanisms for centralized identity management
  2. Audit Logging: Enable comprehensive auditing for security-relevant events like logins, configuration changes, and privilege escalation
  3. System Security: Apply system-wide security settings that align with industry standards

These practices ensure that authentication is managed centrally, all security-relevant activities are logged for forensic purposes, and the system adheres to secure configuration standards.

5.3 IRIS Resource-Based Security

IRIS provides a robust security framework based on resources and roles that allows for fine-grained access control. This framework can be used to implement the principle of least privilege, giving users and services only the permissions they need to perform their functions.

Resource-Based Security Model

The IRIS resource-based security model includes:

  1. Resources: Secure objects such as databases, services, applications, and system operations
  2. Permissions: Different levels of access to resources (Read, Write, Use)
  3. Roles: Collections of permissions on resources that can be assigned to users
  4. Users: Accounts that are assigned roles and can authenticate to the system

This model allows security administrators to create a granular security structure that restricts access based on job functions and needs. For example:

  • Database administrators might have full access to database resources but limited access to application resources
  • Application users might have access only to specific application functions
  • Service accounts for integrations might have narrow permissions tailored to their specific needs

InterSystems Documentation

The implementation of role-based security in IRIS is well-documented in the InterSystems official documentation:

By leveraging IRIS's built-in security framework, organizations can create a security model that follows the principle of least privilege, significantly reducing the risk of unauthorized access or privilege escalation.

5.4 Data Encryption

IRIS database files are encrypted at rest using AWS EBS encryption with customer-managed KMS keys:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: iris-ssd-storageclass
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp3
  encrypted: "true"
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true

The EKS cluster is configured to use encrypted EBS volumes for all persistent storage, ensuring that data at rest is protected with AES-256 encryption.

5.5 Backup and Disaster Recovery

To protect against data loss and ensure business continuity, our architecture implements:

  1. Journal Mirroring: IRIS journals are stored on separate volumes and mirrored
  2. Automated Backups: Regular backups to encrypted S3 buckets
  3. Cross-AZ Replication: Critical data is replicated to a secondary AWS AZ

This approach ensures that even in case of a catastrophic failure or security incident, data can be recovered with minimal loss.

Implementation Guide

To implement this multi-layered security architecture for your own IRIS deployments on AWS, follow these high-level steps:

Step 1: Set Up AWS Infrastructure

  1. Create a VPC with private and public subnets across multiple availability zones
  2. Set up NAT gateways for outbound connectivity from private subnets
  3. Create security groups with appropriate ingress and egress rules
  4. Deploy an EKS cluster in the private subnets

Step 2: Configure AWS Security Services

  1. Create an AWS WAF Web ACL with appropriate rule sets
  2. Set up CloudFront distribution with WAF association
  3. Configure AWS ALB for Kubernetes Ingress

Step 3: Deploy InterSystems IAM

  1. Create necessary Kubernetes secrets for certificates and credentials
  2. Deploy the IAM StatefulSet using the IrisCluster operator
  3. Configure IAM security plugins (rate limiting, session management, request validation)

Step 4: Deploy and Secure IRIS Cluster

  1. Create an IrisCluster custom resource with security configurations
  2. Configure TLS for all communication
  3. Deploy the Web Gateway with hardened configuration
  4. Set up database mirroring and ECP security

Step 5: Implement Monitoring and Logging

  1. Configure centralized logging with ElasticSearch
  2. Set up security monitoring with Datadog
  3. Implement alerting for security events
  4. Enable IRIS audit logging

Monitoring and Incident Response

A robust security architecture must include continuous monitoring and incident response capabilities. Our implementation includes:

6.1 Security Monitoring

The architecture includes comprehensive monitoring using Datadog and ElasticSearch:

  1. Real-time Log Analysis: All components send logs to a centralized ElasticSearch cluster
  2. Security Dashboards: Datadog dashboards visualize security metrics and anomalies
  3. Automated Alerting: Alerts are generated for suspicious activities or security violations

6.2 Incident Response

A defined incident response process ensures timely reaction to security events:

  1. Detection: Automated detection of security incidents through monitoring
  2. Classification: Incidents are classified by severity and type
  3. Containment: Procedures to contain incidents, including automated responses
  4. Eradication: Steps to eliminate the threat and restore security
  5. Recovery: Procedures for restoring normal operations
  6. Lessons Learned: Post-incident analysis to improve security posture

Performance Considerations

Implementing multiple security layers can impact performance. Our architecture addresses this through:

7.1 Caching Strategies

  1. CloudFront Caching: Static content is cached at edge locations
  2. API Gateway Caching: IAM implements response caching for appropriate endpoints
  3. Web Gateway Caching: CSP pages are cached when possible

7.2 Load Balancing

  1. Multi-AZ Deployment: Services are distributed across availability zones
  2. Horizontal Scaling: Components can scale horizontally based on load
  3. Affinity Settings: Pod anti-affinity ensures proper distribution

7.3 Performance Metrics

During our implementation, we observed the following performance impacts:

  1. Latency: Average request latency increased by only 20-30ms with all security layers
  2. Throughput: System can handle over 2,000 requests per second with all security measures
  3. Resource Usage: Additional security components increased CPU usage by approximately at 15%

These metrics demonstrate that a robust security architecture can be implemented without significant performance degradation.

Conclusion

The multi-layered security architecture described in this article provides comprehensive protection for InterSystems IRIS deployments on AWS. By implementing security controls at every layer—from the network perimeter to the database—we create a defense-in-depth strategy that significantly reduces the risk of successful attacks.

Key benefits of this approach include:

  1. Comprehensive Protection: Multiple layers provide protection against a wide range of threats
  2. Defense in Depth: If one security control fails, others remain in place
  3. Scalability: The architecture scales horizontally to handle increased load
  4. Manageability: Infrastructure as Code approach makes security controls reproducible and versionable
  5. Compliance: The architecture helps meet regulatory requirements for data protection

By leveraging AWS security services, InterSystems IAM, and secure IRIS configurations, organizations can build secure, high-performance applications while protecting sensitive data from evolving threats.

References

  1. InterSystems Documentation: IRIS Security Guide
  2. AWS Security Best Practices: AWS Security Pillar
  3. Kubernetes Security: EKS Best Practices Guide
  4. OWASP API Security: Top 10 API Security Risks
  5. InterSystems Container Registry: containers.intersystems.com
3 Comments
Discussão (3)1
Entre ou crie uma conta para continuar
Artigo
· Mar. 1 7min de leitura

IRIS Vector Search for Matching Companies and Climate Action

Hey, community! 👋

We are a team of Stanford students applying technology to make sense of climate action. AI excites us because we know we can quickly analyze huge amounts of text.

As we require more reports on sustainability, such as responsibility reports and financial statements, it can be challenging to cut through the noise of aspirations and get to the real action: what are companies doing

That’s why we built a tool to match companies with climate actions scraped from company sustainability reports.

In this post, we’ll show you how to implement this tool as a chatbot interfacing with an InterSystems IRIS vector database.

  • Step 1: Setup
  • Step 2: Create database table
  • Step 3: Embed content and populate database
  • Step 4: Use vector search with user input
  • Step 5: Find climate actions

Step 1: Setup

Make sure to download Docker and then follow the InterSystems community hackathon start guide. The guide walks you through creating a new database container and connecting to the locally hosted web database portal. Navigate to http://localhost:52773/csp/sys/UtilHome.csp and use the credentials username: demo, password: demo to login.

Step 2: Create database table

Now, for the fun part!

First, we use Python to create the database to store our text chunks. We use the following script:

import pandas as pd
import numpy as np
import iris
import time
import os

### Vector Database Setup
username = "demo"
password = "demo"
hostname = os.getenv("IRIS_HOSTNAME", "localhost")
port = "1972"
namespace = "USER"
CONNECTION_STRING = f"{hostname}:{port}/{namespace}"
print(CONNECTION_STRING)

### Connect to IRIS
conn = iris.connect(CONNECTION_STRING, username, password)
cursor = conn.cursor()

### Create new table. Replace BASLABS to your prefix.
embedTableName = "BASLABS.ClimateReportsEmbed"
tableDefinition = """
( 
source_url VARCHAR(1000),
page INT,
total_pages INT,
content VARCHAR(5000), 
content_embedding VECTOR(DOUBLE, 384)
)
"""

### Re-create if table already exists
if CREATE_TABLE:
    try:
        cursor.execute(f"DROP TABLE {embedTableName}")
    except:
        pass
    cursor.execute(f"CREATE TABLE {embedTableName} {tableDefinition}")
  • We use iris.connect to create an SQL conn instance in Python and connect to the database. We can then use the cursor object to perform create/read/update/delete operations within the database.
  • We define the table structure as a string in the variable tableDefinition. Notably, we include a content_embedding field of type VECTOR(DOUBLE, 384). You must make sure that the number matches the dimension of your text embeddings. In this project, we are using 384 to match the “all-MiniLM-L6-v2” embedding model from the SentenceTransformers package.
  • Finally, we use cursor.execute to create the table.

Step 3: Embed content and populate database

Next, we will populate the database.

To do so, create a file with a list of PDF links you want to scrape. You can also use our list of links that we used during the Stanford TreeHacks hackathon. Your file should look something like this:

<http://website1.xyz/pdf-content-upload1.pdf>
<http://website2.xyz/sustainability-report-2024.pdf>
...

Then, we use LangChain PyPDFLoader(url) to load and split the PDF content.

import pandas as pd
from langchain.docstore.document import Document
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_iris import IRISVector
from langchain_community.document_loaders import PyPDFLoader

# Database setup as above...

# Embedding model
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")

# Setup Text Splitter
text_splitter = RecursiveCharacterTextSplitter(
    separators=["\\n\\n", "\\n", ". ", "? ", "! "],
    chunk_size=1000,
    chunk_overlap=100,
    length_function=len,
    is_separator_regex=False,
)

# Utility functions
def get_df(docs, pdf_url):
    data = [
        {
            "page": doc.metadata["page"],
            "total_pages": doc.metadata["total_pages"],
            "content": doc.page_content,
        }
        for doc in docs
    ]
    doc_df = pd.DataFrame(data)
    doc_df["source_url"] = pdf_url
    doc_embeddings = model.encode(doc_df["content"], normalize_embeddings=True).tolist()
    doc_df["content_embedding"] = doc_embeddings
    return doc_df
    
def insert_df(doc_df):
    sql = f"""
    INSERT INTO {embedTableName}
    (source_url, page, total_pages, content, content_embedding) 
    VALUES (?, ?, ?, ?, ?, ?, TO_VECTOR(?))
    """
    data = [
        (
            row["source_url"],
            row["page"],
            row["total_pages"],
            row["content"],
            str(row["content_embedding"]),
        )
        for index, row in doc_df.iterrows()
    ]
    results = cursor.executemany(sql, data)

# Main loop
with open("pdf_links.txt") as f:
        links = f.read().split("\\n")
 
for url in links:   
	loader = PyPDFLoader(url)
	documents = loader.load()
	docs = text_splitter.split_documents(documents)
	print(f"Found {len(docs)} docs for url: {url}")
	df = get_df(docs, url)
	insert_df(df)
	
  • First, we set up an embedding model, which we import from sentence_transformers.
  • Next, we define how to split the text of the reports. We use LangChains recursive text splitter that tries to split on provided characters in order until the chunks are small enough.
  • We then create two utility functions: get_df and insert_df. We utilize pandas dataframes to package text content and its metadata for later storage effectively.
  • Finally, we loop over all links in the text file, loading the PDF using LangChain, splitting into smaller chunks, embedding their contents, and inserting them into the IRIS Vector Database.

Step 4: Use vector search with user input

Finally, we will create a chatbot wrapper to perform a vector search based on a company climate description. First, we package the database functionality into an IRIS_Database class:

import iris
import time
import os
import pandas as pd
from sqlalchemy import create_engine

from sentence_transformers import SentenceTransformer

class IRIS_Database:
    def __init__(self):
        ### Vector Database Setup
        username = "demo"
        password = "demo"
        hostname = os.getenv("IRIS_HOSTNAME", "localhost")
        port = "1972"
        namespace = "USER"
        CONNECTION_STRING = f"{hostname}:{port}/{namespace}"

        self.conn = iris.connect(CONNECTION_STRING, username, password)
        self.cursor = self.conn.cursor()

        # Tables
        self.report_table = "BASLABS.ClimateReportsEmbed"

        # Embeddings
        self.embed = SentenceTransformer("all-MiniLM-L6-v2")

    def get_report_section(self, search_query, n=5):
        if not self.conn:
            self.connect()
				# Embed query using the same embedding model as before
        search_vector = self.embed.encode(
            search_query, normalize_embeddings=True
        ).tolist()
        # SQL query to perform vector search
        sql = f"""
            SELECT TOP ? source_url, page, content
            FROM {self.report_table}
            ORDER BY VECTOR_DOT_PRODUCT(content_embedding, TO_VECTOR(?)) DESC
        """
        self.cursor.execute(sql, [n, str(search_vector)])
        # Format results
        results = self.cursor.fetchall()
        return [
            dict(
                source_url=row[0],
                page=str(row[1]),
                content=row[2],
            )
            for row in results
        ]
  • First, we set up the database as before in the __init__ method. Notice that we also load the embedding model to embed our search queries later on.
  • The get_report_section does all of the heavy lifting. First, it embeds the query vector using the same model used when populating the database. Then, we use an SQL query to perform the vector search. The magic lies in ORDER BY VECTOR_DOT_PRODUCT(content_embedding)since it is here, we find the cosine similarities between text embeddings and return the most similar chunks.
  • Finally, we format the results as a Python dictionary and return to the user.

Step 5: Find climate actions

As a final step, we can create a front-facing client application called vector search. Below, we include a basic chat interface without a language model. But feel free to extend it to your needs! See our DevPost for more details on setting up an agentic tool calling workflow using DAIN!

import os
import sys
import json
from IRIS_Database import IRIS_Database

def get_report_section(search_query: str) -> list[dict]:
    """
    Perform RAG search for report data.
    """
    db = BAS_Database()
    result = db.get_report_section(search_query)
    return result
    
# ask user for
while True:
	print("Please tell me a bit more about your company and your goals. (q to quit)")
	user_input = input("> ")
	if user_input.lower().startswith("q"):
		break
		
	results = get_report_section(user_input)
	print(f"Found {len(results)} matches!")
	for r in results:
		print(r)

Wrapping up

And that’s it! We hope this can inspire you to use an embedding search to match individuals, companies, and even nations with more ambitious climate actions. Thank you to InterSystems for providing these services.

If you have any questions, please leave them in the comments below.

Happy InterSystems Hacking!

Authors: Alice, Suze, and Bubble

Discussão (0)0
Entre ou crie uma conta para continuar
Pergunta
· Mar. 1

How to use upload manager to upload data to codetable tables.

I am trying to upload using Upload manager functionality in Trakcare , but unable to do so without any guide or documentation.

If anyone can assist me through that would be greatly appreciated.

Thanks.

1 Comment
Discussão (1)1
Entre ou crie uma conta para continuar
Artigo
· Mar. 1 6min de leitura

Leveraging InterSystems IRIS for Health Data Analytics with Explainable AI and Vector Search

Introduction

To achieve optimized AI performance, robust explainability, adaptability, and efficiency in healthcare solutions, InterSystems IRIS serves as the core foundation for a project within the x-rAI multi-agentic framework. This article provides an in-depth look at how InterSystems IRIS empowers the development of a real-time health data analytics platform, enabling advanced analytics and actionable insights. The solution leverages the strengths of InterSystems IRIS, including dynamic SQL, native vector search capabilities, distributed caching (ECP), and FHIR interoperability. This innovative approach directly aligns with the contest themes of "Using Dynamic SQL & Embedded SQL," "GenAI, Vector Search," and "FHIR, EHR," showcasing a practical application of InterSystems IRIS in a critical healthcare context.

1 Comment
Discussão (1)1
Entre ou crie uma conta para continuar