When using standard SQL or the object layer in InterSystems IRIS, metadata consistency is usually maintained through built-in validation and type enforcement. However, legacy systems that bypass these layers—directly accessing globals—can introduce subtle and serious inconsistencies.
Understanding how drivers behave in these edge cases is crucial for diagnosing legacy data issues and ensuring application reliability.
The DATATYPE_SAMPLE
database is designed to help analyze error scenarios where column values do not conform to the data types or constraints defined in the metadata. The goal is to evaluate how InterSystems IRIS and its drivers (JDBC, ODBC, .NET) and different tools behave when such inconsistencies occur. In this post, I’ll focus on the JDBC driver.
What's the Problem?
Some legacy applications write directly to globals. If a relational model (created via CREATE TABLE
or manually defined using a global mapping) is used to expose this data, the mapping defines the underlying values conform to the declared metadata for each column.
When this assumption is broken, different types of problems may occur:
- Access Failure: A value cannot be read at all, and an exception is thrown when the driver tries to access it.
- Silent Corruption: The value is read successfully but does not match the expected metadata.
- Undetected Mutation: The value is read and appears valid, but was silently altered by the driver to fit the metadata, making the inconsistency hard to detect.
Simulating the Behavior
To demonstrate these scenarios, I created the DATATYPE_SAMPLE
database, available on the InterSystems Open Exchange:
🔗 Package page
🔗 GitHub repo
The table used for the demonstration:
CREATE TABLE SQLUser.Employee (
ID BIGINT NOT NULL AUTO_INCREMENT,
Age INTEGER,
Company BIGINT,
DOB DATE,
FavoriteColors VARCHAR(4096),
Name VARCHAR(50) NOT NULL,
Notes LONGVARCHAR,
Picture LONGVARBINARY,
SSN VARCHAR(50) NOT NULL,
Salary INTEGER,
Spouse BIGINT,
Title VARCHAR(50),
Home_City VARCHAR(80),
Home_State VARCHAR(2),
Home_Street VARCHAR(80),
Home_Zip VARCHAR(5),
Office_City VARCHAR(80),
Office_State VARCHAR(2),
Office_Street VARCHAR(80),
Office_Zip VARCHAR(5)
);
Example 1: Access Failure
To simulate an inconsistency, I injected invalid values into the DOB
(Date of Birth\Datatype DATE) column using direct global access. Specifically, the rows with primary keys 101, 180, 181, 182, 183, 184, and 185 were populated with values that do not represent valid dates.
The values looks like this now:
As you can see, a string was appended to the end of a $H
(Horolog) value. According to the table's metadata, this column is expected to contain a date—but the stored value clearly isn't one.
So what happens when you try to read this data? Well, it depends on the tool you're using. I tested a few different tools to compare how they handle this kind of inconsistency.
1) SquirrelSQL (SQuirreL SQL Client Home Page)
When SquirrelSQL attempts to access the data, an error occurs. It tries to read all rows and columns, and any cell that contains invalid data is simply marked as "ERROR"
. Unfortunately, I couldn't find any additional details or error messages explaining the cause.
2) SQLWorkbench/J (SQL Workbench/J - Home)
SQL Workbench/J stops processing the result set as soon as it encounters the first invalid cell. It displays an error message like "Invalid date"
, but unfortunately, it doesn't provide any information about which row caused the issue.
3) DBVisualizer (dbvis) & DBeaver (dbeaver)
DBVisualizer and DBeaver behave similarly. Both tools continue reading the result set and provide detailed error messages for each affected cell. This makes it easy to identify the corresponding row that caused the issue.
4) SQL DATA LENS (SQL Data Lens - a powerful tool for InterSystems IRIS and Caché)
With the latest release of SQL DATA LENS, you get detailed information about the error, the affected row, and the actual database value. As shown in the screenshot, the internal value for the first row in columns DOB is "39146<Ruined>"
, which cannot be cast to a valid DATE
.
SQL DATA LENS also allows you to configure whether result processing should stop at the first erroneous cell or continue reading to retrieve all available data.
The next part of this article will shows details about:
Silent Corruption: The value is read successfully but does not match the expected metadata.
Undetected Mutation: The value is read and appears valid, but was silently altered by the driver to fit the metadata, making the inconsistency hard to detect.
Andreas