Anyone experience hurdles when using the Production Validator toolkit?

Question 1

Pergunta

Andre Ribera · Mar. 25

#Interoperabilidade #Ferramentas #Health Connect #HealthShare #InterSystems IRIS for Health

Been testing out the Production Validator toolkit, just to see what we can/not do with it. Seems really interesting and there seem to be some use cases for it that can really streamline some upgrades (or at least parts of upgrades) but I was running into so many hurdles with the documentation. I am curious if anyone else has used it.

Did you experience any issues getting it working? Any clarification that you would have liked in the documentation? any use cases that you worked through that made it particularly valuable? etc?

The hurdles I experienced included:

The documentation stated that the toolkit comes pre-installed in 2024.3+ but I got to a point where it kept throwing an <METHOD DOES NOT EXIST> error. Upon investigation the toolkit either wasn't installed or only part of it was installed so I had to download and install the package from WRC.
The documentation was frequently a little vague about things like some of the terminal command inputs. For example, it isn't clear what the arguments were for in the following command (later I figured out that it creates a new temporary namespace wherein the copied production is loaded and run):
```
HSLIB>set sc = ##class(HS.InteropTools.HL7.Compare.ProductionExtract).ConfigureAndImport(<Testing_Namespace>,<Full_Path>)
```
Since I was simulating an upgrade between 2024.1 and 2024.3 I wasn't able to get an output that resulted in any differences. The documentation doesn't have any kind of test or demo baked in, just screenshots so I'm still not 100% sure what it can handle and what it can't (see Outstanding Questions below)

Outstanding Questions

I don't have any demo prods that use BPLs so there is a question regarding how the PV manages when those are present
Doesn't it seem that a good amount of this could be scripted since these are commands that mostly don't take user input? (and if they do wouldn't it make sense to get as much scripted as possible?)
I had not worked with the syntax that produces the JSON and initially thought that it was invalid JSON (what with the doubled double quotes, e.g. ""example""). Would a link to that syntactical documentation be unnecessary? or would an example of the JSON output be out of place?
It isn't totally clear, until you bang your head into every wall, what the workflow is, is there a way to clarify that with a diagram/wireframe of the workflow? or is that inappropriate?
Does this allow for rerunning a test once it has been run once? I tried doing so but kept bumping into errors like "DB already exists", "namespace already exists", etc. so I had to create a new NS and delete the COMPARE file each time. Is that just how it runs or is there a way to rerun that isn't easily identifiable?
Initially I had been trying to get this all running in a container (Docker) and seemed to be running into too many challenges to justify the time spent. I ended up running 2 parallel instances (e.g. 2024.1 and 2024.3) locally and that worked a treat. But is there a way anyone has been able to test this using a docker container?

Interested in what you all have experienced and if you have made any customizations of this or if you have been able to figure out any workarounds, etc.

Thanks

5 Comments

Discussão (5)2

Entre ou crie uma conta para continuar

Question 2

Pergunta

Kurro Lopez · Mar. 25

#Docker #JSON #Python #Vector Search #InterSystems IRIS

Hola a todos.

Estoy intentando crear una tabla indexada con un campo vectorial para poder buscar por su valor. He estado investigando y descubrí que, para obtener el valor del vector a partir del texto (token), se debe usar un método de Python como el siguiente:

ClassMethod TokenizeData(desc As %String) As %String [ Language = python ]
{
    import iris
    # Step 2: Generate Document Embeddings
    from sentence_transformers import SentenceTransformer

    model = SentenceTransformer('/opt/irisbuild/all-MiniLM-L6-v2')

    # Generate embeddings for each document
    document_embeddings = model.encode(desc)

    return document_embeddings.tolist()
}

El modelo all-MiniLM-L6-v2 se descargó de https://ollama.com/library/all-minilm y se instaló en mi instancia de Docker.

Al intentar probar este método (desde Visual Studio), se generó el siguiente error:

<THROW>DebugStub+40^%Debugger.System.1 *%Exception.PythonException <PYTHON EXCEPTION> 246 <class 'OSError'>: It looks like the config file at '/opt/irisbuild/all-MiniLM-L6-v2/config.json' is not a valid JSON file.

Luego cambié el archivo config.json para crear un archivo JSon válido (solo escribí las llaves) y repetí la prueba, pero hay un nuevo error.

<THROW>DebugStub+40^%Debugger.System.1 *%Exception.PythonException <PYTHON EXCEPTION> 246 <class 'safetensors_rust.SafetensorError'>: Error while deserializing header: HeaderTooSmall

¿Alguien sabe cómo solucionar este problema?

¿Hay alguna otra forma de crear el valor del vector para poder indexarlo?

Saludos cordiales.

1 Comment

Discussão (1)2

Entre ou crie uma conta para continuar

Question 3

Pergunta

Kurro Lopez · Mar. 25

#Docker #JSON #Python #Vector Search #InterSystems IRIS

Hi all.

I'm trying to create an indexed table with an vector field so I can search by the vector value.
I've been investigating and found that to get the vector value based on the text (token), use a Python method like the following:

ClassMethod TokenizeData(desc As %String) As %String [ Language = python ]
{
    import iris
    # Step 2: Generate Document Embeddings
    from sentence_transformers import SentenceTransformer

    model = SentenceTransformer('/opt/irisbuild/all-MiniLM-L6-v2')

    # Generate embeddings for each document
    document_embeddings = model.encode(desc)

    return document_embeddings.tolist()
}

The model all-MiniLM-L6-v2 is downloaded from https://ollama.com/library/all-minilm and installed into my Docker instance.

When I've tryed to test this métod (from Visual Studio), it throws the following error:

<THROW>DebugStub+40^%Debugger.System.1 *%Exception.PythonException <PYTHON EXCEPTION> 246 <class 'OSError'>: It looks like the config file at '/opt/irisbuild/all-MiniLM-L6-v2/config.json' is not a valid JSON file.

Then I changed the config.json file to create a valid JSon file (I only wrote the curly braces) and repeated the test, but there is a new error.

<THROW>DebugStub+40^%Debugger.System.1 *%Exception.PythonException <PYTHON EXCEPTION> 246 <class 'safetensors_rust.SafetensorError'>: Error while deserializing header: HeaderTooSmall

Does anyone know how to fix this problem?
Is there any other way to create the vector value so I can index it?

Best regards.

2 Comments

Discussão (2)2

Entre ou crie uma conta para continuar

Question 4

Pergunta

Colin Brough · Mar. 25

#HL7 #ObjectScript #Dicas e truques #XML #Ensemble #Caché

Is there a generic process for "walking" the structure of a virtual document - eg an HL7 message (EnsLib.HL7.Message) or an XML document (EnsLib.EDI.XML.Document).

At least we'd want to be able to visit all "nodes" (HL7 fields or sub-fields, XML nodes) in the virtual document and be able to work out/generate the Property Path (so we could call "GetValueAt").

We can just about come up with something generic for HL7, since it only nests down to 4 levels within each segment, though we're using numeric Property Path's at that point rather than symbolic ones (MSH:1.3 etc).

Reading the documentation has not so far cast light! Any pointers welcome. Thanks.

7 Comments

Discussão (7)3

Entre ou crie uma conta para continuar

Pesquisar

Anyone experience hurdles when using the Production Validator toolkit?

Cómo poner los registros de la aplicación en el ^ERRORS global

Rúbrica de preguntas frecuentes de InterSystems

¿Cómo tokenizar un texto usando SentenceTransformer?

How to tokenize a text using SentenceTransformer?

Walking a virtual document's structure