OpenAI API for embeddings?

Does anyone know if it’s possible to use the OpenAI blocks to create an embedding from a string? Or would one use the standard API calls to do this?

Can you say a bit more about what you mean by “an embedding from a string”?

OpenAI has a function to create ‘embeddings’ from strings / chunks of text. Embeddings are 1536 dimensional vector representations of the text. They can be obtained by sending the string / entire dataset to an OpenAI endpoint, where it runs a model on the data and returns the vector representation of each string: Embeddings - OpenAI API at least that’s my noob understanding.

Fascinating. I had no idea.

The built-in OpenAI block is fairly limited so you would need to configure the API manually for that.

1 Like

OK - that was surprisingly easy! I managed to get a vector embedding back from the OpenAI API. That means that IF I can vectorise large datasets then I could, in theory:

  • Ask a question (record this)
  • Convert the sound file to text using WhisperAPI
  • Convert the text string to an embedding / vector
  • Compare that vector to other vectors in the database
  • Return similar vectors > the basis of sematic search

For reference: add the Auth (+API key) and Content-Type to the API configuration menu, along with the url. Then set the model and input as shown to get this to work:

1 Like

Quick follow-up on this as I did try and integrate that into my previous code: be aware that the above only work because the “Retrieve a test embedding for this text” is in double quotes. Any string you’re using here needs to be in double quotes. My return variable from WhisperAPI.com was not, so I needed to add them in using a Join.

1 Like