Getting Started with NLP and spaCy Transcripts
Chapter: Part 4: NLP with huggingface and LLMs
Lecture: spaCy-LLM

0:00 Alright, so I'm back in my notebook and I've taken a couple of extra steps. First, I've installed some extra packages.

0:09 I've installed spaCyllm, which is the llm plugin, and I also installed this library called python.env.

0:16 The reason why I've installed this library is because I need some environment variables that have my OpenAI credentials,

0:23 and those are stored in this .env file. Now, the two variables that I've set there are the OpenAI API organization, which is my personal identifier,

0:34 and I have also set the OpenAI API key, which is my secret key. These are the two environment variables that I've declared in that particular file,

0:45 and these are the two environment variables that this llm provider needs. Besides this .env file though, there's also this other file here,

0:53 which is called spaCy-llm-config. Here's what it looks like. But in this particular case, this config file is somewhat limited.

1:02 We can see that there is this pipeline defined here, where there's only an llm component,

1:07 but inside that llm component we can see that there's a task definition and that there is a model definition.

1:14 This model definition over here contains everything that spaCy-llm needs to understand what backend to use.

1:21 There are lots of backends that are described on the documentation, but I'm using the gpd3-5 setting here, which is chat-gpt.

1:30 Then, for a task, I have a suite of tasks that I can pick from, and you can kind of look at this as a recipe that contains both the

1:40 prompt generator as well as the response translator. But in order to generate the prompt, the minimum thing that I would need

1:48 is I would need to know the names of the labels that I would have to predict. And that's something I'm able to define over here.

1:55 So, in layman's terms, this is all the configuration that you need to have a NER model that detects the dish, the ingredient, and the equipment.

2:06 And this configuration is something I can actually use to bootstrap a spaCy model.

2:11 To do that, there is this assemble function from the spaCy-llm-util library. I can point that to the config file.

2:21 And because I've loaded my keys by running this load.n function, that means that the keys in this file are now properly loaded,

2:30 then this NLP model, whenever it sees text, will do a call to OpenAI. When we send something to OpenAI, we send the prompt plus the text.

2:40 Then we get a response back. And then the text in that response, that is something that spaCy-llm can then use to construct a spaCy doc object.

2:50 And that's what we see happen here. As a user, it really feels like you're using spaCy as you would normally.

2:57 It's just that all of this in the background is abstracted away from you. But that is still definitely kind of a nice feature, I would say.

3:06 We are able to use big LLM models, and it still feels like we're just using spaCy.

Getting Started with NLP and spaCy Transcripts Chapter: Part 4: NLP with huggingface and LLMs Lecture: spaCy-LLM

Getting Started with NLP and spaCy Transcripts
Chapter: Part 4: NLP with huggingface and LLMs
Lecture: spaCy-LLM