Getting Started with NLP and spaCy Transcripts
Chapter: Part 4: NLP with huggingface and LLMs
Lecture: Prompting

Login or purchase this course to watch this video and the rest of the course contents.
0:00 In the previous video we gave ChatGPT a prompt and it was able to generate text that was interesting to us.
0:08 However, before we just got a big list of items but maybe we can come up with a more elaborate prompt for a more
0:16 elaborate NLP task. One thing I could do is I could literally make a prompt telling ChatGPT to perform NLP for me. So I could do something like
0:25 from the text below extract the following entities for me. And I could do something like dish, ingredient
0:39 and maybe equipment. And then I could do something like here's some text. I know of a great pizza recipe with anchovies.
0:49 And here we can see that ChatGPT responds and that it's actually able to make some solid detections. It is able to confirm that
0:59 pizza is indeed a dish. We're also able to see that anchovies is a ingredient so that's nice. And it's also able to say that the
1:09 equipment, well there's no equipment mentioned in this bit of text over here. Now what's kind of interesting here is
1:17 that technically you could look at this and you could say hey it seems that ChatGPT can be used to do some named entity
1:24 recognition on our behalf. And that's definitely a cool observation but we should also be a little bit critical and wonder if what ChatGPT is
1:31 giving us here if that's actually enough. And that's because usually when you're applying NLP text will go in, you give that to your
1:41 NLP system and then structured data comes out. In this case it's a little bit different. We have text in the form of a prompt
1:50 with the text we're interested in detecting. We're giving that to a large language model and then what comes out, well that is
1:58 actually more text. Technically speaking there's no structured information here. This is just text that's being generated on our
2:05 behalf. So that technically means we need an extra step here to turn the response text into something
2:12 that is structured preferably. So how might you actually be able to get something that's structured here? Well one thing you can do is you can
2:20 change this prompt over here. Right now we're just asking it to give me entities but we could also ask it to give the
2:28 entities in a specific format. So let's just try that real quick. All right so here's an adaptation. It's a
2:36 small change but what I've now done is I've said well there's a very specific format that you have to follow. So there's a dish and then there has to
2:43 be a comma delimited list of strings. Then there's an ingredient and the same thing. And although it's a subtle change we do
2:52 see now that instead of it being a bulleted list we really just get new lines in the format that we're asking for
2:59 and we see again that it's performing the NLP task in a way that we like. Now if you were to build a proper NLP system for this
3:08 what you kind of need are two components. You need some sort of way to take this output and to turn it into something that is structured.
3:17 If I were to think in terms of spaCy it'd be nice to somehow get this into a spaCy document object right.
3:24 In particular it'd be nice to see those as entities on that object. But maybe something else we would like our system to do as well
3:31 is maybe generate this prompt. You can imagine that I might be able to start with some text that goes in. That'd be the text over here.
3:41 But constructing a prompt over here such that the information that comes out over
3:46 here is nice and structured. It'd be kind of nice if we have some sort of a prompt generator for that as well.
3:54 And hopefully it's also clear that this prompt generator together with this response translator. Well
4:02 these two things these would be nice to have in some sort of a library that just handles a whole bunch of this translation and prompt generation
4:10 on my behalf. And this is exactly what a plugin called spaCy LLM does for you. It can generate prompts on your
4:19 behalf and there's actually a fair effort being taken into account to make sure that prompts are being generated according
4:25 to the latest literature. And for each prompt we can also have a proper response translator. But even better spaCy LLM will also
4:34 allow you to pick the LLM provider. You can use tools from OpenAI if you like but OpenAI is not the only LLM vendor out there. In fact there's also
4:43 open source models from Hugging Face that you can run locally that spaCy LLM can also communicate with on your behalf. I hope
4:50 this was useful context but next up what I think I should do is just give a very quick demo of spaCy LLM.


Talk Python's Mastodon Michael Kennedy's Mastodon