Getting Started with NLP and spaCy Transcripts
Chapter: Part 4: NLP with huggingface and LLMs
Lecture: spaCy remains super useful
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
So you might be a little bit curious now because I've shown you two different ways of doing NLP with AI tools.
0:07
On one end we've got spaCy where text can go in and because we have a spaCy model that was trained using training data with labels we
0:16
are able to get some structured information out. But you might also look at spaCy LLM that is also able to get some structured information out.
0:27
But the main difference is that here I would also need to have some labels and here I would only need a prompt.
0:34
So that might make you think that this way of building machine learning models is actually preferable.
0:40
It's easier to get started with and wouldn't that always be better? And here I want to apply a little bit of warning and a little bit of nuance.
0:48
If you're going down the LLM approach be reminded that it is a bit costly.
0:54
This is especially true if you are using third-party providers you're going to have to pay.
0:59
But even if you're running models locally these models tend to be very heavyweight.
1:03
And you also need to pay for compute so that's something to keep in mind.
1:08
spaCy models tend to be nice and lightweight which is definitely not the case for these LLMs.
1:13
Second, especially if you're using a third-party be aware that the data needs to go there. And depending on the industry that might be a no-go.
1:22
The third reason is accuracy. And the easiest way to explain that is maybe by drawing a chart. Now imagine that we have labels on the x-axis over here.
1:36
So the more labels we've got the more to the right that will be over here.
1:40
And let's for all intents and purposes say that we've got some sort of measure for performance. Well in my experience so far with no labels whatsoever
1:50
you can get a pretty decent performance out of an LLM. So even at zero labels when you've got a decent prompt you can get pretty good performance.
1:59
The spaCy model on the other hand is probably going to start out a bit slow.
2:04
But there does come this point usually where the spaCy model is able to surpass the LLM. Now this probably won't hold for every use case out there
2:17
but I do have a small theory on why I've seen this in practice so much.
2:21
Even when you've got a pretty good prompt this LLM over here can be seen as a very general model.
2:28
OpenAI really needs to facilitate a wide spectrum of use cases. Our little spaCy model over here doesn't.
2:36
The lightweight spaCy model only needs to concern itself with a small subset of natural language and on a very precise task.
2:45
And I think that's why at some point the spaCy model tends to perform a bit better as you collect more and more labels.
2:53
That's not to say that LLMs aren't useful though. There is this moment in the beginning when you're bootstrapping
2:58
when the performance of an LLM is actually going to be pretty good.
3:01
But this phenomenon at least to me is the reason that I really like to use spaCy LLM early in a project as a annotation aid.
3:10
With very little effort I can get a model that's okay and I can compare that to my own spaCy model
3:16
and again when there's disagreement I might be able to give priority. And it's tricks like that that really do feel very useful.
3:24
So in summary, who knows, times might change, maybe these LLMs become more and more lightweight and they also become more performant.
3:31
But it does feel that at least for now there's still a lot to be said to also train your own spaCy model
3:36
even though we've got these very useful LLMs at our disposal as well.