Getting Started with NLP and spaCy Transcripts
Chapter: Part 4: NLP with huggingface and LLMs
Lecture: spaCy plugins
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
One cool feature of spaCy is that it also has plugins. And you can explore some of them by going to this Universe tab over here.
0:11
Now there's a lot of different types of plugins. Some of these plugins are things that use spaCy under the hood,
0:16
whereas other plugins are adding new functionality to the library.
0:21
For example, this Blackstone project over here is a project that contains a spaCy pipeline
0:26
that has been trained on legal texts. We can also scroll down a bit and we can see that there are
0:31
these community models that have been trained on different languages. This one is trained on Danish.
0:36
But there are also projects that fall a little bit more in the hobby category, like this Hobbit spaCy plugin
0:41
that adds NLP support for Middle Earth. A bit of a fan language there.
0:46
But there's also some interesting projects for, let's say, Ancient Greek or Latin.
0:51
These languages technically aren't spoken that much anymore, but it's still pretty cool to see that one can make a spaCy plugin for such a language.
1:01
And there are also use cases for this, especially if you're a little bit more in the linguistic humanities.
1:06
And I guess another good example of a spaCy plugin would be this one. This one is called Next spaCy,
1:11
which gives a pipeline for negating concepts in text. And it's pretty hard. You can typically see the instructions on how to install the plugin.
1:21
And there's usually also an example that you can just copy and paste to get going locally.
1:26
And this particular plugin has some algorithms to deal with negation on entities.
1:31
So just to zoom in on the example that's listed here, the sentence here reads, She does not like Steve Jobs, but likes Apple products.
1:41
So that's referring to Steve Jobs. And that's a property that you can extract on that entity if you've added this pipeline component.
1:51
And you should also be able to confirm that negation is not active on Apple, which is, I assume, the other entity found in this sentence.
2:01
Next spaCy is just one example of a plugin, but this spaCy universe over here has lots of them.
2:06
And it can be worthwhile to just have a look what's in there.
2:11
Now, the one thing to remember is that these universe projects are not hosted by spaCy itself.
2:16
These are community projects. And that does mean the projects can be a little bit stale.
2:21
If we have a look at the GitHub repository for Next spaCy, then we can see that it's still fairly well maintained.
2:26
I can see that there are some unit tests, and the last commit was about a year ago,
2:31
which isn't that bad. But I can also see that this project is over 5 years old.
2:36
And it can happen that a plugin, especially after 5 years, isn't maintained as much anymore. That does totally happen with open source packages.
2:46
So before you dive deep and start using one of these plugins in production, it will be good to just at least have a glance at the GitHub repository,