Getting Started with NLP and spaCy Transcripts
Chapter: Part 4: NLP with huggingface and LLMs
Lecture: spaCy plugins

Login or purchase this course to watch this video and the rest of the course contents.
0:00 One cool feature of spaCy is that it also has plugins. And you can explore some of them by going to this Universe tab over here.
0:11 Now there's a lot of different types of plugins. Some of these plugins are things that use spaCy under the hood,
0:16 whereas other plugins are adding new functionality to the library.
0:21 For example, this Blackstone project over here is a project that contains a spaCy pipeline
0:26 that has been trained on legal texts. We can also scroll down a bit and we can see that there are
0:31 these community models that have been trained on different languages. This one is trained on Danish.
0:36 But there are also projects that fall a little bit more in the hobby category, like this Hobbit spaCy plugin
0:41 that adds NLP support for Middle Earth. A bit of a fan language there.
0:46 But there's also some interesting projects for, let's say, Ancient Greek or Latin.
0:51 These languages technically aren't spoken that much anymore, but it's still pretty cool to see that one can make a spaCy plugin for such a language.
1:01 And there are also use cases for this, especially if you're a little bit more in the linguistic humanities.
1:06 And I guess another good example of a spaCy plugin would be this one. This one is called Next spaCy,
1:11 which gives a pipeline for negating concepts in text. And it's pretty hard. You can typically see the instructions on how to install the plugin.
1:21 And there's usually also an example that you can just copy and paste to get going locally.
1:26 And this particular plugin has some algorithms to deal with negation on entities.
1:31 So just to zoom in on the example that's listed here, the sentence here reads, She does not like Steve Jobs, but likes Apple products.
1:41 So that's referring to Steve Jobs. And that's a property that you can extract on that entity if you've added this pipeline component.
1:51 And you should also be able to confirm that negation is not active on Apple, which is, I assume, the other entity found in this sentence.
2:01 Next spaCy is just one example of a plugin, but this spaCy universe over here has lots of them.
2:06 And it can be worthwhile to just have a look what's in there.
2:11 Now, the one thing to remember is that these universe projects are not hosted by spaCy itself.
2:16 These are community projects. And that does mean the projects can be a little bit stale.
2:21 If we have a look at the GitHub repository for Next spaCy, then we can see that it's still fairly well maintained.
2:26 I can see that there are some unit tests, and the last commit was about a year ago,
2:31 which isn't that bad. But I can also see that this project is over 5 years old.
2:36 And it can happen that a plugin, especially after 5 years, isn't maintained as much anymore. That does totally happen with open source packages.
2:46 So before you dive deep and start using one of these plugins in production, it will be good to just at least have a glance at the GitHub repository,


Talk Python's Mastodon Michael Kennedy's Mastodon