Managing Python Dependencies Transcripts
Chapter: Finding Quality Python Packages
Lecture: "Rules of Thumb" for Selecting a Great Package - Part 2

Login or purchase this course to watch this video and the rest of the course contents.
0:01 Not all Python packages are going to have a dedicated project homepage. But what every project should have is some form of README file,
0:10 that introduces you to the project. So I always check those too. And what I like to see here, is that
0:17 I want the README to cover the basics of the project, what does the library do, and how do I install it.
0:24 You could learn a lot about the quality of a library, by looking at how the maintainers communicate the value that the library provides,
0:31 I also want to know what license the project is under, because that could really influence in what circumstances
0:37 you can actually use the project and then, it makes sense to quickly check who the author is, is it a group of people,
0:44 is it a company, is it an individual contributor, what have they done in the past, and do they seem trustworthy?
0:50 Let's take a look at a real project README now. Alright, I am going to try and find the README file for the Reqests library now.
0:58 So, typically, what you'd be looking for is a link to the project source repository so it already looks like this is hosted on GitHub here,
1:06 so I am just going to look for a link. Alright, there we go, requests at GitHub, so that should be the link to the GitHub project
1:15 where we can check out the project README, yes, this is it, so when I scroll down this is where GitHub displays the README file,
1:20 and for other source control websites like Bitbucket, they will either display the readme in the same fashion
1:26 or you can view it by finding the README file and then clicking directly on that.
1:30 Now the first thing that I can see here is that this looks really similar to the project homepage which isn't a bad thing,
1:36 I mean, this contains all of the information that I wanted to get out of the README and it looks like it's really well structured and nicely formatted.
1:42 So this is great, this tells me how to install the library, and it looks really simple, it's pointing me to the documentation
1:49 and it also tells me how to contribute to the project. If you're wondering what should go into a great README file, I wrote this article a while ago,
1:57 about how you can write a great README for a github project, I am covering a number of things here that in my opinion
2:02 should go into a great README, for example, it should talk about what the project actually does, how to install it,
2:08 some example usage, how someone could set up a development environment, some link to a change log, and then also license and author info,
2:17 you can check out the full article in the link that you see here. Now let's go back to the Requests README. I said that I'd like to know
2:29 under which license a library was published, so let's find that out now. Usually where you can find that information is in a license file,
2:35 at the root of the repository. So this tells us that Requests is under the Apache license, a popular open source license; if you're wondering
2:43 what the most common open source licenses are, and what their terms are there is a really great website you should check out.
2:49 Go to choosealicence.com/licenses and they have great simple and human readable explanations of the conditions and permissions
2:59 and limitations in the most popular open source licenses. So for example, this is the Apache license used by Requests,
3:06 and this gives us a quick overview over the terms of the license, without actually having to drill down into the nitty gritty details.
3:14 Another thing that I'd like to know is who the authors are, who wrote a library. Now, typically, in an open source library,
3:22 you can find an AUTHORS file that will list all the contributors, again here with Requests you can get a really quick overview
3:30 of who the core maintainers are, and then there was apparently a whole bunch of people who have submitted patches over time,
3:37 and this is a great sign because it means you have a project leadership and then you also have a large group of people who are dedicating patches
3:43 and contributions to the project. We could also check out the GitHub user account that hosts the Request library,
3:50 and in this case, it's Kenneth Reitz and you can see that Kenneth has a number of very popular libraries in the Python space,
3:57 he is working for respectable and well known company and these are all indicators that Requests is a really great library.
4:05 At the end of step 4, maybe the field has narrowed down a little bit further, every Python library should have a good project README,
4:13 and I find it helpful to familiarize myself with the licensing terms for the project, and the team of people working on or maintaining the library.
4:24 In step 5, you're going to make sure that the project is actively maintained. In my mind, this is one of the most important quality indicators,
4:32 now how can you find out if a project is under active development, usually a great way to find that information is to check out
4:40 the project changelog and the update history. You could either do that directly on PyPi or by checking the project source repository,
4:49 also on the source repository you can usually find a bug tracker for the project.
4:53 Now this can tell you a lot about how the project is being maintained. Are there discussions going on, are there many open tickets for severe bugs?
5:03 If there are no tickets, than that is usually not a great sign either, because in my experience, any project that gets some traction,
5:09 has a flood of bug reports coming in; now I would recommend that you skim through some of those bug reports,
5:16 just to make sure that there isn't some large problem with the project that would affect your ability to use it properly.
5:21 Another piece of information you can find directly on the source repository is when the last commit to the project happened.
5:28 Now you don't want to discount projects that do not have a lot of development activity going on at the moment,
5:34 I'd rather pick a well seasoned project that is also well maintained or at least not abandoned over one that's super maintained but also brand new,
5:44 because then you don't really know what the future holds, maybe the project is going to get abandoned in a few months,
5:50 and then you're stuck with it, whereas a seasoned library that still does its job properly but it's not getting a lot of feature updates,
5:56 could still be totally worth your while, there is nothing wrong with an older library that does its job really well.
6:01 At the end of step 5, your list of candidates projects will likely have narrowed down further and this is a good thing,
6:07 the more projects you can weed out, the easier it will be to pick the perfect library for your usecase. You are almost done here. In step 6,
6:17 you would spot check the source code of the project. I always like to look under the hood of a library that I am going to use in my own programs.
6:27 And usually, this is really easy to do if you're dealing with an open source project,
6:30 you just open the project repository website and browse around in the code a little bit. Here is what I like to see.
6:37 Does the code follow community best practices, for example, does it have a consistent formatting style,
6:43 are there comments in the code, are there docstrings, stuff like that, another hugely important topic for me is whether or not
6:51 the code has automated test coverage, in my mind, a good quality Python package will always have an automated test suite.
6:58 Looking at the code will also give you a good idea of how experienced the developers were who wrote the library;
7:04 often you can tell at a glance whether it was someone who had a deep understanding of Python who wrote a library,
7:10 or if it was someone who was maybe coming from an entirely different language background and was just kind of told to write a Python library.
7:18 Now, this doesn't automatically mean disaster, but it's still a really good quality indicator. In the end,
7:24 it all boils down to the question would you feel comfortable making small changes to this library if you had to?
7:31 Because that is what the worst case scenario is. Imagine you are building a really successful application that is using a particular library and then
7:38 the original authors of the library stop maintaining it. Well, if you don't want to give up your project,
7:44 it will pretty much come down to you maintaining this library, at least enough so you can use it for your own purposes.
7:50 This is something that I always try to keep in the back of my head when I make a decision whether to use one library or another.
7:56 Alright, let's take a look at what this looks like in practice. So I am back here looking at the GitHub repository for the Requests library.
8:06 And that gives me a really easy way to browse through the library source code,
8:09 so I don't even have to install it, I can just use the GitHub code viewer and browse around and I don't need to pull this over into my own editor.
8:18 So what I would do here is try and find the main directory where all the source files live in, and in this case,
8:24 it's the requests folder so typically this would be named after the library, and you can see here there is a bunch of Python files in there.
8:32 This seems pretty well structured already and you can also see there is a lot of activity here so these are being updated all the time.
8:40 Now let's check out one of those files. For example, the cookies.py file, that sounds tasty. And I would just spend some time reading that code,
8:48 so things that I immediately like here is that there is docstrings, the imports are nicely formatted, you can see here
8:55 the classes seem like they are named properly, again, there are extensive docstrings for everything,
9:03 this class here with these methods on it, they seem well structured, right, there is not this crazy long like a thousand lines methods here.
9:14 This is all pretty nice and tidy and when I scroll further through the file,
9:18 it all just seems like it's following a structure and it's formatted in the way that makes it easy on the eyes, and that is usually a really good sign,
9:29 like imagine you have to maintain this code, personally I would much rather work with code that looks like this, than some convoluted mess.
9:37 And you can see here it seems to adhere to the PEP 8 formatting style which I think is also a good sign
9:43 because if you are also using PEP 8 or something similar, than this library code is going to look similar to your application code,
9:48 which also helps maintenance. Yeah, so I would say this looks pretty good, let's see if we can find some tests.
9:58 Okay, so there is a tests folder here, and again, it looks like there are whole bunch of tests here, so let's check out the test_structures, alright,
10:11 so they are using pytest which is a library that personally I like a lot so this would be a good sign for me, first of all I love the fact
10:19 they have an automated test suite here and just glancing over those tests, I mean, they seem pretty reasonable, right,
10:31 they seem like they are actually testing stuff, they are not just placeholders or dummy tests they are actually doing some things.
10:37 Now, usually I wouldn't do like a full code review for a library that I want to use, but I just want to do some spot checking to get an idea
10:46 of the code quality for that library, because, in the worst case scenario I might actually have to do some maintenance work on this library,
10:53 if someone stops maintaining it and it's an important part of my own application, then I would be pretty much responsible for keeping this thing alive
11:01 so that I can continue to use it. So this is always something that is in the back of my head;
11:07 of course, Requests here passes that test with flying colors, and seeing how popular that library is, it's probably going to be maintained
11:13 for a really long time so I wouldn't be too worried about this, but, of course it helps that it has great code quality too.
11:20 Okay you made it all the way to step 7, and this is the last step in this workflow.
11:27 So at this point, you would have a much narrow down list of candidates, and now it's time to try out a few of them.
11:34 So at this point, I would go over my notes and my memories, and take this narrow down list of candidates and just start installing
11:40 some of them to try them out, in an interpreter session, and I am always suing a fresh virtual environment for that
11:47 so that I am not cluttering up my system. I would encourage you to do the same, and then you can just launch
11:52 into an interpreter session, import the library, and play with it a little bit. Or you might write a couple of small programs
11:59 just to get a feel for how the library works, so for example, with Requests, maybe I would write a little program
12:06 that downloads a file over HTTP and then I would try and implement the same example with a different library to get a feel
12:13 for what the strength and weaknesses are of each of them. Now actually, installing the library and then trying it out is
12:20 going to tell you something very important; it's going to tell you whether the package installs and imports cleanly, because,
12:27 at the end of the day that is super important, even if you have the best library for your purpose and it's so painful to install
12:34 or it doesn't work on your system, then that is not going to help you. So I always make sure to actually get some hands on experience
12:41 with my top 3 choices or so, so that I can be confident into decision that I make. Another very important question is
12:48 whether or not you enjoy working with the package. I strongly believe that developers should use tools that they enjoy working with,
12:54 and this also applies to third party packages and modules and libraries. So for me, this would always factor into the decision,
13:03 now I realize that there might be business constraints and sometimes you just have to work with something that you are not enjoying as much.
13:09 But if there is a way to get the best of both worlds, a really great library that is actually fun to work with,
13:14 I would always pick the one that is fun to work with and gets the job done.


Talk Python's Mastodon Michael Kennedy's Mastodon