Managing Python Dependencies Transcripts
Chapter: Finding Quality Python Packages
Lecture: "Rules of Thumb" for Selecting a Great Package - Part 1

Login or purchase this course to watch this video and the rest of the course contents.
0:01 When you're looking to find a quality Python package to help you out with a problem at hand, it can be a little bit overwhelming,
0:07 having to select between all of these different options. In my time as a Python developer, I've come up with
0:13 a series of rules of thumb for selecting a great package. I've turned this into a seven step workflow
0:19 that you can use to find and select quality Python packages. Let me walk you through it now, step by step. Think of this workflow as a funnel.
0:29 First, you're going to find a poll of candidate packages, and just do a bunch of research and basically
0:36 collect as many packages as possible that could help you with the problem at hand, and then, with each of the steps in this workflow,
0:42 we're going to successively refine this list by excluding packages, each step will help you gather more information and give you
0:51 a better understanding of the quality of each package. The goal of this process is to make the decision which package to use, really really simple.
1:00 You'll be starting with this long list of candidate packages and in the beginning, it will be almost impossible to tell
1:07 which one is the perfect package for your use case, but as you keep narrowing down that list, by the end of this workflow,
1:13 you will have narrowed down that candidate list so much and you will have built a great understanding of the strength and weaknesses of each library,
1:21 that making that decision is going to be very easy for you. The ability to find and identify great Python packages,
1:28 is very helpful even if you're working on your own, but it gets so much more powerful if you have to justify your dependency decisions
1:36 to a team of other developers or to your manager. You can apply the same workflow and the same criteria and use them to explain your decisions;
1:46 to give you a concrete example, you could just take this process and as you go through it, take extensive notes
1:52 and basically compile a report about your decision and after you went through those seven steps in the workflow,
1:58 this is going to be a pretty bulletproof report that you can then share with your team, or your manager.
2:04 Alright, let's jump right in and then you'll learn this important skill in no time. Let's start with step 1- finding candidate packages.
2:13 The first thing that I usually do is that I come up with a list of candidate packages that will help me solve the problem at hand.
2:20 And there is a number of ways you can fill up that list. In my mind, it really helps to come up with the series of options,
2:26 so that you have a base for comparison. Now, let's talk about how you would fill up that candidate package list.
2:32 I often start out by browsing through the curated lists I told you about earlier, so I would just open up those websites like Awesome Python,
2:40 I will try and find the matching category that is relevant to my problem that I am trying to solve and then I'll just click through that category
2:49 checking out all the packages that are listed there. Another option would be just to run a quick Google search
2:56 for two to five relevant keywords, imagine you are looking for a way to upload files to Amazon's S3 service using Python.
3:04 Here is what I would do, so for that, I just open up Google and then I would probably search for something like S3 upload Python,
3:11 you know, very focused keywords and just kind of sprinkle the minimal set of keywords that I could think about, and I just search for that.
3:20 And then the results here are going to give me a pretty good overview, so I probably just click through the first three results or so,
3:28 and just check out what they have to say. Now, this question here, pretty much is what I had in mind
3:34 and looking at the first answer points me to the boto library, so I'll probably check that out and add it to the list,
3:42 and then I do the same thing for the other top search results, now in this case, I know from personal experience that boto is a great choice,
3:49 so the fact that we're already seeing this result is a pretty good sign. Honestly, I found that a quick Google search can really help you out here,
3:57 it's often digging up the right content immediately pointing you to results on Stack Overflow or on forums like Reddit or Hacker News.
4:05 So I usually do that really early on in my research process, when I am looking for a new Python package.
4:11 You've already seen that I looked at a Stack Overflow result here, so Stack Overflow is another great site you can use
4:19 to find recommendations for Python packages, if you haven't used Stack Overflow before, it's basically a question answer site for developers.
4:28 And you can search it as well, so I am just going to punch in the same keywords that I previously searched on Google, just to see what comes up,
4:35 so by default, this will be sorted by relevance, which is kind of an opaque measure, so often, I'll just immediately switch over to the votes tab
4:42 which will give me the most upvoted answers. Alright, so let's check out the first answer here.
4:48 So, this is the number one upvoted answer for this question, I am not going in to read the full question,
4:54 I just want to see what kinds of libraries and tools people recommend here. And as I scroll down, I can immediately see that okay,
5:00 boto is another library that people recommend, so again, this will be a pretty good indicator that I should really check out
5:07 this boto library because it just keeps popping up again and again. Another great recommendation for finding quality Python packages
5:16 are community forums like Reddit or Hacker News, and sometimes you can also use Twitter like that, let's take a look at those now.
5:25 Reddit is a community forum website that has a pretty large Python community, you can find it at reddit.com/r/Python
5:33 And reddit has a search feature as well, again, what I would do here is I would punch in the same keywords and then I would limit my search.
5:42 In this case, we could probably drop the Python because we're just searching the Python forum.
5:47 So, anything S3 related will pretty much be about Python. Alright, let's see what we got here, so this looks pretty helpful already,
5:54 one interesting bit here is that you can see when the question was submitted, or when the form thread was created.
6:02 So you want to make sure you are not looking at super old content for things that could change frequently.
6:07 But let's just check out this discussion here. So this looks like this is not going to give me the answer immediately,
6:12 but I can still learn a lot about how people talk about the problem here, what keywords they use and that could point me in the right direction
6:21 to actually find the library that does what I want or I actually find a discussion where someone recommends a specific library
6:27 and then other people can respond to that discussion and I can read what they have to say and that is going to give me
6:33 a pretty good idea of whether or not that library might be the right choice for me. Another helpful community forum is Hacker News.
6:41 Now, by default Hacker News doesn't have a search function built in, but you can use a third party search at hn.algolia.com
6:50 that can do a full tech search on comments and stories inside hacker news. Again, let's punch in S3 upload Python and see what happens.
6:59 Alright, so looking at these results again I see boto popping up here so this could be interesting, maybe this result is a little bit old,
7:06 but again, this could be a good way to fill up that candidate list and identify libraries that other people recommend and use.
7:13 Even if you're not using Twitter, just the fact that so many people share their thoughts on Twitter all the time, can be pretty powerful
7:21 if you're looking for an answer to your programming question, I know it sounds a little bit crazy but this works more often than you'd think,
7:28 so let's try it out, I don't know what is going to happen. Again, I am searching for the same set of keywords,
7:33 and then I am just going to check out some of the responses here,
7:37 alright, so sometimes it's going to reference other source material like Stack Overflow, or blog posts, okay, so this looks pretty interesting here,
7:45 this guy is talking about a script that uploads stuff to S3, so why don't we check it out.
7:50 So just looking at the code here, it looks like this guy is not using a specific library
7:54 to talk to S3, but he is using the command line tool, this aws s3 command,
7:59 so this could be another option for us to research now, maybe it's a good choice, I don't know, I know this process is a little bit time consuming
8:05 but it's really impressive what this process can dig up. If you do this for an hour or two, you're going to be pretty much an expert
8:12 on what's out there in terms of libraries that could help you with this job. If you've searched all these sites and you're still not happy
8:23 with this candidate package list that you've built up, then it might make sense to search PyPi directly,
8:29 personally, I find it a little bit hard to find stuff on PyPi because the interface is pretty clunky, and there is very little curation.
8:38 But it might still make sense to spend a few minutes on that and see if you can dig up something useful.
8:45 Now, another option to get those candidate packages would be to actually ask a question on Stack Overflow or Reddit,
8:52 so on all of these sites you can create a free account, and just start asking questions, of course, you want to be mindful
8:59 of questions that people have asked in the past, so I recommend that you do some research first
9:04 to avoid running the danger of posting duplicate questions. But usually, people are pretty receptive and helpful on these forums,
9:10 so it might make sense to give it a shot. However, it's rather time intensive to write and post the question
9:16 and then having to wait a couple of hours or even days to get a response. Now at the end of step one, you should have a list of candidate packages
9:25 that you want to do some further research on. After you've generated a list of candidate packages,
9:32 the next step is to check out how popular these packages are. Usually popularity is a good sign if you're looking for a Python package
9:41 because that often means that the package is well maintained it's high quality,
9:46 and you can't really go wrong with installing it and using it for your own purposes. Now, how can you find out if a package is popular?
9:54 One way to do it would be to check out the download stats, now you used to be able just to go to PyPi and checkout the download stats for a package,
10:03 but this feature was removed when the PyPi architecture changed. So right now, you can't really get those stats, they might come back in the future,
10:10 and then I think they are really good indicator, but right now, we'll have to go with something else. Another good popularity indicator would be
10:18 just the number of Google results and Reddit results and Stack Overflow results or recommendations you find for a given package.
10:25 And often, this step of the research process happens in combination with the first one, so as you go along and search these sites,
10:31 you can take mental notes of which packages show up frequently. And this could be really valuable information,
10:37 when you have to make a decision which one you are going to use. If a package is hosted on GitHub, you could also check out their GitHub page
10:44 and see how many stars they have on GitHub, so the star system on GitHub is a pretty simple voting system
10:50 where people can favorite or star repositories. Now, if you are thinking about installing a library
10:56 that has let's say 5 thousand or 10 thousand stars, it's pretty much a no brainer. If it only has 10 or 20 then maybe that is not a bad sign,
11:04 but it's also probably not a super popular library. Another way to get at that information is using the Python.libhunt website
11:13 and it includes a popularity indicator that is based on some other opaque values sometimes it can be helpful to compare two packages
11:21 and just kind of see which one has more traction. Now at the end of step two, you should have a pretty good understanding
11:28 of the relative popularity of your candidate packages. Once you have narrowed down the list of candidate packages
11:35 I would start checking out the actual project homepages. You could learn a lot from a project website,
11:42 things like does this website actually feel helpful, is it answering my questions that I have as a new user, does the website look actively maintained,
11:52 and how successful does this project look, did someone actually spend the time to make the website helpful and nice;
11:59 let's play through this with an actual Python project website. A great example here is the Requests library, and right away when this site loads up,
12:11 this looks like a really high quality library, it has its own logo here, it looks like it's supporting a bunch of Python versions
12:18 it looks like it has automated tests which is always a great thing to see, and the project maintainer is also tracking test code coverage.
12:26 Here on the left you can see that the page has this embedded GitHub stars indicator, and as you can tell,
12:34 the library has a high number of stars here which is usually a good sign. What I like here as well is that the page starts with a concrete example
12:41 of what you can do with the library and what it looks like to use it. This is great, so they even have a bunch of user testimonials
12:48 from really well known people in the Python community, and when I scroll down further, I can see here that it has a pretty extensive user guide
12:56 that covers a number of interesting things and seems really well structured, there is also in depth API documentation which is always a good sign.
13:05 Another sign that this is a really popular and strong library is that it has a contributor guide with all kinds of information
13:12 about how to contribute to the project, the code style they use, how people should report bugs, and a really small and unpopular library
13:21 is usually not going to have a need for that. So when you see something like that, that is usually a strong sign
13:27 that the library is really popular and very successful. And by extension that means it's usually a safe choice
13:34 for you to use that library in your own programs. By the way, if you're wondering where to find a project's homepage,
13:42 if it has one, you can usually find the link on PyPi, so it we'll be right here on the left and for older versions of PyPi
13:49 you will typically have to scroll all the way down and then you can find the link to the project homepage there.
13:56 There we go, this is the homepage link for the Requests library. At the end of step 3, after you check the couple of project homepages,
14:04 your list should have narrowed down a little bit further, at this point you are starting to get to know these projects a lot better
14:10 and you have a good idea of how popular they are, how well maintained they are, and whether or not you like them.
14:16 So maybe you can already start excluding some libraries that you are not really enjoying as much.
14:21 Of course, not all libraries are going to have a dedicated website or homepage, that doesn't automatically mean that the library is not great quality,
14:28 many Python projects don't actually have dedicated homepages, but if there is one, it absolutely makes sense to check it out.


Talk Python's Mastodon Michael Kennedy's Mastodon