Building Data-Driven Web Apps with Pyramid and SQLAlchemy Transcripts
Chapter: Testing web apps
Lecture: Sitemaps and testing

Login or purchase this course to watch this video and the rest of the course contents.
0:00 You've heard of the Pareto principle
0:01 in one form or another.
0:03 It's often referred to as the 80/20 rule
0:05 the idea that most of the value can be gotten
0:08 by doing a little bit of the work...
0:10 80% value, 20% work.
0:13 That's good right? We can actually accomplish a lot of that benefit
0:16 by having a site map. Well, what's a site map?
0:20 So, a site map is actually a listing of all of the URLs
0:24 and a little bit of detail about them
0:26 targeted for search engines.
0:28 Google, Search bots, Bing, things like that.
0:32 Here's an example that might make sense for our...
0:35 And what we can do is we could define a Chameleon template
0:38 that will generate this.
0:40 So we can have one for the site, one for account/login
0:43 one for account/register and then every package on the site
0:47 we could use data-driven behaviors to pull out all of that.
0:51 See that down there at the bottom
0:53 tal:repeat p in packages, we're going to spit that out.
0:56 You want to have one of these types of things anyway.
0:58 It's massively helpful for search engines.
1:01 So we could have this template
1:02 and our view model would be really simple.
1:04 It's just going to go and say "Give me all the packages."
1:07 We may want to set a limit, probably not, more likely
1:09 you want to just cache this and refresh it periodically.
1:12 Anyway, we want to generate this somehow
1:15 and here's our controller, super easy.
1:18 We register sitemap.xml as the URL
1:22 and than we just generate this stuff using that template.
1:25 If we have this in place
1:26 and I'm going to give you one for this website
1:27 we're not going to write it from scratch.
1:29 I'll give it to you, you can adapt it for your own.
1:31 Then we can use this for testing.
1:33 We can go through here and just request
1:35 every single page in the sitemap
1:37 and make sure that none of them 500
1:39 you know server crash or 404 or things like that.
1:42 That's actually really helpful. This is the 80/20 rule.
1:46 We can just do a little bit of work to grab the site map
1:48 and request everything.
1:50 It won't test everything we're looking for
1:51 but like I said most of the time when these pages fail
1:54 they fail hard and just fully crash
1:56 500, 404, something like that.
1:59 They don't just send back wrong HTML.
2:01 That does happen, but they fully crash most of the time
2:06 which is actually kind of good for testing.
2:09 So, let's see this just in the concept here
2:11 and then we'll go look at it in code for just a moment.
2:13 We're going to write a little bit of code
2:15 to get the sitemap text, so what we need to do
2:18 is actually use our functional test behavior
2:22 to request /sitemap.xml.
2:25 It's going to return some value, response.
2:28 We can go to the text and we can say
2:30 so the XPath queries are simpler
2:32 let's get rid of the namespaces
2:34 replace this namespace that has to be there with nothing
2:37 that means we don't have to worry about it
2:38 in what we're about to do.
2:40 We can write a test that'll say
2:41 I want to go and test every URL there, so get us that text.
2:46 I want to create an XML DOM element, an etree.from_string
2:51 and in there we're going to do an XPath query for URL/location.
2:55 This will pull through, pull it out
2:58 and generate every single URL on our site.
3:02 Here where we are doing the replace localhost:6552
3:05 to say, I mean we would put that in our sitemap
3:08 but we just want to do a relative request here
3:11 like /account and so on. And then, we can just
3:15 now we have just the URLs as a flat string.
3:19 Well, at that point let's just go ahead with a bunch of requests.
3:22 So, we'll go through each one of these
3:23 and, we'll just do a request for them.
3:27 URL status should be 200.
3:31 I know that I threw a "if the project is in here"
3:34 in the URL.
3:36 We've already tested it, lets not do it again.
3:39 In the real PyPI, imagine, this was the real PyPI data
3:43 we'd have 140,000 different packages.
3:47 It may be valuable to request every one
3:50 but, you'll probably get most of the value
3:52 by just requesting an example one.
3:55 So, we're just, if we've requested one
3:58 so okay we've tested that part of our site
4:00 we don't need to it do it over and over and over again.
4:03 So, it's an optimization, it gives up some tests
4:05 makes it much faster, put it in here if you like.
4:09 And this is it. We can go and write this simple test
4:12 against this simple site map you want anyway
4:14 and get a lot of verification right there.