June 17, 2010

Good Slideshows: includes Good Powerpoint, Good Keynote, Good Impress

PowerPoint/ Keynote/ Impress/ other slideshow, or (even better because of wider compatibility) a pdf formatted for projection, is great for PowerPoint sized ideas. Note that Keynote does not have a freely available viewer (thus, no link). If you use it, you are depending on your machine working, or other people having Keynote on their machines. Slide projectors are also rare these days, so be similarly wary of them. PowerPoint and Impress have similar risks, but free availability of a viewer for the former, and the program itself for the latter, ameliorates this risk. PDF viewers are, however, extremely common. At full screen, they are great at showing slide shows. A PDF of the presentation on a USB stick is the current state of the art for a portable slideshow format. Have a second USB stick with a backup copy if it's a really important show.

I've seen a lot of positions that require "good PowerPoint skills". I can't tell if they think that "PowerPoint" is a synonym for "Public Speaking", or if they are just PP-dependent and need someone to assemble slide decks for them. I know that for any of these positions that I consider, I will be asking if that requirement is because they A: assume that good PowerPoint is good speaking, B: PowerPoint is actually a good format for the kinds of results their company finds and/or needs, or C: the audience requires PowerPoint. In case A, I will point out that while I can put together PowerPoints slideshows, 4 years of Toastmasters, with a CC (Competent Communicator, formerly Competent Toastmaster)and CL (Compentent Leader) may be a better indicator of my speaking ability. In case C, I'll be very discrete about the follow up inquiries, audiences that require Powerpoints are a red flag to me. Do they want only soundbites? Can they read technical reports when there are interlinkages and not just bullet points? Are they literate/numerate?

So what about B?

The points from "The Seven Habits of Higly Effective Pirates" (from Schlock Mercenary) are excellent examples of PowerPoint-able ideas. I recommend one per slide, like panel three of the linked comic. It is an almost perfect slide. A picture demonstrating the rule is sometimes stronger, if it doesn't upset the stomachs of your audience. Oh wait - that's panel four, but panel four has the demonstration first, and the rule second. Wrong order for audiences who's attention span is shorter than their reading speed.

Also note, clipping this panel and pasting a comic under the bullet point with the text of the rule does not work well. It is just bad design, and makes you look a combination of stodgy (must have bullet point) and uncreative (particularly if you use a Dilbert clipping - in PP they are now trite, not ironic). If you're going to take someone elses work like that do three things: get permission, make it as big as possible on the slide, put a little bullet under it giving attribution to the artist. This shows that you understand how it highlights your words (which is the part that is your creativity), and are willing to share the credit where due (the picture itself is not your creativity). A highlight is best done large, right?

There are two other cases that are very closely related to each other: pictures and graphs. A picture of a hazard is much better than any technical description, and a florid description that replaces the picture would probably be discredited as "unprofessional". Compare the florid description: "there's a bunch of round rocks held together by mud at the top of a dirt cliff above a building site, they'll fall on you if you sneeze too loud", with the technical presentation description: "there is a rather steep, poorly vegetated slope above the site that is capped by a loosely consolidated conglomerate of riverstone."

Graphs are awesome for presentations. Use them! But don't use all of them, or use them all the time. Really - use graphs to show the patterns, use words to give your presentation. Only show the graphs you want people to remember, those that are important to your presentation. And turn the projector off when you need people to pay attention to what you are saying. If you can't turn it off, put in either blank slides, or uninteresting, you've seen them too many times to care, plain background corporate logo slides between your graphs and photos. This way the audience will listen to you when needed, and see graphs and charts and photos when they need to learn through their eyes. In dire straits, put a business card in front of the projector when you need eyes back on you.

In a twist that won't surprise anyone who has sat through "just a quick Powerpoint on our idea" or a 45-minute "highlights of our wedding" slideshow, slideshows are also useful for another situation: the presentation of "nothing". Best explained at the end of a New York Times article about the use of PowerPoint in the military:

Senior officers say the program does come in handy when the goal is not imparting information, as in briefings for reporters.

The news media sessions often last 25 minutes, with 5 minutes left at the end for questions from anyone still awake. Those types of PowerPoint presentations, Dr. Hammes said, are known as “hypnotizing chickens.”

So, PowerPoint and other slideshows do have good uses. But please, try not to hypnotize the chickens unless you really mean to!

June 4, 2010

Look at your data

Anscombe (1973)

Possibly one of the most famous sets of artificial data sets. And rightly so, if you're not familiar with "Anscombe's Quartet", follow the above link.

Now that we've had that graphic reminder to LOOK AT THE DATA, I'm going to skip the usual next step of writing about what to do, and instead refer you, kind reader, to the discussion of this data at the Princeton Office of Population Research. The discussion is located at this link.

And after you look at that link, please: look at your data.

June 2, 2010

Lost in log-space: residuals of log-transformed data

Yesterday I was helping Jim with a model for predicting non-profit registration by county, and a little problem came up about how to explain some data. He had found a really strong relation, ran the model, and made a pretty choropleth of the results. But the legend had entries like "-0.500 standard deviation to 0.500 standard deviation". And the underlying data had been log transformed. He didn't have time to rebuild all of his maps, so I was asked this question: "What do I tell them this means?"

Good question! Log transformation of data is a common technique to deal with several problems: typically anything where the scatterplot of a relationship look log-normal, "looks like an exponential curve", has "too long" of a rightward tail, or numerous other things. I will leave aside the question of whether the log transform is the right one in a particular instance, and proceed directly to today's (well, yesterday's) question:
When analyzing the residual deviations after the model is fit, what does the deviation of the log of the variable of interest mean? And make it friendly, statistics guy!
Well... ouch. I don't like the implication of "y'all are hard to understand". But then again, a lot of people feel that way, and that's why analysts are paid to be analysts. So what to do? What do "regular people" want to see, that is also a good representation of the reality?
Percent deviation from predicted.
And here's what they don't want to hear (but you want to do):
  • If you have "standardized residuals" (how very statistical of you!), first multiply by the RMSE to get actual deviations.
  • Take the exponent of the deviations, and respect the sign of the deviation! 
  • Now you have the ratio of measured over predicted. If it's greater than 1, the subtract one and multiply by 100 to get the percent "high". If it's less than one, then subtract the ratio from one and multiply by 100 to get the percent low.
Use those numbers for your labels, don't explain the funky transformation details to a lay audience, and enjoy the lack of flustered puzzlement on the audiences faces!

Example, just for practice:
Say your RMSE is 0.5. Then one deviation high (positive) is e to the 0.2: about 1.65; and one deviation low is e to the -0.5: about .61.

So one up is about 65% high, and one down is about 39% low.

One additional note here: symmetrical percent bands, like 10% to 40%  high or low, will be misleading because the "high" band is actually expected to have smaller counts than the "low" band if a log model is correct. But this is the way that people have become accustomed to having data presented, and people think of it as "fair", despite this potential bias.